Skip to content
GitLab
Projects
Groups
Snippets
Help
Loading...
Help
What's new
4
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Switch to GitLab Next
Sign in / Register
Toggle navigation
M
manifesto
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Locked Files
Issues
27
Issues
27
List
Boards
Labels
Service Desk
Milestones
Iterations
Merge Requests
0
Merge Requests
0
Requirements
Requirements
List
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Test Cases
Security & Compliance
Security & Compliance
Dependency List
License Compliance
Operations
Operations
Incidents
Environments
Analytics
Analytics
CI / CD
Code Review
Insights
Issue
Repository
Value Stream
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
rostools
manifesto
Commits
7efc124d
Commit
7efc124d
authored
Jan 14, 2020
by
Luke Johnston
Browse files
Options
Browse Files
Download
Plain Diff
Merge branch 'manifesto-suggestions' into 'master'
Manifesto suggestions for discussion See merge request
!7
parents
f4b06ee8
72324b3f
Pipeline
#109367231
passed with stage
in 5 minutes and 40 seconds
Changes
4
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
81 additions
and
68 deletions
+81
-68
index.Rmd
index.Rmd
+2
-2
introduction.Rmd
introduction.Rmd
+15
-16
preamble.tex
preamble.tex
+1
-1
recommendations.Rmd
recommendations.Rmd
+63
-49
No files found.
index.Rmd
View file @
7efc124d
---
title
:
"A Generalized and Structured Analytical Workflow
(GSAW) for Reproducible and Openly Scientific (ROS)
Projects"
title
:
"A Generalized and Structured Analytical Workflow
for Reproducible and Openly Scientific
Projects"
author
:
-
"Luke Johnston"
-
"Joel Östblom"
...
...
@@ -11,7 +11,7 @@ documentclass: book
biblio
-
style
:
apalike
link
-
citations
:
yes
#
github
-
repo
:
description
:
"Heavily Opinionated Manifesto on Reproducible and Open Science Projects"
description
:
"
A
Heavily Opinionated Manifesto on Reproducible and Open Science Projects"
---
\
mainmatter
...
...
introduction.Rmd
View file @
7efc124d
# Manifesto for ROS Projects {#ros-manifesto}
TODO: Need to incorporate the GSAW or other acronym throughout the manifesto
```{r, child="preamble-note.md"}
```
...
...
@@ -28,7 +26,7 @@ more important in this changing scientific landscape.
These new demands should herald in better approaches to doing science, such as
greater training in computational aspects of research, data management, and
dissemination of findings. While there are small pockets of change adapting to
this new landscape
, this however, is the exception
and mainstream academic
this new landscape
these are the exceptions
and mainstream academic
science continues as it has for decades. Academia still (obsessively) rewards
publications as the currency for promotion, funding, and achievement. Since
following open and reproducible scientific practices is presently extremely
...
...
@@ -37,7 +35,7 @@ is currently little incentive to do these practices as that would reduce the
effective number of publications produced in any given amount of time. Therefore,
until the current obsession with publication numbers declines, efforts to
simplify and make doing open and reproducible science (ROS) accessible and
(relatively easily) ach
ei
vable are a way to increase adherence and acceptance of
(relatively easily) ach
ie
vable are a way to increase adherence and acceptance of
these practices. Even without the current incentive structure, simplifying the
process for doing these "ROS" practices would regardless be beneficial to
scientists given the high expectations placed on scientists already.
...
...
@@ -47,7 +45,7 @@ scientists given the high expectations placed on scientists already.
There are many benefits to adopting ROS practices for research and scientific
activities. Publishing findings under an open access license increases exposure
to the public, both via media and direct download, and also increases the number
of scientist that may end up benefit
t
ing from the findings. Being open with the
of scientist that may end up benefiting from the findings. Being open with the
data and the analysis code increases the transparency and reproducibility of the
results and facilitates in assessing the validity of any claims made in the
paper, improving the scientific rigor and strength of the study.
...
...
@@ -83,7 +81,7 @@ There seems to be two main problems with this lack of integration and uptake of
doing ROS. One, there are not many opinionated workflow tools that try to
automate and simplify many aspects of ROS. Two, the documentation on many of
these ROS tools and services is often incomplete, not comprehensive enough, or
not effectively target
t
ed to the end user who is likely completely unfamiliar
not effectively targeted to the end user who is likely completely unfamiliar
with many of the ROS terms and concepts. There are other reasons for
non-adherence to ROS practices, such as the aforementioned lack of incentive
structures. However, these are massive systemic problems that can only be
...
...
@@ -108,8 +106,8 @@ Our philosophy is to encourage reproducible and open scientific practices by
automating and streamlining many aspects of a ROS project and by providing an
opinionated view on which tools, services, and workflows to use when doing
research. The goal is to reduce the burden on researchers and lower the barrier
to doing open and reproducible science by creating a
Generalized and S
tructured
Analytical Workflow (GSAW)
.
to doing open and reproducible science by creating a
generalized and s
tructured
analytical workflow and approach to research projects
.
For now, we are focusing on typical scientific activities such as creating
abstracts, slides, posters, and manuscripts. We aim to incorporate creating
...
...
@@ -134,15 +132,16 @@ not-for-profit (or at least have a strong history of supporting open source and
open science activities)
- Should be actively developed and well-maintained
- Should have well-developed documentation, resources, and learning material
- The company, organization, or community responsible for the tools or services
should be ethical, have strong principles in favour of openness, and be a strong
advocate and supporter of fairness and equity
- (Optional) Preferably, the company, organization, or community responsible
for the tools or services should have strong principles, policies, and actions
in favour of openness and be a strong advocate and supporter of fairness and
equity
[open source]: https://opensource.org/osd
When a tool and/or service is mostly equal, consider that:
- The design focuses and emphasizes simplicity, us
e
ability, and accessibility
- The design focuses and emphasizes simplicity, usability, and accessibility
- It is already widely used and accepted within the ROS community
- Has a system to allow easy programmatic access (e.g. has a public [API])
...
...
@@ -150,12 +149,12 @@ When a tool and/or service is mostly equal, consider that:
### Guiding principles on workflow and processes
Likewise, for the analysis and workflow
(the GSAW)
aspect of ROS, we follow
Likewise, for the analysis and workflow aspect of ROS, we follow
these guiding principles:
- Favour readability over concision
- Favour well-established infrastructures and approaches
- Be internally consistent in file
naming
, code style syntax, and language
- Be internally consistent in file
names
, code style syntax, and language
- Consider and abide by privacy rules and laws (e.g. [GDPR] in Europe)
- Use and adhere to existing checklists (e.g. [STROBE] in epidemiology)
- Favour approaches that explicitly show steps taken from data to final
...
...
@@ -191,7 +190,7 @@ being in the "Advanced" stage.
### Phases of a research project
To help navigate the recommendations and steps for a
GSAW-
ROS project, phases
To help navigate the recommendations and steps for a ROS project, phases
of a research project are split into:
- Project management throughout (specifically regarding files, folders)
...
...
@@ -202,7 +201,7 @@ of a research project are split into:
- Dissemination
All current and future tools, services, and workflows incorporated into a
GSAW-
ROS project template must be based on these guiding principles and
ROS project template must be based on these guiding principles and
considerations.
TODO: Include guiding principles for creating teaching material
preamble.tex
View file @
7efc124d
...
...
@@ -5,7 +5,7 @@
\title
{
\normalfont
\horrule
{
1pt
}
\\
[0.4cm]
\huge
A Generalized and Structured Analytical Workflow
(GSAW) for Reproducible and Openly Scientific (ROS)
Projects
\huge
A Generalized and Structured Analytical Workflow
for Reproducible and Openly Scientific
Projects
\horrule
{
1pt
}
\\
[0.5cm]
}
...
...
recommendations.Rmd
View file @
7efc124d
# Specific recommendations {#recommendations}
```{r, child="preamble-note.md"}
...
...
@@ -12,13 +11,13 @@ some comparisons between options.
Open science encompasses a vast number of diverse tools and services that is
continuously increasing. This encouraging growth indicates that open science is
actively evolving and that there is a rich network of people and organizations
devoted to improving current scientific practices. A downside to this
plenty
is
that i
s
can act as a barrier for researchers who desire to work more in the
devoted to improving current scientific practices. A downside to this
abundance
is
that i
t
can act as a barrier for researchers who desire to work more in the
open. The range of tool choices and the lack of guidance on what to use
particularly risks to overwhelm and discourage researchers seeking to open up
their workflow for the first time.
To provide a solution to this problem, the
GSAW-
ROS framework provides heavily
To provide a solution to this problem, the ROS framework provides heavily
opinionated recommendations on open tools, workflows, and services. Below is a
brief summary of the specific recommendations we make, followed by more detailed
explanations and comparisons between tools, services, and workflows.
...
...
@@ -28,16 +27,27 @@ explanations and comparisons between tools, services, and workflows.
- **File management and version control**: [Git], combined with [GitHub] or [GitLab]
- **Statistical and/or programming language**: [R] or [Python]
- **For writing documents**: [Pandoc Markdown] (e.g. [R Markdown])
- **Analytic and writing platform**: [RStudio] (for R) or [JupyterLab] (for Python)
- **Analytic platform**: [RStudio] (for R) or [JupyterLab] (for Python)
- **Writing platform**: [RStudio]
- **Dissemination** for getting a DOI and for discoverability:
- **Code and other project files**: [Zenodo]
- **Preprint manuscripts**: [bioRxiv]
or [PeerJ Preprints]
or [OSF Preprints]
- **Preprint manuscripts**: [bioRxiv]
, [medRxiv],
or [OSF Preprints]
- **Posters**: [figshare] or [PeerJ Preprints]
- **Slides**: ??? [figshare]?
- **All activities**: For R projects, preferably everything is done in [RStudio].
- **All activities**: For R projects, preferably everything is done in
[RStudio]. See the [workflow section](#workflow) below for more detail. For
Python projects the environment is a bit more complicated and we are still
thinking through how it would look.
<!--
For Python projects, most work can be done in [JupyterLab], however other tools
will also need to be used. See the [workflow section](#workflow) below for more
detail.
will also need to be used.
For Jupyter Notebook, it might make sense to always have Rmd as the backend and
then use RStudio as a Markdown and Git GUI. There is no other platform with as
much support for different publishing option through a GUI, so I think it will
be used for writing. For git, there is git kraken and nbdime and git jupyterlab
exteisnion as an alternative.
-->
[Git]: https://git-scm.com/
[GitHub]: https://github.com/
...
...
@@ -50,7 +60,7 @@ detail.
[JupyterLab]: https://jupyterlab.readthedocs.io/en/stable/
[Zenodo]: https://zenodo.org/
[bioRxiv]: https://www.biorxiv.org/
[
PeerJ Preprints]: https://peerj.com/preprints
/
[
medRxiv]: https://www.medrxiv.org
/
[OSF Preprints]: https://osf.io/preprints/
[figshare]: https://figshare.com/
...
...
@@ -151,8 +161,9 @@ then excludes it from being part of a ROS workflow.
There are many programming and statistical computing languages available,
both open source and proprietary. However, of them all we recommend using [R]
and [Python]. Both languages are open source, have active and (mostly) welcoming
communities, have very well developed packages and extensions for all types of
and [Python]. Both languages are open source, have active communities, are
working at being more welcoming and inclusive,
have very well developed packages and extensions for all types of
analyses projects, are well maintained and documented, are (mostly) readable,
are widely used in the scientific community, and are the two most widely used
languages in the world for data science. The R community in particular is very
...
...
@@ -179,49 +190,52 @@ researcher's institution can't afford a license, the text of that document will
be inaccessible.
More commonly, if one finds a document written using an older version of the
software (e.g. `.doc` vs `.docx`), there is no guarantee it can be opened in the
new version of the software. Documents can only be opened by people who can
afford to purchase the products sold by the vendor. Opening the same document in
different versions of the same software or on different computers could render
different results (such as when opening a Windows PowerPoint presentation on a
Mac). Storing either data or manuscripts in such formats means that they can be
lost forever or could be inaccessible to certain groups of people. In contrast,
writing in an open, text-based source format means that the document can be
opened by anyone with access to a computer or mobile device.
There are several plain-text "[markup language]" formates, such as [LaTeX] or
[HTML]. However, there are major drawbacks to these "languages", including the
difficulty and effort required to learn them. Luckily, there is the
[Markdown] format which is simple to learn and to use. Since Markdown is just
plain text, changes can also be easily tracked using [Git] and collaboration can
happen on [GitHub] or [GitLab]. There are also promising online text editors
emerging which support Markdown with track changes to ease the transition for
those not wanting to learn GitHub, e.g. Authorea. Plus, when using
[Pandoc Markdown], the document can be converted to a large range of output
formats, including Word `.docx`, beautifully typeset [LaTeX] PDFs, or web friendly
[HTML] files. Markdown also has features typically required of scientific
writing such as citation and bibliography insertion (including plain text
formats such as [BibTeX]). Taken together Markdown is a simple, powerful plain
text format that ensures documents will stick around well into the future.
software (e.g. `.doc` vs `.docx`), there is no guarantee it can be opened in
the new version of the software. Opening the same document in different
versions of the same software or on different computers could render different
results (such as when opening a Windows PowerPoint presentation on a Mac).
Documents can only be opened by people who can afford to purchase the products
sold by the vendor. Storing either data or manuscripts in such formats means
that they can be lost forever or could be inaccessible to certain groups of
people. In contrast, writing in an open, text-based source format means that
the document can be opened by anyone with access to a computer or mobile
device.
Open, text-based formats are commonly referred to as plain text documents.
Although plain text itself cannot be formatted into headings, bold font, etc;
the addition of text-based markup, such as `*` or `[bold]` surrounding a word,
enables text editors to display plain text as formatted documents. There are
several plain-text "[markup language]", such as [LaTeX] or [HTML], but many of
these have verbose markup that make them inefficient to type and difficult to
learn. [Markdown] is a markup language that was designed from markup
conventions used over email so it is simple to learn and easy to type. A flavor
of Markdown called [Pandoc Markdown]) is specialized on scholarly communication
and support features required in scientific writing such as automatic figure
referencing, in text citations, and bibliography insertion (including plain
text formats such as [BibTeX]). [Pandoc Markdown] documents can also be
converted to a large range of output formats, including Word `.docx`,
beautifully typeset [LaTeX] PDFs, or web friendly [HTML] files. [R Markdown] is
an extension of [Pandoc Markdown] that allows R and Python code to be executed
within and inserted into a document, increasing document-level reproducibility.
Since Markdown is just plain text, changes can be easily tracked using [Git]
and collaboration can happen on [GitHub] or [GitLab]. There are also promising
online text editors emerging which support Markdown with track changes to ease
the transition for people used to conventional word processors, e.g. [Authorea]
and [Stencila]. Taken together the Markdown format is an open plain text format
that is accessible and usable on all operating systems, has an active community
of users, is well maintained and documented (e.g. the [Pandoc Markdown] manual
or the [R Markdown Book]), can be converted in a wide range of document types
(see the [Pandoc Markdown] about page for examples), is designed for simplicity
and readability, and has flavors dedicated to scholarly communication.
[HTML]: https://en.wikipedia.org/wiki/HTML
[LaTeX]: https://www.latex-project.org/
[markup language]: https://en.wikipedia.org/wiki/Markup_language
[Markdown]: https://en.wikipedia.org/wiki/Markdown
[BibTeX]: https://en.wikipedia.org/wiki/BibTeX
TODO: Determine if there is a "Python Markdown" available
The Markdown format is open source, is simple plain text (accessible and usable
on all operating systems and versions), has an active community of users, is
well maintained and documented (e.g. the [Pandoc Markdown] or the [R Markdown
Book]), can be converted in a wide range of document types (see the [Pandoc
Markdown] about page for examples), is designed for simplicity and readability.
For R based projects, use [R Markdown], an extension of Markdown that allows R
code to be executed within and inserted into a document, increasing
document-level reproducibility. [RStudio] offers a great environment to write R
Markdown.
[Authorea]: https://www.authorea.com/
[Stencila]: https://stenci.la/
[R Markdown Book]: https://bookdown.org/yihui/rmarkdown/
#### Dissemination phase
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment