├── .github └── workflows │ └── slides.yml ├── .gitignore ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── LICENSE.md ├── README.md ├── answers ├── module_3 │ ├── ex1.yml │ ├── ex2.yml │ ├── ex3.yml │ └── exercise-module-3.txt └── module_4 │ ├── ex1.yml │ └── ex2.yml ├── book_module4 ├── _toc.yml ├── figures │ ├── ReproducibleMatrix.jpg │ ├── favicon-32x32.png │ ├── logo.png │ ├── open_access_citatations.jpg │ ├── open_umbrella.png │ ├── reasons_reproducibility.png │ ├── reproducibility.jpg │ └── welcome.jpg ├── open │ ├── open-access.md │ ├── open-data.md │ ├── open-hardware.md │ ├── open-notebooks.md │ ├── open-scholarship.md │ ├── open-source.md │ └── open.md ├── overview │ ├── overview-benefit.md │ ├── overview-definitions.md │ ├── overview-resources.md │ └── overview.md ├── references.bib ├── reproducible-research.md └── welcome.md ├── book_module5 ├── _config.yml ├── _toc.yml ├── demo.ipynb ├── figures │ ├── ReproducibleMatrix.jpg │ ├── favicon-32x32.png │ ├── logo.png │ ├── open_access_citatations.jpg │ ├── open_umbrella.png │ ├── reasons_reproducibility.png │ ├── reproducibility.jpg │ └── welcome.jpg ├── open │ ├── open-access.md │ ├── open-data.md │ ├── open-hardware.md │ ├── open-notebooks.md │ ├── open-scholarship.md │ ├── open-source.md │ └── open.md ├── overview │ ├── overview-benefit.md │ ├── overview-definitions.md │ ├── overview-resources.md │ └── overview.md ├── references.bib ├── reproducible-research.md └── welcome.md ├── book_module6 ├── _config.yml ├── _toc.yml ├── demo.ipynb ├── figures │ ├── ReproducibleMatrix.jpg │ ├── favicon-32x32.png │ ├── logo.png │ ├── open_access_citatations.jpg │ ├── open_umbrella.png │ ├── reasons_reproducibility.png │ ├── reproducibility.jpg │ └── welcome.jpg ├── open │ ├── open-access.md │ ├── open-data.md │ ├── open-hardware.md │ ├── open-notebooks.md │ ├── open-scholarship.md │ ├── open-source.md │ └── open.md ├── overview │ ├── overview-benefit.md │ ├── overview-definitions.md │ ├── overview-resources.md │ └── overview.md ├── references.bib ├── reproducible-research.md └── welcome.md ├── content ├── bibliography.md ├── demo.ipynb ├── demo_2.ipynb ├── figures │ ├── ReproducibleMatrix.jpg │ ├── favicon-32x32.png │ ├── logo.png │ ├── open_access_citatations.jpg │ ├── open_umbrella.png │ ├── reasons_reproducibility.png │ ├── reproducibility.jpg │ └── welcome.jpg ├── open │ ├── open-access.md │ ├── open-data.md │ ├── open-hardware.md │ ├── open-notebooks.md │ ├── open-scholarship.md │ ├── open-source.md │ └── open.md ├── overview │ ├── overview-benefit.md │ ├── overview-definitions.md │ ├── overview-resources.md │ └── overview.md ├── references.bib ├── reproducible-research.md └── welcome.md ├── data └── README_data.md ├── images ├── MalvikaSharan.jpg ├── MartinaVilas.jpg ├── README_images.md ├── SarahGibson.jpg ├── cell_options.png ├── create_assignment.png ├── github-button-turing.png ├── graded_exercise_sample.png ├── jupyterbook-jc2020.png ├── reproducible-jc2020.jpg ├── theturingway-intro-jc2020.png └── theturingway-navigation-jc2020.png ├── notebooks ├── 1-welcome.ipynb ├── 2-introduction.ipynb ├── 3-setup-jupyterbook.ipynb ├── 4-config-jupyterbook.ipynb ├── 5-more-jupyterbook.ipynb ├── 6-ci-jupyterbook.ipynb ├── 7-final-demo.ipynb └── README.md ├── postBuild ├── presentation ├── README.md ├── module_1_presentation.pdf ├── module_2.1_presentation.pdf ├── module_2.2_presentation.pdf ├── module_3_presentation.pdf ├── module_4_presentation.pdf ├── module_5_presentation.pdf ├── module_6_presentation.pdf └── module_7_presentation.pdf ├── requirements.txt └── slides └── README_slides.md /.github/workflows/slides.yml: -------------------------------------------------------------------------------- 1 | name: deploy-slides 2 | 3 | # Only run this when the master branch changes 4 | on: 5 | push: 6 | branches: 7 | - master 8 | 9 | # This job installs dependencies, builds the slides, and pushes it to `gh-pages` 10 | jobs: 11 | deploy-book: 12 | runs-on: ubuntu-latest 13 | steps: 14 | - uses: actions/checkout@v2 15 | 16 | # Install dependencies 17 | - name: Set up Python 3.7 18 | uses: actions/setup-python@v1 19 | with: 20 | python-version: 3.7 21 | 22 | - name: Install dependencies 23 | run: | 24 | pip install -r requirements.txt 25 | 26 | 27 | 28 | # Build the slides 29 | - name: Build the slides 30 | run: | 31 | cd slides 32 | mkdir html 33 | jupytext README_slides.md --to ipynb 34 | jupyter nbconvert README_slides.ipynb --to slides --SlidesExporter.reveal_theme=solarized --stdout > html/index.html 35 | 36 | # Push the book's HTML to github-pages 37 | - name: GitHub Pages action 38 | uses: peaceiris/actions-gh-pages@v3.5.9 39 | with: 40 | github_token: ${{ secrets.GITHUB_TOKEN }} 41 | publish_dir: slides/html 42 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Ignore jupyter notebook checkpoints 2 | .ipynb_checkpoints 3 | 4 | # Ignore local build of the book 5 | _build/ 6 | 7 | # Ignore files that should only be created during tutorial 8 | book/ 9 | book_module4/_config.yml -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Code of Conduct for the tutorial repository for "Building a Jupyter Book with *The Turing Way*". 2 | 3 | As a community-oriented project we welcome everyone, and encourage a friendly and positive environment. 🌺 4 | 5 | This code of conduct outlines our expectations for participants within the 6 | community, as well as steps to reporting unacceptable behavior. We are committed 7 | to providing a welcoming and inspiring community for all and expect our code of 8 | conduct to be honored. Anyone who violates this code of conduct may be banned 9 | from the community. 10 | 11 | Our open source community strives to: 12 | 13 | - **Be friendly and patient.** 14 | 15 | - **Be welcoming**: We strive to be a community that welcomes and supports 16 | people of all backgrounds and identities. This includes, but is not limited to 17 | members of any race, ethnicity, culture, national origin, colour, immigration 18 | status, social and economic class, educational level, sex, sexual orientation, 19 | gender identity and expression, age, size, family status, political belief, 20 | religion, and mental and physical ability. 21 | 22 | - **Be considerate**: Your work will be used by other people, and you in turn 23 | will depend on the work of others. Any decision you take will affect users and 24 | colleagues, and you should take those consequences into account when making 25 | decisions. Remember that we're a world-wide community, so you might not be 26 | communicating in someone else's primary language. 27 | 28 | - **Be respectful**: Not all of us will agree all the time, but disagreement is 29 | no excuse for poor behavior and poor manners. We might all experience some 30 | frustration now and then, but we cannot allow that frustration to turn into a 31 | personal attack. It’s important to remember that a community where people feel 32 | uncomfortable or threatened is not a productive one. 33 | 34 | - **Be careful in the words that we choose**: We are a community of 35 | professionals, and we conduct ourselves professionally. Be kind to others. Do 36 | not insult or put down other participants. Harassment and other exclusionary 37 | behavior aren't acceptable. This includes, but is not limited to: Violent 38 | threats or language directed against another person, Discriminatory jokes and 39 | language, Posting sexually explicit or violent material, Posting (or 40 | threatening to post) other people’s personally identifying information 41 | (“doxing”), Personal insults, especially those using racist or sexist terms, 42 | Unwelcome sexual attention, Advocating for, or encouraging, any of the above 43 | behavior, Repeated harassment of others. In general, if someone asks you to 44 | stop, then stop. 45 | 46 | - **Try to understand why we disagree**: Disagreements, both social and 47 | technical, happen all the time. It is important that we resolve disagreements 48 | and differing views constructively. Remember that we’re different. Diversity 49 | contributes to the strength of our community, which is composed of people from 50 | a wide range of backgrounds. Different people have different perspectives on 51 | issues. Being unable to understand why someone holds a viewpoint doesn’t mean 52 | that they’re wrong. Don’t forget that it is human to err and blaming each 53 | other doesn’t get us anywhere. Instead, focus on helping to resolve issues and 54 | learning from mistakes. 55 | 56 | ### Diversity Statement 57 | 58 | We encourage everyone to participate and are committed to building a community 59 | for all. Although we will fail at times, we seek to treat everyone both as 60 | fairly and equally as possible. Whenever a participant has made a mistake, we 61 | expect them to take responsibility for it. If someone has been harmed or 62 | offended, it is our responsibility to listen carefully and respectfully, and do 63 | our best to right the wrong. 64 | 65 | Although this list cannot be exhaustive, we explicitly honor diversity in age, 66 | gender, gender identity or expression, culture, ethnicity, language, national 67 | origin, political beliefs, profession, race, religion, sexual orientation, 68 | socioeconomic status, and technical ability. We will not tolerate discrimination 69 | based on any of the protected characteristics above, including participants with 70 | disabilities. 71 | 72 | ### Reporting Issues 73 | 74 | If you experience or witness unacceptable behavior, or have any other concerns, 75 | please report it by contacting the project developers and maintainers: 76 | Malvika Sharan (email: msharan@turing.ac.uk), Martina G. Vilas (email: martinagonzalezvilas@gmail.com) 77 | and Sarah Gibson (sgibson@turing.ac.uk). 78 | 79 | To report an issue involving one of the members, please email other members individually. 80 | 81 | All reports will be handled with discretion. In your report please include: 82 | 83 | - Your contact information. 84 | 85 | - Names (real, nicknames, or pseudonyms) of any individuals involved. If there 86 | are additional witnesses, please include them as well. Your account of what 87 | occurred, and if you believe the incident is ongoing. If there is a publicly 88 | available record (e.g. a mailing list archive or a public IRC logger), please 89 | include a link. 90 | 91 | - Any additional information that may be helpful. 92 | 93 | After filing a report, a representative will contact you personally, review the 94 | incident, follow up with any additional questions, and make a decision as to how 95 | to respond. If the person who is harassing you is part of the response team, 96 | they will recuse themselves from handling your incident. If the complaint 97 | originates from a member of the response team, it will be handled by a different 98 | member of the response team. We will respect confidentiality requests for the 99 | purpose of protecting victims of abuse. 100 | 101 | ### Attribution & Acknowledgements 102 | 103 | This code of conduct is based on the Open Code of Conduct from the [TODO Group](https://github.com/todogroup/opencodeofconduct/). 104 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # How to contribute? 2 | 3 | First of all, thanks for taking the time to contribute! 🎉👍 4 | 5 | This repository is a place to share resources for the tutorial "Building a 6 | Jupyter Book with *The Turing Way*". We welcome all the suggestions and contributions 7 | to improve this tutorial. 8 | You can report mistakes and errors, propose a topic for discussion, offer resources 9 | to enhance the current modules. 10 | 11 | We have a [Code of Conduct](./CODE_OF_CONDUCT.md) that applies to all the activities 12 | on this repository. 13 | 14 | Whatever is your background and availability, there is a way to contribute on this 15 | GitHub repository. 16 | 17 | 🏃 I'm busy, I only have 1 minute 18 | --- 19 | 20 | Get in touch with the developers and maintainers of this project by dropping an email 21 | to express your interest to get involved. 22 | (Malvika Sharan (email: msharan@turing.ac.uk), Martina G. Vilas 23 | (email: martinagonzalezvilas@gmail.com) and/or Sarah Gibson (email: sgibson@turing.ac.uk)) 24 | 25 | ⏳ I've read/used the resources and now I have 5 minutes - tell me what I can do 26 | --- 27 | 28 | - Open an issue to report errors/bugs, share your feedback or suggest any improvement 29 | that can help other users of this tutorial. 30 | - Open a pull request to fix any error, clarify any part of this tutorial that is not clear or 31 | contribute examples. 32 | 33 | 🎉 I am committed to contributing to *The Turing Way* and Jupyter Book in the future 34 | --- 35 | 36 | Both the projects have contribution guidelines that will help you get started as 37 | a contributor and get onboarded in the community. 38 | - [*The Turing Way* CONTRIBUTING.MD](https://github.com/alan-turing-institute/the-turing-way/blob/master/CONTRIBUTING.md) 39 | - [Jupyter Book Contributor's guide](https://jupyterbook.org/contribute/intro.html) 40 | 41 | Please visit their GitHub repository for more details: 42 | - [*The Turing Way* GitHub repository](https://github.com/alan-turing-institute/the-turing-way) 43 | - [Jupyter Book GitHub repository](https://github.com/executablebooks/jupyter-book) 44 | 45 | 🛠 I am ready to contribute to the tutorial and/or the projects 46 | --- 47 | 48 | - For open tasks in this repository, please see the 49 | [Issues sections in this repository](https://github.com/martinagvilas/jupytercon_tutorial/issues). 50 | - For open tasks in *The Turing Way* repository, please visit their 51 | [Issues section](https://github.com/alan-turing-institute/the-turing-way/issues). 52 | - For open tasks in Jupyter Book repository, please visit their 53 | [Issues section](https://github.com/executablebooks/jupyter-book/issues). 54 | - Raise mistakes, error or missing information on these repositories by opening Pull Request 55 | - Read details on [how to open a Pull request](https://opensource.guide/how-to-contribute/#opening-a-pull-request) 56 | - Submit trivial fixes (for example, a typo, a broken link or an obvious error) 57 | - Start work on a contribution that is already listed as an issue, or something you’ve already discussed 58 | - A pull request doesn’t have to represent finished work. It’s usually better to open a 59 | pull request early on, so others can watch or give feedback on your progress. 60 | You can mark it as a “WIP” (Work in Progress) in the subject line. You can always add more commits later. 61 | 62 | Acknowledgements 🙌 63 | --- 64 | 65 | This project is developed and maintained by Malvika Sharan, Martina Vilas and Sarah Gibson. 66 | The project leads of *The Turing Way*, Kirstie Whitaker and Jupyter Book, Chris Holdgraf 67 | have provided support in developing this tutorial. 68 | 69 | This work is licensed under a Creative Commons Attribution 4.0 International license. 70 | You are free to share and adapt the material for any purpose, even commercially, 71 | as long as you provide attribution (give appropriate credit, provide a link to the license, 72 | and indicate if changes were made) in any reasonable manner, but not in any way that suggests the 73 | licensor endorses you or your use, and with no additional restrictions. 74 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # JupyterCon Tutorial 2020 2 | 3 | - Website: https://jupytercon.com/ 4 | - Location: Online 5 | - Date: 5-9 October 2020 6 | 7 | ## Title: Creating a Jupyter Book with The Turing Way 8 | 9 | - **Subtitle:** Create a Jupyter Book from scratch using chapters from *The Turing Way* on Reproducible Research. 10 | - **Duration:** 2 - 4h (based on familarity) 11 | - **Audience level**: Novice programmers/Intermediate GitHub users 12 | - **Prerequisite**: previous experience with version control, GitHub, Markdown, Jupyter Notebooks, basic commands of bash and basic Pathon 13 | - **Tutorial files**: This tutorial is organised in 7 short modules that are developed in Jupyter Notebooks and paired with introductory videos. 14 | - Please visit the [notebooks directory](./notebooks) to find the tutorial files (named module-wise). 15 | - All the introductory videos are available in this [YouTube playlist](https://www.youtube.com/playlist?list=PLBxcQEfGu3Dmdo6oKg6o9V7Q_e7WSX-vu) (named and ordered module-wise) and the slides used in those presentations are provided in the [presentation directory](./presentation). 16 | 17 | ### Description 18 | 19 | **Short Description:** 20 | 21 | Jupyter Book is an open source project for building publication-ready online books with computational files. *The Turing Way* is a community-led book project on learning computational skills, which is hosted online as a Jupyter Book. In this tutorial, you will learn about the collaborative nature of both projects and create your own Jupyter Book using files and chapters from The Turing Way as examples. 22 | 23 | **Session detail:** 24 | 25 | The topics and activities to be covered in this workshop are the following: 26 | - We will start by introducing *The Turing Way* and Jupyter Book. 27 | - The session leads will present *The Turing Way* as a community-developed book project on research reproducibility, project design, communication, collaboration and ethics. 28 | - A demo of The Turing Way's GitHub repository will explain how a Jupyter book is created and how they are hosted online. 29 | - A hands-on session will be carried out to create a Jupyter Book using *The Turing Way* chapters as examples. 30 | - We will explain what Continuous Integration (CI) is and how to deploy CI tests using GitHub Actions. 31 | - We will show the collaborative workflow of Jupyter Book that allows GitHub based contributions by the users of the book. 32 | - The session will end with sharing details on how participants can gain further support when working with Jupyter Books and *The Turing Way*. 33 | 34 | ### Learning outcomes 35 | 36 | In this tutorial, our learners will: 37 | - Get introduced to *The Turing Way* and Jupyter Book projects as reproducible and collaborative platforms for community-developed computational resources. 38 | - Learn how to create and structure a Jupyter Book using example chapters from *The Turing Way*. 39 | - Configure and personalise their Jupyter Book locally and connect it to an online GitHub repository. 40 | - Learn how Jupyter Notebooks can be used as chapters and executed using Binder. 41 | - Learn what Continuous Integration (CI) and Continuous Deployment (CD) are and how to use them with GitHub Actions. 42 | - Get introduced to Sphinx-based features in Jupyter Book for citing external resources and cross-referencing its chapters. 43 | 44 | ### Instructor details 45 | 46 | - Name: Malvika Sharan 47 | - Title: Dr. 48 | - Organization: *The Turing Way*, The Alan Turing Institute, London, UK 49 | - Biography: Malvika is the community manager of *[The Turing Way](https://the-turing-way.netlify.app)* at [The Alan Turing Institute, UK](https://www.turing.ac.uk/). She works with its community of diverse members to develop resources and ways that can make data science accessible for a wider audience. 50 | Malvika has a PhD in Bioinformatics and she worked at European Molecular Biology Laboratory, Germany, that helped her solidify her values as an Open Researcher and community builder. 51 | She is a co-founder of the [Open Life Science](https://openlifesci.org/) mentoring program, a fellow of the [Software Sustainability Institute](https://www.software.ac.uk/) and a board member of [Open Bioinformatics Foundation](https://www.open-bio.org/event-awards/), where she focuses on training resources and fellowship programs to enhance the training, skill building and representation of marginalised groups in data science and bioinformatics. 52 | - Photo: [LINK](images/MalvikaSharan.jpg) 53 | 54 | - Name: Martina G. Vilas 55 | - Title: Ms. 56 | - Organization: Max Plank Institute for Empirical Aesthetics, Frankfurt, Germany 57 | - Biography: Martina is currently working at the [Max-Planck-Institute AE](https://www.aesthetics.mpg.de/en/the-institute/people/m-vilas.html), where she is conducting her research in cognitive neuroscience using computational modeling techniques. She is an open-science advocate who enjoys programming and contributing to open-source projects and communities. As a core contributor and maintainer, she provides infrastructure support for *[The Turing Way](https://the-turing-way.netlify.app)* project. 58 | - Photo: [LINK](images/MartinaVilas.jpg) 59 | 60 | - Name: Sarah Gibson 61 | - Title: Dr 62 | - Organization: The Alan Turing Institute, London, UK 63 | - Biography: Sarah is a Research Software Engineer at [The Alan Turing Institute, UK](https://www.turing.ac.uk/) where she implements software best practices to translate academic research into real world solutions through the Turing’s collaborative network. As a maintainer and operator of the [Binder](https://mybinder.org/) project, she operates a [BinderHub](https://binderhub.readthedocs.io/en/latest/) cluster at the Turing. Sarah is a Software Sustainability Institute Fellow where she focuses on nurturing and diversifying the Binder community. She is also a core contributor of *[The Turing Way](https://the-turing-way.netlify.app)* project. 64 | - Photo: [LINK](images/SarahGibson.jpg) 65 | 66 | ### Project Leads and video contributors for module 2 67 | 68 | - *The Turing Way*: Kirstie Whitaker, head of the Tools, Practices, and Systems research programme, The Alan Turing Institute, UK. 69 | - Jupyter Book: Chris Holdgraf, member of Project Jupyter and Binder, co-founder of The International Interactive Computing Collaboration (2i2c), USA. 70 | -------------------------------------------------------------------------------- /answers/module_3/ex1.yml: -------------------------------------------------------------------------------- 1 | - file: welcome 2 | - file: overview/overview 3 | title: Reproducibility Overview 4 | sections: 5 | - file: overview/overview-definitions 6 | title: Definitions 7 | - file: overview/overview-benefit 8 | title: Benefits 9 | - file: overview/overview-resources 10 | title: Resources -------------------------------------------------------------------------------- /answers/module_3/ex2.yml: -------------------------------------------------------------------------------- 1 | - file: welcome 2 | - part: Guide for Reproducible Research 3 | chapters: 4 | - file: overview/overview 5 | title: Overview 6 | sections: 7 | - file: overview/overview-definitions 8 | title: Definitions 9 | - file: overview/overview-benefit 10 | title: Benefits 11 | - file: overview/overview-resources 12 | title: Resources 13 | - file: open/open 14 | title: Open Research 15 | sections: 16 | - file: open/open-data 17 | title: Open Data 18 | - file: open/open-source 19 | title: Open Source 20 | - file: open/open-hardware 21 | title: Open Hardware 22 | - file: open/open-access 23 | title: Open Access 24 | - file: open/open-notebooks 25 | title: Open Notebooks 26 | - file: open/open-scholarship 27 | title: Open Scholarship -------------------------------------------------------------------------------- /answers/module_3/ex3.yml: -------------------------------------------------------------------------------- 1 | - file: welcome 2 | - file: reproducible-research 3 | chapters: 4 | - file: overview/overview 5 | title: Overview 6 | numbered: true 7 | sections: 8 | - file: overview/overview-definitions 9 | title: Definitions 10 | - file: overview/overview-benefit 11 | title: Benefits 12 | - file: overview/overview-resources 13 | title: Resources 14 | - file: open/open 15 | title: Open Research 16 | numbered: true 17 | sections: 18 | - file: open/open-data 19 | title: Open Data 20 | - file: open/open-source 21 | title: Open Source 22 | - file: open/open-hardware 23 | title: Open Hardware 24 | - file: open/open-access 25 | title: Open Access 26 | - file: open/open-notebooks 27 | title: Open Notebooks 28 | - file: open/open-scholarship 29 | title: Open Scholarship -------------------------------------------------------------------------------- /answers/module_3/exercise-module-3.txt: -------------------------------------------------------------------------------- 1 | yaml = YAML() 2 | 3 | # Define the contents of our _toc.yml 4 | toc_document = """ 5 | - file: welcome 6 | - file: reproducible-research 7 | - file: overview/overview 8 | sections: 9 | - file: overview/overview-definitions.md 10 | - file: overview/overview-benefit.md 11 | - file: overview/overview-resources.md 12 | - file: open/open 13 | title: Open Research 14 | sections: 15 | - file: open/open-data 16 | - file: open/open-source 17 | - file: open/open-hardware 18 | - file: open/open-access 19 | - file: open/open-notebooks 20 | - file: open/open-scholarship 21 | """ 22 | 23 | # Save _toc.yml in the book directory 24 | toc_file = open('../book/_toc.yml', 'w') 25 | yaml.dump(yaml.load(toc_document), toc_file) 26 | -------------------------------------------------------------------------------- /answers/module_4/ex1.yml: -------------------------------------------------------------------------------- 1 | title : The Turing Way # The title of the book 2 | author : Reader! # The author of the book to be placed in the footer 3 | copyright : "2020" # Copyright year to be placed in the footer 4 | logo : "./figures/reproducibility.jpg" # A path to the book logo 5 | -------------------------------------------------------------------------------- /answers/module_4/ex2.yml: -------------------------------------------------------------------------------- 1 | title : The Turing Way # The title of the book 2 | author : The Turing Way Community # The author of the book to be placed in the footer 3 | copyright : "2020" # Copyright year to be placed in the footer 4 | logo : "./figures/logo.png" # A path to the book logo 5 | 6 | html: 7 | favicon : "./figures/favicon-32x32.png" # A path to a favicon image 8 | navbar_footer_text : 9 | 'Visit our GitHub Repository 10 |
11 | This book is powered by Jupyter Book 12 |
' # Will be displayed underneath the left navigation bar. 13 | home_page_in_navbar : false # Whether to include your home page in the left Navigation Bar -------------------------------------------------------------------------------- /book_module4/_toc.yml: -------------------------------------------------------------------------------- 1 | - file: welcome 2 | - file: reproducible-research 3 | title: Reproducibility Guide 4 | chapters: 5 | - file: overview/overview 6 | title: Overview 7 | sections: 8 | - file: overview/overview-definitions 9 | title: Definitions 10 | - file: overview/overview-benefit 11 | title: Benefits 12 | - file: overview/overview-resources 13 | title: Resources 14 | - file: open/open 15 | title: Open Research 16 | sections: 17 | - file: open/open-data 18 | title: Open Data 19 | - file: open/open-source 20 | title: Open Source 21 | - file: open/open-hardware 22 | title: Open Hardware 23 | - file: open/open-access 24 | title: Open Access 25 | - file: open/open-notebooks 26 | title: Open Notebooks 27 | - file: open/open-scholarship 28 | title: Open Scholarship -------------------------------------------------------------------------------- /book_module4/figures/ReproducibleMatrix.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module4/figures/ReproducibleMatrix.jpg -------------------------------------------------------------------------------- /book_module4/figures/favicon-32x32.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module4/figures/favicon-32x32.png -------------------------------------------------------------------------------- /book_module4/figures/logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module4/figures/logo.png -------------------------------------------------------------------------------- /book_module4/figures/open_access_citatations.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module4/figures/open_access_citatations.jpg -------------------------------------------------------------------------------- /book_module4/figures/open_umbrella.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module4/figures/open_umbrella.png -------------------------------------------------------------------------------- /book_module4/figures/reasons_reproducibility.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module4/figures/reasons_reproducibility.png -------------------------------------------------------------------------------- /book_module4/figures/reproducibility.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module4/figures/reproducibility.jpg -------------------------------------------------------------------------------- /book_module4/figures/welcome.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module4/figures/welcome.jpg -------------------------------------------------------------------------------- /book_module4/open/open-access.md: -------------------------------------------------------------------------------- 1 | # Open access 2 | 3 | ## What is open access? 4 | 5 | One of the most common ways to disseminate research results is by writing a manuscript and publishing it in a journal, conference proceedings or book. For many years those publications were available to the public if purchased by means of a subscription fee or individually. 6 | However, new knowledge is built by synthesizing current scholarship and then building upon it. 7 | At the turn of the 21st century a new movement appeared with a clear objective: make all the research results available to anyone interested in reading it, free of charge by any user, with no technical obstacles such as mandatory registration or login to specific platforms. 8 | This movement took the name of Open access and established two initial strategies to achieve its final goal: self-archiving and open access publishing. 9 | 10 | ### Repositories and self-archiving 11 | 12 | The aim of the self-archiving movement is to provide tools and assistance to scholars to deposit their refereed journal articles in open electronic repositories. 13 | As a result of the first strategy we see self-archiving practices: researchers depositing and disseminating papers in institutional or subject based repositories. 14 | There has also been a growth in the publication of preprints through institutional repositories and preprint servers. 15 | Preprints are widely used in physical sciences and now emerging in life sciences and other fields. 16 | Preprints are documents that have not been peer reviewed but are considered as a complete publication in a first stage. 17 | Some of the preprint servers include open peer review services and the availability to post new versions of the initial paper once reviewed by peers. 18 | 19 | At the beginning of 2019 more than 4000 repositories are available for researchers to self-archive their publications according to the [registry of open access repositories](http://roar.eprints.org/). 20 | In this list there are institutional repositories, subject based or thematic repositories, and harvesters. 21 | Institutional repositories are generally managed by research performing institutions to provide to their community a place to archive and share openly papers and other research outputs. 22 | Subject based repositories are usually managed by research communities and most of the contents are related to a certain discipline. 23 | Finally, harvesters aggregate content from different repositories becoming sites to perform general searches and build other value-added services. 24 | 25 | When choosing a journal to publish research results, researchers should take a moment to read the journal policy regarding the transfer of copyright. 26 | Many journals still require for publication that authors transfer full copyright. 27 | This transfer of rights implies that authors must ask for permission to reuse their own work beyond what is allowed by the applicable law, unless there are some uses already granted. 28 | Such granted uses may include teaching purposes, sharing with colleagues, and self-archiving by researchers of their papers in repositories. 29 | Sometimes there a common policy among all the journals published by the same publishers but in general journals have their own policy, especially when they are published on behalf of a scientific society. 30 | When looking at the self-archiving conditions we must identify two key issues: the version of the paper that can be deposited and when it can be made publicly available. 31 | 32 | Regarding the version, some journals allow the dissemination of the submitted version, also known as a preprint, and they allow its replacement for a reviewed version once the final paper has been published. 33 | Due to the increase of policies requiring access to research results, most of the journals allow self-archiving of the accepted version of the paper, also known as the author manuscript or postprint. 34 | This version is the final text once the peer review process has ended but it has not the final layout of the publication. 35 | Finally some journals do allow researchers to deposit the final published version, also known as the version of record. 36 | 37 | In relation to the moment to make the paper publicly available, many journals establish a period of time from its original publication - the embargo period, which can range from zero to 60 months - when making the paper publicly available is not permitted. 38 | Some journals include or exclude embargoes depending on the versions. 39 | For instance the accepted version could be made publicly available after publication but the published version must wait 12 months. 40 | 41 | ### Open access publishing 42 | 43 | Open access publishing attempts to ensure permanent open access to all the articles published in journals, and as a result we have seen the creation of the open access journals. 44 | The number of open access journals has increased during the last years, according to the Directory of Open access Journals \([DOAJ](http://www.doaj.org)\), currently there are more than 12,000. 45 | Open access journal must provide free access to its contents but it also must licence them to allow reusability. 46 | 47 | Currently many paywalled journals offer individual open access options to researchers once the paper is accepted after peer review. 48 | Those options include the publication under a free content licence and free accessibility to anyone since its first publication. 49 | This model is commonly known as the hybrid model because in the same issue of a journal, readers can find open access and paywalled contributions. 50 | Usually publishers ask for a fee to open individual contributions. 51 | 52 | Open access publishing has two primary versions — gratis and libre. 53 | Gratis open access is simply making research available for others to read without having to pay for it. 54 | However, it does not grant the user the right to make copies, distribute, or modify the work in any way beyond fair use. 55 | Libre open access is gratis, meaning the research is available free of charge, but it goes further by granting users additional rights, usually via a Creative Commons licence, so that people are free to reuse and remix the research. 56 | There are varying degrees of what may be considered libre open access. 57 | For example, some scholarly articles may permit all uses except commercial use, some may permit all uses except derivative works, and some may permit all uses and simply require attribution. 58 | While some would argue that libre open access should be free of any copyright restrictions (except attribution), other scholars consider a work that removes at least some permission barriers to be libre. 59 | 60 | ## Why does open access matter? 61 | 62 | Research is useless if it’s not shared; even the best research is ineffectual if others aren’t able to read and build on it. 63 | When price barriers keep articles locked away, research cannot achieve its full potential. 64 | Open access benefits researchers who can work more effectively with a better understanding of the literature. 65 | It also helps avoid duplication of effort. 66 | No researcher (or funder) wants to waste time and money conducting a study if they know it has been attempted elsewhere. 67 | But, duplication of effort is all-too-possible when researchers can’t effectively communicate with one another and make results known to others in their field and beyond. 68 | It also benefits researchers by providing better visibility and therefore higher impact/citation rate for their scholarship. 69 | Numerous publishers, both non-profit and for-profit, voluntarily make their articles openly available at the time of publication or within 6-12 months. 70 | Many have switched from a closed, subscription model to an open one as a strategic business decision to increase their journal's exposure and impact. 71 | Further it can be argued that taxpayers who pay for much of the research published in journals have a right to access the information resulting from that investment without charge. 72 | Finally, if research is available to the widest possible pool of readers then it is more likely/easy for it to be checked and reproduced. 73 | 74 | ## Best practice for open access 75 | 76 | ### Self-archiving 77 | 78 | Self-archive a publication in a suitable repository, institutional or subject-based, following the possible restrictions posed by the publisher, for example an embargo period, or limits on the allowed version to be deposited in such archives. 79 | In doing this it is important to make sure you are aware of the copyright implications of any documents/agreements you make when submitting your manuscript to a journal. 80 | If your institution does not have an institutional repository, advocate for the creation of one. 81 | You can check journal policies on self-archiving using [SHERPA/RoMEO](http://www.sherpa.ac.uk/romeo/index.php). 82 | 83 | ### Publication 84 | 85 | Consider submitting your work to a journal that is open access. 86 | When doing this be aware that there may be funds or discounts available to cover any associated costs. 87 | -------------------------------------------------------------------------------- /book_module4/open/open-notebooks.md: -------------------------------------------------------------------------------- 1 | # Open notebooks 2 | 3 | Electronic Lab Notebooks (ELNs) enable researchers to organize and store experimental procedures, protocols, plans, notes, data, and even unfiltered interpretations using their computer or mobile device. 4 | They are a digital analogue to the paper notebook most researchers keep. 5 | ELNs can offer several advantages over the traditional paper notebook in documenting research during the active phase of a project, including searchability within and across notebooks, secure storage with multiple redundancies, remote access to notebooks, and the ability to easily share notebooks among team members and collaborators. 6 | 7 | Open notebook research is simply the practice of making such notebooks openly available, usually online. 8 | Some researchers choose to keep their notebooks open from the very beginning of their projects. 9 | Rather than wait months, even years, to share their research through journal publication as is the current practice, this allows researchers to post their experimental data and protocols online and in real-time. 10 | Sharing research in this open and timely manner helps to reduce duplication of work, helps foster new collaborations, and cultivates a more open dialogue with others. 11 | It also helps researchers avoid making exploring dead ends and making mistakes that have already been covered by their colleague, but went unpublished because of lack of scientific interest. 12 | 13 | Open notebooks have the further benefit of increasing the quality of scientific outputs by forcing researchers to be careful, thorough, and explicit. 14 | Making research open has the added benefit of increasing the likelihood that any errors made in an investigation will be spotted quickly, instead of down the line. 15 | Immediate fixes will have much less impact on a research project, which will save a research time, the lab money, and pride. 16 | 17 | Ideally, every scientist would maintain an open notebook in real-time which would encompass all aspects of their research. 18 | But many fears about dealing with complete open access, conflicts with intellectual property and publications, and online data overload hamper this movement. 19 | To combat this, practitioners encourage any form of open notebook research, "make open what you can", even if that means uploading some information for a project from many years ago that never saw the light of day. 20 | -------------------------------------------------------------------------------- /book_module4/open/open-scholarship.md: -------------------------------------------------------------------------------- 1 | # Open scholarship 2 | 3 | Open research and its subcomponents fit under the umbrella of a broader concept - open scholarship. 4 | 5 | ![open_umbrella](../figures/open_umbrella.png) 6 | 7 | ## Open educational resources 8 | 9 | Open educational resources (OERs) are teaching and learning materials that can be freely used and reused for learning and/or teaching at no cost, and without needing to ask permission. Examples are courses, including Massive Online Open Courses (MOOCs), lectures, teaching materials, assignments, and various other resources. OERs are available in many different formats compatible with online usage, most obviously text, images, audio, and video. Anyone with internet access can access and use OERs; access is not dependent on location or membership of a particular institution. 10 | 11 | Unlike copyrighted resources, OERs have been authored or created by an individual or organization that chooses to retain few, if any, ownership rights. In some cases, that means anyone can download a resource and share it with colleagues and students. In other cases, this may go further and enable people to edit resources and then re-post them as a remixed work. How do you know your options? OERs often have a Creative Commons licence or other permission to let you know how the material may be used, reused, adapted, and shared. 12 | 13 | Fully open OERs comply with the 5 Rs: 14 | 15 | - Retain: the right to make, own, and control copies of the content. 16 | - Reuse: the right to use the content in a wide range of ways (for example, in a class, in a study group, on a website, in a video). 17 | - Revise: the right to adapt, adjust, modify, or alter the content itself (for example, translate the content into another language). 18 | - Remix: the right to combine the original or revised content with other open content to create something new (for example, incorporate the content into a mashup). 19 | - Redistribute: the right to share copies of the original content, your revisions, or your remixes with others (for example, give a copy of the content to a friend). 20 | 21 | Researchers generate a great deal of educational resources in the course of teaching students and each other (at workshops, for example). 22 | By making these openly available, for example in the [open educational resource commons](https://www.oercommons.org/), the wider community can benefit from them in three main ways: 23 | 24 | 1. Most obviously, the community can use the materials to learn about the material they cover. 25 | 2. Sharing resources reduces duplication of effort. 26 | If an educator needs materials for teaching and such materials already exist openly then they need not make their own from scratch, saving time. 27 | 3. Making materials openly available helps a community build better resources by improving resources that already exist and combining OERs to take advantage of their different strengths, such as a great diagram or explanation. 28 | 29 | Beyond the raw practical benefits the worldwide OER movement is rooted in the human right to access high-quality education. 30 | This shift in educational practice is about participation and co-creation. 31 | Open Educational Resources (OERs) offer opportunities for systemic change in teaching and learning content through engaging educators in new participatory processes and effective technologies for engaging with learning. 32 | 33 | ## Equity, diversity, inclusion 34 | 35 | Open scholarship means open to *everyone* without discrimination based on factors such as race, gender, sexual orientation, or any number of other factors. 36 | As a community we should undertake to ensure equitable opportunities for all. 37 | We can go about that by deliberately fostering welcoming, inclusive cultures within out communities. 38 | For example, reasonable accommodations should be made wherever possible to include community members with disabilities to enable them to participate fully, and this can be as simple as choosing colourblind-safe colour schemes when making graphs. 39 | 40 | ## Citizen science 41 | 42 | Citizen science is the involvement of the public in scientific research – whether community-driven research or global investigations, the Oxford English Dictionary recently defined it as: "scientific work undertaken by members of the general public, often in collaboration with or under the direction of professional scientists and scientific institutions". 43 | Citizen science offers the power of science to everyone, and the power of everyone to science. 44 | 45 | By allowing members of the public to contribute to scientific research, citizen science helps engage and invest the wider world in science. 46 | It also benefits researchers by offering manpower that simply would not be accessible otherwise. 47 | Examples of this include [finding](https://citizensciencegames.com/games/eterna/) ways of folding molecules, and [classifying](https://www.zooniverse.org/) different types of galaxies. 48 | 49 | ## Patient and Public Involvement 50 | Whilst citizen science encompasses one way of contributing to scientific research, Patient and Public Involvement (PPI) is a far more specialised form of citizen science which is particularly useful when doing research on health and/or social issues. 51 | 52 | PPI is *not*: 53 | - Participation: Recruitment of participants (such as for a clinical trial or survey) to contribute data to a project. 54 | - Engagement: Dissemination, such as presenting at patient interest groups or writing a blog post. 55 | 56 | PPI *is*: 57 | - Involvement: patients and members of the public contribute at *all* stages of the research cycle. 58 | 59 | When incorporating PPI into research, researchers work *with* volunteers, rather than doing work *about* them. 60 | PPI volunteers are usually patients or members of the public with a particular interest in some area of research which means that the topic is often very personal, and being involved in the research cycle can be an empowering experience. 61 | For the researcher, PPI often generates unique and invaluable insights from the volunteers' own personal expertise which cannot always be predicted by the researchers themselves. 62 | 63 | It is a good idea to consider PPI very early in a project, ideally before any grant applications or submissions for ethical approval have been written. 64 | PPI volunteers can help researchers in many ways, such as the following: 65 | 1. Generate or shape research questions. 66 | 2. Contribute to, or review, study design. 67 | 3. Help with grant applications or submissions to research ethics committees (particularly the lay summary). 68 | 4. Collect data. 69 | 5. Analyse data. 70 | 6. Contribute to the manuscript and be listed as a co-author. 71 | 7. Disseminate findings in plain English. 72 | 73 | One of the biggest barriers to PPI is not knowing how to get started. 74 | The UK National Institute for Health Research have their own site, [INVOLVE](https://www.invo.org.uk/), to help familiarise yourself with the foundations of PPI. 75 | Additionally, charities related to your specific research field may be able to facilitate or support PPI; for example [Cancer Research UK](https://www.cancerresearchuk.org/funding-for-researchers/patient-involvement-toolkit-for-researchers) and [Parkinson's UK](https://www.parkinsons.org.uk/research/patient-and-public-involvement-ppi) have formal guides in place that provide a comprehensive overview of PPI. 76 | -------------------------------------------------------------------------------- /book_module4/open/open.md: -------------------------------------------------------------------------------- 1 | (rr-open)= 2 | # Open research 3 | 4 | ## Prerequisites / recommended skill level 5 | 6 | | Prerequisite | Importance | Notes | 7 | | -------------|----------|------| 8 | | Experience with version control | Helpful | Experience with GitHub is particularly useful | 9 | 10 | ## Summary 11 | 12 | Open research aims to transform research by making it more reproducible, transparent, re-usable, collaborative, accountable, and accessible to society. It pushes for change in the way that research is carried out and disseminated by digital tools. One definition of open research, [as given by the Organisation for Economic Co-operation and Development (OECD)](https://www.fct.pt/dsi/docs/Making_Open_Science_a_Reality.pdf "Making Open Science a Reality, OECD Science, Technology and Industry Policy Papers No. 25"), is the practice of making "the primary outputs of publicly funded research results – publications and the research data – publicly accessible in digital format with no or minimal restriction." In order to achieve this openness in research, each element of the research process should: 13 | 14 | - Be publicly available: It is difficult to use and benefit from knowledge hidden behind barriers such as passwords and paywalls. 15 | - Be reusable: Research outputs need to be licensed appropriately so that prospective users clearly know any limitations on re-use. 16 | - Be transparent: With appropriate metadata to provide clear statements of how research output was produced and what it contains. 17 | 18 | The research process typically has the following form: data is collected and then analysed (usually using software). This process may involve the use of specialist hardware. The results of the research are then published. Throughout the process it is good practice for researchers to document their working in notebooks. Open research aims to make each of these elements open: 19 | 20 | - Open data: Documenting and sharing research data openly for re-use. 21 | - Open source software: Documenting research code and routines, and making them freely accessible and available. 22 | - Open hardware: Documenting designs, materials, and other relevant information related to hardware, and making them freely accessible and available. 23 | - Open access: Making all published outputs freely accessible for maximum use and impact. 24 | - Open notebooks: An emerging practice, documenting and sharing the experimental process of trial and error. 25 | 26 | These elements are expanded upon in this chapter. 27 | 28 | Open scholarship is a concept that extends open research further. It relates to making other aspects of scientific research open to the public, for example: 29 | 30 | - Open educational resources: Making educational resources publicly available to be re-used and modified. 31 | - Equity, diversity, inclusion: Ensuring scholarship is open to anyone without barriers based on factors such as race, background, gender, and sexual orientation. 32 | - Citizen science: The inclusion of members of the public in scientific research. 33 | 34 | These elements are also discussed in detail in this chapter. 35 | 36 | ## How this will help you / why this is useful 37 | 38 | There are five main schools of thought motivating open practices to benefit research: 39 | 40 | | School | Belief | Aim | 41 | | -------------------------- | -------------------- | ------------------------------------------------- | 42 | | Infrastructure | Efficient research depends on the available tools and applications. | Creating openly available platforms, tools, and services for researchers. | 43 | | Pragmatic | Knowledge-creation could be more efficient if researchers worked together. | Opening up the process of knowledge creation. | 44 | | Measurement | Academic contributions today need alternative impact measurements. | Developing an alternative metric system for research impact. | 45 | | Democratic | The access to knowledge is unequally distributed. | Making knowledge freely available for everyone. | 46 | | Public | Research needs to be made accessible to the public. | Making research accessible for citizens. | 47 | 48 | Open practices also benefit the researchers that propagate them. For example there is evidence [(Mckiernan et al. 2016)](https://elifesciences.org/articles/16800) that open access articles are cited more often, as shown by the metastudy presented in the figure below. 49 | 50 | | ![open_access_citatations](../figures/open_access_citatations.jpg) | 51 | | -----------------------------------------------------| 52 | | The relative citation rate (OA: non-OA) in 19 fields of research. This rate is defined as the mean citation rate of OA articles divided by the mean citation rate of non-OA articles. Multiple points for the same discipline indicate different estimates from the same study, or estimates from several studies. (See footnote 1 for references.) | 53 | 54 | Another benefit of openness is that while research collaborations are essential to advancing knowledge, identifying and connecting with appropriate collaborators is not trivial. Open practices can make it easier for researchers to connect with one another by increasing the discoverability and visibility of one’s work, facilitating rapid access to novel data and software resources, and creating new opportunities to interact with and contribute to ongoing communal projects. 55 | -------------------------------------------------------------------------------- /book_module4/overview/overview-benefit.md: -------------------------------------------------------------------------------- 1 | (rr-overview-benefits)= 2 | # Added Advantages 3 | 4 | In the section, we discussed the different aspects of reproducible research that are beneficial for the scientific community. 5 | In this chapter, we will share some less obvious aspects of working reproducibly for individual researchers and teams. 6 | 7 | ![Why we should care about working reproducibly](../figures/reasons_reproducibility.png) 8 | 9 | **1. Track a complete history of your research** 10 | 11 | Reproducible research must contain a complete history and narrative (also known as [Provenance](https://en.wikipedia.org/wiki/Provenance)) of the project planning and development process. 12 | This includes information on the data, tools, methods, codes, and documentation used in the research project. 13 | By storing a complete track-record of our work, we can ensure research sustainability, fair citation/acknowledgment, and usefulness of our and others' work in our research fields. 14 | 15 | **2. Facilitate collaboration and review process** 16 | 17 | By designing reproducible workflows and sharing them with the different components of our research project, we can allow others to develop an in-depth understanding of our work. 18 | This encourages them to review our methods, test our code, propose useful changes and make thoughtful contributions to develop our project further. 19 | Reproducible workflows facilitate the peer review process tremendously by allowing reviewers access to the different parts of the projects that are necessary to validate the research outcomes. 20 | 21 | **3. Publish validated research and avoid misinformation** 22 | 23 | Lack of reproducibility is one of the major factors that lead to paper retractions (source [Retraction Watch](https://retractionwatch.com/)). 24 | The best-known analyses of scientific literatures in psychology {cite}`Begley2012` and cancer biology {cite}`OpenScienceCollaboration2015Reproducibility` found the reproducibility rates of their research output of around 40% and 10%, respectively. 25 | By working reproducibly, we can develop validated research work, avoid misinformation that can limit replicability of our work and publish accurate research outputs. 26 | This aspect does not only support the validity of the current work, but any future studies that are based on reproducible research {cite}`MozillaScienceLab`. 27 | 28 | **4. Write your papers, thesis and reports efficiently** 29 | 30 | Well documented analyses help us maintain easy access to all the results generated within a project that can be written up efficiently. 31 | If working in a team, collaborators can easily get recognition in terms of authorship for their contributions. Furthermore, by availing the underlying dataset and methods we can easily comply with the highest-level journal guidelines. 32 | 33 | **5. Get credits for your work fairly** 34 | 35 | Applying reproducibility practices separately on different parts of the project such as data, independently executable codes and scripts, protocols, and reports allows other researchers to test and reuse our work in their research, and brings fair recognition for our work. 36 | Researchers who publish their work with the underlying information, get cited more often as their research outcome can be broadly replicated and trusted. 37 | This fair credit system encourages researchers to further maintain reproducibility practices in their work. 38 | 39 | **6. Ensure continuity of your work** 40 | 41 | By following guidelines for reproducibility, we can easily communicate our work with different stakeholders such as our supervisors, funders, reviewers, students, and potential collaborators. 42 | This aspect of reproducibility increases the usefulness of our research by enabling others to easily build on our results, and re-use our research materials {cite}`MozillaScienceLab`. 43 | This ensures the continuity of a research idea and can even find fresh applications in other contexts. 44 | Progress of such projects can easily be tracked and continued - either by other researchers, or yourself if you want to build on your own work after a longer period {cite}`Markowetz2015`. 45 | 46 | To learn about other benefits of working reproducibly on Open Research projects are covered in our Open chapter. 47 | 48 | --- 49 | ## References 50 | ```{bibliography} ../references.bib 51 | :filter: docname in docnames 52 | ``` -------------------------------------------------------------------------------- /book_module4/overview/overview-definitions.md: -------------------------------------------------------------------------------- 1 | (rr-overview-definitions)= 2 | # Definitions of Reproducibility 3 | 4 | The most common definition of reproducibility (and replication) was first noted by Claerbout and Karrenbach in 1992 {cite}'ClaerboutKarrenbach1992Reproducibility' and has been used in computational science literature since then. 5 | Another popular definition has been introduced in 2013 by the Association for Computing Machinery (ACM) {cite}`Ivie2018SciComp`, which swapped the meaning of the terms 'reproducible' and 'replicable' compared to Claerbout and Karrenbach. 6 | 7 | The following table contrasts both definitions {cite}`Heroux2018Reproducibility`. 8 | 9 | | Term | Claerbout & Karrenbach | ACM | 10 | | -----|------------------------|-----| 11 | | Reproducible | Authors provide all the necessary data and the computer codes to run the analysis again, re-creating the results.| (Different team, different experimental setup.) The measurement can be obtained with stated precision by a different team, a different measuring system, in a different location on multiple trials. For computational experiments, this means that an independent group can obtain the same result using artifacts which they develop completely independently. | 12 | | Replicable | A study that arrives at the same scientific findings as another study, collecting new data (possibly with different methods) and completing new analyses. | (Different team, same experimental setup.) The measurement can be obtained with stated precision by a different team using the same measurement procedure, the same measuring system, under the same operating conditions, in the same or a different location on multiple trials. For computational experiments, this means that an independent group can obtain the same result using the author's own artifacts. | 13 | 14 | Barba (2018) {cite}`Barba2018Reproducibility` conducted a detailed literature review on the usage of reproducible/replicable covering several disciplines. 15 | Most papers and disciplines use the terminology as defined by Claerbout and Karrenbach, whereas microbiology, immunology and computer science tend to follow the ACM use of reproducibility and replication. 16 | In political science and economics literature, both terms are used interchangeably. 17 | 18 | In addition to these high level definitions of reproducibility, some authors provide more detailed disctinctions. 19 | Victoria Stodden {cite}`Victoria2014Reproducibility`, a prominent scholar on this topic, has for example identified the following further distinctions: 20 | 21 | - _Computational reproducibility_: When detailed information is provided about code, software, hardware and implementation details. 22 | 23 | - _Empirical reproducibility_: When detailed information is provided about non-computational empirical scientific experiments and observations. In practice this is enabled by making data freely available, as well as details of how the data was collected. 24 | 25 | - _Statistical reproducibility_: When detailed information is provided, for example, about the choice of statistical tests, model parameters, and threshold values. This mostly relates to pre-registration of study design to prevent p-value hacking and other manipulations. 26 | 27 | (rr-overview-definitions-table)= 28 | ## Table of definitions for reproducibility 29 | 30 | At _The Turing Way_ we define **reproducible research** as work that can be independently recreated from the same data and the same code that the original team used. 31 | Reproducible is distinct from replicable, robust and generalisable as described in the figure below. 32 | 33 | | ![Kirstie's definition of reproducible research](../figures/ReproducibleMatrix.jpg) | 34 | | -------------------------------------------------------------------------------------------------------- | 35 | | How the Turing Way defines reproducible research | 36 | 37 | The different dimensions of reproducible research described in the matrix above have the following definitions: 38 | 39 | - **Reproducible:** A result is reproducible when the _same_ analysis steps performed on the _same_ dataset consistently produces the _same_ answer. 40 | - **Replicable:** A result is replicable when the _same_ analysis performed on _different_ datasets produces qualitatively similar answers. 41 | - **Robust:** A result is robust when the _same_ dataset is subjected to _different_ analysis workflows to answer the same research question (for example one pipeline written in R and another written in Python) and a qualitatively similar or identical answer is produced. 42 | Robust results show that the work is not dependent on the specificities of the programming language chosen to perform the analysis. 43 | - **Generalisable:** Combining replicable and robust findings allow us to form generalisable results. 44 | Note that running an analysis on a different software implementation and with a different dataset does not provide _generalised_ results. 45 | There will be many more steps to know how well the work applies to all the different aspects of the research question. 46 | Generalisation is an important step towards understanding that the result is not dependent on a particular dataset nor a particular version of the analysis pipeline. 47 | 48 | More information on these definitions can be found in "Reproducibility vs. Replicability: A Brief History of a Confused Terminology" by Hans E. Plesser {cite}`Plesser2018Reproducibility`. 49 | 50 | ## Reproducible but not open 51 | 52 | _The Turing Way_ recognises that some research will use sensitive data that cannot be shared and this handbook will provide guides on how your research can be reproducible without all parts necessarily being open. 53 | 54 | --- 55 | ## References 56 | ```{bibliography} ../references.bib 57 | :filter: docname in docnames 58 | ``` -------------------------------------------------------------------------------- /book_module4/overview/overview-resources.md: -------------------------------------------------------------------------------- 1 | (rr-overview-resources)= 2 | # Resources for reproducibility chapter 3 | For additional resources like videos and reference papers on reproducibility, see the {ref}`rr-overview-resources-reading` and {ref}`rr-overview-resources-addmaterial` sections. 4 | 5 | ## Checklist / Exercise 6 | - [ ] Define reproducibility for yourself. 7 | 8 | ## What to learn next? 9 | Open Research would be a good chapter to read next. 10 | If you want to start learning hands-on practices, we recommend reading the version control chapter next. 11 | 12 | (rr-overview-resources-reading)= 13 | ## Further Reading 14 | 15 | * Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature, 533(7604), 452–454. https://doi.org/10.1038/533452a 16 | 17 | * Barba, L. (2017): Barba-group Reproducibility Syllabus. figshare. Paper. https://doi.org/10.6084/m9.figshare.4879928.v1 18 | 19 | * Piwowar, H. A., & Vision, T. J. (2013). Data reuse and the open data citation advantage. PeerJ, 1, e175. https://doi.org/10.7717/peerj.175 20 | 21 | * Whitaker, Kirstie (2018): Barriers to reproducible research (and how to overcome them). figshare. Paper. https://doi.org/10.6084/m9.figshare.7140050.v2 22 | 23 | (rr-overview-resources-addmaterial)= 24 | ## Additional material 25 | 26 | ### Videos 27 | 28 | * Markowetz, F. (2016). 5 selfish reasons to work reproducibly. Talk at scidata 2016. https://www.youtube.com/watch?v=Is15CMVPHas&feature=youtu.be 29 | 30 | ### Other useful links 31 | 32 | * Markowetz, F. (2018). 5 selfish reasons to work reproducibly. Slides available at https://osf.io/a8wq4/ 33 | 34 | * Leipzig, J (2020). Awesome Reproducible Research: A curated list of reproducible research case studies, projects, tutorials, and media. Github repo. https://github.com/leipzig/awesome-reproducible-research 35 | -------------------------------------------------------------------------------- /book_module4/overview/overview.md: -------------------------------------------------------------------------------- 1 | (rr-overview)= 2 | # Overview of Reproducible Research 3 | 4 | Scientific results and evidence are strengthened if they are reproduced 5 | and confirmed by several independent researchers (see {ref}`definitions `). 6 | With all parts used in an analysis being available and/or documented, valuable time is saved reproducing published results and other researchers can easily build on these research results and re-use data or code for their analyses. 7 | 8 | Learn about the less obvious benefits of working reproducibly in the {ref}`added advantages ` subchapter. 9 | 10 | Major media outlets have [reported on](https://www.theguardian.com/science/2018/aug/27/attempt-to-replicate-major-social-scientific-findings-of-past-decade-fails) investigations showing that a significant percentage of scientific studies cannot be reproduced. 11 | 12 | This leads to other academics and society losing trust in scientific results {cite}`baker2016reproducibility`. 13 | Working reproducibly means others can check your results - even early on in the research process. 14 | Thus, the full analysis and methodology is transparent. 15 | 16 | In addition, so called "negative results" can be published easily, helping avoid other researchers wasting time repeating analyses that will not return the expected results {cite}`Dirnagl2010bias`. 17 | For further reading resources on reproducibility, please checkout the {ref}`resources ` subchapter. 18 | 19 | ## Prerequisites / recommended skill level 20 | No previous knowledge needed. 21 | 22 | --- 23 | ## References 24 | ```{bibliography} ../references.bib 25 | :filter: docname in docnames 26 | ``` 27 | -------------------------------------------------------------------------------- /book_module4/reproducible-research.md: -------------------------------------------------------------------------------- 1 | (rr)= 2 | # Guide for Reproducible Research 3 | 4 | ***This guide covers topics related to skills, tools and best practices for research reproducibility.*** 5 | 6 | _The Turing Way_ defines reproducibility in data research as data and code being available to fully rerun the analysis. 7 | 8 | There are several definitions of reproducibility in use, and we discuss these in more detail in the {ref}`rr-overview-definitions` section of this chapter. 9 | While it it absolutely fine for us each to use different words, it will be useful for you to know how _The Turing Way_ defines *reproducibility* to avoid misunderstandings when reading the rest of the handbook. 10 | 11 | | ![A person showing another person what steps to take to make your data research reproducible](./figures/reproducibility.jpg) | 12 | | ---------------| 13 | | _The Turing Way_ project illustration by Scriberia. Zenodo. [http://doi.org/10.5281/zenodo.3332807](http://doi.org/10.5281/zenodo.3332807) | 14 | 15 | _The Turing Way_ started by defining reproducibility in the context of this handbook, lay out its importance for science and scientists, and provide an overview of the common concepts, tools and resources. 16 | The first few chapters were on {ref}`rr-overview-benefits`, testing and reproducible computational environments. 17 | Since the start of this project in 2019, many additional chapters have been written, edited, reviewed, read and promoted by over 100 contributors. 18 | 19 | We welcome your contributions to improve these chapters and to add other important concepts in reproducibility and how to empower researchers to work reproducibly from the start. 20 | Check out our [contributing guidelines](https://github.com/alan-turing-institute/the-turing-way/blob/master/CONTRIBUTING.md) to get involved. 21 | -------------------------------------------------------------------------------- /book_module4/welcome.md: -------------------------------------------------------------------------------- 1 | # Welcome 2 | 3 | _The Turing Way_ is an open source community-driven guide to reproducible, ethical, inclusive and collaborative data science. 4 | 5 | Our goal is to provide all the information that data scientists in academia, industry, government and in the third sector need at the start of their projects to ensure that they are easy to reproduce and reuse at the end. 6 | 7 | The book started as a guide for reproducibility, covering version control, testing, and continuous integration. 8 | But technical skills are just one aspect of making data science research "open for all". 9 | 10 | In February 2020, _The Turing Way_ expanded to a series of books covering reproducible research, project design, communication, collaboration, and ethical research. 11 | 12 | | ![The Turing Way project is illustrated as a road or path with shops for different data science skills. People can go in and out with their shopping cart and pick and choose what they need](./figures/welcome.jpg) | 13 | | ---------------| 14 | | _The Turing Way_ project illustration by Scriberia. Zenodo. [http://doi.org/10.5281/zenodo.3332807](http://doi.org/10.5281/zenodo.3332807) | 15 | 16 | ## Our community 17 | 18 | _The Turing Way_ community is dedicated to making collaborative, reusable and transparent research "too easy not to do". 19 | That means investing in the socio-technical skills required to work in a team, to build something greater than the any individual person could deliver alone. 20 | 21 | _The Turing Way_ is: 22 | 23 | * a book 24 | * a community 25 | * a global collaboration 26 | 27 | We hope you find the content in the book helpful. 28 | Everything here is available for free under a [CC-BY licence](https://github.com/alan-turing-institute/the-turing-way/blob/master/LICENSE.md). 29 | Please use and re-use whatever you need for any purpose. 30 | 31 | The book is collaboratively written and open from the start. 32 | To make this project truly accessible and useful for everyone, we invite you to contribute your skills and bring your perspectives into this project. 33 | To join this community, please read our [contribution guidelines](https://github.com/alan-turing-institute/the-turing-way/blob/master/CONTRIBUTING.md) and ways to [get in touch](https://github.com/alan-turing-institute/the-turing-way#get-in-touch). 34 | More information about the community and the project is available in the Community Handbook. 35 | We look forward to expanding and building _The Turing Way_ together. 36 | 37 | Although _The Turing Way_ receives support and funding from [The Alan Turing Institute](https://www.turing.ac.uk/), the project is designed to be a global collaboration. 38 | We have contributions from across the UK, and from India, Mexico, Australia, USA, and many European countries. 39 | Chapters have been written, reviewed and curated by members of research institutes and universities, government departments, and industry. 40 | We are committed to creating a space where people with diverse expertise and lived experiences can share their knowledge with others to allow us all to use data science to improve the world. 41 | 42 | We value the participation of every member of our community and want to ensure that every contributor has an enjoyable and fulfilling experience. 43 | Accordingly, everyone who participates in _The Turing Way_ project is expected to show respect and courtesy to other community members at all times. 44 | All contributions must abide by our [code of conduct](https://github.com/alan-turing-institute/the-turing-way/blob/master/CODE_OF_CONDUCT.md). 45 | 46 | ![Gif showing screen capture of contributors table, smiling faces and emojis representing the types of contributions in a table](https://media.giphy.com/media/gKIUisnjpj2PS75nOJ/giphy.gif) 47 | 48 | (cite-tag)= 49 | ## Citing _The Turing Way_ 50 | 51 | All material in _The Turing Way_ is available under a [CC-BY 4.0 licence](https://github.com/alan-turing-institute/the-turing-way/blob/master/LICENSE.md). 52 | 53 | You can cite _The Turing Way_ through the project's Zenodo archive using doi: [10.5281/zenodo.3233853](https://doi.org/10.5281/zenodo.3233853). 54 | 55 | The citation will look something like: 56 | 57 | > The Turing Way Community, Becky Arnold, Louise Bowler, Sarah Gibson, Patricia Herterich, Rosie Higman, … Kirstie Whitaker. (2019, March 25). The Turing Way: A Handbook for Reproducible Data Science (Version v0.0.4). Zenodo. http://doi.org/10.5281/zenodo.3233986 58 | 59 | Please visit the [DOI link](https://doi.org/10.5281/zenodo.3233853) though to get the most recent version - the one above is not automatically generated and therefore may be out of date. 60 | DOIs allow us to archive the repository and they are really valuable to ensure that the work is tracked in academic publications. 61 | 62 | You can also share the human-readable URL to a page in the book, for example: [https://the-turing-way.netlify.app/reproducible-research/overview/overview-definitions.html](./overview/overview-definitions), but be aware that the project is under development and therefore these links may change over time. 63 | You might want to include a [web archive link](http://web.archive.org) such as: [https://web.archive.org/web/20191030093753/https://the-turing-way.netlify.com/reproducibility/03/definitions.html](https://web.archive.org/web/20191030093753/https://the-turing-way.netlify.com/reproducibility/03/definitions.html) to make sure that you don't end up with broken links everywhere! 64 | 65 | We really appreciate any references that you make to _The Turing Way_ project in your work and we hope it is useful. 66 | If you have any questions please [get in touch](https://github.com/alan-turing-institute/the-turing-way#get-in-touch). 67 | -------------------------------------------------------------------------------- /book_module5/_config.yml: -------------------------------------------------------------------------------- 1 | ####################################################################################### 2 | # Book settings 3 | title: The Turing Way # The title of the book 4 | author: The Turing Way Community # The author of the book to be placed in the footer 5 | copyright: '2020' # Copyright year to be placed in the footer 6 | logo: ./figures/logo.png # A path to the book logo 7 | 8 | ####################################################################################### 9 | # HTML-specific settings 10 | html: 11 | favicon: ./figures/favicon-32x32.png # A path to a favicon image 12 | navbar_footer_text: Visit our GitHub 13 | Repository
This book is powered by Jupyter 14 | Book
# Will be displayed underneath the left navigation bar. 15 | home_page_in_navbar: false # Whether to include your home page in the left Navigation Bar 16 | use_repository_button: true # Whether to add a link to your repository button 17 | use_issues_button: true # Whether to add an "open an issue" button 18 | 19 | ####################################################################################### 20 | # Launch button settings 21 | repository: 22 | url: https://github.com/alan-turing-institute/the-turing-way # The URL to your book's repository 23 | path_to_book: book/website # A path to your book's folder, relative to the repository root 24 | branch: master # Which branch of the repository should be used when creating links 25 | 26 | -------------------------------------------------------------------------------- /book_module5/_toc.yml: -------------------------------------------------------------------------------- 1 | - file: welcome 2 | - file: reproducible-research 3 | title: Reproducibility Guide 4 | chapters: 5 | - file: overview/overview 6 | title: Overview 7 | sections: 8 | - file: overview/overview-definitions 9 | title: Definitions 10 | - file: overview/overview-benefit 11 | title: Benefits 12 | - file: overview/overview-resources 13 | title: Resources 14 | - file: open/open 15 | title: Open Research 16 | sections: 17 | - file: open/open-data 18 | title: Open Data 19 | - file: open/open-source 20 | title: Open Source 21 | - file: open/open-hardware 22 | title: Open Hardware 23 | - file: open/open-access 24 | title: Open Access 25 | - file: open/open-notebooks 26 | title: Open Notebooks 27 | - file: open/open-scholarship 28 | title: Open Scholarship -------------------------------------------------------------------------------- /book_module5/demo.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Demo notebook" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "We can also create parts of our Jupyter Book based on Jupyter Notebooks." 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "Let's simulate data for two conditions and print their first ten rows:" 22 | ] 23 | }, 24 | { 25 | "cell_type": "code", 26 | "execution_count": null, 27 | "metadata": {}, 28 | "outputs": [], 29 | "source": [ 30 | "import numpy as np\n", 31 | "\n", 32 | "cond_1 = np.random.rand(100)\n", 33 | "print(f'Condition 1 = {cond_1[:10]}')\n", 34 | "\n", 35 | "cond_2 = cond_1 + (np.random.rand(100))\n", 36 | "print(f'Condition 2 = {cond_2[:10]}')" 37 | ] 38 | }, 39 | { 40 | "cell_type": "markdown", 41 | "metadata": {}, 42 | "source": [ 43 | "We can also display in our Jupyter Book more complex datastructures, like pandas dataframes:" 44 | ] 45 | }, 46 | { 47 | "cell_type": "code", 48 | "execution_count": null, 49 | "metadata": {}, 50 | "outputs": [], 51 | "source": [ 52 | "import pandas as pd\n", 53 | "\n", 54 | "df = pd.DataFrame(\n", 55 | " {'condition_1': cond_1, 'condition_2': cond_2}, \n", 56 | " index=np.arange(100)\n", 57 | ")\n", 58 | "\n", 59 | "df[:10]" 60 | ] 61 | }, 62 | { 63 | "cell_type": "markdown", 64 | "metadata": {}, 65 | "source": [ 66 | "And of course, we can display plots as well!" 67 | ] 68 | }, 69 | { 70 | "cell_type": "code", 71 | "execution_count": null, 72 | "metadata": {}, 73 | "outputs": [], 74 | "source": [ 75 | "import matplotlib.pyplot as plt\n", 76 | "\n", 77 | "plt.scatter(cond_1, cond_2, alpha=.6)\n", 78 | "plt.xlabel('condition 1')\n", 79 | "plt.ylabel('condition 2')\n", 80 | "plt.title('Scatterplot')\n", 81 | "plt.show()" 82 | ] 83 | } 84 | ], 85 | "metadata": { 86 | "kernelspec": { 87 | "display_name": "Python 3", 88 | "language": "python", 89 | "name": "python3" 90 | }, 91 | "language_info": { 92 | "codemirror_mode": { 93 | "name": "ipython", 94 | "version": 3 95 | }, 96 | "file_extension": ".py", 97 | "mimetype": "text/x-python", 98 | "name": "python", 99 | "nbconvert_exporter": "python", 100 | "pygments_lexer": "ipython3", 101 | "version": "3.7.4" 102 | } 103 | }, 104 | "nbformat": 4, 105 | "nbformat_minor": 4 106 | } -------------------------------------------------------------------------------- /book_module5/figures/ReproducibleMatrix.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module5/figures/ReproducibleMatrix.jpg -------------------------------------------------------------------------------- /book_module5/figures/favicon-32x32.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module5/figures/favicon-32x32.png -------------------------------------------------------------------------------- /book_module5/figures/logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module5/figures/logo.png -------------------------------------------------------------------------------- /book_module5/figures/open_access_citatations.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module5/figures/open_access_citatations.jpg -------------------------------------------------------------------------------- /book_module5/figures/open_umbrella.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module5/figures/open_umbrella.png -------------------------------------------------------------------------------- /book_module5/figures/reasons_reproducibility.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module5/figures/reasons_reproducibility.png -------------------------------------------------------------------------------- /book_module5/figures/reproducibility.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module5/figures/reproducibility.jpg -------------------------------------------------------------------------------- /book_module5/figures/welcome.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module5/figures/welcome.jpg -------------------------------------------------------------------------------- /book_module5/open/open-notebooks.md: -------------------------------------------------------------------------------- 1 | # Open notebooks 2 | 3 | Electronic Lab Notebooks (ELNs) enable researchers to organize and store experimental procedures, protocols, plans, notes, data, and even unfiltered interpretations using their computer or mobile device. 4 | They are a digital analogue to the paper notebook most researchers keep. 5 | ELNs can offer several advantages over the traditional paper notebook in documenting research during the active phase of a project, including searchability within and across notebooks, secure storage with multiple redundancies, remote access to notebooks, and the ability to easily share notebooks among team members and collaborators. 6 | 7 | Open notebook research is simply the practice of making such notebooks openly available, usually online. 8 | Some researchers choose to keep their notebooks open from the very beginning of their projects. 9 | Rather than wait months, even years, to share their research through journal publication as is the current practice, this allows researchers to post their experimental data and protocols online and in real-time. 10 | Sharing research in this open and timely manner helps to reduce duplication of work, helps foster new collaborations, and cultivates a more open dialogue with others. 11 | It also helps researchers avoid making exploring dead ends and making mistakes that have already been covered by their colleague, but went unpublished because of lack of scientific interest. 12 | 13 | Open notebooks have the further benefit of increasing the quality of scientific outputs by forcing researchers to be careful, thorough, and explicit. 14 | Making research open has the added benefit of increasing the likelihood that any errors made in an investigation will be spotted quickly, instead of down the line. 15 | Immediate fixes will have much less impact on a research project, which will save a research time, the lab money, and pride. 16 | 17 | Ideally, every scientist would maintain an open notebook in real-time which would encompass all aspects of their research. 18 | But many fears about dealing with complete open access, conflicts with intellectual property and publications, and online data overload hamper this movement. 19 | To combat this, practitioners encourage any form of open notebook research, "make open what you can", even if that means uploading some information for a project from many years ago that never saw the light of day. 20 | -------------------------------------------------------------------------------- /book_module5/open/open-scholarship.md: -------------------------------------------------------------------------------- 1 | # Open scholarship 2 | 3 | Open research and its subcomponents fit under the umbrella of a broader concept - open scholarship. 4 | 5 | ![open_umbrella](../figures/open_umbrella.png) 6 | 7 | ## Open educational resources 8 | 9 | Open educational resources (OERs) are teaching and learning materials that can be freely used and reused for learning and/or teaching at no cost, and without needing to ask permission. Examples are courses, including Massive Online Open Courses (MOOCs), lectures, teaching materials, assignments, and various other resources. OERs are available in many different formats compatible with online usage, most obviously text, images, audio, and video. Anyone with internet access can access and use OERs; access is not dependent on location or membership of a particular institution. 10 | 11 | Unlike copyrighted resources, OERs have been authored or created by an individual or organization that chooses to retain few, if any, ownership rights. In some cases, that means anyone can download a resource and share it with colleagues and students. In other cases, this may go further and enable people to edit resources and then re-post them as a remixed work. How do you know your options? OERs often have a Creative Commons licence or other permission to let you know how the material may be used, reused, adapted, and shared. 12 | 13 | Fully open OERs comply with the 5 Rs: 14 | 15 | - Retain: the right to make, own, and control copies of the content. 16 | - Reuse: the right to use the content in a wide range of ways (for example, in a class, in a study group, on a website, in a video). 17 | - Revise: the right to adapt, adjust, modify, or alter the content itself (for example, translate the content into another language). 18 | - Remix: the right to combine the original or revised content with other open content to create something new (for example, incorporate the content into a mashup). 19 | - Redistribute: the right to share copies of the original content, your revisions, or your remixes with others (for example, give a copy of the content to a friend). 20 | 21 | Researchers generate a great deal of educational resources in the course of teaching students and each other (at workshops, for example). 22 | By making these openly available, for example in the [open educational resource commons](https://www.oercommons.org/), the wider community can benefit from them in three main ways: 23 | 24 | 1. Most obviously, the community can use the materials to learn about the material they cover. 25 | 2. Sharing resources reduces duplication of effort. 26 | If an educator needs materials for teaching and such materials already exist openly then they need not make their own from scratch, saving time. 27 | 3. Making materials openly available helps a community build better resources by improving resources that already exist and combining OERs to take advantage of their different strengths, such as a great diagram or explanation. 28 | 29 | Beyond the raw practical benefits the worldwide OER movement is rooted in the human right to access high-quality education. 30 | This shift in educational practice is about participation and co-creation. 31 | Open Educational Resources (OERs) offer opportunities for systemic change in teaching and learning content through engaging educators in new participatory processes and effective technologies for engaging with learning. 32 | 33 | ## Equity, diversity, inclusion 34 | 35 | Open scholarship means open to *everyone* without discrimination based on factors such as race, gender, sexual orientation, or any number of other factors. 36 | As a community we should undertake to ensure equitable opportunities for all. 37 | We can go about that by deliberately fostering welcoming, inclusive cultures within out communities. 38 | For example, reasonable accommodations should be made wherever possible to include community members with disabilities to enable them to participate fully, and this can be as simple as choosing colourblind-safe colour schemes when making graphs. 39 | 40 | ## Citizen science 41 | 42 | Citizen science is the involvement of the public in scientific research – whether community-driven research or global investigations, the Oxford English Dictionary recently defined it as: "scientific work undertaken by members of the general public, often in collaboration with or under the direction of professional scientists and scientific institutions". 43 | Citizen science offers the power of science to everyone, and the power of everyone to science. 44 | 45 | By allowing members of the public to contribute to scientific research, citizen science helps engage and invest the wider world in science. 46 | It also benefits researchers by offering manpower that simply would not be accessible otherwise. 47 | Examples of this include [finding](https://citizensciencegames.com/games/eterna/) ways of folding molecules, and [classifying](https://www.zooniverse.org/) different types of galaxies. 48 | 49 | ## Patient and Public Involvement 50 | Whilst citizen science encompasses one way of contributing to scientific research, Patient and Public Involvement (PPI) is a far more specialised form of citizen science which is particularly useful when doing research on health and/or social issues. 51 | 52 | PPI is *not*: 53 | - Participation: Recruitment of participants (such as for a clinical trial or survey) to contribute data to a project. 54 | - Engagement: Dissemination, such as presenting at patient interest groups or writing a blog post. 55 | 56 | PPI *is*: 57 | - Involvement: patients and members of the public contribute at *all* stages of the research cycle. 58 | 59 | When incorporating PPI into research, researchers work *with* volunteers, rather than doing work *about* them. 60 | PPI volunteers are usually patients or members of the public with a particular interest in some area of research which means that the topic is often very personal, and being involved in the research cycle can be an empowering experience. 61 | For the researcher, PPI often generates unique and invaluable insights from the volunteers' own personal expertise which cannot always be predicted by the researchers themselves. 62 | 63 | It is a good idea to consider PPI very early in a project, ideally before any grant applications or submissions for ethical approval have been written. 64 | PPI volunteers can help researchers in many ways, such as the following: 65 | 1. Generate or shape research questions. 66 | 2. Contribute to, or review, study design. 67 | 3. Help with grant applications or submissions to research ethics committees (particularly the lay summary). 68 | 4. Collect data. 69 | 5. Analyse data. 70 | 6. Contribute to the manuscript and be listed as a co-author. 71 | 7. Disseminate findings in plain English. 72 | 73 | One of the biggest barriers to PPI is not knowing how to get started. 74 | The UK National Institute for Health Research have their own site, [INVOLVE](https://www.invo.org.uk/), to help familiarise yourself with the foundations of PPI. 75 | Additionally, charities related to your specific research field may be able to facilitate or support PPI; for example [Cancer Research UK](https://www.cancerresearchuk.org/funding-for-researchers/patient-involvement-toolkit-for-researchers) and [Parkinson's UK](https://www.parkinsons.org.uk/research/patient-and-public-involvement-ppi) have formal guides in place that provide a comprehensive overview of PPI. 76 | -------------------------------------------------------------------------------- /book_module5/open/open.md: -------------------------------------------------------------------------------- 1 | (rr-open)= 2 | # Open research 3 | 4 | ## Prerequisites / recommended skill level 5 | 6 | | Prerequisite | Importance | Notes | 7 | | -------------|----------|------| 8 | | Experience with version control | Helpful | Experience with GitHub is particularly useful | 9 | 10 | ## Summary 11 | 12 | Open research aims to transform research by making it more reproducible, transparent, re-usable, collaborative, accountable, and accessible to society. It pushes for change in the way that research is carried out and disseminated by digital tools. One definition of open research, [as given by the Organisation for Economic Co-operation and Development (OECD)](https://www.fct.pt/dsi/docs/Making_Open_Science_a_Reality.pdf "Making Open Science a Reality, OECD Science, Technology and Industry Policy Papers No. 25"), is the practice of making "the primary outputs of publicly funded research results – publications and the research data – publicly accessible in digital format with no or minimal restriction." In order to achieve this openness in research, each element of the research process should: 13 | 14 | - Be publicly available: It is difficult to use and benefit from knowledge hidden behind barriers such as passwords and paywalls. 15 | - Be reusable: Research outputs need to be licensed appropriately so that prospective users clearly know any limitations on re-use. 16 | - Be transparent: With appropriate metadata to provide clear statements of how research output was produced and what it contains. 17 | 18 | The research process typically has the following form: data is collected and then analysed (usually using software). This process may involve the use of specialist hardware. The results of the research are then published. Throughout the process it is good practice for researchers to document their working in notebooks. Open research aims to make each of these elements open: 19 | 20 | - Open data: Documenting and sharing research data openly for re-use. 21 | - Open source software: Documenting research code and routines, and making them freely accessible and available. 22 | - Open hardware: Documenting designs, materials, and other relevant information related to hardware, and making them freely accessible and available. 23 | - Open access: Making all published outputs freely accessible for maximum use and impact. 24 | - Open notebooks: An emerging practice, documenting and sharing the experimental process of trial and error. 25 | 26 | These elements are expanded upon in this chapter. 27 | 28 | Open scholarship is a concept that extends open research further. It relates to making other aspects of scientific research open to the public, for example: 29 | 30 | - Open educational resources: Making educational resources publicly available to be re-used and modified. 31 | - Equity, diversity, inclusion: Ensuring scholarship is open to anyone without barriers based on factors such as race, background, gender, and sexual orientation. 32 | - Citizen science: The inclusion of members of the public in scientific research. 33 | 34 | These elements are also discussed in detail in this chapter. 35 | 36 | ## How this will help you / why this is useful 37 | 38 | There are five main schools of thought motivating open practices to benefit research: 39 | 40 | | School | Belief | Aim | 41 | | -------------------------- | -------------------- | ------------------------------------------------- | 42 | | Infrastructure | Efficient research depends on the available tools and applications. | Creating openly available platforms, tools, and services for researchers. | 43 | | Pragmatic | Knowledge-creation could be more efficient if researchers worked together. | Opening up the process of knowledge creation. | 44 | | Measurement | Academic contributions today need alternative impact measurements. | Developing an alternative metric system for research impact. | 45 | | Democratic | The access to knowledge is unequally distributed. | Making knowledge freely available for everyone. | 46 | | Public | Research needs to be made accessible to the public. | Making research accessible for citizens. | 47 | 48 | Open practices also benefit the researchers that propagate them. For example there is evidence [(Mckiernan et al. 2016)](https://elifesciences.org/articles/16800) that open access articles are cited more often, as shown by the metastudy presented in the figure below. 49 | 50 | | ![open_access_citatations](../figures/open_access_citatations.jpg) | 51 | | -----------------------------------------------------| 52 | | The relative citation rate (OA: non-OA) in 19 fields of research. This rate is defined as the mean citation rate of OA articles divided by the mean citation rate of non-OA articles. Multiple points for the same discipline indicate different estimates from the same study, or estimates from several studies. (See footnote 1 for references.) | 53 | 54 | Another benefit of openness is that while research collaborations are essential to advancing knowledge, identifying and connecting with appropriate collaborators is not trivial. Open practices can make it easier for researchers to connect with one another by increasing the discoverability and visibility of one’s work, facilitating rapid access to novel data and software resources, and creating new opportunities to interact with and contribute to ongoing communal projects. 55 | -------------------------------------------------------------------------------- /book_module5/overview/overview-benefit.md: -------------------------------------------------------------------------------- 1 | (rr-overview-benefits)= 2 | # Added Advantages 3 | 4 | In the section, we discussed the different aspects of reproducible research that are beneficial for the scientific community. 5 | In this chapter, we will share some less obvious aspects of working reproducibly for individual researchers and teams. 6 | 7 | ![Why we should care about working reproducibly](../figures/reasons_reproducibility.png) 8 | 9 | **1. Track a complete history of your research** 10 | 11 | Reproducible research must contain a complete history and narrative (also known as [Provenance](https://en.wikipedia.org/wiki/Provenance)) of the project planning and development process. 12 | This includes information on the data, tools, methods, codes, and documentation used in the research project. 13 | By storing a complete track-record of our work, we can ensure research sustainability, fair citation/acknowledgment, and usefulness of our and others' work in our research fields. 14 | 15 | **2. Facilitate collaboration and review process** 16 | 17 | By designing reproducible workflows and sharing them with the different components of our research project, we can allow others to develop an in-depth understanding of our work. 18 | This encourages them to review our methods, test our code, propose useful changes and make thoughtful contributions to develop our project further. 19 | Reproducible workflows facilitate the peer review process tremendously by allowing reviewers access to the different parts of the projects that are necessary to validate the research outcomes. 20 | 21 | **3. Publish validated research and avoid misinformation** 22 | 23 | Lack of reproducibility is one of the major factors that lead to paper retractions (source [Retraction Watch](https://retractionwatch.com/)). 24 | The best-known analyses of scientific literatures in psychology {cite}`Begley2012` and cancer biology {cite}`OpenScienceCollaboration2015Reproducibility` found the reproducibility rates of their research output of around 40% and 10%, respectively. 25 | By working reproducibly, we can develop validated research work, avoid misinformation that can limit replicability of our work and publish accurate research outputs. 26 | This aspect does not only support the validity of the current work, but any future studies that are based on reproducible research {cite}`MozillaScienceLab`. 27 | 28 | **4. Write your papers, thesis and reports efficiently** 29 | 30 | Well documented analyses help us maintain easy access to all the results generated within a project that can be written up efficiently. 31 | If working in a team, collaborators can easily get recognition in terms of authorship for their contributions. Furthermore, by availing the underlying dataset and methods we can easily comply with the highest-level journal guidelines. 32 | 33 | **5. Get credits for your work fairly** 34 | 35 | Applying reproducibility practices separately on different parts of the project such as data, independently executable codes and scripts, protocols, and reports allows other researchers to test and reuse our work in their research, and brings fair recognition for our work. 36 | Researchers who publish their work with the underlying information, get cited more often as their research outcome can be broadly replicated and trusted. 37 | This fair credit system encourages researchers to further maintain reproducibility practices in their work. 38 | 39 | **6. Ensure continuity of your work** 40 | 41 | By following guidelines for reproducibility, we can easily communicate our work with different stakeholders such as our supervisors, funders, reviewers, students, and potential collaborators. 42 | This aspect of reproducibility increases the usefulness of our research by enabling others to easily build on our results, and re-use our research materials {cite}`MozillaScienceLab`. 43 | This ensures the continuity of a research idea and can even find fresh applications in other contexts. 44 | Progress of such projects can easily be tracked and continued - either by other researchers, or yourself if you want to build on your own work after a longer period {cite}`Markowetz2015`. 45 | 46 | To learn about other benefits of working reproducibly on Open Research projects are covered in our Open chapter. 47 | 48 | --- 49 | ## References 50 | ```{bibliography} ../references.bib 51 | :filter: docname in docnames 52 | ``` -------------------------------------------------------------------------------- /book_module5/overview/overview-definitions.md: -------------------------------------------------------------------------------- 1 | (rr-overview-definitions)= 2 | # Definitions of Reproducibility 3 | 4 | The most common definition of reproducibility (and replication) was first noted by Claerbout and Karrenbach in 1992 {cite}'ClaerboutKarrenbach1992Reproducibility' and has been used in computational science literature since then. 5 | Another popular definition has been introduced in 2013 by the Association for Computing Machinery (ACM) {cite}`Ivie2018SciComp`, which swapped the meaning of the terms 'reproducible' and 'replicable' compared to Claerbout and Karrenbach. 6 | 7 | The following table contrasts both definitions {cite}`Heroux2018Reproducibility`. 8 | 9 | | Term | Claerbout & Karrenbach | ACM | 10 | | -----|------------------------|-----| 11 | | Reproducible | Authors provide all the necessary data and the computer codes to run the analysis again, re-creating the results.| (Different team, different experimental setup.) The measurement can be obtained with stated precision by a different team, a different measuring system, in a different location on multiple trials. For computational experiments, this means that an independent group can obtain the same result using artifacts which they develop completely independently. | 12 | | Replicable | A study that arrives at the same scientific findings as another study, collecting new data (possibly with different methods) and completing new analyses. | (Different team, same experimental setup.) The measurement can be obtained with stated precision by a different team using the same measurement procedure, the same measuring system, under the same operating conditions, in the same or a different location on multiple trials. For computational experiments, this means that an independent group can obtain the same result using the author's own artifacts. | 13 | 14 | Barba (2018) {cite}`Barba2018Reproducibility` conducted a detailed literature review on the usage of reproducible/replicable covering several disciplines. 15 | Most papers and disciplines use the terminology as defined by Claerbout and Karrenbach, whereas microbiology, immunology and computer science tend to follow the ACM use of reproducibility and replication. 16 | In political science and economics literature, both terms are used interchangeably. 17 | 18 | In addition to these high level definitions of reproducibility, some authors provide more detailed disctinctions. 19 | Victoria Stodden {cite}`Victoria2014Reproducibility`, a prominent scholar on this topic, has for example identified the following further distinctions: 20 | 21 | - _Computational reproducibility_: When detailed information is provided about code, software, hardware and implementation details. 22 | 23 | - _Empirical reproducibility_: When detailed information is provided about non-computational empirical scientific experiments and observations. In practice this is enabled by making data freely available, as well as details of how the data was collected. 24 | 25 | - _Statistical reproducibility_: When detailed information is provided, for example, about the choice of statistical tests, model parameters, and threshold values. This mostly relates to pre-registration of study design to prevent p-value hacking and other manipulations. 26 | 27 | (rr-overview-definitions-table)= 28 | ## Table of definitions for reproducibility 29 | 30 | At _The Turing Way_ we define **reproducible research** as work that can be independently recreated from the same data and the same code that the original team used. 31 | Reproducible is distinct from replicable, robust and generalisable as described in the figure below. 32 | 33 | | ![Kirstie's definition of reproducible research](../figures/ReproducibleMatrix.jpg) | 34 | | -------------------------------------------------------------------------------------------------------- | 35 | | How the Turing Way defines reproducible research | 36 | 37 | The different dimensions of reproducible research described in the matrix above have the following definitions: 38 | 39 | - **Reproducible:** A result is reproducible when the _same_ analysis steps performed on the _same_ dataset consistently produces the _same_ answer. 40 | - **Replicable:** A result is replicable when the _same_ analysis performed on _different_ datasets produces qualitatively similar answers. 41 | - **Robust:** A result is robust when the _same_ dataset is subjected to _different_ analysis workflows to answer the same research question (for example one pipeline written in R and another written in Python) and a qualitatively similar or identical answer is produced. 42 | Robust results show that the work is not dependent on the specificities of the programming language chosen to perform the analysis. 43 | - **Generalisable:** Combining replicable and robust findings allow us to form generalisable results. 44 | Note that running an analysis on a different software implementation and with a different dataset does not provide _generalised_ results. 45 | There will be many more steps to know how well the work applies to all the different aspects of the research question. 46 | Generalisation is an important step towards understanding that the result is not dependent on a particular dataset nor a particular version of the analysis pipeline. 47 | 48 | More information on these definitions can be found in "Reproducibility vs. Replicability: A Brief History of a Confused Terminology" by Hans E. Plesser {cite}`Plesser2018Reproducibility`. 49 | 50 | ## Reproducible but not open 51 | 52 | _The Turing Way_ recognises that some research will use sensitive data that cannot be shared and this handbook will provide guides on how your research can be reproducible without all parts necessarily being open. 53 | 54 | --- 55 | ## References 56 | ```{bibliography} ../references.bib 57 | :filter: docname in docnames 58 | ``` -------------------------------------------------------------------------------- /book_module5/overview/overview-resources.md: -------------------------------------------------------------------------------- 1 | (rr-overview-resources)= 2 | # Resources for reproducibility chapter 3 | For additional resources like videos and reference papers on reproducibility, see the {ref}`rr-overview-resources-reading` and {ref}`rr-overview-resources-addmaterial` sections. 4 | 5 | ## Checklist / Exercise 6 | - [ ] Define reproducibility for yourself. 7 | 8 | ## What to learn next? 9 | Open Research would be a good chapter to read next. 10 | If you want to start learning hands-on practices, we recommend reading the version control chapter next. 11 | 12 | (rr-overview-resources-reading)= 13 | ## Further Reading 14 | 15 | * Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature, 533(7604), 452–454. https://doi.org/10.1038/533452a 16 | 17 | * Barba, L. (2017): Barba-group Reproducibility Syllabus. figshare. Paper. https://doi.org/10.6084/m9.figshare.4879928.v1 18 | 19 | * Piwowar, H. A., & Vision, T. J. (2013). Data reuse and the open data citation advantage. PeerJ, 1, e175. https://doi.org/10.7717/peerj.175 20 | 21 | * Whitaker, Kirstie (2018): Barriers to reproducible research (and how to overcome them). figshare. Paper. https://doi.org/10.6084/m9.figshare.7140050.v2 22 | 23 | (rr-overview-resources-addmaterial)= 24 | ## Additional material 25 | 26 | ### Videos 27 | 28 | * Markowetz, F. (2016). 5 selfish reasons to work reproducibly. Talk at scidata 2016. https://www.youtube.com/watch?v=Is15CMVPHas&feature=youtu.be 29 | 30 | ### Other useful links 31 | 32 | * Markowetz, F. (2018). 5 selfish reasons to work reproducibly. Slides available at https://osf.io/a8wq4/ 33 | 34 | * Leipzig, J (2020). Awesome Reproducible Research: A curated list of reproducible research case studies, projects, tutorials, and media. Github repo. https://github.com/leipzig/awesome-reproducible-research 35 | -------------------------------------------------------------------------------- /book_module5/overview/overview.md: -------------------------------------------------------------------------------- 1 | (rr-overview)= 2 | # Overview of Reproducible Research 3 | 4 | Scientific results and evidence are strengthened if they are reproduced 5 | and confirmed by several independent researchers (see {ref}`definitions `). 6 | With all parts used in an analysis being available and/or documented, valuable time is saved reproducing published results and other researchers can easily build on these research results and re-use data or code for their analyses. 7 | 8 | Learn about the less obvious benefits of working reproducibly in the {ref}`added advantages ` subchapter. 9 | 10 | Major media outlets have [reported on](https://www.theguardian.com/science/2018/aug/27/attempt-to-replicate-major-social-scientific-findings-of-past-decade-fails) investigations showing that a significant percentage of scientific studies cannot be reproduced. 11 | 12 | This leads to other academics and society losing trust in scientific results {cite}`baker2016reproducibility`. 13 | Working reproducibly means others can check your results - even early on in the research process. 14 | Thus, the full analysis and methodology is transparent. 15 | 16 | In addition, so called "negative results" can be published easily, helping avoid other researchers wasting time repeating analyses that will not return the expected results {cite}`Dirnagl2010bias`. 17 | For further reading resources on reproducibility, please checkout the {ref}`resources ` subchapter. 18 | 19 | ## Prerequisites / recommended skill level 20 | No previous knowledge needed. 21 | 22 | --- 23 | ## References 24 | ```{bibliography} ../references.bib 25 | :filter: docname in docnames 26 | ``` 27 | -------------------------------------------------------------------------------- /book_module5/reproducible-research.md: -------------------------------------------------------------------------------- 1 | (rr)= 2 | # Guide for Reproducible Research 3 | 4 | ***This guide covers topics related to skills, tools and best practices for research reproducibility.*** 5 | 6 | _The Turing Way_ defines reproducibility in data research as data and code being available to fully rerun the analysis. 7 | 8 | There are several definitions of reproducibility in use, and we discuss these in more detail in the {ref}`rr-overview-definitions` section of this chapter. 9 | While it it absolutely fine for us each to use different words, it will be useful for you to know how _The Turing Way_ defines *reproducibility* to avoid misunderstandings when reading the rest of the handbook. 10 | 11 | | ![A person showing another person what steps to take to make your data research reproducible](./figures/reproducibility.jpg) | 12 | | ---------------| 13 | | _The Turing Way_ project illustration by Scriberia. Zenodo. [http://doi.org/10.5281/zenodo.3332807](http://doi.org/10.5281/zenodo.3332807) | 14 | 15 | _The Turing Way_ started by defining reproducibility in the context of this handbook, lay out its importance for science and scientists, and provide an overview of the common concepts, tools and resources. 16 | The first few chapters were on {ref}`rr-overview-benefits`, testing and reproducible computational environments. 17 | Since the start of this project in 2019, many additional chapters have been written, edited, reviewed, read and promoted by over 100 contributors. 18 | 19 | We welcome your contributions to improve these chapters and to add other important concepts in reproducibility and how to empower researchers to work reproducibly from the start. 20 | Check out our [contributing guidelines](https://github.com/alan-turing-institute/the-turing-way/blob/master/CONTRIBUTING.md) to get involved. 21 | -------------------------------------------------------------------------------- /book_module5/welcome.md: -------------------------------------------------------------------------------- 1 | # Welcome 2 | 3 | _The Turing Way_ is an open source community-driven guide to reproducible, ethical, inclusive and collaborative data science. 4 | 5 | Our goal is to provide all the information that data scientists in academia, industry, government and in the third sector need at the start of their projects to ensure that they are easy to reproduce and reuse at the end. 6 | 7 | The book started as a guide for reproducibility, covering version control, testing, and continuous integration. 8 | But technical skills are just one aspect of making data science research "open for all". 9 | 10 | In February 2020, _The Turing Way_ expanded to a series of books covering reproducible research, project design, communication, collaboration, and ethical research. 11 | 12 | | ![The Turing Way project is illustrated as a road or path with shops for different data science skills. People can go in and out with their shopping cart and pick and choose what they need](./figures/welcome.jpg) | 13 | | ---------------| 14 | | _The Turing Way_ project illustration by Scriberia. Zenodo. [http://doi.org/10.5281/zenodo.3332807](http://doi.org/10.5281/zenodo.3332807) | 15 | 16 | ## Our community 17 | 18 | _The Turing Way_ community is dedicated to making collaborative, reusable and transparent research "too easy not to do". 19 | That means investing in the socio-technical skills required to work in a team, to build something greater than the any individual person could deliver alone. 20 | 21 | _The Turing Way_ is: 22 | 23 | * a book 24 | * a community 25 | * a global collaboration 26 | 27 | We hope you find the content in the book helpful. 28 | Everything here is available for free under a [CC-BY licence](https://github.com/alan-turing-institute/the-turing-way/blob/master/LICENSE.md). 29 | Please use and re-use whatever you need for any purpose. 30 | 31 | The book is collaboratively written and open from the start. 32 | To make this project truly accessible and useful for everyone, we invite you to contribute your skills and bring your perspectives into this project. 33 | To join this community, please read our [contribution guidelines](https://github.com/alan-turing-institute/the-turing-way/blob/master/CONTRIBUTING.md) and ways to [get in touch](https://github.com/alan-turing-institute/the-turing-way#get-in-touch). 34 | More information about the community and the project is available in the Community Handbook. 35 | We look forward to expanding and building _The Turing Way_ together. 36 | 37 | Although _The Turing Way_ receives support and funding from [The Alan Turing Institute](https://www.turing.ac.uk/), the project is designed to be a global collaboration. 38 | We have contributions from across the UK, and from India, Mexico, Australia, USA, and many European countries. 39 | Chapters have been written, reviewed and curated by members of research institutes and universities, government departments, and industry. 40 | We are committed to creating a space where people with diverse expertise and lived experiences can share their knowledge with others to allow us all to use data science to improve the world. 41 | 42 | We value the participation of every member of our community and want to ensure that every contributor has an enjoyable and fulfilling experience. 43 | Accordingly, everyone who participates in _The Turing Way_ project is expected to show respect and courtesy to other community members at all times. 44 | All contributions must abide by our [code of conduct](https://github.com/alan-turing-institute/the-turing-way/blob/master/CODE_OF_CONDUCT.md). 45 | 46 | ![Gif showing screen capture of contributors table, smiling faces and emojis representing the types of contributions in a table](https://media.giphy.com/media/gKIUisnjpj2PS75nOJ/giphy.gif) 47 | 48 | (cite-tag)= 49 | ## Citing _The Turing Way_ 50 | 51 | All material in _The Turing Way_ is available under a [CC-BY 4.0 licence](https://github.com/alan-turing-institute/the-turing-way/blob/master/LICENSE.md). 52 | 53 | You can cite _The Turing Way_ through the project's Zenodo archive using doi: [10.5281/zenodo.3233853](https://doi.org/10.5281/zenodo.3233853). 54 | 55 | The citation will look something like: 56 | 57 | > The Turing Way Community, Becky Arnold, Louise Bowler, Sarah Gibson, Patricia Herterich, Rosie Higman, … Kirstie Whitaker. (2019, March 25). The Turing Way: A Handbook for Reproducible Data Science (Version v0.0.4). Zenodo. http://doi.org/10.5281/zenodo.3233986 58 | 59 | Please visit the [DOI link](https://doi.org/10.5281/zenodo.3233853) though to get the most recent version - the one above is not automatically generated and therefore may be out of date. 60 | DOIs allow us to archive the repository and they are really valuable to ensure that the work is tracked in academic publications. 61 | 62 | You can also share the human-readable URL to a page in the book, for example: [https://the-turing-way.netlify.app/reproducible-research/overview/overview-definitions.html](./overview/overview-definitions), but be aware that the project is under development and therefore these links may change over time. 63 | You might want to include a [web archive link](http://web.archive.org) such as: [https://web.archive.org/web/20191030093753/https://the-turing-way.netlify.com/reproducibility/03/definitions.html](https://web.archive.org/web/20191030093753/https://the-turing-way.netlify.com/reproducibility/03/definitions.html) to make sure that you don't end up with broken links everywhere! 64 | 65 | We really appreciate any references that you make to _The Turing Way_ project in your work and we hope it is useful. 66 | If you have any questions please [get in touch](https://github.com/alan-turing-institute/the-turing-way#get-in-touch). 67 | -------------------------------------------------------------------------------- /book_module6/_config.yml: -------------------------------------------------------------------------------- 1 | ####################################################################################### 2 | # Book settings 3 | title: The Turing Way # The title of the book 4 | author: The Turing Way Community # The author of the book to be placed in the footer 5 | copyright: '2020' # Copyright year to be placed in the footer 6 | logo: ./figures/logo.png # A path to the book logo 7 | 8 | ####################################################################################### 9 | # HTML-specific settings 10 | html: 11 | favicon: ./figures/favicon-32x32.png # A path to a favicon image 12 | navbar_footer_text: Visit our GitHub 13 | Repository
This book is powered by Jupyter 14 | Book
# Will be displayed underneath the left navigation bar. 15 | home_page_in_navbar: false # Whether to include your home page in the left Navigation Bar 16 | use_repository_button: true # Whether to add a link to your repository button 17 | use_issues_button: true # Whether to add an "open an issue" button 18 | 19 | ####################################################################################### 20 | # Launch button settings 21 | repository: 22 | url: https://github.com/alan-turing-institute/the-turing-way # The URL to your book's repository 23 | path_to_book: book/website # A path to your book's folder, relative to the repository root 24 | branch: master # Which branch of the repository should be used when creating links 25 | 26 | -------------------------------------------------------------------------------- /book_module6/_toc.yml: -------------------------------------------------------------------------------- 1 | - file: welcome 2 | - file: reproducible-research 3 | title: Reproducibility Guide 4 | chapters: 5 | - file: overview/overview 6 | title: Overview 7 | sections: 8 | - file: overview/overview-definitions 9 | title: Definitions 10 | - file: overview/overview-benefit 11 | title: Benefits 12 | - file: overview/overview-resources 13 | title: Resources 14 | - file: open/open 15 | title: Open Research 16 | sections: 17 | - file: open/open-data 18 | title: Open Data 19 | - file: open/open-source 20 | title: Open Source 21 | - file: open/open-hardware 22 | title: Open Hardware 23 | - file: open/open-access 24 | title: Open Access 25 | - file: open/open-notebooks 26 | title: Open Notebooks 27 | - file: open/open-scholarship 28 | title: Open Scholarship 29 | - file: demo 30 | title: Reproducible notebook -------------------------------------------------------------------------------- /book_module6/demo.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Demo notebook" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "We can also create parts of our Jupyter Book based on Jupyter Notebooks." 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "Let's simulate data for two conditions and print their first ten rows:" 22 | ] 23 | }, 24 | { 25 | "cell_type": "code", 26 | "execution_count": null, 27 | "metadata": {}, 28 | "outputs": [], 29 | "source": [ 30 | "import numpy as np\n", 31 | "\n", 32 | "cond_1 = np.random.rand(100)\n", 33 | "print(f'Condition 1 = {cond_1[:10]}')\n", 34 | "\n", 35 | "cond_2 = cond_1 + (np.random.rand(100))\n", 36 | "print(f'Condition 2 = {cond_2[:10]}')" 37 | ] 38 | }, 39 | { 40 | "cell_type": "markdown", 41 | "metadata": {}, 42 | "source": [ 43 | "We can also display in our Jupyter Book more complex datastructures, like pandas dataframes:" 44 | ] 45 | }, 46 | { 47 | "cell_type": "code", 48 | "execution_count": null, 49 | "metadata": {}, 50 | "outputs": [], 51 | "source": [ 52 | "import pandas as pd\n", 53 | "\n", 54 | "df = pd.DataFrame(\n", 55 | " {'condition_1': cond_1, 'condition_2': cond_2}, \n", 56 | " index=np.arange(100)\n", 57 | ")\n", 58 | "\n", 59 | "df[:10]" 60 | ] 61 | }, 62 | { 63 | "cell_type": "markdown", 64 | "metadata": {}, 65 | "source": [ 66 | "And of course, we can display plots as well!" 67 | ] 68 | }, 69 | { 70 | "cell_type": "code", 71 | "execution_count": null, 72 | "metadata": {}, 73 | "outputs": [], 74 | "source": [ 75 | "import matplotlib.pyplot as plt\n", 76 | "\n", 77 | "plt.scatter(cond_1, cond_2, alpha=.6)\n", 78 | "plt.xlabel('condition 1')\n", 79 | "plt.ylabel('condition 2')\n", 80 | "plt.title('Scatterplot')\n", 81 | "plt.show()" 82 | ] 83 | } 84 | ], 85 | "metadata": { 86 | "kernelspec": { 87 | "display_name": "Python 3", 88 | "language": "python", 89 | "name": "python3" 90 | }, 91 | "language_info": { 92 | "codemirror_mode": { 93 | "name": "ipython", 94 | "version": 3 95 | }, 96 | "file_extension": ".py", 97 | "mimetype": "text/x-python", 98 | "name": "python", 99 | "nbconvert_exporter": "python", 100 | "pygments_lexer": "ipython3", 101 | "version": "3.7.4" 102 | } 103 | }, 104 | "nbformat": 4, 105 | "nbformat_minor": 4 106 | } -------------------------------------------------------------------------------- /book_module6/figures/ReproducibleMatrix.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module6/figures/ReproducibleMatrix.jpg -------------------------------------------------------------------------------- /book_module6/figures/favicon-32x32.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module6/figures/favicon-32x32.png -------------------------------------------------------------------------------- /book_module6/figures/logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module6/figures/logo.png -------------------------------------------------------------------------------- /book_module6/figures/open_access_citatations.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module6/figures/open_access_citatations.jpg -------------------------------------------------------------------------------- /book_module6/figures/open_umbrella.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module6/figures/open_umbrella.png -------------------------------------------------------------------------------- /book_module6/figures/reasons_reproducibility.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module6/figures/reasons_reproducibility.png -------------------------------------------------------------------------------- /book_module6/figures/reproducibility.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module6/figures/reproducibility.jpg -------------------------------------------------------------------------------- /book_module6/figures/welcome.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/book_module6/figures/welcome.jpg -------------------------------------------------------------------------------- /book_module6/open/open-notebooks.md: -------------------------------------------------------------------------------- 1 | # Open notebooks 2 | 3 | Electronic Lab Notebooks (ELNs) enable researchers to organize and store experimental procedures, protocols, plans, notes, data, and even unfiltered interpretations using their computer or mobile device. 4 | They are a digital analogue to the paper notebook most researchers keep. 5 | ELNs can offer several advantages over the traditional paper notebook in documenting research during the active phase of a project, including searchability within and across notebooks, secure storage with multiple redundancies, remote access to notebooks, and the ability to easily share notebooks among team members and collaborators. 6 | 7 | Open notebook research is simply the practice of making such notebooks openly available, usually online. 8 | Some researchers choose to keep their notebooks open from the very beginning of their projects. 9 | Rather than wait months, even years, to share their research through journal publication as is the current practice, this allows researchers to post their experimental data and protocols online and in real-time. 10 | Sharing research in this open and timely manner helps to reduce duplication of work, helps foster new collaborations, and cultivates a more open dialogue with others. 11 | It also helps researchers avoid making exploring dead ends and making mistakes that have already been covered by their colleague, but went unpublished because of lack of scientific interest. 12 | 13 | Open notebooks have the further benefit of increasing the quality of scientific outputs by forcing researchers to be careful, thorough, and explicit. 14 | Making research open has the added benefit of increasing the likelihood that any errors made in an investigation will be spotted quickly, instead of down the line. 15 | Immediate fixes will have much less impact on a research project, which will save a research time, the lab money, and pride. 16 | 17 | Ideally, every scientist would maintain an open notebook in real-time which would encompass all aspects of their research. 18 | But many fears about dealing with complete open access, conflicts with intellectual property and publications, and online data overload hamper this movement. 19 | To combat this, practitioners encourage any form of open notebook research, "make open what you can", even if that means uploading some information for a project from many years ago that never saw the light of day. 20 | -------------------------------------------------------------------------------- /book_module6/open/open-scholarship.md: -------------------------------------------------------------------------------- 1 | # Open scholarship 2 | 3 | Open research and its subcomponents fit under the umbrella of a broader concept - open scholarship. 4 | 5 | ![open_umbrella](../figures/open_umbrella.png) 6 | 7 | ## Open educational resources 8 | 9 | Open educational resources (OERs) are teaching and learning materials that can be freely used and reused for learning and/or teaching at no cost, and without needing to ask permission. Examples are courses, including Massive Online Open Courses (MOOCs), lectures, teaching materials, assignments, and various other resources. OERs are available in many different formats compatible with online usage, most obviously text, images, audio, and video. Anyone with internet access can access and use OERs; access is not dependent on location or membership of a particular institution. 10 | 11 | Unlike copyrighted resources, OERs have been authored or created by an individual or organization that chooses to retain few, if any, ownership rights. In some cases, that means anyone can download a resource and share it with colleagues and students. In other cases, this may go further and enable people to edit resources and then re-post them as a remixed work. How do you know your options? OERs often have a Creative Commons licence or other permission to let you know how the material may be used, reused, adapted, and shared. 12 | 13 | Fully open OERs comply with the 5 Rs: 14 | 15 | - Retain: the right to make, own, and control copies of the content. 16 | - Reuse: the right to use the content in a wide range of ways (for example, in a class, in a study group, on a website, in a video). 17 | - Revise: the right to adapt, adjust, modify, or alter the content itself (for example, translate the content into another language). 18 | - Remix: the right to combine the original or revised content with other open content to create something new (for example, incorporate the content into a mashup). 19 | - Redistribute: the right to share copies of the original content, your revisions, or your remixes with others (for example, give a copy of the content to a friend). 20 | 21 | Researchers generate a great deal of educational resources in the course of teaching students and each other (at workshops, for example). 22 | By making these openly available, for example in the [open educational resource commons](https://www.oercommons.org/), the wider community can benefit from them in three main ways: 23 | 24 | 1. Most obviously, the community can use the materials to learn about the material they cover. 25 | 2. Sharing resources reduces duplication of effort. 26 | If an educator needs materials for teaching and such materials already exist openly then they need not make their own from scratch, saving time. 27 | 3. Making materials openly available helps a community build better resources by improving resources that already exist and combining OERs to take advantage of their different strengths, such as a great diagram or explanation. 28 | 29 | Beyond the raw practical benefits the worldwide OER movement is rooted in the human right to access high-quality education. 30 | This shift in educational practice is about participation and co-creation. 31 | Open Educational Resources (OERs) offer opportunities for systemic change in teaching and learning content through engaging educators in new participatory processes and effective technologies for engaging with learning. 32 | 33 | ## Equity, diversity, inclusion 34 | 35 | Open scholarship means open to *everyone* without discrimination based on factors such as race, gender, sexual orientation, or any number of other factors. 36 | As a community we should undertake to ensure equitable opportunities for all. 37 | We can go about that by deliberately fostering welcoming, inclusive cultures within out communities. 38 | For example, reasonable accommodations should be made wherever possible to include community members with disabilities to enable them to participate fully, and this can be as simple as choosing colourblind-safe colour schemes when making graphs. 39 | 40 | ## Citizen science 41 | 42 | Citizen science is the involvement of the public in scientific research – whether community-driven research or global investigations, the Oxford English Dictionary recently defined it as: "scientific work undertaken by members of the general public, often in collaboration with or under the direction of professional scientists and scientific institutions". 43 | Citizen science offers the power of science to everyone, and the power of everyone to science. 44 | 45 | By allowing members of the public to contribute to scientific research, citizen science helps engage and invest the wider world in science. 46 | It also benefits researchers by offering manpower that simply would not be accessible otherwise. 47 | Examples of this include [finding](https://citizensciencegames.com/games/eterna/) ways of folding molecules, and [classifying](https://www.zooniverse.org/) different types of galaxies. 48 | 49 | ## Patient and Public Involvement 50 | Whilst citizen science encompasses one way of contributing to scientific research, Patient and Public Involvement (PPI) is a far more specialised form of citizen science which is particularly useful when doing research on health and/or social issues. 51 | 52 | PPI is *not*: 53 | - Participation: Recruitment of participants (such as for a clinical trial or survey) to contribute data to a project. 54 | - Engagement: Dissemination, such as presenting at patient interest groups or writing a blog post. 55 | 56 | PPI *is*: 57 | - Involvement: patients and members of the public contribute at *all* stages of the research cycle. 58 | 59 | When incorporating PPI into research, researchers work *with* volunteers, rather than doing work *about* them. 60 | PPI volunteers are usually patients or members of the public with a particular interest in some area of research which means that the topic is often very personal, and being involved in the research cycle can be an empowering experience. 61 | For the researcher, PPI often generates unique and invaluable insights from the volunteers' own personal expertise which cannot always be predicted by the researchers themselves. 62 | 63 | It is a good idea to consider PPI very early in a project, ideally before any grant applications or submissions for ethical approval have been written. 64 | PPI volunteers can help researchers in many ways, such as the following: 65 | 1. Generate or shape research questions. 66 | 2. Contribute to, or review, study design. 67 | 3. Help with grant applications or submissions to research ethics committees (particularly the lay summary). 68 | 4. Collect data. 69 | 5. Analyse data. 70 | 6. Contribute to the manuscript and be listed as a co-author. 71 | 7. Disseminate findings in plain English. 72 | 73 | One of the biggest barriers to PPI is not knowing how to get started. 74 | The UK National Institute for Health Research have their own site, [INVOLVE](https://www.invo.org.uk/), to help familiarise yourself with the foundations of PPI. 75 | Additionally, charities related to your specific research field may be able to facilitate or support PPI; for example [Cancer Research UK](https://www.cancerresearchuk.org/funding-for-researchers/patient-involvement-toolkit-for-researchers) and [Parkinson's UK](https://www.parkinsons.org.uk/research/patient-and-public-involvement-ppi) have formal guides in place that provide a comprehensive overview of PPI. 76 | -------------------------------------------------------------------------------- /book_module6/open/open.md: -------------------------------------------------------------------------------- 1 | (rr-open)= 2 | # Open research 3 | 4 | ## Prerequisites / recommended skill level 5 | 6 | | Prerequisite | Importance | Notes | 7 | | -------------|----------|------| 8 | | Experience with version control | Helpful | Experience with GitHub is particularly useful | 9 | 10 | ## Summary 11 | 12 | Open research aims to transform research by making it more reproducible, transparent, re-usable, collaborative, accountable, and accessible to society. It pushes for change in the way that research is carried out and disseminated by digital tools. One definition of open research, [as given by the Organisation for Economic Co-operation and Development (OECD)](https://www.fct.pt/dsi/docs/Making_Open_Science_a_Reality.pdf "Making Open Science a Reality, OECD Science, Technology and Industry Policy Papers No. 25"), is the practice of making "the primary outputs of publicly funded research results – publications and the research data – publicly accessible in digital format with no or minimal restriction." In order to achieve this openness in research, each element of the research process should: 13 | 14 | - Be publicly available: It is difficult to use and benefit from knowledge hidden behind barriers such as passwords and paywalls. 15 | - Be reusable: Research outputs need to be licensed appropriately so that prospective users clearly know any limitations on re-use. 16 | - Be transparent: With appropriate metadata to provide clear statements of how research output was produced and what it contains. 17 | 18 | The research process typically has the following form: data is collected and then analysed (usually using software). This process may involve the use of specialist hardware. The results of the research are then published. Throughout the process it is good practice for researchers to document their working in notebooks. Open research aims to make each of these elements open: 19 | 20 | - Open data: Documenting and sharing research data openly for re-use. 21 | - Open source software: Documenting research code and routines, and making them freely accessible and available. 22 | - Open hardware: Documenting designs, materials, and other relevant information related to hardware, and making them freely accessible and available. 23 | - Open access: Making all published outputs freely accessible for maximum use and impact. 24 | - Open notebooks: An emerging practice, documenting and sharing the experimental process of trial and error. 25 | 26 | These elements are expanded upon in this chapter. 27 | 28 | Open scholarship is a concept that extends open research further. It relates to making other aspects of scientific research open to the public, for example: 29 | 30 | - Open educational resources: Making educational resources publicly available to be re-used and modified. 31 | - Equity, diversity, inclusion: Ensuring scholarship is open to anyone without barriers based on factors such as race, background, gender, and sexual orientation. 32 | - Citizen science: The inclusion of members of the public in scientific research. 33 | 34 | These elements are also discussed in detail in this chapter. 35 | 36 | ## How this will help you / why this is useful 37 | 38 | There are five main schools of thought motivating open practices to benefit research: 39 | 40 | | School | Belief | Aim | 41 | | -------------------------- | -------------------- | ------------------------------------------------- | 42 | | Infrastructure | Efficient research depends on the available tools and applications. | Creating openly available platforms, tools, and services for researchers. | 43 | | Pragmatic | Knowledge-creation could be more efficient if researchers worked together. | Opening up the process of knowledge creation. | 44 | | Measurement | Academic contributions today need alternative impact measurements. | Developing an alternative metric system for research impact. | 45 | | Democratic | The access to knowledge is unequally distributed. | Making knowledge freely available for everyone. | 46 | | Public | Research needs to be made accessible to the public. | Making research accessible for citizens. | 47 | 48 | Open practices also benefit the researchers that propagate them. For example there is evidence [(Mckiernan et al. 2016)](https://elifesciences.org/articles/16800) that open access articles are cited more often, as shown by the metastudy presented in the figure below. 49 | 50 | | ![open_access_citatations](../figures/open_access_citatations.jpg) | 51 | | -----------------------------------------------------| 52 | | The relative citation rate (OA: non-OA) in 19 fields of research. This rate is defined as the mean citation rate of OA articles divided by the mean citation rate of non-OA articles. Multiple points for the same discipline indicate different estimates from the same study, or estimates from several studies. (See footnote 1 for references.) | 53 | 54 | Another benefit of openness is that while research collaborations are essential to advancing knowledge, identifying and connecting with appropriate collaborators is not trivial. Open practices can make it easier for researchers to connect with one another by increasing the discoverability and visibility of one’s work, facilitating rapid access to novel data and software resources, and creating new opportunities to interact with and contribute to ongoing communal projects. 55 | -------------------------------------------------------------------------------- /book_module6/overview/overview-benefit.md: -------------------------------------------------------------------------------- 1 | (rr-overview-benefits)= 2 | # Added Advantages 3 | 4 | In the section, we discussed the different aspects of reproducible research that are beneficial for the scientific community. 5 | In this chapter, we will share some less obvious aspects of working reproducibly for individual researchers and teams. 6 | 7 | ![Why we should care about working reproducibly](../figures/reasons_reproducibility.png) 8 | 9 | **1. Track a complete history of your research** 10 | 11 | Reproducible research must contain a complete history and narrative (also known as [Provenance](https://en.wikipedia.org/wiki/Provenance)) of the project planning and development process. 12 | This includes information on the data, tools, methods, codes, and documentation used in the research project. 13 | By storing a complete track-record of our work, we can ensure research sustainability, fair citation/acknowledgment, and usefulness of our and others' work in our research fields. 14 | 15 | **2. Facilitate collaboration and review process** 16 | 17 | By designing reproducible workflows and sharing them with the different components of our research project, we can allow others to develop an in-depth understanding of our work. 18 | This encourages them to review our methods, test our code, propose useful changes and make thoughtful contributions to develop our project further. 19 | Reproducible workflows facilitate the peer review process tremendously by allowing reviewers access to the different parts of the projects that are necessary to validate the research outcomes. 20 | 21 | **3. Publish validated research and avoid misinformation** 22 | 23 | Lack of reproducibility is one of the major factors that lead to paper retractions (source [Retraction Watch](https://retractionwatch.com/)). 24 | The best-known analyses of scientific literatures in psychology {cite}`Begley2012` and cancer biology {cite}`OpenScienceCollaboration2015Reproducibility` found the reproducibility rates of their research output of around 40% and 10%, respectively. 25 | By working reproducibly, we can develop validated research work, avoid misinformation that can limit replicability of our work and publish accurate research outputs. 26 | This aspect does not only support the validity of the current work, but any future studies that are based on reproducible research {cite}`MozillaScienceLab`. 27 | 28 | **4. Write your papers, thesis and reports efficiently** 29 | 30 | Well documented analyses help us maintain easy access to all the results generated within a project that can be written up efficiently. 31 | If working in a team, collaborators can easily get recognition in terms of authorship for their contributions. Furthermore, by availing the underlying dataset and methods we can easily comply with the highest-level journal guidelines. 32 | 33 | **5. Get credits for your work fairly** 34 | 35 | Applying reproducibility practices separately on different parts of the project such as data, independently executable codes and scripts, protocols, and reports allows other researchers to test and reuse our work in their research, and brings fair recognition for our work. 36 | Researchers who publish their work with the underlying information, get cited more often as their research outcome can be broadly replicated and trusted. 37 | This fair credit system encourages researchers to further maintain reproducibility practices in their work. 38 | 39 | **6. Ensure continuity of your work** 40 | 41 | By following guidelines for reproducibility, we can easily communicate our work with different stakeholders such as our supervisors, funders, reviewers, students, and potential collaborators. 42 | This aspect of reproducibility increases the usefulness of our research by enabling others to easily build on our results, and re-use our research materials {cite}`MozillaScienceLab`. 43 | This ensures the continuity of a research idea and can even find fresh applications in other contexts. 44 | Progress of such projects can easily be tracked and continued - either by other researchers, or yourself if you want to build on your own work after a longer period {cite}`Markowetz2015`. 45 | 46 | To learn about other benefits of working reproducibly on Open Research projects are covered in our Open chapter. 47 | 48 | --- 49 | ## References 50 | ```{bibliography} ../references.bib 51 | :filter: docname in docnames 52 | ``` -------------------------------------------------------------------------------- /book_module6/overview/overview-definitions.md: -------------------------------------------------------------------------------- 1 | (rr-overview-definitions)= 2 | # Definitions of Reproducibility 3 | 4 | The most common definition of reproducibility (and replication) was first noted by Claerbout and Karrenbach in 1992 {cite}'ClaerboutKarrenbach1992Reproducibility' and has been used in computational science literature since then. 5 | Another popular definition has been introduced in 2013 by the Association for Computing Machinery (ACM) {cite}`Ivie2018SciComp`, which swapped the meaning of the terms 'reproducible' and 'replicable' compared to Claerbout and Karrenbach. 6 | 7 | The following table contrasts both definitions {cite}`Heroux2018Reproducibility`. 8 | 9 | | Term | Claerbout & Karrenbach | ACM | 10 | | -----|------------------------|-----| 11 | | Reproducible | Authors provide all the necessary data and the computer codes to run the analysis again, re-creating the results.| (Different team, different experimental setup.) The measurement can be obtained with stated precision by a different team, a different measuring system, in a different location on multiple trials. For computational experiments, this means that an independent group can obtain the same result using artifacts which they develop completely independently. | 12 | | Replicable | A study that arrives at the same scientific findings as another study, collecting new data (possibly with different methods) and completing new analyses. | (Different team, same experimental setup.) The measurement can be obtained with stated precision by a different team using the same measurement procedure, the same measuring system, under the same operating conditions, in the same or a different location on multiple trials. For computational experiments, this means that an independent group can obtain the same result using the author's own artifacts. | 13 | 14 | Barba (2018) {cite}`Barba2018Reproducibility` conducted a detailed literature review on the usage of reproducible/replicable covering several disciplines. 15 | Most papers and disciplines use the terminology as defined by Claerbout and Karrenbach, whereas microbiology, immunology and computer science tend to follow the ACM use of reproducibility and replication. 16 | In political science and economics literature, both terms are used interchangeably. 17 | 18 | In addition to these high level definitions of reproducibility, some authors provide more detailed disctinctions. 19 | Victoria Stodden {cite}`Victoria2014Reproducibility`, a prominent scholar on this topic, has for example identified the following further distinctions: 20 | 21 | - _Computational reproducibility_: When detailed information is provided about code, software, hardware and implementation details. 22 | 23 | - _Empirical reproducibility_: When detailed information is provided about non-computational empirical scientific experiments and observations. In practice this is enabled by making data freely available, as well as details of how the data was collected. 24 | 25 | - _Statistical reproducibility_: When detailed information is provided, for example, about the choice of statistical tests, model parameters, and threshold values. This mostly relates to pre-registration of study design to prevent p-value hacking and other manipulations. 26 | 27 | (rr-overview-definitions-table)= 28 | ## Table of definitions for reproducibility 29 | 30 | At _The Turing Way_ we define **reproducible research** as work that can be independently recreated from the same data and the same code that the original team used. 31 | Reproducible is distinct from replicable, robust and generalisable as described in the figure below. 32 | 33 | | ![Kirstie's definition of reproducible research](../figures/ReproducibleMatrix.jpg) | 34 | | -------------------------------------------------------------------------------------------------------- | 35 | | How the Turing Way defines reproducible research | 36 | 37 | The different dimensions of reproducible research described in the matrix above have the following definitions: 38 | 39 | - **Reproducible:** A result is reproducible when the _same_ analysis steps performed on the _same_ dataset consistently produces the _same_ answer. 40 | - **Replicable:** A result is replicable when the _same_ analysis performed on _different_ datasets produces qualitatively similar answers. 41 | - **Robust:** A result is robust when the _same_ dataset is subjected to _different_ analysis workflows to answer the same research question (for example one pipeline written in R and another written in Python) and a qualitatively similar or identical answer is produced. 42 | Robust results show that the work is not dependent on the specificities of the programming language chosen to perform the analysis. 43 | - **Generalisable:** Combining replicable and robust findings allow us to form generalisable results. 44 | Note that running an analysis on a different software implementation and with a different dataset does not provide _generalised_ results. 45 | There will be many more steps to know how well the work applies to all the different aspects of the research question. 46 | Generalisation is an important step towards understanding that the result is not dependent on a particular dataset nor a particular version of the analysis pipeline. 47 | 48 | More information on these definitions can be found in "Reproducibility vs. Replicability: A Brief History of a Confused Terminology" by Hans E. Plesser {cite}`Plesser2018Reproducibility`. 49 | 50 | ## Reproducible but not open 51 | 52 | _The Turing Way_ recognises that some research will use sensitive data that cannot be shared and this handbook will provide guides on how your research can be reproducible without all parts necessarily being open. 53 | 54 | --- 55 | ## References 56 | ```{bibliography} ../references.bib 57 | :filter: docname in docnames 58 | ``` -------------------------------------------------------------------------------- /book_module6/overview/overview-resources.md: -------------------------------------------------------------------------------- 1 | (rr-overview-resources)= 2 | # Resources for reproducibility chapter 3 | For additional resources like videos and reference papers on reproducibility, see the {ref}`rr-overview-resources-reading` and {ref}`rr-overview-resources-addmaterial` sections. 4 | 5 | ## Checklist / Exercise 6 | - [ ] Define reproducibility for yourself. 7 | 8 | ## What to learn next? 9 | Open Research would be a good chapter to read next. 10 | If you want to start learning hands-on practices, we recommend reading the version control chapter next. 11 | 12 | (rr-overview-resources-reading)= 13 | ## Further Reading 14 | 15 | * Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature, 533(7604), 452–454. https://doi.org/10.1038/533452a 16 | 17 | * Barba, L. (2017): Barba-group Reproducibility Syllabus. figshare. Paper. https://doi.org/10.6084/m9.figshare.4879928.v1 18 | 19 | * Piwowar, H. A., & Vision, T. J. (2013). Data reuse and the open data citation advantage. PeerJ, 1, e175. https://doi.org/10.7717/peerj.175 20 | 21 | * Whitaker, Kirstie (2018): Barriers to reproducible research (and how to overcome them). figshare. Paper. https://doi.org/10.6084/m9.figshare.7140050.v2 22 | 23 | (rr-overview-resources-addmaterial)= 24 | ## Additional material 25 | 26 | ### Videos 27 | 28 | * Markowetz, F. (2016). 5 selfish reasons to work reproducibly. Talk at scidata 2016. https://www.youtube.com/watch?v=Is15CMVPHas&feature=youtu.be 29 | 30 | ### Other useful links 31 | 32 | * Markowetz, F. (2018). 5 selfish reasons to work reproducibly. Slides available at https://osf.io/a8wq4/ 33 | 34 | * Leipzig, J (2020). Awesome Reproducible Research: A curated list of reproducible research case studies, projects, tutorials, and media. Github repo. https://github.com/leipzig/awesome-reproducible-research 35 | -------------------------------------------------------------------------------- /book_module6/overview/overview.md: -------------------------------------------------------------------------------- 1 | (rr-overview)= 2 | # Overview of Reproducible Research 3 | 4 | Scientific results and evidence are strengthened if they are reproduced 5 | and confirmed by several independent researchers (see {ref}`definitions `). 6 | With all parts used in an analysis being available and/or documented, valuable time is saved reproducing published results and other researchers can easily build on these research results and re-use data or code for their analyses. 7 | 8 | Learn about the less obvious benefits of working reproducibly in the {ref}`added advantages ` subchapter. 9 | 10 | Major media outlets have [reported on](https://www.theguardian.com/science/2018/aug/27/attempt-to-replicate-major-social-scientific-findings-of-past-decade-fails) investigations showing that a significant percentage of scientific studies cannot be reproduced. 11 | 12 | This leads to other academics and society losing trust in scientific results {cite}`baker2016reproducibility`. 13 | Working reproducibly means others can check your results - even early on in the research process. 14 | Thus, the full analysis and methodology is transparent. 15 | 16 | In addition, so called "negative results" can be published easily, helping avoid other researchers wasting time repeating analyses that will not return the expected results {cite}`Dirnagl2010bias`. 17 | For further reading resources on reproducibility, please checkout the {ref}`resources ` subchapter. 18 | 19 | ## Prerequisites / recommended skill level 20 | No previous knowledge needed. 21 | 22 | --- 23 | ## References 24 | ```{bibliography} ../references.bib 25 | :filter: docname in docnames 26 | ``` 27 | -------------------------------------------------------------------------------- /book_module6/reproducible-research.md: -------------------------------------------------------------------------------- 1 | (rr)= 2 | # Guide for Reproducible Research 3 | 4 | ***This guide covers topics related to skills, tools and best practices for research reproducibility.*** 5 | 6 | _The Turing Way_ defines reproducibility in data research as data and code being available to fully rerun the analysis. 7 | 8 | There are several definitions of reproducibility in use, and we discuss these in more detail in the {ref}`rr-overview-definitions` section of this chapter. 9 | While it it absolutely fine for us each to use different words, it will be useful for you to know how _The Turing Way_ defines *reproducibility* to avoid misunderstandings when reading the rest of the handbook. 10 | 11 | | ![A person showing another person what steps to take to make your data research reproducible](./figures/reproducibility.jpg) | 12 | | ---------------| 13 | | _The Turing Way_ project illustration by Scriberia. Zenodo. [http://doi.org/10.5281/zenodo.3332807](http://doi.org/10.5281/zenodo.3332807) | 14 | 15 | _The Turing Way_ started by defining reproducibility in the context of this handbook, lay out its importance for science and scientists, and provide an overview of the common concepts, tools and resources. 16 | The first few chapters were on {ref}`rr-overview-benefits`, testing and reproducible computational environments. 17 | Since the start of this project in 2019, many additional chapters have been written, edited, reviewed, read and promoted by over 100 contributors. 18 | 19 | We welcome your contributions to improve these chapters and to add other important concepts in reproducibility and how to empower researchers to work reproducibly from the start. 20 | Check out our [contributing guidelines](https://github.com/alan-turing-institute/the-turing-way/blob/master/CONTRIBUTING.md) to get involved. 21 | -------------------------------------------------------------------------------- /book_module6/welcome.md: -------------------------------------------------------------------------------- 1 | # Welcome 2 | 3 | _The Turing Way_ is an open source community-driven guide to reproducible, ethical, inclusive and collaborative data science. 4 | 5 | Our goal is to provide all the information that data scientists in academia, industry, government and in the third sector need at the start of their projects to ensure that they are easy to reproduce and reuse at the end. 6 | 7 | The book started as a guide for reproducibility, covering version control, testing, and continuous integration. 8 | But technical skills are just one aspect of making data science research "open for all". 9 | 10 | In February 2020, _The Turing Way_ expanded to a series of books covering reproducible research, project design, communication, collaboration, and ethical research. 11 | 12 | | ![The Turing Way project is illustrated as a road or path with shops for different data science skills. People can go in and out with their shopping cart and pick and choose what they need](./figures/welcome.jpg) | 13 | | ---------------| 14 | | _The Turing Way_ project illustration by Scriberia. Zenodo. [http://doi.org/10.5281/zenodo.3332807](http://doi.org/10.5281/zenodo.3332807) | 15 | 16 | ## Our community 17 | 18 | _The Turing Way_ community is dedicated to making collaborative, reusable and transparent research "too easy not to do". 19 | That means investing in the socio-technical skills required to work in a team, to build something greater than the any individual person could deliver alone. 20 | 21 | _The Turing Way_ is: 22 | 23 | * a book 24 | * a community 25 | * a global collaboration 26 | 27 | We hope you find the content in the book helpful. 28 | Everything here is available for free under a [CC-BY licence](https://github.com/alan-turing-institute/the-turing-way/blob/master/LICENSE.md). 29 | Please use and re-use whatever you need for any purpose. 30 | 31 | The book is collaboratively written and open from the start. 32 | To make this project truly accessible and useful for everyone, we invite you to contribute your skills and bring your perspectives into this project. 33 | To join this community, please read our [contribution guidelines](https://github.com/alan-turing-institute/the-turing-way/blob/master/CONTRIBUTING.md) and ways to [get in touch](https://github.com/alan-turing-institute/the-turing-way#get-in-touch). 34 | More information about the community and the project is available in the Community Handbook. 35 | We look forward to expanding and building _The Turing Way_ together. 36 | 37 | Although _The Turing Way_ receives support and funding from [The Alan Turing Institute](https://www.turing.ac.uk/), the project is designed to be a global collaboration. 38 | We have contributions from across the UK, and from India, Mexico, Australia, USA, and many European countries. 39 | Chapters have been written, reviewed and curated by members of research institutes and universities, government departments, and industry. 40 | We are committed to creating a space where people with diverse expertise and lived experiences can share their knowledge with others to allow us all to use data science to improve the world. 41 | 42 | We value the participation of every member of our community and want to ensure that every contributor has an enjoyable and fulfilling experience. 43 | Accordingly, everyone who participates in _The Turing Way_ project is expected to show respect and courtesy to other community members at all times. 44 | All contributions must abide by our [code of conduct](https://github.com/alan-turing-institute/the-turing-way/blob/master/CODE_OF_CONDUCT.md). 45 | 46 | ![Gif showing screen capture of contributors table, smiling faces and emojis representing the types of contributions in a table](https://media.giphy.com/media/gKIUisnjpj2PS75nOJ/giphy.gif) 47 | 48 | (cite-tag)= 49 | ## Citing _The Turing Way_ 50 | 51 | All material in _The Turing Way_ is available under a [CC-BY 4.0 licence](https://github.com/alan-turing-institute/the-turing-way/blob/master/LICENSE.md). 52 | 53 | You can cite _The Turing Way_ through the project's Zenodo archive using doi: [10.5281/zenodo.3233853](https://doi.org/10.5281/zenodo.3233853). 54 | 55 | The citation will look something like: 56 | 57 | > The Turing Way Community, Becky Arnold, Louise Bowler, Sarah Gibson, Patricia Herterich, Rosie Higman, … Kirstie Whitaker. (2019, March 25). The Turing Way: A Handbook for Reproducible Data Science (Version v0.0.4). Zenodo. http://doi.org/10.5281/zenodo.3233986 58 | 59 | Please visit the [DOI link](https://doi.org/10.5281/zenodo.3233853) though to get the most recent version - the one above is not automatically generated and therefore may be out of date. 60 | DOIs allow us to archive the repository and they are really valuable to ensure that the work is tracked in academic publications. 61 | 62 | You can also share the human-readable URL to a page in the book, for example: [https://the-turing-way.netlify.app/reproducible-research/overview/overview-definitions.html](./overview/overview-definitions), but be aware that the project is under development and therefore these links may change over time. 63 | You might want to include a [web archive link](http://web.archive.org) such as: [https://web.archive.org/web/20191030093753/https://the-turing-way.netlify.com/reproducibility/03/definitions.html](https://web.archive.org/web/20191030093753/https://the-turing-way.netlify.com/reproducibility/03/definitions.html) to make sure that you don't end up with broken links everywhere! 64 | 65 | We really appreciate any references that you make to _The Turing Way_ project in your work and we hope it is useful. 66 | If you have any questions please [get in touch](https://github.com/alan-turing-institute/the-turing-way#get-in-touch). 67 | -------------------------------------------------------------------------------- /content/bibliography.md: -------------------------------------------------------------------------------- 1 | # Bibliography 2 | 3 | ```{bibliography} ./references.bib 4 | ``` 5 | -------------------------------------------------------------------------------- /content/demo.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Demo notebook" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "We can also create parts of our Jupyter Book based on Jupyter Notebooks." 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "Let's simulate data for two conditions and print their first ten rows:" 22 | ] 23 | }, 24 | { 25 | "cell_type": "code", 26 | "execution_count": null, 27 | "metadata": {}, 28 | "outputs": [], 29 | "source": [ 30 | "import numpy as np\n", 31 | "\n", 32 | "cond_1 = np.random.rand(100)\n", 33 | "print(f'Condition 1 = {cond_1[:10]}')\n", 34 | "\n", 35 | "cond_2 = cond_1 + (np.random.rand(100))\n", 36 | "print(f'Condition 2 = {cond_2[:10]}')" 37 | ] 38 | }, 39 | { 40 | "cell_type": "markdown", 41 | "metadata": {}, 42 | "source": [ 43 | "We can also display in our Jupyter Book more complex datastructures, like pandas dataframes:" 44 | ] 45 | }, 46 | { 47 | "cell_type": "code", 48 | "execution_count": null, 49 | "metadata": {}, 50 | "outputs": [], 51 | "source": [ 52 | "import pandas as pd\n", 53 | "\n", 54 | "df = pd.DataFrame(\n", 55 | " {'condition_1': cond_1, 'condition_2': cond_2}, \n", 56 | " index=np.arange(100)\n", 57 | ")\n", 58 | "\n", 59 | "df[:10]" 60 | ] 61 | }, 62 | { 63 | "cell_type": "markdown", 64 | "metadata": {}, 65 | "source": [ 66 | "And of course, we can display plots as well!" 67 | ] 68 | }, 69 | { 70 | "cell_type": "code", 71 | "execution_count": null, 72 | "metadata": {}, 73 | "outputs": [], 74 | "source": [ 75 | "import matplotlib.pyplot as plt\n", 76 | "\n", 77 | "plt.scatter(cond_1, cond_2, alpha=.6)\n", 78 | "plt.xlabel('condition 1')\n", 79 | "plt.ylabel('condition 2')\n", 80 | "plt.title('Scatterplot')\n", 81 | "plt.show()" 82 | ] 83 | } 84 | ], 85 | "metadata": { 86 | "kernelspec": { 87 | "display_name": "Python 3", 88 | "language": "python", 89 | "name": "python3" 90 | }, 91 | "language_info": { 92 | "codemirror_mode": { 93 | "name": "ipython", 94 | "version": 3 95 | }, 96 | "file_extension": ".py", 97 | "mimetype": "text/x-python", 98 | "name": "python", 99 | "nbconvert_exporter": "python", 100 | "pygments_lexer": "ipython3", 101 | "version": "3.7.4" 102 | } 103 | }, 104 | "nbformat": 4, 105 | "nbformat_minor": 4 106 | } -------------------------------------------------------------------------------- /content/demo_2.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Demo notebook" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "We can also create parts of our Jupyter Book based on Jupyter Notebooks." 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "Let's simulate data for two conditions and print their first ten rows:" 22 | ] 23 | }, 24 | { 25 | "cell_type": "code", 26 | "execution_count": null, 27 | "metadata": { 28 | "tags": [] 29 | }, 30 | "outputs": [], 31 | "source": [ 32 | "import numpy as np\n", 33 | "\n", 34 | "cond_1 = np.random.rand(100)\n", 35 | "print(f'Condition 1 = {cond_1[:10]}')\n", 36 | "\n", 37 | "cond_2 = cond_1 + (np.random.rand(100))\n", 38 | "print(f'Condition 2 = {cond_2[:10]}')" 39 | ] 40 | }, 41 | { 42 | "cell_type": "markdown", 43 | "metadata": {}, 44 | "source": [ 45 | "We can also display in our Jupyter Book more complex datastructures, like pandas dataframes:" 46 | ] 47 | }, 48 | { 49 | "cell_type": "code", 50 | "execution_count": null, 51 | "metadata": {}, 52 | "outputs": [], 53 | "source": [ 54 | "import pandas as pd\n", 55 | "\n", 56 | "df = pd.DataFrame(\n", 57 | " {'condition_1': cond_1, 'condition_2': cond_2}, \n", 58 | " index=np.arange(100)\n", 59 | ")\n", 60 | "\n", 61 | "df[:10]" 62 | ] 63 | }, 64 | { 65 | "cell_type": "markdown", 66 | "metadata": {}, 67 | "source": [ 68 | "And of course, we can display plots as well!" 69 | ] 70 | }, 71 | { 72 | "cell_type": "code", 73 | "execution_count": null, 74 | "metadata": {}, 75 | "outputs": [], 76 | "source": [ 77 | "import matplotlib.pyplot as plt\n", 78 | "\n", 79 | "plt.scatter(cond_1, cond_2, alpha=.6)\n", 80 | "plt.xlabel('condition 1')\n", 81 | "plt.ylabel('condition 2')\n", 82 | "plt.title('Scatterplot')\n", 83 | "plt.show()" 84 | ] 85 | }, 86 | { 87 | "source": [ 88 | "We want to know if there is a statistically significant difference between these two conditions. Let's run a [t-test](https://en.wikipedia.org/wiki/Student%27s_t-test) to find out. We will use the package [statsmodels](https://www.statsmodels.org/) to run the test:" 89 | ], 90 | "cell_type": "markdown", 91 | "metadata": {} 92 | }, 93 | { 94 | "cell_type": "code", 95 | "execution_count": null, 96 | "metadata": { 97 | "tags": [] 98 | }, 99 | "outputs": [], 100 | "source": [ 101 | "from statsmodels.stats.weightstats import ttest_ind\n", 102 | "\n", 103 | "t_value, p_value, df = ttest_ind(cond_1, cond_2)\n", 104 | "\n", 105 | "print(f'Obtained t-value: {t_value}')\n", 106 | "print(f'Obtained p-value: {p_value}')" 107 | ] 108 | } 109 | ], 110 | "metadata": { 111 | "kernelspec": { 112 | "display_name": "Python 3", 113 | "language": "python", 114 | "name": "python3" 115 | }, 116 | "language_info": { 117 | "codemirror_mode": { 118 | "name": "ipython", 119 | "version": 3 120 | }, 121 | "file_extension": ".py", 122 | "mimetype": "text/x-python", 123 | "name": "python", 124 | "nbconvert_exporter": "python", 125 | "pygments_lexer": "ipython3", 126 | "version": "3.8.3-final" 127 | } 128 | }, 129 | "nbformat": 4, 130 | "nbformat_minor": 4 131 | } -------------------------------------------------------------------------------- /content/figures/ReproducibleMatrix.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/content/figures/ReproducibleMatrix.jpg -------------------------------------------------------------------------------- /content/figures/favicon-32x32.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/content/figures/favicon-32x32.png -------------------------------------------------------------------------------- /content/figures/logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/content/figures/logo.png -------------------------------------------------------------------------------- /content/figures/open_access_citatations.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/content/figures/open_access_citatations.jpg -------------------------------------------------------------------------------- /content/figures/open_umbrella.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/content/figures/open_umbrella.png -------------------------------------------------------------------------------- /content/figures/reasons_reproducibility.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/content/figures/reasons_reproducibility.png -------------------------------------------------------------------------------- /content/figures/reproducibility.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/content/figures/reproducibility.jpg -------------------------------------------------------------------------------- /content/figures/welcome.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/content/figures/welcome.jpg -------------------------------------------------------------------------------- /content/open/open-access.md: -------------------------------------------------------------------------------- 1 | # Open access 2 | 3 | ## What is open access? 4 | 5 | One of the most common ways to disseminate research results is by writing a manuscript and publishing it in a journal, conference proceedings or book. For many years those publications were available to the public if purchased by means of a subscription fee or individually. 6 | However, new knowledge is built by synthesizing current scholarship and then building upon it. 7 | At the turn of the 21st century a new movement appeared with a clear objective: make all the research results available to anyone interested in reading it, free of charge by any user, with no technical obstacles such as mandatory registration or login to specific platforms. 8 | This movement took the name of Open access and established two initial strategies to achieve its final goal: self-archiving and open access publishing. 9 | 10 | ### Repositories and self-archiving 11 | 12 | The aim of the self-archiving movement is to provide tools and assistance to scholars to deposit their refereed journal articles in open electronic repositories. 13 | As a result of the first strategy we see self-archiving practices: researchers depositing and disseminating papers in institutional or subject based repositories. 14 | There has also been a growth in the publication of preprints through institutional repositories and preprint servers. 15 | Preprints are widely used in physical sciences and now emerging in life sciences and other fields. 16 | Preprints are documents that have not been peer reviewed but are considered as a complete publication in a first stage. 17 | Some of the preprint servers include open peer review services and the availability to post new versions of the initial paper once reviewed by peers. 18 | 19 | At the beginning of 2019 more than 4000 repositories are available for researchers to self-archive their publications according to the [registry of open access repositories](http://roar.eprints.org/). 20 | In this list there are institutional repositories, subject based or thematic repositories, and harvesters. 21 | Institutional repositories are generally managed by research performing institutions to provide to their community a place to archive and share openly papers and other research outputs. 22 | Subject based repositories are usually managed by research communities and most of the contents are related to a certain discipline. 23 | Finally, harvesters aggregate content from different repositories becoming sites to perform general searches and build other value-added services. 24 | 25 | When choosing a journal to publish research results, researchers should take a moment to read the journal policy regarding the transfer of copyright. 26 | Many journals still require for publication that authors transfer full copyright. 27 | This transfer of rights implies that authors must ask for permission to reuse their own work beyond what is allowed by the applicable law, unless there are some uses already granted. 28 | Such granted uses may include teaching purposes, sharing with colleagues, and self-archiving by researchers of their papers in repositories. 29 | Sometimes there a common policy among all the journals published by the same publishers but in general journals have their own policy, especially when they are published on behalf of a scientific society. 30 | When looking at the self-archiving conditions we must identify two key issues: the version of the paper that can be deposited and when it can be made publicly available. 31 | 32 | Regarding the version, some journals allow the dissemination of the submitted version, also known as a preprint, and they allow its replacement for a reviewed version once the final paper has been published. 33 | Due to the increase of policies requiring access to research results, most of the journals allow self-archiving of the accepted version of the paper, also known as the author manuscript or postprint. 34 | This version is the final text once the peer review process has ended but it has not the final layout of the publication. 35 | Finally some journals do allow researchers to deposit the final published version, also known as the version of record. 36 | 37 | In relation to the moment to make the paper publicly available, many journals establish a period of time from its original publication - the embargo period, which can range from zero to 60 months - when making the paper publicly available is not permitted. 38 | Some journals include or exclude embargoes depending on the versions. 39 | For instance the accepted version could be made publicly available after publication but the published version must wait 12 months. 40 | 41 | ### Open access publishing 42 | 43 | Open access publishing attempts to ensure permanent open access to all the articles published in journals, and as a result we have seen the creation of the open access journals. 44 | The number of open access journals has increased during the last years, according to the Directory of Open access Journals \([DOAJ](http://www.doaj.org)\), currently there are more than 12,000. 45 | Open access journal must provide free access to its contents but it also must licence them to allow reusability. 46 | 47 | Currently many paywalled journals offer individual open access options to researchers once the paper is accepted after peer review. 48 | Those options include the publication under a free content licence and free accessibility to anyone since its first publication. 49 | This model is commonly known as the hybrid model because in the same issue of a journal, readers can find open access and paywalled contributions. 50 | Usually publishers ask for a fee to open individual contributions. 51 | 52 | Open access publishing has two primary versions — gratis and libre. 53 | Gratis open access is simply making research available for others to read without having to pay for it. 54 | However, it does not grant the user the right to make copies, distribute, or modify the work in any way beyond fair use. 55 | Libre open access is gratis, meaning the research is available free of charge, but it goes further by granting users additional rights, usually via a Creative Commons licence, so that people are free to reuse and remix the research. 56 | There are varying degrees of what may be considered libre open access. 57 | For example, some scholarly articles may permit all uses except commercial use, some may permit all uses except derivative works, and some may permit all uses and simply require attribution. 58 | While some would argue that libre open access should be free of any copyright restrictions (except attribution), other scholars consider a work that removes at least some permission barriers to be libre. 59 | 60 | ## Why does open access matter? 61 | 62 | Research is useless if it’s not shared; even the best research is ineffectual if others aren’t able to read and build on it. 63 | When price barriers keep articles locked away, research cannot achieve its full potential. 64 | Open access benefits researchers who can work more effectively with a better understanding of the literature. 65 | It also helps avoid duplication of effort. 66 | No researcher (or funder) wants to waste time and money conducting a study if they know it has been attempted elsewhere. 67 | But, duplication of effort is all-too-possible when researchers can’t effectively communicate with one another and make results known to others in their field and beyond. 68 | It also benefits researchers by providing better visibility and therefore higher impact/citation rate for their scholarship. 69 | Numerous publishers, both non-profit and for-profit, voluntarily make their articles openly available at the time of publication or within 6-12 months. 70 | Many have switched from a closed, subscription model to an open one as a strategic business decision to increase their journal's exposure and impact. 71 | Further it can be argued that taxpayers who pay for much of the research published in journals have a right to access the information resulting from that investment without charge. 72 | Finally, if research is available to the widest possible pool of readers then it is more likely/easy for it to be checked and reproduced. 73 | 74 | ## Best practice for open access 75 | 76 | ### Self-archiving 77 | 78 | Self-archive a publication in a suitable repository, institutional or subject-based, following the possible restrictions posed by the publisher, for example an embargo period, or limits on the allowed version to be deposited in such archives. 79 | In doing this it is important to make sure you are aware of the copyright implications of any documents/agreements you make when submitting your manuscript to a journal. 80 | If your institution does not have an institutional repository, advocate for the creation of one. 81 | You can check journal policies on self-archiving using [SHERPA/RoMEO](http://www.sherpa.ac.uk/romeo/index.php). 82 | 83 | ### Publication 84 | 85 | Consider submitting your work to a journal that is open access. 86 | When doing this be aware that there may be funds or discounts available to cover any associated costs. 87 | -------------------------------------------------------------------------------- /content/open/open-notebooks.md: -------------------------------------------------------------------------------- 1 | # Open notebooks 2 | 3 | Electronic Lab Notebooks (ELNs) enable researchers to organize and store experimental procedures, protocols, plans, notes, data, and even unfiltered interpretations using their computer or mobile device. 4 | They are a digital analogue to the paper notebook most researchers keep. 5 | ELNs can offer several advantages over the traditional paper notebook in documenting research during the active phase of a project, including searchability within and across notebooks, secure storage with multiple redundancies, remote access to notebooks, and the ability to easily share notebooks among team members and collaborators. 6 | 7 | Open notebook research is simply the practice of making such notebooks openly available, usually online. 8 | Some researchers choose to keep their notebooks open from the very beginning of their projects. 9 | Rather than wait months, even years, to share their research through journal publication as is the current practice, this allows researchers to post their experimental data and protocols online and in real-time. 10 | Sharing research in this open and timely manner helps to reduce duplication of work, helps foster new collaborations, and cultivates a more open dialogue with others. 11 | It also helps researchers avoid making exploring dead ends and making mistakes that have already been covered by their colleague, but went unpublished because of lack of scientific interest. 12 | 13 | Open notebooks have the further benefit of increasing the quality of scientific outputs by forcing researchers to be careful, thorough, and explicit. 14 | Making research open has the added benefit of increasing the likelihood that any errors made in an investigation will be spotted quickly, instead of down the line. 15 | Immediate fixes will have much less impact on a research project, which will save a research time, the lab money, and pride. 16 | 17 | Ideally, every scientist would maintain an open notebook in real-time which would encompass all aspects of their research. 18 | But many fears about dealing with complete open access, conflicts with intellectual property and publications, and online data overload hamper this movement. 19 | To combat this, practitioners encourage any form of open notebook research, "make open what you can", even if that means uploading some information for a project from many years ago that never saw the light of day. 20 | -------------------------------------------------------------------------------- /content/open/open-scholarship.md: -------------------------------------------------------------------------------- 1 | # Open scholarship 2 | 3 | Open research and its subcomponents fit under the umbrella of a broader concept - open scholarship. 4 | 5 | ![open_umbrella](../figures/open_umbrella.png) 6 | 7 | ## Open educational resources 8 | 9 | Open educational resources (OERs) are teaching and learning materials that can be freely used and reused for learning and/or teaching at no cost, and without needing to ask permission. Examples are courses, including Massive Online Open Courses (MOOCs), lectures, teaching materials, assignments, and various other resources. OERs are available in many different formats compatible with online usage, most obviously text, images, audio, and video. Anyone with internet access can access and use OERs; access is not dependent on location or membership of a particular institution. 10 | 11 | Unlike copyrighted resources, OERs have been authored or created by an individual or organization that chooses to retain few, if any, ownership rights. In some cases, that means anyone can download a resource and share it with colleagues and students. In other cases, this may go further and enable people to edit resources and then re-post them as a remixed work. How do you know your options? OERs often have a Creative Commons licence or other permission to let you know how the material may be used, reused, adapted, and shared. 12 | 13 | Fully open OERs comply with the 5 Rs: 14 | 15 | - Retain: the right to make, own, and control copies of the content. 16 | - Reuse: the right to use the content in a wide range of ways (for example, in a class, in a study group, on a website, in a video). 17 | - Revise: the right to adapt, adjust, modify, or alter the content itself (for example, translate the content into another language). 18 | - Remix: the right to combine the original or revised content with other open content to create something new (for example, incorporate the content into a mashup). 19 | - Redistribute: the right to share copies of the original content, your revisions, or your remixes with others (for example, give a copy of the content to a friend). 20 | 21 | Researchers generate a great deal of educational resources in the course of teaching students and each other (at workshops, for example). 22 | By making these openly available, for example in the [open educational resource commons](https://www.oercommons.org/), the wider community can benefit from them in three main ways: 23 | 24 | 1. Most obviously, the community can use the materials to learn about the material they cover. 25 | 2. Sharing resources reduces duplication of effort. 26 | If an educator needs materials for teaching and such materials already exist openly then they need not make their own from scratch, saving time. 27 | 3. Making materials openly available helps a community build better resources by improving resources that already exist and combining OERs to take advantage of their different strengths, such as a great diagram or explanation. 28 | 29 | Beyond the raw practical benefits the worldwide OER movement is rooted in the human right to access high-quality education. 30 | This shift in educational practice is about participation and co-creation. 31 | Open Educational Resources (OERs) offer opportunities for systemic change in teaching and learning content through engaging educators in new participatory processes and effective technologies for engaging with learning. 32 | 33 | ## Equity, diversity, inclusion 34 | 35 | Open scholarship means open to *everyone* without discrimination based on factors such as race, gender, sexual orientation, or any number of other factors. 36 | As a community we should undertake to ensure equitable opportunities for all. 37 | We can go about that by deliberately fostering welcoming, inclusive cultures within out communities. 38 | For example, reasonable accommodations should be made wherever possible to include community members with disabilities to enable them to participate fully, and this can be as simple as choosing colourblind-safe colour schemes when making graphs. 39 | 40 | ## Citizen science 41 | 42 | Citizen science is the involvement of the public in scientific research – whether community-driven research or global investigations, the Oxford English Dictionary recently defined it as: "scientific work undertaken by members of the general public, often in collaboration with or under the direction of professional scientists and scientific institutions". 43 | Citizen science offers the power of science to everyone, and the power of everyone to science. 44 | 45 | By allowing members of the public to contribute to scientific research, citizen science helps engage and invest the wider world in science. 46 | It also benefits researchers by offering manpower that simply would not be accessible otherwise. 47 | Examples of this include [finding](https://citizensciencegames.com/games/eterna/) ways of folding molecules, and [classifying](https://www.zooniverse.org/) different types of galaxies. 48 | 49 | ## Patient and Public Involvement 50 | Whilst citizen science encompasses one way of contributing to scientific research, Patient and Public Involvement (PPI) is a far more specialised form of citizen science which is particularly useful when doing research on health and/or social issues. 51 | 52 | PPI is *not*: 53 | - Participation: Recruitment of participants (such as for a clinical trial or survey) to contribute data to a project. 54 | - Engagement: Dissemination, such as presenting at patient interest groups or writing a blog post. 55 | 56 | PPI *is*: 57 | - Involvement: patients and members of the public contribute at *all* stages of the research cycle. 58 | 59 | When incorporating PPI into research, researchers work *with* volunteers, rather than doing work *about* them. 60 | PPI volunteers are usually patients or members of the public with a particular interest in some area of research which means that the topic is often very personal, and being involved in the research cycle can be an empowering experience. 61 | For the researcher, PPI often generates unique and invaluable insights from the volunteers' own personal expertise which cannot always be predicted by the researchers themselves. 62 | 63 | It is a good idea to consider PPI very early in a project, ideally before any grant applications or submissions for ethical approval have been written. 64 | PPI volunteers can help researchers in many ways, such as the following: 65 | 1. Generate or shape research questions. 66 | 2. Contribute to, or review, study design. 67 | 3. Help with grant applications or submissions to research ethics committees (particularly the lay summary). 68 | 4. Collect data. 69 | 5. Analyse data. 70 | 6. Contribute to the manuscript and be listed as a co-author. 71 | 7. Disseminate findings in plain English. 72 | 73 | One of the biggest barriers to PPI is not knowing how to get started. 74 | The UK National Institute for Health Research have their own site, [INVOLVE](https://www.invo.org.uk/), to help familiarise yourself with the foundations of PPI. 75 | Additionally, charities related to your specific research field may be able to facilitate or support PPI; for example [Cancer Research UK](https://www.cancerresearchuk.org/funding-for-researchers/patient-involvement-toolkit-for-researchers) and [Parkinson's UK](https://www.parkinsons.org.uk/research/patient-and-public-involvement-ppi) have formal guides in place that provide a comprehensive overview of PPI. 76 | -------------------------------------------------------------------------------- /content/open/open.md: -------------------------------------------------------------------------------- 1 | (rr-open)= 2 | # Open research 3 | 4 | ## Prerequisites / recommended skill level 5 | 6 | | Prerequisite | Importance | Notes | 7 | | -------------|----------|------| 8 | | Experience with version control | Helpful | Experience with GitHub is particularly useful | 9 | 10 | ## Summary 11 | 12 | Open research aims to transform research by making it more reproducible, transparent, re-usable, collaborative, accountable, and accessible to society. It pushes for change in the way that research is carried out and disseminated by digital tools. One definition of open research, [as given by the Organisation for Economic Co-operation and Development (OECD)](https://www.fct.pt/dsi/docs/Making_Open_Science_a_Reality.pdf "Making Open Science a Reality, OECD Science, Technology and Industry Policy Papers No. 25"), is the practice of making "the primary outputs of publicly funded research results – publications and the research data – publicly accessible in digital format with no or minimal restriction." In order to achieve this openness in research, each element of the research process should: 13 | 14 | - Be publicly available: It is difficult to use and benefit from knowledge hidden behind barriers such as passwords and paywalls. 15 | - Be reusable: Research outputs need to be licensed appropriately so that prospective users clearly know any limitations on re-use. 16 | - Be transparent: With appropriate metadata to provide clear statements of how research output was produced and what it contains. 17 | 18 | The research process typically has the following form: data is collected and then analysed (usually using software). This process may involve the use of specialist hardware. The results of the research are then published. Throughout the process it is good practice for researchers to document their working in notebooks. Open research aims to make each of these elements open: 19 | 20 | - Open data: Documenting and sharing research data openly for re-use. 21 | - Open source software: Documenting research code and routines, and making them freely accessible and available. 22 | - Open hardware: Documenting designs, materials, and other relevant information related to hardware, and making them freely accessible and available. 23 | - Open access: Making all published outputs freely accessible for maximum use and impact. 24 | - Open notebooks: An emerging practice, documenting and sharing the experimental process of trial and error. 25 | 26 | These elements are expanded upon in this chapter. 27 | 28 | Open scholarship is a concept that extends open research further. It relates to making other aspects of scientific research open to the public, for example: 29 | 30 | - Open educational resources: Making educational resources publicly available to be re-used and modified. 31 | - Equity, diversity, inclusion: Ensuring scholarship is open to anyone without barriers based on factors such as race, background, gender, and sexual orientation. 32 | - Citizen science: The inclusion of members of the public in scientific research. 33 | 34 | These elements are also discussed in detail in this chapter. 35 | 36 | ## How this will help you / why this is useful 37 | 38 | There are five main schools of thought motivating open practices to benefit research: 39 | 40 | | School | Belief | Aim | 41 | | -------------------------- | -------------------- | ------------------------------------------------- | 42 | | Infrastructure | Efficient research depends on the available tools and applications. | Creating openly available platforms, tools, and services for researchers. | 43 | | Pragmatic | Knowledge-creation could be more efficient if researchers worked together. | Opening up the process of knowledge creation. | 44 | | Measurement | Academic contributions today need alternative impact measurements. | Developing an alternative metric system for research impact. | 45 | | Democratic | The access to knowledge is unequally distributed. | Making knowledge freely available for everyone. | 46 | | Public | Research needs to be made accessible to the public. | Making research accessible for citizens. | 47 | 48 | Open practices also benefit the researchers that propagate them. For example there is evidence [(Mckiernan et al. 2016)](https://elifesciences.org/articles/16800) that open access articles are cited more often, as shown by the metastudy presented in the figure below. 49 | 50 | | ![open_access_citatations](../figures/open_access_citatations.jpg) | 51 | | -----------------------------------------------------| 52 | | The relative citation rate (OA: non-OA) in 19 fields of research. This rate is defined as the mean citation rate of OA articles divided by the mean citation rate of non-OA articles. Multiple points for the same discipline indicate different estimates from the same study, or estimates from several studies. (See footnote 1 for references.) | 53 | 54 | Another benefit of openness is that while research collaborations are essential to advancing knowledge, identifying and connecting with appropriate collaborators is not trivial. Open practices can make it easier for researchers to connect with one another by increasing the discoverability and visibility of one’s work, facilitating rapid access to novel data and software resources, and creating new opportunities to interact with and contribute to ongoing communal projects. 55 | -------------------------------------------------------------------------------- /content/overview/overview-benefit.md: -------------------------------------------------------------------------------- 1 | (rr-overview-benefits)= 2 | # Added Advantages 3 | 4 | In the section, we discussed the different aspects of reproducible research that are beneficial for the scientific community. 5 | In this chapter, we will share some less obvious aspects of working reproducibly for individual researchers and teams. 6 | 7 | ![Why we should care about working reproducibly](../figures/reasons_reproducibility.png) 8 | 9 | **1. Track a complete history of your research** 10 | 11 | Reproducible research must contain a complete history and narrative (also known as [Provenance](https://en.wikipedia.org/wiki/Provenance)) of the project planning and development process. 12 | This includes information on the data, tools, methods, codes, and documentation used in the research project. 13 | By storing a complete track-record of our work, we can ensure research sustainability, fair citation/acknowledgment, and usefulness of our and others' work in our research fields. 14 | 15 | **2. Facilitate collaboration and review process** 16 | 17 | By designing reproducible workflows and sharing them with the different components of our research project, we can allow others to develop an in-depth understanding of our work. 18 | This encourages them to review our methods, test our code, propose useful changes and make thoughtful contributions to develop our project further. 19 | Reproducible workflows facilitate the peer review process tremendously by allowing reviewers access to the different parts of the projects that are necessary to validate the research outcomes. 20 | 21 | **3. Publish validated research and avoid misinformation** 22 | 23 | Lack of reproducibility is one of the major factors that lead to paper retractions (source [Retraction Watch](https://retractionwatch.com/)). 24 | The best-known analyses of scientific literatures in psychology {cite}`Begley2012` and cancer biology {cite}`OpenScienceCollaboration2015Reproducibility` found the reproducibility rates of their research output of around 40% and 10%, respectively. 25 | By working reproducibly, we can develop validated research work, avoid misinformation that can limit replicability of our work and publish accurate research outputs. 26 | This aspect does not only support the validity of the current work, but any future studies that are based on reproducible research {cite}`MozillaScienceLab`. 27 | 28 | **4. Write your papers, thesis and reports efficiently** 29 | 30 | Well documented analyses help us maintain easy access to all the results generated within a project that can be written up efficiently. 31 | If working in a team, collaborators can easily get recognition in terms of authorship for their contributions. Furthermore, by availing the underlying dataset and methods we can easily comply with the highest-level journal guidelines. 32 | 33 | **5. Get credits for your work fairly** 34 | 35 | Applying reproducibility practices separately on different parts of the project such as data, independently executable codes and scripts, protocols, and reports allows other researchers to test and reuse our work in their research, and brings fair recognition for our work. 36 | Researchers who publish their work with the underlying information, get cited more often as their research outcome can be broadly replicated and trusted. 37 | This fair credit system encourages researchers to further maintain reproducibility practices in their work. 38 | 39 | **6. Ensure continuity of your work** 40 | 41 | By following guidelines for reproducibility, we can easily communicate our work with different stakeholders such as our supervisors, funders, reviewers, students, and potential collaborators. 42 | This aspect of reproducibility increases the usefulness of our research by enabling others to easily build on our results, and re-use our research materials {cite}`MozillaScienceLab`. 43 | This ensures the continuity of a research idea and can even find fresh applications in other contexts. 44 | Progress of such projects can easily be tracked and continued - either by other researchers, or yourself if you want to build on your own work after a longer period {cite}`Markowetz2015`. 45 | 46 | To learn about other benefits of working reproducibly on Open Research projects are covered in our Open chapter. 47 | 48 | --- 49 | ## References 50 | ```{bibliography} ../references.bib 51 | :filter: docname in docnames 52 | ``` -------------------------------------------------------------------------------- /content/overview/overview-definitions.md: -------------------------------------------------------------------------------- 1 | (rr-overview-definitions)= 2 | # Definitions of Reproducibility 3 | 4 | The most common definition of reproducibility (and replication) was first noted by Claerbout and Karrenbach in 1992 {cite}'ClaerboutKarrenbach1992Reproducibility' and has been used in computational science literature since then. 5 | Another popular definition has been introduced in 2013 by the Association for Computing Machinery (ACM) {cite}`Ivie2018SciComp`, which swapped the meaning of the terms 'reproducible' and 'replicable' compared to Claerbout and Karrenbach. 6 | 7 | The following table contrasts both definitions {cite}`Heroux2018Reproducibility`. 8 | 9 | | Term | Claerbout & Karrenbach | ACM | 10 | | -----|------------------------|-----| 11 | | Reproducible | Authors provide all the necessary data and the computer codes to run the analysis again, re-creating the results.| (Different team, different experimental setup.) The measurement can be obtained with stated precision by a different team, a different measuring system, in a different location on multiple trials. For computational experiments, this means that an independent group can obtain the same result using artifacts which they develop completely independently. | 12 | | Replicable | A study that arrives at the same scientific findings as another study, collecting new data (possibly with different methods) and completing new analyses. | (Different team, same experimental setup.) The measurement can be obtained with stated precision by a different team using the same measurement procedure, the same measuring system, under the same operating conditions, in the same or a different location on multiple trials. For computational experiments, this means that an independent group can obtain the same result using the author's own artifacts. | 13 | 14 | Barba (2018) {cite}`Barba2018Reproducibility` conducted a detailed literature review on the usage of reproducible/replicable covering several disciplines. 15 | Most papers and disciplines use the terminology as defined by Claerbout and Karrenbach, whereas microbiology, immunology and computer science tend to follow the ACM use of reproducibility and replication. 16 | In political science and economics literature, both terms are used interchangeably. 17 | 18 | In addition to these high level definitions of reproducibility, some authors provide more detailed disctinctions. 19 | Victoria Stodden {cite}`Victoria2014Reproducibility`, a prominent scholar on this topic, has for example identified the following further distinctions: 20 | 21 | - _Computational reproducibility_: When detailed information is provided about code, software, hardware and implementation details. 22 | 23 | - _Empirical reproducibility_: When detailed information is provided about non-computational empirical scientific experiments and observations. In practice this is enabled by making data freely available, as well as details of how the data was collected. 24 | 25 | - _Statistical reproducibility_: When detailed information is provided, for example, about the choice of statistical tests, model parameters, and threshold values. This mostly relates to pre-registration of study design to prevent p-value hacking and other manipulations. 26 | 27 | (rr-overview-definitions-table)= 28 | ## Table of definitions for reproducibility 29 | 30 | At _The Turing Way_ we define **reproducible research** as work that can be independently recreated from the same data and the same code that the original team used. 31 | Reproducible is distinct from replicable, robust and generalisable as described in the figure below. 32 | 33 | | ![Kirstie's definition of reproducible research](../figures/ReproducibleMatrix.jpg) | 34 | | -------------------------------------------------------------------------------------------------------- | 35 | | How the Turing Way defines reproducible research | 36 | 37 | The different dimensions of reproducible research described in the matrix above have the following definitions: 38 | 39 | - **Reproducible:** A result is reproducible when the _same_ analysis steps performed on the _same_ dataset consistently produces the _same_ answer. 40 | - **Replicable:** A result is replicable when the _same_ analysis performed on _different_ datasets produces qualitatively similar answers. 41 | - **Robust:** A result is robust when the _same_ dataset is subjected to _different_ analysis workflows to answer the same research question (for example one pipeline written in R and another written in Python) and a qualitatively similar or identical answer is produced. 42 | Robust results show that the work is not dependent on the specificities of the programming language chosen to perform the analysis. 43 | - **Generalisable:** Combining replicable and robust findings allow us to form generalisable results. 44 | Note that running an analysis on a different software implementation and with a different dataset does not provide _generalised_ results. 45 | There will be many more steps to know how well the work applies to all the different aspects of the research question. 46 | Generalisation is an important step towards understanding that the result is not dependent on a particular dataset nor a particular version of the analysis pipeline. 47 | 48 | More information on these definitions can be found in "Reproducibility vs. Replicability: A Brief History of a Confused Terminology" by Hans E. Plesser {cite}`Plesser2018Reproducibility`. 49 | 50 | ## Reproducible but not open 51 | 52 | _The Turing Way_ recognises that some research will use sensitive data that cannot be shared and this handbook will provide guides on how your research can be reproducible without all parts necessarily being open. 53 | 54 | --- 55 | ## References 56 | ```{bibliography} ../references.bib 57 | :filter: docname in docnames 58 | ``` -------------------------------------------------------------------------------- /content/overview/overview-resources.md: -------------------------------------------------------------------------------- 1 | (rr-overview-resources)= 2 | # Resources for reproducibility chapter 3 | For additional resources like videos and reference papers on reproducibility, see the {ref}`rr-overview-resources-reading` and {ref}`rr-overview-resources-addmaterial` sections. 4 | 5 | ## Checklist / Exercise 6 | - [ ] Define reproducibility for yourself. 7 | 8 | ## What to learn next? 9 | Open Research would be a good chapter to read next. 10 | If you want to start learning hands-on practices, we recommend reading the version control chapter next. 11 | 12 | (rr-overview-resources-reading)= 13 | ## Further Reading 14 | 15 | * Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature, 533(7604), 452–454. https://doi.org/10.1038/533452a 16 | 17 | * Barba, L. (2017): Barba-group Reproducibility Syllabus. figshare. Paper. https://doi.org/10.6084/m9.figshare.4879928.v1 18 | 19 | * Piwowar, H. A., & Vision, T. J. (2013). Data reuse and the open data citation advantage. PeerJ, 1, e175. https://doi.org/10.7717/peerj.175 20 | 21 | * Whitaker, Kirstie (2018): Barriers to reproducible research (and how to overcome them). figshare. Paper. https://doi.org/10.6084/m9.figshare.7140050.v2 22 | 23 | (rr-overview-resources-addmaterial)= 24 | ## Additional material 25 | 26 | ### Videos 27 | 28 | * Markowetz, F. (2016). 5 selfish reasons to work reproducibly. Talk at scidata 2016. https://www.youtube.com/watch?v=Is15CMVPHas&feature=youtu.be 29 | 30 | ### Other useful links 31 | 32 | * Markowetz, F. (2018). 5 selfish reasons to work reproducibly. Slides available at https://osf.io/a8wq4/ 33 | 34 | * Leipzig, J (2020). Awesome Reproducible Research: A curated list of reproducible research case studies, projects, tutorials, and media. Github repo. https://github.com/leipzig/awesome-reproducible-research 35 | -------------------------------------------------------------------------------- /content/overview/overview.md: -------------------------------------------------------------------------------- 1 | (rr-overview)= 2 | # Overview of Reproducible Research 3 | 4 | Scientific results and evidence are strengthened if they are reproduced 5 | and confirmed by several independent researchers (see {ref}`definitions `). 6 | With all parts used in an analysis being available and/or documented, valuable time is saved reproducing published results and other researchers can easily build on these research results and re-use data or code for their analyses. 7 | 8 | Learn about the less obvious benefits of working reproducibly in the {ref}`added advantages ` subchapter. 9 | 10 | Major media outlets have [reported on](https://www.theguardian.com/science/2018/aug/27/attempt-to-replicate-major-social-scientific-findings-of-past-decade-fails) investigations showing that a significant percentage of scientific studies cannot be reproduced. 11 | 12 | This leads to other academics and society losing trust in scientific results {cite}`baker2016reproducibility`. 13 | Working reproducibly means others can check your results - even early on in the research process. 14 | Thus, the full analysis and methodology is transparent. 15 | 16 | In addition, so called "negative results" can be published easily, helping avoid other researchers wasting time repeating analyses that will not return the expected results {cite}`Dirnagl2010bias`. 17 | For further reading resources on reproducibility, please checkout the {ref}`resources ` subchapter. 18 | 19 | ## Prerequisites / recommended skill level 20 | No previous knowledge needed. 21 | 22 | --- 23 | ## References 24 | ```{bibliography} ../references.bib 25 | :filter: docname in docnames 26 | ``` 27 | -------------------------------------------------------------------------------- /content/references.bib: -------------------------------------------------------------------------------- 1 | @article{baker2016reproducibility, 2 | author={Baker, Monya}, 3 | title={Reproducibility crisis?}, 4 | journal={Nature}, 5 | volume={533}, 6 | number={26}, 7 | pages={353--66}, 8 | year={2016}} 9 | 10 | @article{Dirnagl2010bias, 11 | author = {Dirnagl, Ulrich and Lauritzen, Martin}, 12 | title ={Fighting Publication Bias: Introducing the Negative Results Section}, 13 | journal = {Journal of Cerebral Blood Flow \& Metabolism}, 14 | volume = {30}, 15 | number = {7}, 16 | pages = {1263-1264}, 17 | year = {2010}, 18 | doi = {10.1038/jcbfm.2010.51}, 19 | note ={PMID: 20596038}, 20 | URL = {https://doi.org/10.1038/jcbfm.2010.51}, 21 | eprint = {https://doi.org/10.1038/jcbfm.2010.51}} 22 | 23 | @article{Ivie2018SciComp, 24 | author = {Ivie, Peter and Thain, Douglas}, 25 | title = {{Reproducibility in Scientific Computing}}, 26 | journal = {ACM Comput. Surv.}, 27 | volume = {51}, 28 | number = {3}, 29 | pages = {1--36}, 30 | year = {2018}, 31 | month = {Jul}, 32 | issn = {0360-0300}, 33 | publisher = {Association for Computing Machinery}, 34 | doi = {10.1145/3186266}} 35 | 36 | @article{Heroux2018Reproducibility, 37 | author = {Heroux, Michael A. and Barba, Lorena and Parashar, Manish and Stodden, Victoria and Taufer, Michela}, 38 | title = {{Toward a Compatible Reproducibility Taxonomy for Computational and Computing Sciences.}}, 39 | journal = {OSTI.GOV collections}, 40 | year = {2018}, 41 | month = {Oct}, 42 | note = {[Online; accessed 27. May 2020]}, 43 | URL = {https://www.osti.gov/biblio/1481626-toward-compatible-reproducibility-taxonomy-computational-computing-sciences}, 44 | doi = {10.2172/1481626}} 45 | 46 | @article{Barba2018Reproducibility, 47 | author = {Barba, Lorena A.}, 48 | title = {{Terminologies for Reproducible Research}}, 49 | journal = {arXiv}, 50 | year = {2018}, 51 | month = {Feb}, 52 | eprint = {1802.03311}, 53 | url = {https://arxiv.org/abs/1802.03311v1}} 54 | 55 | @misc{Victoria2014Reproducibility, 56 | author = {Stodden, Victoria}, 57 | title = {{Edge.org}}, 58 | year = {2014}, 59 | month = {May}, 60 | note = {[Online; accessed 27. May 2020]}, 61 | url = {https://www.edge.org/response-detail/25340}} 62 | 63 | @article{Plesser2018Reproducibility, 64 | author = {Plesser, Hans E.}, 65 | title = {{Reproducibility vs. Replicability: A Brief History of a Confused Terminology}}, 66 | journal = {Front. Neuroinf.}, 67 | volume = {11}, 68 | year = {2018}, 69 | month = {Jan}, 70 | issn = {1662-5196}, 71 | publisher = {Frontiers}, 72 | doi = {10.3389/fninf.2017.00076} 73 | } 74 | 75 | @article{Begley2012, 76 | Author = {Begley, C. and Ellis, L.}, 77 | Date-Added = {2012-03-28 0:00:00 am +0100}, 78 | Date-Modified = {2012-03-29 0:00:00 am +0100}, 79 | Journal = {Nature}, 80 | Title = {Raise standards for preclinical cancer research}, 81 | Year = {2015}, 82 | doi = {10.1038/483531a}} 83 | 84 | @article{OpenScienceCollaboration2015Reproducibility, 85 | author = {{Open Science Collaboration}}, 86 | title = {{Estimating the reproducibility of psychological science}}, 87 | journal = {Science}, 88 | volume = {349}, 89 | number = {6251}, 90 | pages = {aac4716}, 91 | year = {2015}, 92 | month = {Aug}, 93 | issn = {0036-8075}, 94 | publisher = {American Association for the Advancement of Science}, 95 | doi = {10.1126/science.aac4716}} 96 | 97 | @article{MozillaScienceLab, 98 | Author = {Mozilla Science Lab}, 99 | Date-Added = {2016-10-02 0:00:00 am +0100}, 100 | Date-Modified = {2018-01-22 0:00:00 am +0100}, 101 | Doi = {}, 102 | Journal = {GitHub}, 103 | Title = {Mozilla Science Lab's Study Group}, 104 | Year = {2016}, 105 | Bdsk-Url-1 = {https://mozillascience.github.io/study-group-orientation/}} 106 | 107 | @article{Marwick2018Mar, 108 | author = {Marwick, Ben and Boettiger, Carl and Mullen, Lincoln}, 109 | title = {{Packaging data analytical work reproducibly using R (and friends)}}, 110 | journal = {PeerJ Preprints}, 111 | year = {2018}, 112 | month = {Mar}, 113 | issn = {2167-9843}, 114 | publisher = {PeerJ Inc.}, 115 | doi = {10.7287/peerj.preprints.3192v2} 116 | } 117 | 118 | @article{Markowetz2015, 119 | Author = {Florian Markowetz}, 120 | Date-Added = {2015-08-12 0:00:00 am +0100}, 121 | Date-Modified = {2015-08-12 0:00:00 am +0100}, 122 | Doi = {10.1186/s13059-015-0850-7}, 123 | Journal = {Genome Biology}, 124 | Title = {Five selfish reasons to work reproducibly}, 125 | Year = {2015}, 126 | Bdsk-Url-1 = {https://doi.org/10.1186/s13059-015-0850-7}} -------------------------------------------------------------------------------- /content/reproducible-research.md: -------------------------------------------------------------------------------- 1 | (rr)= 2 | # Guide for Reproducible Research 3 | 4 | ***This guide covers topics related to skills, tools and best practices for research reproducibility.*** 5 | 6 | _The Turing Way_ defines reproducibility in data research as data and code being available to fully rerun the analysis. 7 | 8 | There are several definitions of reproducibility in use, and we discuss these in more detail in the definitions section of this chapter. 9 | While it it absolutely fine for us each to use different words, it will be useful for you to know how _The Turing Way_ defines *reproducibility* to avoid misunderstandings when reading the rest of the handbook. 10 | 11 | | ![A person showing another person what steps to take to make your data research reproducible](./figures/reproducibility.jpg) | 12 | | ---------------| 13 | | _The Turing Way_ project illustration by Scriberia. Zenodo. [http://doi.org/10.5281/zenodo.3332807](http://doi.org/10.5281/zenodo.3332807) | 14 | 15 | _The Turing Way_ started by defining reproducibility in the context of this handbook, lay out its importance for science and scientists, and provide an overview of the common concepts, tools and resources. 16 | The first few chapters were on the benefits of reproducibility, testing and reproducible computational environments. 17 | Since the start of this project in 2019, many additional chapters have been written, edited, reviewed, read and promoted by over 100 contributors. 18 | 19 | We welcome your contributions to improve these chapters and to add other important concepts in reproducibility and how to empower researchers to work reproducibly from the start. 20 | Check out our [contributing guidelines](https://github.com/alan-turing-institute/the-turing-way/blob/master/CONTRIBUTING.md) to get involved. 21 | -------------------------------------------------------------------------------- /content/welcome.md: -------------------------------------------------------------------------------- 1 | # Welcome 2 | 3 | _The Turing Way_ is an open source community-driven guide to reproducible, ethical, inclusive and collaborative data science. 4 | 5 | Our goal is to provide all the information that data scientists in academia, industry, government and in the third sector need at the start of their projects to ensure that they are easy to reproduce and reuse at the end. 6 | 7 | The book started as a guide for reproducibility, covering version control, testing, and continuous integration. 8 | But technical skills are just one aspect of making data science research "open for all". 9 | 10 | In February 2020, _The Turing Way_ expanded to a series of books covering reproducible research, project design, communication, collaboration, and ethical research. 11 | 12 | | ![The Turing Way project is illustrated as a road or path with shops for different data science skills. People can go in and out with their shopping cart and pick and choose what they need](./figures/welcome.jpg) | 13 | | ---------------| 14 | | _The Turing Way_ project illustration by Scriberia. Zenodo. [http://doi.org/10.5281/zenodo.3332807](http://doi.org/10.5281/zenodo.3332807) | 15 | 16 | ## Our community 17 | 18 | _The Turing Way_ community is dedicated to making collaborative, reusable and transparent research "too easy not to do". 19 | That means investing in the socio-technical skills required to work in a team, to build something greater than the any individual person could deliver alone. 20 | 21 | _The Turing Way_ is: 22 | 23 | * a book 24 | * a community 25 | * a global collaboration 26 | 27 | We hope you find the content in the book helpful. 28 | Everything here is available for free under a [CC-BY licence](https://github.com/alan-turing-institute/the-turing-way/blob/master/LICENSE.md). 29 | Please use and re-use whatever you need for any purpose. 30 | 31 | The book is collaboratively written and open from the start. 32 | To make this project truly accessible and useful for everyone, we invite you to contribute your skills and bring your perspectives into this project. 33 | To join this community, please read our [contribution guidelines](https://github.com/alan-turing-institute/the-turing-way/blob/master/CONTRIBUTING.md) and ways to [get in touch](https://github.com/alan-turing-institute/the-turing-way#get-in-touch). 34 | More information about the community and the project is available in the Community Handbook. 35 | We look forward to expanding and building _The Turing Way_ together. 36 | 37 | Although _The Turing Way_ receives support and funding from [The Alan Turing Institute](https://www.turing.ac.uk/), the project is designed to be a global collaboration. 38 | We have contributions from across the UK, and from India, Mexico, Australia, USA, and many European countries. 39 | Chapters have been written, reviewed and curated by members of research institutes and universities, government departments, and industry. 40 | We are committed to creating a space where people with diverse expertise and lived experiences can share their knowledge with others to allow us all to use data science to improve the world. 41 | 42 | We value the participation of every member of our community and want to ensure that every contributor has an enjoyable and fulfilling experience. 43 | Accordingly, everyone who participates in _The Turing Way_ project is expected to show respect and courtesy to other community members at all times. 44 | All contributions must abide by our [code of conduct](https://github.com/alan-turing-institute/the-turing-way/blob/master/CODE_OF_CONDUCT.md). 45 | 46 | ![Gif showing screen capture of contributors table, smiling faces and emojis representing the types of contributions in a table](https://media.giphy.com/media/gKIUisnjpj2PS75nOJ/giphy.gif) 47 | 48 | (cite-tag)= 49 | ## Citing _The Turing Way_ 50 | 51 | All material in _The Turing Way_ is available under a [CC-BY 4.0 licence](https://github.com/alan-turing-institute/the-turing-way/blob/master/LICENSE.md). 52 | 53 | You can cite _The Turing Way_ through the project's Zenodo archive using doi: [10.5281/zenodo.3233853](https://doi.org/10.5281/zenodo.3233853). 54 | 55 | The citation will look something like: 56 | 57 | > The Turing Way Community, Becky Arnold, Louise Bowler, Sarah Gibson, Patricia Herterich, Rosie Higman, … Kirstie Whitaker. (2019, March 25). The Turing Way: A Handbook for Reproducible Data Science (Version v0.0.4). Zenodo. http://doi.org/10.5281/zenodo.3233986 58 | 59 | Please visit the [DOI link](https://doi.org/10.5281/zenodo.3233853) though to get the most recent version - the one above is not automatically generated and therefore may be out of date. 60 | DOIs allow us to archive the repository and they are really valuable to ensure that the work is tracked in academic publications. 61 | 62 | You can also share the human-readable URL to a page in the book, for example: [https://the-turing-way.netlify.app/reproducible-research/overview/overview-definitions.html](https://the-turing-way.netlify.app/reproducible-research/overview/overview-definitions.html), but be aware that the project is under development and therefore these links may change over time. 63 | You might want to include a [web archive link](http://web.archive.org) such as: [https://web.archive.org/web/20191030093753/https://the-turing-way.netlify.com/reproducibility/03/definitions.html](https://web.archive.org/web/20191030093753/https://the-turing-way.netlify.com/reproducibility/03/definitions.html) to make sure that you don't end up with broken links everywhere! 64 | 65 | We really appreciate any references that you make to _The Turing Way_ project in your work and we hope it is useful. 66 | If you have any questions please [get in touch](https://github.com/alan-turing-institute/the-turing-way#get-in-touch). 67 | -------------------------------------------------------------------------------- /data/README_data.md: -------------------------------------------------------------------------------- 1 | # Data 2 | 3 | - This tutorial is designed to guide learners to work on their local computers. 4 | - Data source used for this tutorial: https://github.com/martinagvilas/jupytercon_tutorial. 5 | - Through the tutorial we guide learners to download the [Zip folder](https://github.com/martinagvilas/jupytercon_tutorial/archive/master.zip) of clone the entire repository `git clone https://github.com/martinagvilas/jupytercon_tutorial.git`. 6 | - We use example files that are present in the [./content] folder. 7 | -------------------------------------------------------------------------------- /images/MalvikaSharan.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/images/MalvikaSharan.jpg -------------------------------------------------------------------------------- /images/MartinaVilas.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/images/MartinaVilas.jpg -------------------------------------------------------------------------------- /images/README_images.md: -------------------------------------------------------------------------------- 1 | # Images 2 | 3 | Any images embedded in markdown cells within the Jupyter notebooks should go in this folder. 4 | 5 | List any image sources and authors in this README. 6 | 7 | **Recommend** that any images used in the materials be shared under a [CC-BY](https://creativecommons.org/licenses/by/2.0/) license. 8 | -------------------------------------------------------------------------------- /images/SarahGibson.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/images/SarahGibson.jpg -------------------------------------------------------------------------------- /images/cell_options.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/images/cell_options.png -------------------------------------------------------------------------------- /images/create_assignment.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/images/create_assignment.png -------------------------------------------------------------------------------- /images/github-button-turing.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/images/github-button-turing.png -------------------------------------------------------------------------------- /images/graded_exercise_sample.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/images/graded_exercise_sample.png -------------------------------------------------------------------------------- /images/jupyterbook-jc2020.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/images/jupyterbook-jc2020.png -------------------------------------------------------------------------------- /images/reproducible-jc2020.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/images/reproducible-jc2020.jpg -------------------------------------------------------------------------------- /images/theturingway-intro-jc2020.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/images/theturingway-intro-jc2020.png -------------------------------------------------------------------------------- /images/theturingway-navigation-jc2020.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/images/theturingway-navigation-jc2020.png -------------------------------------------------------------------------------- /notebooks/README.md: -------------------------------------------------------------------------------- 1 | # Overview of Modules 2 | 3 | Each module has a corresponding Jupyter Notebook. In this tutorial you are 4 | meant to execute the Jupyter Notebooks in order. 5 | 6 | The following is an overview of the contents in each module: 7 | 8 | - [Module 1](./1-welcome.ipynb) 9 | - Introduction to the workshop and initial setup 10 | - [Module 2](./2-introduction.ipynb) 11 | - Introduction to The Turing Way and reproducible research 12 | - Introduction to Jupyter Book 13 | - Demo of The Turing Way repository and its Jupyter Book 14 | - [Module 3](./3-setup-jupyterbook.ipynb) 15 | - Creating the minimal version of a Jupyter Book 16 | - [Module 4](./4-config-jupyterbook.ipynb) 17 | - Configuring the layout and behavior of a Jupyter Book 18 | - [Module 5](./5-more-jupyterbook.ipynb) 19 | - Adding Jupyter Notebooks to a Jupyter Book 20 | - Making your Jupyter Book interactive 21 | - Using MyST in a Jupyter Book 22 | - [Module 6](./6-ci-jupyterbook.ipynb) 23 | - Continuous Integration (CI) and its role in reproducibility 24 | - Deploying Jupyter Book using GitHub actions 25 | - Hosting a Jupyter Book online using GitHub Pages 26 | - [Module 7](./7-final-demo.ipynb) 27 | - Using Sphinx features in a Jupyter Book 28 | - Wrap up -------------------------------------------------------------------------------- /postBuild: -------------------------------------------------------------------------------- 1 | jupyter labextension install @jupyterlab/server-proxy --minimize=False -------------------------------------------------------------------------------- /presentation/README.md: -------------------------------------------------------------------------------- 1 | # Slides for the presentations used in this tutorial 2 | 3 | Each Jupyter Notebook comes with an introductory video, which is linked on the top of the Notebook. 4 | All the slides for those presentations are being shared in this repository in the [presentation directory](../presentation) under the CC-BY 4.0 License. 5 | -------------------------------------------------------------------------------- /presentation/module_1_presentation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/presentation/module_1_presentation.pdf -------------------------------------------------------------------------------- /presentation/module_2.1_presentation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/presentation/module_2.1_presentation.pdf -------------------------------------------------------------------------------- /presentation/module_2.2_presentation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/presentation/module_2.2_presentation.pdf -------------------------------------------------------------------------------- /presentation/module_3_presentation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/presentation/module_3_presentation.pdf -------------------------------------------------------------------------------- /presentation/module_4_presentation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/presentation/module_4_presentation.pdf -------------------------------------------------------------------------------- /presentation/module_5_presentation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/presentation/module_5_presentation.pdf -------------------------------------------------------------------------------- /presentation/module_6_presentation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/presentation/module_6_presentation.pdf -------------------------------------------------------------------------------- /presentation/module_7_presentation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupytercon/2020-jupyterbook-with-turing-way/7d7baf2b8df420299e6ea0af054693680857a503/presentation/module_7_presentation.pdf -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | matplotlib 2 | numpy 3 | pandas 4 | jupyter-book==0.8.0 5 | jupytext 6 | ruamel.yaml 7 | git+https://github.com/choldgraf/jupyter-http-server -------------------------------------------------------------------------------- /slides/README_slides.md: -------------------------------------------------------------------------------- 1 | --- 2 | jupyter: 3 | jupytext: 4 | text_representation: 5 | extension: .md 6 | format_name: myst 7 | format_version: '1.1' 8 | jupytext_version: 1.1.0 9 | kernelspec: 10 | display_name: Python 3 11 | language: python 12 | name: python3 13 | --- 14 | 15 | +++ {"slideshow": {"slide_type": "slide"}} 16 | 17 | # Tutorial slides 18 | 19 | - Slides are optional (e.g., you may not use them if your presentation is via live coding). 20 | - If the pre-recorded presentations will use slides, we request that you deposit the slides in this folder. 21 | 22 | +++ {"slideshow": {"slide_type": "slide"}} 23 | 24 | ## Use text-based source 25 | 26 | - We ask that you use text-based formats for your slides, e.g., markdown 27 | - This markdown file is an example source for slides using `nbconvert` and Reveal. See the GitHub action '.github/workflows/slides.yml' in this repo so see how this markdown file is converted to a HTML slide show and published on GitHub Pages - https://jupytercon.github.io/tutorial2020/ 28 | 29 | +++ {"slideshow": {"slide_type": "subslide"}} 30 | 31 | ## An example sub-slide 32 | 33 | - Another option: you can write your slide content using markdown and use an app for slide design, like [Deckset](https://www.deckset.com) or similar. 34 | 35 | +++ {"slideshow": {"slide_type": "slide"}} 36 | 37 | ## Naming convention and file list 38 | 39 | - Use a **naming convention** where each file name starts with a number, reflecting the order of use in the presentation of the tutorial. 40 | - List your slide files in a markdown, with a brief description. 41 | 42 | 43 | +++ {"slideshow": {"slide_type": "slide"}} 44 | 45 | ## License 46 | 47 | **Recommend** that slides be shared under a [CC-BY](https://creativecommons.org/licenses/by/4.0/) license. 48 | --------------------------------------------------------------------------------