Reproducible Reseach

Steps to Reproducible Research

Overview

Teaching: 30 min
Exercises: 10 min
Questions
  • What is reproducibility?

  • Why should I take reproducibility in mind for my research?

  • How can I make my research reproducible?

Objectives
  • understand what it means to have a reproducible research

  • understand the steps to create a reproducible research

What is Reproducibility?

In a single sentence, reproducibility is the ability to exactly re-create an earlier research/analysis given the same data. Meaning that if I hand of a piece of journal that describe my method for a research, and the input data, another person can make the exact findings.

Reproducibility Spectrum

The image above is the reproducibility spectrum showing the range from research that are described in publications, all the way through full replication, which are research with publication, executable code, and linked data, like having a frozen machine on the cloud, that can be executed to run your whole research. We don’t want both extremes, but rather in the middle depending on your field, data, and your research.

Reproducibility Discussion

What measures do you take to ensure your analyses are:

https://etherpad.wikimedia.org/p/ghw2018-reproducible-discussion

PS: shameless copied from the awesome slides available at: https://github.com/oceanhackweek/ohw2018_tutorials/blob/master/day5/reproducible_research_and_tools/.

Simple steps to reproducible research

For this tutorial I’m going to cover test, document, and publish your code part.

In research, experiments/results are not trusted unless:

So why would scientific software be any different?

Clear code is paramount!

Non-clean code

Good practices for the scientific env creation is also important!

Confusing science env creation

Let me introduce you

The code test-document-publish cookie cutter!

(Yep! Another cookie cutter for Scientific Python package!)

Standard proliferation

Example: https://nsls-ii.github.io/scientific-python-cookiecutter

What We will need

Hack session

  1. create python package
  2. choose a license (https://choosealicense.com/)
  3. write doctest
  4. bug? fix test / re-run
  5. setup Travis-CI / CircleCI
  6. setup AppVeyor
  7. upload source dist and docs
  8. create doi DOI

Key Points