diff --git a/Research-simulation.md b/Research-simulation.md index 8535bad..a721cac 100644 --- a/Research-simulation.md +++ b/Research-simulation.md @@ -1,38 +1,45 @@ -# Principles of code and data management - -This page lists **mandatory** rules to be followed when working with code and -data. - -## Working with code -The objectives of these rules is to ensure: - -1. Code developped in the lab is preserved. -2. Code can be easily shared in the lab and outside. -3. Code can be easily reused by other people *(this includes future you!)*. -4. Simulations are reproducible. - -### Tenets -1. Every simulation and post-processing code must be versionned in a - [Git](https://git-scm.com/book/en/v2) (or similar) repository on - [git.dalembert.umpc.fr](https://git.dalembert.upmc.fr) (or similar). - [Commit](https://git-scm.com/book/en/v2/Git-Basics-Recording-Changes-to-the-Repository) - messages must be meaningful. -2. Every repository must have a `README` file explaining what is the purpose of - the code, what are the dependencies, and how to run the code, what the code - produces, how to run tests. -3. Every library repository must contain tests. Running the tests should be done - with a single command. -4. Every library must have a documented API, i.e. each function of the API has a - basic description of what it does, along a description of inputs and outputs. -5. Every library must have usage examples. -6. Code versions used in a publication must be saved on - [SoftwareHeritage](https://archive.softwareheritage.org/save/) and the - resulting SWHID cited in the publication. - -## Working with data -The objective of these rules is to enure: - -1. Data produced in the lab is preserved. -2. Data can be easily shared in the lab and outside. -3. Data can be easily resued by other people *(this includes future you!)*. -4. Data origin can be traced. +# Principles of code and data management + +This page lists **mandatory** rules to be followed when working with code and +data. + +## Working with code +The objectives of these rules is to ensure: + +1. Code developed in the lab is preserved. +2. Code can be easily shared in the lab and outside. +3. Code can be easily reused by other people *(this includes future you!)*. +4. Simulations are reproducible. + +### Tenets +1. Every simulation and post-processing code must be versioned in a + [Git](https://git-scm.com/book/en/v2) (or similar) repository on + [git.dalembert.umpc.fr](https://git.dalembert.upmc.fr) (or similar). + [Commit](https://git-scm.com/book/en/v2/Git-Basics-Recording-Changes-to-the-Repository) + messages must be meaningful. +2. Every repository must have a `README` file explaining what is the purpose of + the code, what are the dependencies, and how to run the code, what the code + produces, how to run tests. +3. Every library repository must contain tests. Running the tests should be done + with a single command. +4. Every library must have a documented API, i.e. each function of the API has a + basic description of what it does, along a description of inputs and outputs. +5. Every library must have usage examples. +6. Code versions used in a publication must be saved on + [SoftwareHeritage](https://archive.softwareheritage.org/save/) and the + resulting SWHID cited in the publication. + +## Working with data +The objective of these rules is to ensure: + +1. Data produced in the lab is preserved. +2. Data can be easily shared in the lab and outside. +3. Data can be easily reused by other people *(this includes future you!)*. +4. Data origin can be traced. + +### Tenets +1. Your `$HOME` must have scheduled daily backups on an external drive / remote server. Periodically make sure backups are working and can be recovered. +2. Simulation data for a workflow / pipeline / paper is grouped in a dataset. Datasets must be documented with a `README` file explaining what is the data, how it was generated and how it can be used. +3. Datasets must be (as much as possible) published to Zenodo at the time of submission, and the dataset DOI cited in the article. +4. Open-source file formats must be used to store data and metadata. +5. All datasets must be uploaded to ... when leaving the lab. \ No newline at end of file