Data management: Recommendations
It is very important that data are searchable, findable, reusable, attributable, readable, accurate, and complete. Take one of your published figures from 5 years ago. Are you able to find the raw data? How long will it take you? According to our survey, 3 quarters of CFs cannot do this…
Below are some recommendations how this can be improved.
Use an electronic lab book (ELN) or LIMS or write a data management guideline for your users about how data will be managed at the CF. Define who does what and how, regarding data naming, saving, transferring, long-term storage and analysis.
- Data naming:
- A unique and logical name is the first step to traceable data. Decide of a uniform and consistent convention that you impose on your users, also allowing some personalisation:
- e.g. 201130-imk-experimentX [date]-[user acronym]-[free text from the user]
- Data saving:
- Raw data files should be unmodifiable and should include metadata about the acquisition settings and the experimental conditions.
- Data transferring:
- Organize easy/practical data transfer to the user (eg. through network) to prevent data loss.
- Data storage:
- Define who will store the raw data long-term and where (see responsibility). CF could store raw data additionally for the users on a server/repository, as a backup and to prevent fraud.
- Data analysis:
- Analysed data must be given a different name from raw data (by adding a suffix). Ensure raw data are not modified or overwritten. Analysed data should be linked to raw data, as well as published figures.
- Data traceability
- Data processing (from data acquisition, repeated datasets, data analysis up to published figures) needs to be properly documented to ensure traceability. Always use the same unique identifier throughout the experiment and modify only the end of the name. Ensure data are structured and complete.
A data management plan (DMP) can also be implemented for each experiment. It is a written document that describes the data you expect to acquire or generate during the course of a research project, how you will manage, describe, analyse, and store those data, and what mechanisms you will use at the end of your project to share and preserve your data.