Difference between revisions of "Data management: Recommendations"

From Q-CoFa
Jump to: navigation, search
Line 1: Line 1:
 
It is very important that data are searchable, findable, reusable, attributable, readable, accurate, and complete. Take one of your published figures from 5 years ago. Are you able to find the raw data? How long will it take you? According to our [https://elifesciences.org/articles/62212 survey], 3 quarters of CFs cannot do this…
 
It is very important that data are searchable, findable, reusable, attributable, readable, accurate, and complete. Take one of your published figures from 5 years ago. Are you able to find the raw data? How long will it take you? According to our [https://elifesciences.org/articles/62212 survey], 3 quarters of CFs cannot do this…
  
Use an [[Management: Recommendations#ELN|electronic lab book (ELN) or LIMS]] or write a [[:File:Data_management_guideline.docx|'''data management guideline''']] for your users about how data will be managed at the CF. [[:File:Responsibility distribution.docx|'''Define who does what''']] and how, regarding data naming, saving, transferring, long-term storage and analysis.  
+
Below are some recommendations how this can be improved. Use an [[Management: Recommendations#ELN|electronic lab book (ELN) or LIMS]] or write a [[:File:Data_management_guideline.docx|'''data management guideline''']] for your users about how data will be managed at the CF. [[:File:Responsibility distribution.docx|'''Define who does what''']] and how, regarding data naming, saving, transferring, long-term storage and analysis.  
  
 
* <span style="color:#0a42c5">Data naming:</span>
 
* <span style="color:#0a42c5">Data naming:</span>
Line 14: Line 14:
 
::Define '''who''' will store the raw data long-term and where (see [[Responsibility: Recommendations|responsibility]]). CF could store raw data additionally for the users on a server/repository, as a backup and to prevent fraud.
 
::Define '''who''' will store the raw data long-term and where (see [[Responsibility: Recommendations|responsibility]]). CF could store raw data additionally for the users on a server/repository, as a backup and to prevent fraud.
 
* <span style="color:#0a42c5">Data analysis: </span>
 
* <span style="color:#0a42c5">Data analysis: </span>
::Analysed data must be given a different name from raw data (by adding a suffix). Ensure raw data are not modified or overridden. Analysed data should be linked to raw data, as well as published figures.
+
::Analysed data must be given a different name from raw data (by adding a suffix). Ensure raw data are not modified or overwritten. Analysed data should be linked to raw data, as well as published figures.
 
* <span style="color:#0a42c5">Data traceability</span>
 
* <span style="color:#0a42c5">Data traceability</span>
::Data processing (from data acquisition, repeated datasets, data analysis up to published figures) needs to be properly documented to ensure traceability. Always keep the same unique identifier throughout and modify the only the end of the name. Ensure data are structured and complete.
+
::Data processing (from data acquisition, repeated datasets, data analysis up to published figures) needs to be properly documented to ensure traceability. Always use the same unique identifier throughout the experiment and modify only the end of the name. Ensure data are structured and complete.
  
  
  
A '''data management plan (DMP)''' can also be implemented for each experience. It is a written document that describes the data you expect to acquire or generate during the course of a research project, how you will manage, describe, analyse, and store those data, and what mechanisms you will use at the end of your project to share and preserve your data.  
+
A '''data management plan (DMP)''' can also be implemented for each experiment. It is a written document that describes the data you expect to acquire or generate during the course of a research project, how you will manage, describe, analyse, and store those data, and what mechanisms you will use at the end of your project to share and preserve your data.  
  
 
https://library.stanford.edu/research/data-management-services/data-management-plans
 
https://library.stanford.edu/research/data-management-services/data-management-plans
  
 
https://en.wikipedia.org/wiki/Data_management_plan
 
https://en.wikipedia.org/wiki/Data_management_plan

Revision as of 07:22, 16 March 2021

It is very important that data are searchable, findable, reusable, attributable, readable, accurate, and complete. Take one of your published figures from 5 years ago. Are you able to find the raw data? How long will it take you? According to our survey, 3 quarters of CFs cannot do this…

Below are some recommendations how this can be improved. Use an electronic lab book (ELN) or LIMS or write a data management guideline for your users about how data will be managed at the CF. Define who does what and how, regarding data naming, saving, transferring, long-term storage and analysis.

  • Data naming:
A unique and logical name is the first step to traceable data. Decide of a uniform and consistent convention that you impose on your users, also allowing some personalisation:
e.g. 201130-imk-experimentX [date]-[user acronym]-[free text from the user]
  • Data saving:
Raw data files should be unmodifiable and should include metadata about the acquisition settings and the experimental conditions.
  • Data transferring:
Organize easy/practical data transfer to the user (eg. through network) to prevent data loss.
  • Data storage:
Define who will store the raw data long-term and where (see responsibility). CF could store raw data additionally for the users on a server/repository, as a backup and to prevent fraud.
  • Data analysis:
Analysed data must be given a different name from raw data (by adding a suffix). Ensure raw data are not modified or overwritten. Analysed data should be linked to raw data, as well as published figures.
  • Data traceability
Data processing (from data acquisition, repeated datasets, data analysis up to published figures) needs to be properly documented to ensure traceability. Always use the same unique identifier throughout the experiment and modify only the end of the name. Ensure data are structured and complete.


A data management plan (DMP) can also be implemented for each experiment. It is a written document that describes the data you expect to acquire or generate during the course of a research project, how you will manage, describe, analyse, and store those data, and what mechanisms you will use at the end of your project to share and preserve your data.

https://library.stanford.edu/research/data-management-services/data-management-plans

https://en.wikipedia.org/wiki/Data_management_plan