DataScience Workbook / 09. Project Management / 3. Resource Management / 3.1 Data Management / 3.1.4 Data Management Plan


Introduction

Let’s talk about crafting a Data Management Plan (DMP). Think of it as your roadmap for handling all the data in your research project. It’s not just a bunch of formalities; it’s a practical guide that helps you sort out how you’ll collect, store, and share your data. Plus, it’s super handy for making sure your data is managed responsibly and effectively, right from the get-go.

NOTE:
ChatGPT A Data Management Plan (DMP) is a formal document outlining how data will be handled both during and after a research project. It covers aspects like data collection, storage, sharing, and preservation. The importance of a DMP in a research project lies in ensuring that data is managed efficiently, ethically, and in compliance with legal and funding requirements. It helps to maximize data's usefulness, facilitates collaboration, and enhances the transparency and reproducibility of research.
PRO TIP:
Whether you're dealing with heaps of numbers, images, stacks of surveys, or any other kind of data, a good DMP keeps everything organized and makes your research life a whole lot easier. Besides, a well-structured data management plan that adheres to all TRUST, FAIR and CARE principles ensures responsible and effective data handling.

Let’s dive into how you can set up a DMP that really works for you and your research!

Data Manegement Plan (DMP)

Creating a data management plan (DMP) that is compliant with both TRUST, FAIR and CARE principles involves a comprehensive approach to handling, storing, and sharing research data. Below is a framework for a DMP that aligns with these principles, incorporating some guidance following the Digital Curation Centre (DCC).

DMP Compliant with TRUST, FAIR and CARE

When putting together your Data Management Plan (DMP), make sure to give each of the steps in the checklist a good look and answer them as they apply to your work. Covering all these bases will make sure your research data is not just well-managed, but also really lines up with all principles for the best possible outcome.

PRO TIP:
To create a thorough and robust Data Management Plan (DMP) that aligns with FAIR, CARE, and TRUST principles and adheres to best practices in data management, carefully address the included list of questions for each section of the DMP. Your detailed responses will ensure a holistic and effective approach to managing your research data throughout its lifecycle.

Nailing down what data you’ll collect and how you describe it makes sure everyone can understand and use your research data effectively.

PART 1: Data Collection and Documentation

1. Data Description
Clearly describe the data to be collected or generated.

2. Metadata Standards
Use standardized metadata to ensure that data is FAIR (Findable, Accessible, Interoperable, Reusable) and can be easily located and understood by others.
extended subject matter considerations
Data Description * What type of data will be collected or generated?
* What specific characteristics of the data should be documented (e.g., format, volume, quality)?
Data Collection * How will the data be collected or produced, and what standards will be used?
* Are there specific tools or software planned for data collection?
Metadata Standards * What metadata standards will be used to describe the data?
* How will metadata capture the context, quality, and condition of the data?
* Are there any discipline-specific metadata standards that need to be followed?
Data Formats * What file formats will be used for the data, and why have these formats been chosen?
* How do these formats support long-term accessibility and usability of the data?
Data Quality * What steps will be taken to ensure the accuracy and quality of the data during collection?
* How will data quality be monitored and maintained throughout the project?
Data Provenance * How will the origin and changes in the data be tracked and recorded?
* What documentation will be created to detail the data creation and processing workflow?
Data Documentation * What strategies will be used to document the data throughout the project?
* How will the documentation be maintained and updated?
Data Privacy * How will the data collection process address confidentiality and privacy concerns?
* Are there any legal or ethical considerations with the data being collected?
Collaborative Data * If data is being collected collaboratively, how will roles and responsibilities be defined and communicated?
* What mechanisms will be in place for data integration and consistency across different collectors?
Challenges and Mitigation * What potential challenges might arise during data collection and how can they be mitigated?
* Are there contingency plans for unforeseen issues in data collection?

Keeping your data use ethical and legal means you’re respecting people’s rights and staying on the right side of the law.

PART 2: Ethical and Legal Compliance

1. Data Sharing and Privacy
Address how the data will be shared while respecting privacy and ethical considerations, in line with TRUST principles.

2. Ethical Considerations
Address how the data will be shared while respecting ethical considerations, and cultural sensitivities. This includes adhering to the CARE principles for data related to Indigenous or marginalized communities, ensuring informed consent, community engagement, and benefit sharing.

3. Intellectual Property Rights and Cultural Respect
Clarify intellectual property rights (IPR) and ensure respect for the cultural values and norms of communities involved in the research. Be mindful of the ethical implications of data use and sharing, especially in contexts involving sensitive or indigenous data.
extended subject matter considerations
Data Privacy * Are there any ethical considerations or privacy issues related to the data?
* How will the privacy of sensitive data be protected when shared?
Ethical Concerns * What ethical approvals are required for the data collection and use?
* How will the project address ethical concerns related to the data, especially if it involves human subjects?
Property Rights * How will intellectual property rights and data sharing agreements be managed?
* Are there any restrictions on the use or dissemination of the data based on intellectual property laws?
CARE Compliance * How will the project ensure respect for cultural sensitivities, especially when handling Indigenous or community data, in alignment with CARE principles?
* Are there specific community permissions or consultations required?
Data Use Agreements * What terms will be included in data use agreements to ensure ethical and legal compliance?
* How will these agreements be monitored and enforced?
Legal Standards * What legal standards and regulations (like GDPR) are applicable to the data, and how will compliance be ensured?
* Are there any cross-border data transfer issues to consider?
Informed Consent * How will informed consent be obtained and documented for data derived from individuals?
* Are there clear procedures for participants to understand how their data will be used?
Data Anonymization * What methods will be used to anonymize or de-identify sensitive data?
* How will these methods ensure the continued usefulness of the data?
Sensitive Data * How will sensitive data, such as health or personal information, be specifically managed to ensure compliance and privacy?
* Are additional security measures required for such data?
Ownership * Who will own the data, and how will contributors be credited?
* Are there clear guidelines for acknowledging data sources and contributions?

Picking the right place to store your data and keeping it safe means you won’t lose your hard work and can sleep easy knowing it’s secure.

PART 3: Data Storage and Security

1. Storage Solutions
Choose secure and reliable storage solutions, ensuring data integrity and preservation (TRUST).

2. Backup Strategies
Implement regular backups to prevent data loss.
extended subject matter considerations
Storage Solutions * Where will the data be stored during and after the research?
* What criteria are used to select data storage locations and platforms?
* How will the security of these storage solutions be ensured?
Data Integration * How will the data be integrated with existing data sets?
* What standards and protocols will be used for data integration to ensure compatibility and interoperability?
Data Formats and Compression * In which formats will the data be saved and stored? Will it be compressed?
* How do these formats and compression methods affect data quality and accessibility?
Backup Strategies * What backup procedures will be in place to prevent data loss?
* How frequently will backups be created, and where will backup data be stored?
Disaster Recovery * What disaster recovery plans are in place in case of data loss or corruption?
* How will these plans be tested and updated over time?
Post-Project Accessibility * How will data accessibility be maintained after the project ends?
* Are there plans for data migration to ensure long-term access?
Security Protocols * What security measures and protocols will be implemented to protect data from unauthorized access or breaches?
* How will data security be monitored and updated?
Data Encryption * Will the data be encrypted during storage and transfer?
* What encryption standards and tools will be used?
Version Control * How will different versions of the data be managed and stored?
* What system will be used for tracking changes and updates to the data?
Access Control and Authentication * What methods will be used to control access to the data?
* How will user authentication and authorization be managed?

Determining who can access your data and how ensures it’s used responsibly and effectively by those who need it, while maintaining privacy and control.

PART 4: Data Access and Sharing

1. Access Policies
Define how and under what conditions your data will be shared, considering both FAIR and TRUST principles.

2. Formats and Standards
Ensure that data formats are widely accessible and interoperable.
extended subject matter considerations
Data Sharing Protocols * How and with whom will the data be shared?
* What protocols will be used to securely share data with collaborators or the public?
Access Control * How will access to sensitive or confidential data be controlled and monitored?
* Will there be different levels of access for various user groups?
Data Transfer Methods * What secure methods will be used for transferring data to ensure integrity and confidentiality?
Data Usage Tracking * How will the usage of shared data be tracked or monitored?
Data Sharing Agreements * Are there any restrictions on data sharing?
Are there any data sharing agreements or licenses that need to be established to protect sensitive data?
Data Anonymization * For datasets containing sensitive information, what anonymization or de-identification procedures will be implemented before sharing?
Compliance with Regulations * How will data sharing practices comply with relevant data protection laws and regulations?
Third-Party Data Sharing * If using third-party services for data sharing, how will their compliance with security standards be ensured?
Incident Response Plan * Is there an incident response plan in case of unauthorized access or data breaches?
End-User Agreement * Will end-users be required to agree to specific terms and conditions before accessing the data?

Planning for your data’s long-term future ensures it can be a valuable resource for years to come.

PART 5: Data Archiving and Preservation

1. Long-term Preservation
Outline plans for long-term preservation of data, ensuring sustainability and ongoing accessibility (TRUST).

2. Repository Selection
Choose a digital repository that adheres to TRUST and FAIR principles, like those with CoreTrustSeal certification.
extended subject matter considerations
Long-term Storage * What long-term preservation plans are in place for the data?
* How will these plans ensure the data remains accessible and usable over time?
* What formats and standards will be used for long-term preservation?
Repository Selection * Where will the data be archived for long-term access?
* What criteria are used to select an appropriate repository or archiving solution?
* Does the chosen repository align with FAIR and TRUST principles?
Preservation Strategies * What specific strategies will be employed to preserve different types of data (e.g., raw data, processed data, metadata)?
* How will these strategies address potential technological obsolescence?
Legal and Ethical * Are there any legal or ethical considerations to take into account in the data archiving process?
* How will the archive respect intellectual property rights and privacy concerns?
Access and Retrieval * How will data be accessed and retrieved from the archive?
* What mechanisms will be put in place to ensure ease of access for authorized users?
Redundancy and Fail-Safes * What redundancy measures will be implemented to safeguard against data loss?
* Are there fail-safe mechanisms in the event of a system failure or other disruptions?
Archive Health * How will the health and integrity of the archive be monitored over time?
* What processes will be in place for regular checks and maintenance?
Metadata for Archiving * What metadata will be included to describe and facilitate the discovery of archived data?
* How will metadata be maintained and updated in the archive?
Budgeting * What are the estimated costs for data archiving and preservation?
* How will these costs be covered, and how does this impact the overall project budget?
End-of-Project * What procedures will be followed to transition data to the archive at the end of the project?
* Who will be responsible for managing this transition?

Knowing who’s doing what in managing your data keeps things running smoothly and avoids any “I thought you were doing it” moments.

PART 6: Roles and Responsibilities

1. Data Stewardship
Assign roles for data management and stewardship within the project team.

2. Training and Support
Identify any training or support needed to follow these data management practices.
extended subject matter considerations
Data Management * Who is responsible for data management in the project?
* How are the roles and responsibilities for data management defined and distributed among the team?
* What mechanisms are in place for coordinating data management activities?
Team Training * Are any specific training or skills required for data management?
* How will the project provide or facilitate access to necessary training?
* What support structures are in place for ongoing data management needs?
Communication * How will communication regarding data management be maintained within the team?
* Are there collaboration tools or platforms that will be used to facilitate data management activities?
Collaboration * If external partners or collaborators are involved, what are their roles in data management?
* How will their contributions and responsibilities be coordinated and integrated?
Docs of Roles * How will the roles and responsibilities for data management be documented and made accessible to all team members?
* What is the process for updating this documentation as roles evolve?
Accountability * What accountability measures are in place to ensure that data management responsibilities are being met?
* How will the effectiveness of data management roles be evaluated?
Data Security * Who is responsible for data security, including access controls and data breach responses?
* What training is required for those handling sensitive or confidential data?
Succession * What is the plan for transferring data management responsibilities if key personnel leave the project?
* How will continuity of data management be ensured?
Resource Allocation * Who is responsible for allocating resources (time, budget, tools) for data management activities?
* How will these decisions be made and communicated?
Users Engagement * Who will be responsible for engaging with data users, addressing their queries, and collecting feedback?
* How will this feedback be used to improve data management?

Making sure you’ve got the cash and tools for managing your data means you won’t hit a roadblock just when things are getting good.

PART 7: Budget and Resources

1. Funding for Data Management
Allocate sufficient budget for data management activities, including data storage and preservation.</b>

2. Resource Allocation
Ensure adequate resources are available for executing the data management plan.
extended subject matter considerations
Funding * Is there a budget allocated for data management activities?
* How will data management costs be estimated and tracked throughout the project?
* Are there dedicated funds for unforeseen data management expenses?
Resource Allocation * What resources are required for effective data management?
* How will resources like software, hardware, and human expertise be allocated?
* What is the process for adjusting resource allocation as the project evolves?
Costs of Storage * What are the projected costs for data storage and long-term preservation?
* How will these costs be incorporated into the overall project budget?
Costs of Sharing * What financial considerations are associated with data sharing and accessibility (e.g., repository fees, access licenses)?
* How will these be managed within the project’s budget?
Training Costs * What is the budget for training and capacity building in data management?
* How will these funds be distributed and utilized?
Costs of Security * What are the costs associated with implementing and maintaining data security measures?
* How will these be budgeted for and monitored?
Costs of Quality * How much budget is allocated for quality assurance of data?
* What processes or tools will this budget support?
Costs of Metadata * What resources will be dedicated to the creation and maintenance of metadata?
* How will this be factored into the overall project budget?
Budget Revisions * What procedures are in place for revising the data management budget?
* How and to whom will budget usage and adjustments be reported?
External Grants * Are there opportunities for external funding or grants specifically for data management activities?
* How will these opportunities be identified and pursued?

Regular check-ins and being ready to change things up keep your data management on point and up-to-date with the latest practices.

PART 8: Review and Adaptation

1. Periodic Review
Regularly review and update the data management plan to adapt to any changes in data management standards or project direction.</b>

2. Adaptability
Be prepared to adapt the plan to new technologies and standards to maintain FAIR and TRUST compliance.
extended subject matter considerations
Periodic Review * How will the implementation of the DMP be monitored and reported?
* What are the intervals for reviewing the DMP, and who will be involved in the review process?
* What metrics or indicators will be used to assess the effectiveness of data management practices?
Adaptability * How will the DMP be adapted in response to changes in project scope, data needs, or technological advancements?
* What is the process for updating the DMP, and who is responsible for ensuring these updates occur?
Feedback * What mechanisms are in place for collecting feedback on data management practices?
* How will this feedback be incorporated into DMP revisions?
Compliance * How will compliance with data management standards (FAIR, CARE, TRUST) be continuously monitored and assessed?
* What steps will be taken if compliance issues are identified?
Tech Updates * How will new technologies or tools be evaluated and potentially integrated into the data management plan?
* Who is responsible for keeping abreast of technological advancements?
Risk Management * What risk management strategies are included in the DMP for potential data management issues?
* How will these risks be monitored and mitigated?
Stakeholders * How will stakeholders (e.g., project team, funders, data users) be involved in the review and adaptation process of the DMP?
* What is the communication plan for DMP updates?
Documentation * How will changes to the DMP be documented and communicated to relevant parties?
* Where will the history of DMP changes be stored and accessed?
Training needs * What training or development needs might arise from changes to the DMP?
* How will these needs be addressed and funded?
Data Lifecycle * How will changes in the data lifecycle (collection, storage, sharing, archiving) be reflected in DMP updates?
* What process is in place to ensure these lifecycle changes are appropriately captured?

Final documentation provides a comprehensive wrap-up of how data was managed and can serve as a reference for future projects. It aligns with the TRUST principle of transparency and responsibility.

PART 9: Post-Project Documentation (optional yet recommended)

1. Final Reporting
Create a comprehensive final report detailing how the data was managed throughout the project, including any challenges and solutions.</b>

2. Archiving Documentation
Maintain a well-organized database or system for your final notes after project closing, where attributes like data location, metadata, related pipelines, and access procedures can be easily searched and retrieved. This ensures efficient navigation and use of information from past projects.
extended subject matter considerations
Final Reporting * What information will be included in the final report regarding data management?
* How will this report detail the challenges encountered and the solutions implemented?
* Who will be the audience for this report, and how will it be distributed?
Archiving Docs * How will the database or system for final notes be organized and maintained?
* What specific attributes (like data location, metadata, related pipelines, access procedures) will be documented?
* What strategies will be used to ensure this documentation is easily searchable and retrievable for future use?
Data Lifecycle * How will the entire lifecycle of the data be documented in the final report?
* What insights or lessons learned about data management will be included?
Docs Lifecycle * What measures will be taken to ensure the preservation and accessibility of this documentation over time?
* Who will be responsible for maintaining this documentation post-project?
Metadata Summary * How will a comprehensive summary of the metadata be compiled and presented in the final documentation?
* What role will this summary play in future data use and understanding?
Project Summary * What key information about the project and data management will be communicated to stakeholders in the final documentation?
* How will this summary facilitate stakeholder understanding and satisfaction?
Knowledge Transfer * How will knowledge and experiences from the project be documented for transfer to future projects or team members?
* What format and media will be used for this knowledge transfer?
Accessibility * What report will be created to detail the final state of data accessibility and sharing?
* How will this report aid in future data discoverability and use?
Archiving Strategy * What strategies will be employed to archive the project documentation effectively?
* How will these strategies ensure long-term access and usefulness?
Closing Review * What process will be in place to conduct a final review of data management practices?
* How will this review contribute to the final documentation and knowledge base?

Regular checks ensure the ongoing integrity and usability of the data. This practice aligns with the sustainability aspect of TRUST and the reusability aspect of FAIR, as it helps to maintain the data’s quality and relevance over time.

PART 10: Post-Project Data Monitoring (optional yet recommended)

1. Sustainability Assessments
Periodically conduct post-project checks to confirm the continued accessibility and integrity of the data. This includes verifying the ongoing viability of storage mediums and the accessibility of hosting platforms, ensuring the data remains intact and usable for future purposes.</b>

extended subject matter considerations
Sustainability * What procedures will be in place for periodic checks of data accessibility and the integrity of storage mediums and hosting platforms post-project?
* How frequently will these assessments be conducted, and by whom?
Long-Term Usage * How will the usability of data be assessed over time to ensure it remains relevant and valuable?
* What criteria will be used to evaluate data usability?
Tech Obsolescence * What processes are in place to monitor and address technological obsolescence that may impact data accessibility?
* How will updates or migrations be managed if needed?
Data Format * How will data formats and standards be reviewed over time to ensure they remain current and accessible?
* What is the plan for updating or converting data formats as needed?
Repository Health * What checks will be performed to ensure continued repository health and compliance with TRUST and FAIR principles?
* Who will be responsible for conducting these checks?
Post-Project Security * What measures will be taken to continually ensure data security after the project’s completion?
* How will data security practices be adapted in response to emerging threats or vulnerabilities?
User Feedback * How will user feedback be collected and used to assess the ongoing relevance of the data?
* What mechanisms will be in place for users to report issues or suggest improvements?
System Updates * What is the plan for updating archival systems to maintain functionality and security?
* How will these updates be implemented without disrupting data accessibility?
Regulatory Compliance * How will ongoing regulatory compliance be ensured for the stored data?
* What is the process for staying updated with changing regulations?
Post-Project Data Lifecycle * What ongoing considerations will be given to the full data lifecycle, including potential future repurposing or deletion of the data?
* Who will make decisions regarding these lifecycle stages?

DMPs tailored to career stage and project scale

The career stage influences the complexity, scope, and resources available for data management. Senior researchers often have access to more resources and are involved in larger, more complex projects, while early career researchers and PhD students might work with more limited resources and focus on specific aspects of data management pertinent to their research stage.

*[insights based on the analysis of real-life DMP examples available at Wellcome Trust. (n.d.). Guidance for Researchers: Developing a Data Management and Sharing Plan. ]

Early Career Mid-Career Senior Researchers
Plans might focus more on individual projects, with a greater emphasis on specific data types relevant to their research. Plans may include a mix of qualitative and quantitative data, with a focus on ethical considerations and participant consent. Tend to have more comprehensive plans, detailing complex data types like neuroimaging or genomic data.
Might rely more on existing platforms and standard methods for data sharing and preservation. There’s an emphasis on making data accessible to a broader research community, including non-specialists. Plans often include developing new methods for data sharing or using established, sophisticated platforms.
PRO TIP:
Regardless of your career stage or the scale of your project, having a Data Management Plan is not only valuable but also increases the success rate of your research.

A well-crafted DMP ensures that data are rigorously collected, well-preserved, and include meaningful descriptors or metadata. This approach not only helps researchers achieve solid, meaningful results but also enhances the utility and reproducibility of their data, contributing significantly to the advancement of scientific knowledge.

NOTE:
Proper data management is essential across different fields of science and project sizes, and it plays a key role in collaborative research, open science, and ensuring the long-term value of data.
[further reading: Everyone needs a data-management plan, Nature 555, 286 (2018)]

Hands-on Case Study:
Data Management Plan for research project

We’ll walk through a realistic example of a research project in bioinformatics to demonstrate the practical applicability and benefits of having a robust Data Management Plan. This example will highlight how these critical components guide a research project’s successful execution and help navigate potential challenges in the scientific endeavors.

1. Project Overview

PROJECT TITLE: Genomic Analysis for Breast Cancer Treatment Personalization
FIELD: Bioinformatics / Personalized Medicine

BACKGROUND:
The project involves analyzing genomic data to identify genetic markers associated with breast cancer, particularly aiming to understand the variability in response to chemotherapy. The goal is to contribute to the field of personalized medicine, where treatments are tailored to the individual’s genetic makeup. The bioinformatics team will use large datasets from public genomic databases [X, Y, Z] and collaborate with medical institutions [X, Y, Z] for clinical data.

2. Data Management Plan

Following the initial conceptualization of your research project, the subsequent critical step is to develop a Data Management Plan (DMP). This plan serves as a roadmap for managing your project’s data efficiently. It details each stage of the data lifecycle, from collection to archiving, ensuring optimal data handling, compliance, and integrity. A well-structured DMP is key to navigating your project towards successful outcomes and maximizing the value of your research data.

PRO TIP:
In a Data Management Plan (DMP), answers should be detailed enough to provide clear guidance on how data will be handled, while remaining concise to ensure readability and practicality. The level of detail should adequately address specific project needs and comply with any relevant funder or institutional requirements.

This comprehensive DMP provides a structured approach to managing the genomic data crucial for advancing personalized medicine in breast cancer treatment, ensuring the data’s integrity, security, and long-term value.

PART 1: Data Collection and Documentation

1. Data Description
Clearly describe the data to be collected or generated.

The project will generate genomic data, specifically Whole Genome Sequencing (WGS) and Whole Exome Sequencing (WES) data, to identify breast cancer treatment markers. The data characteristics documented will include sequencing depth, coverage, and variant call quality metrics.

2. Metadata Standards
Use standardized metadata to ensure that data is FAIR (Findable, Accessible, Interoperable, Reusable) and can be easily located and understood by others.

Metadata will be documented using MIGS and HUGO standards for genomic data. Clinical data metadata will align with HIPAA compliance standards, documenting anonymized patient identifiers and treatment details.

3. Data Collection Method
  - How will the data be collected or produced, and what standards will be used?
  - Are there specific tools or software planned for data collection?


Data will be collected using high-throughput sequencing techniques. NGS data will be sourced from public databases like gnomAD and TCGA. Clinical data from institutions like Johns Hopkins Hospital will be collected through collaborations, following standardized clinical data collection protocols.

4. Data Formats
  - What file formats will be used for the data, and why have these formats been chosen?
  - How do these formats support long-term accessibility and usability of the data?


Genomic data will be stored in standard formats like FASTQ for raw data and VCF for variant data. These formats are widely accepted in the genomics field and support long-term accessibility and interoperability.

5. Data Quality
  - What steps will be taken to ensure the accuracy and quality of the data during collection?
  - How will data quality be monitored and maintained throughout the project?


Quality control measures will include sequencing error checks, read quality assessments, and variant validation. Quality monitoring will be ongoing, using bioinformatics tools like FastQC and GATK.

6. Data Provenance & Documentation
  - How will the origin and changes in the data be tracked and recorded?
  - What documentation will be created to detail the data creation and processing workflow?


  • Data provenance will be tracked using electronic lab notebooks (JupyterLab) and data processing logs (stored in GitHub repos). This will include details on data acquisition, processing steps, and analysis workflows.
  • Comprehensive documentation will be maintained for all datasets, including data processing steps and analysis methods. Documentation will be regularly updated in a project-specific GitHub repository.

  • 7. Challenges and Mitigation
      - What potential challenges might arise during data collection and how can they be mitigated?
      - Are there contingency plans for unforeseen issues in data collection?


    Potential challenges include data heterogeneity and integration issues. Mitigation strategies include using standardized data formats and protocols, and contingency plans will be in place for data integration challenges.

    PART 2: Ethical and Legal Compliance

    1. Data Sharing and Privacy
    Address how the data will be shared while respecting privacy and ethical considerations, in line with TRUST principles.

    Patient data will be anonymized to protect privacy. Compliance with GDPR and other relevant data protection regulations will be strictly followed.

    2. Ethical Considerations
    Address how the data will be shared while respecting ethical considerations, and cultural sensitivities. This includes adhering to the CARE principles.

  • The project will adhere to ethical guidelines for human subject research, ensuring informed consent and transparent data use communication.
  • Ethical approvals will be obtained from relevant institutional review boards:
  •   - .....................................
      - .....................................

    3. Intellectual Property Rights and Cultural Respect
    Clarify intellectual property rights (IPR) and ensure respect for the cultural values and norms of communities involved in the research.

    Data from public databases will be used in accordance with their terms of use. Original research findings will be subject to intellectual property rights held by the research institution.

    4. CARE Compliance
      - How will the project ensure respect for cultural sensitivities, especially when handling Indigenous or community data, in alignment with CARE principles?
      - Are there specific community permissions or consultations required?


    IMPORTANT! The data used in the project may include Indigenous community data. The project will respect cultural sensitivities and adhere to CARE principles. Community engagement and permissions will be sought where necessary.

    5. Legal Standards/Data Use Agreements
      - What legal standards and regulations are applicable to the data, and how will compliance be ensured? Are there any cross-border data transfer issues to consider?
      - What terms will be included in data use agreements to ensure ethical and legal compliance?


    The project will comply with legal standards like GDPR, particularly in handling personal health information. Cross-border data transfer will be managed in accordance with international data protection laws. Data use agreements will include terms that align with ethical and legal standards, ensuring responsible data sharing. These agreements will be regularly reviewed and monitored for compliance.

    6. Informed Consent & Data Anonymization
      - How will informed consent be obtained and documented for data derived from individuals?
      - Are there clear procedures for participants to understand how their data will be used?
      - What methods will be used to anonymize or de-identify sensitive data?
      - How will these methods ensure the continued usefulness of the data?


  • Informed consent will be obtained for all data derived from individuals. The consent form will include easy-to-understand language detailing the data use. Additionally, an informational website and pamphlets will be provided, offering more in-depth explanations about the project, data security measures, and contact information for queries or concerns. Participants will also have access to a dedicated helpline for any questions related to their participation.
  • Sensitive data will be anonymized or de-identified using methods like data masking or pseudonymization, ensuring the utility of the data while maintaining privacy. A two-step process will be applied. First, direct identifiers like names and social security numbers will be removed. Next, indirect identifiers will be altered using techniques like data perturbation and pseudonymization. Unique participant IDs will replace personal identifiers to link datasets without revealing individual identities.
  • The anonymization methods are designed to preserve key data attributes and relationships, ensuring data utility for analysis. By maintaining the integrity of the data structure and statistical properties, researchers can draw valid conclusions from the anonymized datasets. Additionally, the use of unique participant IDs allows for data linkage without compromising privacy, enabling comprehensive analysis across multiple datasets.

  • 7. Sensitive Data Management
      - How will sensitive data, such as health or personal information, be specifically managed to ensure compliance and privacy?
      - Are additional security measures required for such data?


    Sensitive data, including health and personal information, will be managed with rigorous security measures to ensure full compliance with privacy regulations and to maintain the confidentiality of participant data. Specific management strategies include:
  • Encryption: All sensitive data will be encrypted both in transit and at rest. For data in transit, we'll use SSL/TLS encryption protocols. For data at rest, AES-256 encryption standards will be applied to the stored data on secure servers.
  • Access Controls: Access to sensitive data will be strictly controlled through a role-based access control system (RBAC). This means only authorized personnel, who require the data for specific research-related tasks, will have access, and their level of access will be based on their role in the project.
  • Data Masking: For any case where sensitive data needs to be used in a less secure environment (e.g., for certain types of analysis or testing), data masking techniques will be used to obscure the true data while preserving its usability for the required purpose.
  • Secure Data Transfer Protocols: When transferring data between entities, such as between the bioinformatics team and medical institutions, secure and encrypted data transfer protocols will be used. This includes using VPNs and SFTP for secure file transfers.
  • Regular Security Audits: The security measures in place will be regularly audited and updated to ensure they meet the latest standards and effectively protect the data against new threats. These audits will be conducted by an independent cybersecurity team.
  • Training and Awareness: All team members will receive regular training on data privacy and security protocols. This training will cover best practices for handling sensitive data, recognizing potential security threats, and understanding compliance requirements.

  • Data Breach Response Plan
    A comprehensive data breach response plan will be in place, outlining the steps to be taken in the event of a security breach.
    1. Isolate the affected system to prevent further data leakage. --> IT contact: john.X@it.miracle.com
    2. Initiate a secure backup protocol to safeguard unaffected data.
    3. Conduct a forensic analysis to determine the breach's source, scope, and impact.
    4. Identify the specific data and systems compromised.
    5. Report the breach to relevant regulatory authorities as required by law.
    6. Notify affected individuals, providing details about the nature of the breach and the data involved.
    7. Implement security patches or updates to address the vulnerability exploited in the breach.
    8. Enhance monitoring of network and data access.
    9. Conduct a thorough review of existing security protocols.
    10. Document the breach incident, response actions, and outcomes.

    PART 3: Data Storage and Security

    1. Storage Solutions
    Choose secure and reliable storage solutions, ensuring data integrity and preservation (TRUST).

    Data will be stored on secure, institution-approved servers with encryption protocols.

    2. Backup Strategies
    Implement regular backups to prevent data loss.

    Regular backups will be maintained on separate secure servers. A RAID system will be implemented for redundancy.

    PART 4: Data Access and Sharing

    1. Access Policies
    Define how and under what conditions your data will be shared, considering both FAIR and TRUST principles.

    Access to sensitive data will be restricted and controlled through secure login credentials.

    2. Formats and Standards
    Ensure that data formats are widely accessible and interoperable.

    ...

    3. Data Sharing Protocols
    ...

    Research findings will be shared through academic publications and data repositories, adhering to open-access policies.

    PART 5: Data Archiving and Preservation

    1. Long-term Preservation
    Outline plans for long-term preservation of data, ensuring sustainability and ongoing accessibility (TRUST).

    Final datasets will be archived in institutional repositories with a 10-year minimum retention policy.

    2. Repository Selection
    Choose a digital repository that adheres to TRUST and FAIR principles, like those with CoreTrustSeal certification.

    Data will be archived in repositories that adhere to FAIR and TRUST principles, such as Zenodo.

    PART 6: Roles and Responsibilities

    1. Data Stewardship
    Assign roles for data management and stewardship within the project team.<

    A designated data manager will oversee data handling, storage, and sharing.

    2. Training and Support
    Identify any training or support needed to follow these data management practices.

    Team members will receive training in data management best practices and software tools used in the project.

    PART 7: Budget and Resources

    1. Funding for Data Management
    Allocate sufficient budget for data management activities, including data storage and preservation.</b>

    The project budget includes specific allocations for data storage, backup solutions, and data management personnel.

    2. Resource Allocation
    Ensure adequate resources are available for executing the data management plan.

    Resources will be allocated for data security measures, software licenses, and server maintenance.

    PART 8: Review and Adaptation

    1. Periodic Review
    Regularly review and update the data management plan to adapt to any changes in data management standards or project direction.</b>

    The DMP will be reviewed bi-annually to ensure it remains current and effective.

    2. Adaptability
    Be prepared to adapt the plan to new technologies and standards to maintain FAIR and TRUST compliance.

    The DMP will be adapted in response to technological advancements and changes in data regulations.

    PART 9: Post-Project Documentation (optional yet recommended)

    1. Final Reporting
    Create a comprehensive final report detailing how the data was managed throughout the project, including any challenges and solutions.</b>

    A comprehensive report detailing the data management processes and challenges encountered will be produced at the project's end.

    2. Archiving Documentation
    Maintain a well-organized database or system for your final notes after project closing, where attributes like data location, metadata, related pipelines, and access procedures can be easily searched and retrieved. This ensures efficient navigation and use of information from past projects.

    A database with final notes on data locations, metadata, and access procedures will be maintained for efficient future reference.

    PART 10: Post-Project Data Monitoring (optional yet recommended)

    1. Sustainability Assessments
    Periodically conduct post-project checks to confirm the continued accessibility and integrity of the data. This includes verifying the ongoing viability of storage mediums and the accessibility of hosting platforms, ensuring the data remains intact and usable for future purposes.</b>

    Annual checks will be conducted to ensure the continued accessibility and integrity of the archived data.


    Further Reading


    Homepage Section Index Previous Next top of page