Writing DMS Plans for FaceBase

On this page, we provide guidance for writing out your Data Management and Sharing (DMS) Plan when sharing your data through FaceBase.

Almost every section of a project’s DMS plan will require information specific to that project, which can be provided only by the data contributor.

Some sections will additionally require some information about the repository; this guide includes some FaceBase-specific information and text snippets that can be useful in completing those sections.

This guide is based on the NIH optional DMS plan format document. We refer to section and sub-section titles from that document in bold, our comments appear in plain text, and our suggestions for text to include in your plan appear in italics.

On this page, we only include sections relevant to FaceBase (for example, under “Element 1: Data Types”, we do not include sub-section “B. Scientific data that will be preserved and shared, and the rationale for doing so:”).

Slides

The following are summarized from these detailed, informative slides presented by Laura Pearlman at the 2023 FaceBase Community Forum, titled “How FaceBase Can Help with Data Management and Sharing Plans”.

Element 1: Data Type

A. Types and amount of scientific data expected to be generated in the project:

The Key Concepts for FaceBase Data Contributors guide provides an overview of the types and formats of data we accept. A more exhaustive list of supported data types can be found here. If you plan to submit a data type that isn’t listed, please contact the FaceBase Hub to discuss it.

C. Metadata, other relevant data, and associated documentation:

The basic structure of FaceBase data is described in the Key Concepts for FaceBase Data Contributors guide. FaceBase also has minimum metadata requirements for each type of data. The protocol and metadata requirements for each type of data are described in the Quality Control Rules section of the Key Concepts document. An example of a possible starting point for your metadata description is provided below:

The dataset will include information about the types of experiments performed, the species, developmental stages, and anatomy of biosamples collected, and protocols for each experiment.

In addition, FaceBase requires additional metadata for each type of experiment. For RNA-seq, and scRNA-seq assays:

The metadata for each experiment must include strandedness and read number.

For ChIP-seq assays:

The metadata for each experiment must include strandedness, read number, target of assay, and (for non-control assays) a link to the record for the associated control assay.

Additional standard metadata elements that we collect, and which may or may not be relevant to your project, include:

At the experiment level:

Molecule Type
Strandedness
RNA-seq Selection
Chromatin Modifier
Transcription Factor
Histone Modification

At the biosample level:

Specimen
Gene
Genotype
Strain
Mutation
Stage
Anatomy
Origin
Phenotype
Treatment
Sex
Litter
Collection Date

Please see the Key Concepts for FaceBase Data Contributors guide for more details.

If relevant to your data types, you could include:

Facebase provides web-based tools for visualizing and annotating 2-dimensional imaging data, for visualizing 3-dimensional imaging data, scRNA-seq data, and (through the UCSC Genome Browser) track data on its website, www.facebase.org. All software developed for FaceBase is open-source and hosted on github.

Element 3: Standards

Ontologies used in FaceBase include the following; please list any that apply:

Anatomy: Uberon
Chromatin modifier: ZFIN, NGI, HGNC, Ensemble, MGI
Data type: OBI, SMOMEDCT, CHMO
Experiment type: MMO, ERO, CHMO, SCTID, OBI, STATO
Gene: NCBI
Phenotype: Chemical Methods (CHMO), cmmo, Foundational Model of Anatomy (FMA), Craniofacial Mouse Malformation Ontology (CMMO), MP, HP, DOID
Sex: UBERON
Species: NCBI Taxon
Strain: MGI
Syndrome: MONDO
Transcription factor: MGI, ZFIN, Gene_ORFName, Ensembl, HGNC

FaceBase uses these reference genomes: humans (hg38, hg19), mouse (mm9, mm39, mm10), and chimpanzee (panTro4).

Common data formats accepted by FaceBase include:

Sequencing Data: “raw” sequencing data (fastq files).
Processed Data: fastqc reports (.fastqc.tgz or .fastqc.zip), count files (.count, .tpm, .fpkm), measures in tab-separated format (.tsv), and alignment mapping files (.bam) and indexes (.bam.bai).
Track Data: (.bed), bigBed (.bb), and bigWig (.bw) files.
Array Data: “raw” microarray data (CEL files).
Imaging Data: high-resolution 3D or 2D imaging data, such as micro-CT accepted in NIfTI format gzipped (.nii.gz), confocal or other microscopy sources in TIFF or OME-TIFF (.tiff or .ome.tiff), and other sources in JPEG (.jpg or .jpeg). Other formats may be considered on an as needed basis.
Surface Model / Mesh Data: Wavefront OBJ format.

Element 4: Data Preservation, Access, and Associated Timelines

B. How scientific data will be findable and identifiable:

Each data item in FaceBase is provided with a persistent unique record identifier (RID) on creation. In addition, each dataset can be assigned a DataCite DOI.

C. When and how long the scientific data will be made available:

FaceBase will make all data available for the life of the FaceBase project. In addition, FaceBase will make its best effort to preserve a read-only archive of all data other than human-protected data for at least five years after the last data has been deposited to FaceBase.

Element 5: Access, Distribution, or Reuse Considerations

B. Whether access to scientific data will be controlled:

With the exception of protected human subjects data, all other data will be shared publicly after the associated studies are published. Protected human subjects data will be stored in a separate, secure location and shared only after a review is carried out by the FaceBase Data Access Committee and a Data Use Agreement is signed. More information about this process can be found here.

C. Protections for privacy, rights, and confidentiality of human research participants:

The review and approval process described above protects data housed at FaceBase. The Data Use Agreement protects data that has been shared by FaceBase (following the approval process described above).

FaceBase will perform its normal quality assurance processes to assure that metadata is complete and that industry best practices are followed for data housed at FaceBase.

Writing DMS Plans for FaceBase

On this page

Slides

Element 1: Data Type

Element 2: Related Tools, Software and/or Code:

Element 3: Standards

Element 4: Data Preservation, Access, and Associated Timelines

Element 5: Access, Distribution, or Reuse Considerations

Element 6: Oversight of Data Management and Sharing