Data workflow
Process
Activity
Relevant persistent identifier(s)
PID interaction
Plan and
design
Researcher(s) and collaborators register with institution, partnership, or facility
Researcher signs in to their ORCID account, institution or facility requests permission to read information from, and add information to, their ORCID record
Plans for data creation/collection and management added to RAiD record for the project/activity/experiment
Researcher(s) define appropriate standards, protocols, methods, instruments, samples etc. for their investigation
DOIs captured for existing DMPs or registered for new DMPs AND DOIs added to RAiD record AND DOIs etc. used for references within DMPs
Method publications or protocol DOI used if available, AND/OR context-appropriate identifier used to record/describe activities and any appropriate vocabularies etc. e.g. Handle or PURL etc.
Process
Activity
Relevant persistent identifier(s)
PID interaction
Data creation and collection
Research systems automatically record any equipment, instruments, standards, reagents, archives etc. used in data creation or collection
ORCID IDs for contributors/collaborators AND DOIs or Handles for instruments or other equipment, Handles for configurations, IGSNs for geosamples, RRIDs for reagents, InChI for chemicals etc. plus ROR for any host organisations/archives etc. added to RAiD record for project
Raw data versions automatically assigned a unique identifier
Physical samples or materials recorded
Re-use of any existing data captured via the PID for the dataset
ARK, Handle, or context-appropriate identifier used to record any data disposal or processing decisions and to trace data moved forward for analysis or publication
DOIs for samples, or accession numbers assigned to any physical materials collected or used to generate data
Relationships between datasets and associated entities captured and recorded
PIDs and metadata from previous stages included in metadata for the dataset
DOIs, ARKs, Handles, or context-appropriate identifier used to create a citation for any data re-used during the data creation process with provenance and relationship described
Process
Activity
Relevant persistent identifier(s)
PID interaction
Analysis
Software, packages (e.g. R packages) and models (plus versions) used for data analysis recorded and assigned DOI
Software versions assigned a DOI (via for example Zenodo linked to Github)
Virtual machines, trusted research environments etc. used to host analyses captured and identified
Workflows, protocols, and methods recorded via e-lab notebooks, or other appropriate documentation
Virtual machine instances and trusted research environments uniquely identified using DOIs, UUIDs etc. and tied to the organisation that provided them
DOIs for method publications, Handles or other domain-appropriate PID captured automatically and linked to dataset versions and analyses
Process
Activity
Relevant persistent identifier(s)
PID interaction
Ingest and archive
Datasets selected for archive or repository ingest, and decisions recorded
Select appropriate, trusted repository or archive for long-term storage/preservation/re-use
Researcher(s) and collaborators register with repository
Data transformation (e.g. format changes) for FAIRness and repository etc. compatibility
Acknowledge and credit authorship, contributions etc.
Assign DOI to stable OR citable OR sharable version and set appropriate access levels
Transmit captured relationships, acknowledgements etc. to a trusted repository
ARK, Handle, or context-appropriate identifier used to record any data disposal or processing decisions and to enable linking to archived/published versions
Use re3data.org, FAIRSharing, DataCite Commons, or other appropriate registry to assess the repository/archive offering
Researcher signs in to their ORCID account, repository requests permission to read information from, and add information to, their ORCID record
Record links to earlier versions and use FAIR metadata template if available
Record ORCID IDs for data creators and contributors plus contributor roles (e.g. CRediT) (N.B. these can either be authenticated at this stage or pulled in from RAiD etc. records) AND DOIs for funding awards AND RORs for facilities etc. for complete acknowledgements
Register DOI with appropriate platform or direct with registration agency and create metadata record for the dataset including access or licensing terms and links
to related entities
Re-use RAiD metadata for associations and grant DOI for funding acknowledgements
Process
Activity
Relevant persistent identifier(s)
PID interaction
Review (N.B. Could be post-publication)
Reviewer registers with archive or repository
Reviewers identified and checked by repository/publishing platform staff for COI etc.
Relevant information shared with reviewers
Recognise reviewer contributions
Recognise review panel membership or contribution
Reviewer signs in to their ORCID account, archive/repository requests permission to read information from, and add information to, their ORCID record to enable review service to be recognised and credited
Archive/repository system queries ORCID registry for common publications, grants etc. AND RAiD registry for common project activities AND ROR to disambiguate reviewer and investigator affiliations AND Crossref grant registry for shared grant awards
NB: Data collected at previous steps can be reused and presented to reviewer
Archive/repository adds credit for the peer review to the reviewer’s ORCID record AND include DOI for the review
if it is open
Archive/repository adds membership of the review panel to the reviewer’s ORCID record either during or after the ingest process is completed
Process
Activity
Relevant persistent identifier(s)
PID interaction
Publish and index
Data release created with PIDs for associated entities and reviews etc. embedded in its metadata
Update related project and grant ID metadata to enable efficient reporting
Link data to related content (e.g. journal articles)
DataCite detects the researcher’s ORCID iD in the dataset metadata and automatically updates the researcher’s ORCID record with the dataset citation.
Dataset metadata distributed to abstracting and indexing systems, citation databases, aggregators, analytics
platforms etc.
ORCID IDs, ROR IDs etc. are included in dataset metadata registered with DataCite AND PIDs for previous versions, inputs, associated entities etc. when available AND licensing and conditions of access included in DOI metadata
RAiD records and funder reporting systems updated with dataset DOI and metadata
Connect DOIs for datasets to articles automatically, or add post hoc connections using Scholix or graphing tools
Researchers grant DataCite permission to update their ORCID record AND repositories include ORCID IDs in dataset metadata sent to DataCite during DOI registration
Version DOIs linked and used for comprehensive metadata and interaction analysis
Updates or corrections should be linked to original dataset
Repositories etc. register DOIs for updates and corrections AND include relationships to published versions in metadata AND update original dataset metadata with
links to correction
Process
Activity
Relevant persistent identifier(s)
PID interaction
Access and reuse
Researchers and interested parties access open datasets
Researchers and interested parties request access to the dataset if access is limited
Repositories record access requests (whether approved or rejected) and keeps usage logs
Datasets cited in articles, datasets or other outputs that are derived from the original dataset (using Make Data Count Code of Practice to help track usage and impact)
If subset of data requested, OR data is filtered/anonymised OR the data is merged into a larger dataset, a new data product is generated by the repository and given a new DOI linked to original dataset and associated entities (captured at publishing and archiving stages), with the relationship made clear
Researcher gains access to openly available dataset by clicking DOI to download file(s)
Researcher requests access to dataset using its DOI AND includes their ORCID ID in the request AND provides their organisational affiliation using ROR
Access requests linked to dataset DOI for tracking, analysis, and reporting
DOI used to enable accurate, persistent citation
ORCID IDs, ROR IDs etc. are automatically included in dataset metadata registered with DataCite AND PIDs for previous versions, inputs, associated entities etc. when available AND licensing and conditions of access included in DOI metadata
Interrogate multiple datasets/sources (e.g patents, citations, Event Data, social media) to infer impact
Funders, repositories, or institutions query DOI registries for content links, references, citations, licensing metadata etc. AND usage and impact indicators from event data etc.