Metroline Step: Apply (meta)data model
Status: IN development
Renamed: “Metroline Step: Apply core metadata model” to “Apply (meta)data model”
Short description
This step provides a comprehensive guide on how to apply a (meta)data model to research resources, e.g. Datasets, making data more findable, accessible, interoperable and reusable (FAIR). The page outlines different tools and protocols (including step-by-step guides) that can be used for applying a (meta)data model, thereby increasing consistency and interoperability. It also provides links to more detailed external resources and examples of projects that successfully implemented a metadata schema and/or data model to their data.
Why is this step important
Applying the data model to your data and metadata model to your metadata is crucial for the next step: https://health-ri.atlassian.net/wiki/spaces/FSD/pages/277479473. It is a central step in the FAIRification process, in which your (meta)data will be connected to elements of your (semantic) (meta)data model, such that it becomes machine-readable and interoperable.
The metadata and data that are structured with ontologies and follow standard schemas make it easier for other resources to find your resource’s metadata and understand its data.
How to
Below we outline 3 ways to apply your datamodel, and one guide to apply a metadata model in the FDP
Implement a data model (in development)
A. FAIR-in-a-box
FAIR-in-a-box (adopted from CDE-in-a-box) is an automated tool to help make your data FAIR by enabling you to provide a CSV containing your data in accordance with the embedded CARE-SM model. The tool will transform your CSV into RDF and place it in a triple store connected to a FAIR data point.
The tool is customizable: if you want to use another semantic model, you could potentially edit the scripts and YARRRML that transform the CSV into RDF with that model.
The FAIR-in-a-box or CDE-in-a-box consists of several components (see also Fig. 1 of this paper):
A template-compliant CSV (generated by data castodian, together with FAIR data steward in this paper)
A simple "preCARE" CSV file is created by the data owner
The preCARE.csv is automatically transformed into the final CARE.csv by the caresm toolkit (part of the docker-compose)
RML: RML stands for RDF mapping language. RML technology provides templates that enable CSV (or other) to RDF transformation.
The RML component in FAIR-in-a-box is created to transform a CSV template consisting data according to the CARE-SM model into RDF.In case you wish to adjust the FAIR-in-a-box tool to a customized data model, you can use the YARRRML tool to generate a custom RML template
Transforming a non-RDF data format into RDF with RML: 2 optional tools: SDM-RDFizer; this tool uses the RML template and the non-RDF data to transform your data into RDF.
An alternative tool according to this paper: RMLMapper
The transformed data is fed into GraphDB (triple store). Whenever the data updated, the corresponding metadata is automatically updated as well.
Once finished, the metadata is automatically exposed to an FDP (see also the next Metroline step: https://health-ri.atlassian.net/wiki/spaces/FSD/pages/277479473).
Sources: https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-022-00264-6
https://direct.mit.edu/dint/article/5/1/184/113181/The-FAIR-Data-Point-Interfaces-and-Tooling
https://faircookbook.elixir-europe.org/content/recipes/applied-examples/approach-cdisc.html
B. CastorEDC
Build eCRF for data collection (previous step?)
Map fields from eCRF to semantic model (step 7 of de novo) in the 'data transformation application' (automated or manual??)
eCRF values linked to the ontology concepts used as a machine-readable
representation of the value in the rendered RDF
Entered data automatically converted into RDF in real time (next step) = de novo FAIRification
[+ metadata model for castor?]
C. ontoText refine
With this tool, you can manually map structured data to a locally stored RDF schema in GraphDB. To do so, you can chose the right predicates and types, define datatype and implement transformations. The tool is integrated in GraphDB workbench.
In short, the workflow consists of the following steps:
Load your ontology in GraphDB.
You can use Protege to edit your ontology
Connect ontoRefine to GraphDB (where your ontology is stored)
Load your data.
Transform data to your needs and wishes (eg. convert dates to a specific format).
Connect the variables of your data to ontology manually. (This is the actual step where you apply the data model to your data).
You can export your linked data.
Implement a metadata schema (in development)
FAIR data point reference implementation (implements DCAT)
Health-RI FAIR data point (implements Health-ri metadata schema)
with SHACLS HRI schema you can deliver metadata according to HRI schema
automatically transformed to RDF
Step-by-step for Health-RI
Importing shacl files in the FDP
Log in as an admin in the FDP and go to “Metadata schemas” (top right corner)
Click on the metadata schema you want to update
Go to the GitHub page providing the shacls
Click on the class you want to update and copy the ttl file
Go back to the FDP and paste the ttl file in “Form definition” (bottom of the page)
Click on “Save and release”
Update the version number
Click on “Release”
Metadata implementation (add link)
-mapping metadata schema page (add link)
The How to section should:
be split into easy to follow steps;
Step 1
Step 2
etc.
help the reader to complete the step;
aspire to be readable for everyone, but, depending on the topic, may require specialised knowledge;
be a general, widely applicable approach;
if possible / applicable, add (links to) the solution necessary for onboarding in the Health-RI National Catalogue;
aim to be practical and simple, while keeping in mind: if I would come to this page looking for a solution to this problem, would this How-to actually help me solve this problem;
contain references to solutions such as those provided by FAIR Cookbook, RMDkit, Turing way and FAIR Sharing;
contain custom recipes/best-practices written by/together with experts from the field if necessary.
To check:
According to FAIRopoly this should be step 8 in de novo (set up registry structure in FDP) and step 12 (??) in generic . What is the content of these steps?
Expertise requirements for this step
Describes the expertise that may be necessary for this step. Should be based on the expertise described in the Metroline: Build the team step.
FAIR expert/data steward: help with the tools.
Tip: EJPRD developed a metadata model, it may require a developer to implement it in your registry source code.
Practical examples from the community
Examples of how this step is applied in a project (link to demonstrator projects).
VASCA registry - implemented the CDE semantic data model and implemented the DCAT metadata schema and EJPRD metadata schema.
PRISMA - implemented Health-RI metadata schema
Training
Add links to training resources relevant for this step. Since the training aspect is still under development, currently many steps have “Relevant training will be added in the future if available.”
Suggestions
Visit our How to contribute page for information on how to get in touch if you have any suggestions about this page.