Metroline Step: Apply (meta)data model
Status: IN development
Renamed: “Metroline Step: Apply core metadata model” to “Apply (meta)data model”
‘Start with a great quote from, for example, a paper, between single quotes, in italic.' (source as a hyperlink between parenthesis)
In layman’s terms (Jip en Janneke), add an easy to follow summary, using around three sentences.
Short description
This step provides a comprehensive guide on how to apply a (meta)data model to research resources, e.g. Datasets, making data more findable, accessible, interoperable and reusable (FAIR). The page outlines different tools and protocols (including step-by-step guides) that can be used for applying a (meta)data model, thereby increasing consistency and interoperability. It also provides links to more detailed external resources and examples of projects that successfully implemented a metadata schema and/or data model to their data.
Why is this step important
Applying the data model to your data and metadata model to your metadata is crucial for the next step: Metroline Step: Transform and expose FAIR (meta)data. It is a central step in the FAIRification process, in which your (meta)data will be connected to elements of your (semantic) (meta)data model, such that it becomes machine-readable and interoperable.
The metadata and data that are structured with ontologies and follow standard schemas make it easier for other resources to find your resource’s metadata and understand its data.
How to
Below we outline ways to apply your (meta)datamodel.
1. FAIR-in-a-box
FAIR-in-a-box (adopted from CDE-in-a-box) is an automated tool to help make your data FAIR by enabling you to provide a CSV containing your data in accordance with the embedded CARE-SM model. The tool will transform your CSV into RDF and place it in a triple store connected to a FAIR data point.
The tool is customizable: if you want to use another semantic model, you could potentially edit the scripts and YARRRML that transform the CSV into RDF with that model.
The FAIR-in-a-box or CDE-in-a-box consists of several components (see also Fig. 1 of this paper):
A template-compliant CSV (generated by data castodian, together with FAIR data steward in this paper)
A simple "preCARE" CSV file is created by the data owner
The preCARE.csv is automatically transformed into the final CARE.csv by the caresm toolkit (part of the docker-compose)
RML: RML stands for RDF mapping language. RML technology provides templates that enable CSV (or other) to RDF transformation.
The RML component in FAIR-in-a-box is created to transform a CSV template consisting data according to the CARE-SM model into RDF.In case you wish to adjust the FAIR-in-a-box tool to a customized data model, you can use the YARRRML tool to generate a custom RML template
Transforming a non-RDF data format into RDF with RML: 2 optional tools: SDM-RDFizer; this tool uses the RML template and the non-RDF data to transform your data into RDF.
An alternative tool according to this paper: RMLMapper
The transformed data is fed into GraphDB (triple store). Whenever the data updated, the corresponding metadata is automatically updated as well.
Once finished, the metadata is automatically exposed to an FDP (see also the next Metroline step: Metroline Step: Transform and expose FAIR (meta)data).
The FAIR Data Point: Interfaces and Tooling
Mapping of clinical trial data to CDISC-SDTM: a practical example based on APPROACH and ABIRISK
2. CastorEDC
Build eCRF for data collection (previous step?)
Map fields from eCRF to semantic model (step 7 of de novo) in the 'data transformation application' (automated or manual??)
eCRF values linked to the ontology concepts used as a machine-readable
representation of the value in the rendered RDF
Entered data automatically converted into RDF in real time (next step) = de novo FAIRification
[+ metadata model for castor?]
3. ontoText refine
With this tool, you can manually map structured data to a locally stored RDF schema in GraphDB. To do so, you can chose the right predicates and types, define datatype and implement transformations. The tool is integrated in GraphDB workbench.
In short, the workflow consists of the following steps:
Load your ontology in GraphDB.
You can use Protege to edit your ontology
Connect ontoRefine to GraphDB (where your ontology is stored)
Load your data.
Transform data to your needs and wishes (eg. convert dates to a specific format).
Connect the variables of your data to ontology manually. (This is the actual step where you apply the data model to your data).
You can export your linked data.
4 A. FAIR data point reference implementation (implements DCAT)
A FAIR data point (FDP) exposes metadata according to the FAIR principles. The FDP reference implementation automatically comes with the DCAT metadata schema, so when deploying this standard FDP, there will be a metadata schema in place. However, if you wish to customize your FDP to a specific metadata schema, please refer to the next section.
You can find more information about FDPs here.
These are the steps to apply a metadata model to your metadata with an FDP.
Follow the instructions of the reference implementation to set up a FAIR data point.
After setting up the FDP compliant to the metadata schema, metadata has to be mapped to the metadata model. You can find more information on metadata mapping here.
After successful mapping, metadata can be entered into the FDP, either manually or automatic.
Manual entry:
Once logged in, the FDP user interface allows manual entry of metadata according to the implemented metadata schema. New instances of (DCAT-) classes of the can be created, and the respective properties of each class can be filled out manually.Uploading metadata to FDP via SeMPyRO:
With the SeMPyRO package you can convert your metadata into FDP compliant metadata. More information here.
Next, your metadata is transformed and exposed (Next Metroline step: Metroline Step: Transform and expose FAIR (meta)data).
4 B. Health-RI compliant FAIR data point (implements the Health-RI metadata schema)
You can customize your FDP to be compliant to a specific metadata schema. For example, by making the FDP compliant to the Health-RI metadata schema, you can ensure your metadata can be exposed to the National Health Data Catalogue in the correct format. Customizing your FDP to the Health-RI metadata schema is as simple as following the steps described here:
Step-by-step for implementing Health-RI metadata schema in FAIR Data Point
Importing shacl files in the FDP
Log in as an admin in the FAIR Data Point.
Click on the user icon (top right corner) and click on “Metadata schemas”.
Select the metadata schema you want to update.
Go to the GitHub page providing the shacls.
Click on the class you want to update and copy the ttl file.
Go back to the FDP and paste the ttl file into the “Form definition” section at the bottom of the page.
Click on “Save and release”.
Update the version number.
Click on “Release”.
Expertise requirements for this step
Describes the expertise that may be necessary for this step. Should be based on the expertise described in the Metroline: Build the team step.
FAIR expert/data steward: A FAIR expert or data steward is necessary for several parts of this Metroline step and can help you with the following aspects:
Identify which tool might be most suitable for your case and usage of the tool itself.
Help with (meta)data mapping to the (meta)data model.
IT expert or local IT department: Support from an IT expert or your local IT department is definitely necessary to deploy the FDP and/or FAIR-in-a-box.
In case you are opting for Castor, you need an expert for this option.
Tip: EJPRD developed a metadata model, it may require a developer to implement it in your registry source code.
Practical examples from the community
Examples of how this step is applied in a project (link to demonstrator projects).
VASCA registry - implemented the CDE semantic data model and implemented the DCAT metadata schema and EJPRD metadata schema.
PRISMA - implemented Health-RI metadata schema (FAIR Data Point).
Training
If you have great suggestions for training material, add links to these resources here. Since the training aspect is still under development, currently many steps have “Relevant training will be added soon.”
Suggestions
This page is under construction. Learn more about the contributors here and explore the development process here. If you have any suggestions, visit our How to contribute page to get in touch.