Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This step provides a comprehensive guide on how to apply a metadata (meta)data model to research resources, e.g. Datasets. It emphasizes the importance of metadata in making data more discoverable (findable), accessible, and reusable. The page outlines the standards and protocols that should be followed, increasing consistency and interoperability. It also includes a step-by-step implementation guide, detailing the necessary tools and resources to effectively apply the metadata model. It also provides links to external resources with step-by-step approaches and examples of projects that successfully implemented a metadata schema and/or data model to their data.

...

Implement a data model (in development)

-A. FAIR in a box: from CDE-in-a-box (Collection of software to create, store and publish CDEs)Components:

comes with CARE-SM model. If you want to use your own custom model, you have to adjust the YARRRML model. (via Matey).

Components/steps:

  1. RML: RFD mapping language. reusable templates that support not only CSV to RDF transformations, but also transformations from other formats. RML templates specify individual triple patterns that should be created during a transformation. Eg: The subject Uniform Resource Identifier (URI), predicate URI, and object URI are represented as strings that may contain variables, where the variables are references to locations within the source document (e.g., the appropriate column header within a CSV file). During a transformation, every variable in an RML template is replaced by the value of that location within a single source record (e.g., a single row of a CSV file) and then the source is iterated over all records to complete the transformation. RML templates themselves are represented in RDF and are therefore not always easily human-readable. With the aim of simplifying the RML syntax, such that our EJP RD FAIRification stewards, or potentially the registry data custodians themselves, could edit the template if required, we identified a second, related technology – YARRRML

  • YARRML - Linked data generation rules, generated in a human readable way. YARRRRML docs can be converted into RML template, which are them applied to a CSV to automate transformation. Specify rules to transform data to linked data (eg. triples). +CSV → automated mapping CSV to RDF according to rules (specified by RML)

  • So you need a template-compliant CSV (generated by data castodian, together with FAIR data steward in this paper)

  • RML: RFD mapping language

  • Feed
  1. transforming a non-RDF data format into RDF with RML: 2 optional tools: SDM-RDFizer (alternative in this paper: RMLMapper) - next Metroline step

  2. Fed into GraphDB (triple store). Metadata automatically updated when data updates

  3. Exposed to FDP (default, can use templates for controlled input FDP) - covered in next Metroline step

From FAIR in a Box Github:

The EJP-RD CARE-SM Transformation process has three steps:

  1. A simple "preCARE" CSV file is created by the data owner (you must do this!)

  2. The preCARE.csv is transformed into the final CARE.csv (this is automated) by the caresm toolkit (part of the docker-compose)

  3. The final CARE.csv is processed by the YARRRML transformer, and RDF is output into the ./data/triples folder

Sources: https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-022-00264-6

https://direct.mit.edu/dint/article/5/1/184/113181/The-FAIR-Data-Point-Interfaces-and-Tooling

https://faircookbook.elixir-europe.org/content/recipes/applied-examples/approach-cdisc.html

B. CastorEDC

  • Build eCRF for data collection (previous step?)

...

  • Map fields from eCRF to semantic model (step 7 of de novo) in the 'data transformation application'

  • eCRF values linked to the ontology concepts used as a machine-readable

    representation of the value in the rendered RDF

  • Entered data automatically converted into RDF in real time (next step) = de novo FAIRification

C. ontoText refine: map structured data to a locally stored RDF schema in GraphDB. Chose right predicates and types, define datatype and implement transformations. Integrated in GraphDB workbench.

  1. Load ontology in GraphDB

  2. Connect ontoRefine to graphDB (which has ontology)

  3. Load your data

  4. Transform data to your needs and wishes

  5. Connect data variables to ontology manually. Tadaah: RDF

Protege: ontology editor

Implement a metadata schema (in development)

-FAIR data point reference implementation (implements DCAT)

-Health-RI FAIR data point (implements Health-ri metadata schema)

  • with SHACLS HRI schema you can deliver metadata according to HRI schema

  • automatically transformed to RDF

Step-by-step for Health-RI

Importing shacl files in the FDP

  1. Log in as an admin in the FDP and go to “Metadata schemas” (top right corner)

  2. Click on the metadata schema you want to update

  3. Go to the GitHub page providing the shacls

  4. Click on the class you want to update and copy the ttl file

  5. Go back to the FDP and paste the ttl file in “Form definition” (bottom of the page)

  6. Click on “Save and release”

  7. Update the version number

  8. Click on “Release”

Metadata implementation (add link)

...

Why is this step important 

This section should explain why this step is crucial Applying the data model to your data and metadata model to your metadata is crucial for the next step: Metroline Step: Transform and expose FAIR (meta)data.

How to

The How to section should:

...

Describes the expertise that may be necessary for this step. Should be based on the expertise described in the Metroline: Build the team step.

FAIR expert/data steward: help with the tools.

Practical examples from the community 

...