Metroline Step: Apply (meta)data model

STATUS: IN DEVELOPMENT

Renamed: “Metroline Step: Apply core metadata model” to “Apply (meta)data model”

Short description

This step provides a comprehensive guide on how to apply a metadata model to research resources, e.g. Datasets. It emphasizes the importance of metadata in making data more discoverable (findable), accessible, and reusable. The page outlines the standards and protocols that should be followed, increasing consistency and interoperability. It also includes a step-by-step implementation guide, detailing the necessary tools and resources to effectively apply the metadata model. It also provides links to external resources with step-by-step approaches and examples of projects that successfully implemented a metadata schema and/or data model to their data.

Community examples:
VASCA registry - implemented the CDE semantic data model and implemented the DCAT metadata schema and EJPRD metadata schema.
PRISMA - implemented Health-RI metadata schema

Implement a data model (in development)

-FAIR in a box

-CastorEDC

-openRefine (manually)

Implement a metadata schema

-FAIR data point reference implementation (implements DCAT)

-Health-RI FAIR data point (implements Health-ri metadata schema)

Step-by-step for Health-RI

Metadata implementation (add link)

-mapping metadata schema page (add link)

Implement a data model (in development)

A. FAIR in a box: from CDE-in-a-box (Collection of software to create, store and publish CDEs)

comes with CARE-SM model. If you want to use your own custom model, you have to adjust the YARRRML model. (via Matey).

Components:

YARRML - Linked data generation rules. Specify rules to transform data to linked data (eg. triples)
RML: RFD mapping language
Feed into GraphDB (triple store). Metadata automatically updated when data updates
Exposed to FDP (default, can use templates for controlled input FDP) - covered in next Metroline step

B. CastorEDC

Build eCRF for data collection (previous step?)

C. ontoText refine: map structured data to a locally stored RDF schema in GraphDB. Chose right predicates and types, define datatype and implement transformations. Integrated in GraphDB workbench.

Load ontology in GraphDB
Connect ontoRefine to graphDB (which has ontology)
Load your data
Transform data to your needs and wished
Connect data variables to ontology manually. Tadaah: RDF

Protege: for ontology

Implement a metadata schema

-FAIR data point reference implementation (implements DCAT)

-Health-RI FAIR data point (implements Health-ri metadata schema)

Step-by-step for Health-RI

Metadata implementation (add link)

-mapping metadata schema page (add link)

FAIRopoly

This step aims at implementing the semantic model for data through an automatic tool, and the metadata model for metadata. The metadata and data that are structured with ontologies and follow standard schemas make it easier for other resources such as the EJP RD Virtual Platform to find your resource’s metadata and understand its data.

Tip: EJPRD developed a metadata model, it may require a developer to implement it in your registry source code.

To check:

According to FAIRopoly this should be step 8 in de novo (set up registry structure in FDP) and step 12 (??) in generic . What is the content of these steps?

De novo supplementary

Step 8 - Set up registry structure in the FAIR Data Point

The available semantic metadata model of the FAIR Data Point specification was used todescribe the VASCA registry [4]. This model is based on the DCAT standard. The VASCA registryFAIR Data Point metadata is described in three layers: 1) catalog - a collection of datasets, 2)dataset - a representation of an individual dataset in the collection, and 3) distribution - arepresentation of an accessible form of a dataset, e.g. a downloadable file or a web service thatgives access to the data for authorised users (Figure S2). A catalog may have multiple datasets,and a dataset may have multiple distributions. The VASCA registry described in this project(Registry of Vascular Anomalies - Radboud university medical center) is one of the datasets inthe catalog (Registry of Vascular Anomalies). Other VASCA registries, from this or one of theother centers can also be described in this catalog. The semantic metadata model of the FAIRData Point metadata specification was implemented in the Castor EDC’s FAIR Data Point. Themetadata that describe the catalog, dataset, and distributions of the VASCA registry describedin this project, are publicly available and licensed under the CC0 license

Why is this step important

This section should explain why this step is crucial

How to

The How to section should:

be split into easy to follow steps;
- Step 1
- Step 2
- etc.
help the reader to complete the step;
aspire to be readable for everyone, but, depending on the topic, may require specialised knowledge;
be a general, widely applicable approach;
if possible / applicable, add (links to) the solution necessary for onboarding in the Health-RI National Catalogue;
aim to be practical and simple, while keeping in mind: if I would come to this page looking for a solution to this problem, would this How-to actually help me solve this problem;
contain references to solutions such as those provided by FAIR Cookbook, RMDkit, Turing way and FAIR Sharing;
contain custom recipes/best-practices written by/together with experts from the field if necessary.

Expertise requirements for this step

Describes the expertise that may be necessary for this step. Should be based on the expertise described in the Metroline: Build the team step.

Practical examples from the community

Examples of how this step is applied in a project (link to demonstrator projects).

Training

Add links to training resources relevant for this step. Since the training aspect is still under development, currently many steps have “Relevant training will be added in the future if available.”

Suggestions

Visit our How to contribute page for information on how to get in touch if you have any suggestions about this page.