...
In this section, we describe the proccess of metadata mapping and what the steps you should take. This page is intended for a data stewards, data experts, or equivalent roles. If you need For a general overview, please refer to our general Metadata maping mapping overview: 2. Metadata mapping
...
Next, you need to extract and curate the metadata from the dedicated databases at source. The output of this step is metadata that is sourced, cleaned, wrangled, and ready to go through the transformation pipeline.
Also, in this step you decide how each piece of data relates to RDF concepts like classes, properties, and entities.
Each file (such as csv or json) is describing e.g., CSV or JSON) describes a dataset, or a resource, or an image or a sample.
Each row in the csv CSV can be mapped to the target properties and target class classes in the Core Metadata Schema https://github.com/Health-RI/health-ri-metadata/ .
2. Understand the ontology (
...
DCAT)
An ontology defines the vocabulary (classes, properties, etc.) used to describe your data in RDF.
In our case, we use dcat DCAT v3 for transformation purposes : https://www.w3.org/TR/vocab-dcat-3/ We use dcat-ap and DCAT-AP for evaluation purposes. This DCAT-AP is a constraint model, and you use it for understanding which helps to understand which fields are mandatory and other constraints.
...
For instance, a column named "title" might map to a property such as dcat:title
in the dcat:Resource
class.
5. Convert Values
Transform the values in each cell into RDF literals or resources, depending on their nature. For literal values (e.g., names, descriptions), you can directly use the cell's content directly. For values that represent representing relationships or references to other entities, you will need to create or use existing URIs. (, linking to controlled vocabularies) .
6. Use a Mapping Language or Tool
Several languages and tools can automate the mapping process from CSV to RDF, such as:
RML (RDF Mapping Language): An extension of R2RML for mapping various file formats, including CSV, to RDF.
Tarql (Transforming ARbitrary Queries into Linked data): A command-line tool for mapping CSV to RDF using SPARQL-like templates.
...
.
...
OpenRefine: A powerful tool for working with messy data, including features for converting data to RDF.
7. Create RDF Triples
Using the mappings you have defined, generate RDF triples for each row in your file. Each triple consists of a subject (the resource URI), a predicate (the property URI), and an object (the value or another resource URI).
...
After converting your data, validate the RDF output to ensure it accurately represents your original data and adheres to the ontology's structure. You may need to refine your mappings or data to correct any issues.
https://github.com/Health-RI/metadata-shacl-validation Health-RI RDF Validator using SHACL shapes can be found here. The GitHub repository is available here. (Note: This repository and all SHACL shapes are still under active development)
9. Share and Publish your validated metadata graph as FDP
Once your RDF data is ready, consider how you will share or publish it to make it accessible to your community, for instance, through FDP. This might involve hosting it on a SPARQL endpoint, within a triple store, or through other data publishing platforms.
...