Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Table of Contents
maxLevel6
minLevel1
include
outlinefalse
indent
excludeWarning:
stylenone
typelist
printablefalse
class
Page Properties
hiddentrue
idDocument ID Documentnaam Versie Datum Status Classificatie Vertrouwelijk / Intern / Publiek Eigenaar

Name

Metadata mapping

Author

Hannah Neikes Lucie Kulhankova

Version

v1.0.0

Descripton

Description of metadata mapping, HRI metadata schema and the relevant terms (UML, classes, properties). Provides background infromation for Mapping tutorial

Status

Status
colourBlue
titleGoedgekerud

Version

Datum

Changes

Author

Version

Datum

Changes

Author

📌 Introduction

Before you can add your resource’s metadata to the National Health Data Catalogue, you will need to know what metadata are, where your metadata are located and what metadata is needed for the Catalogue. Independent of how you will add your metadata to a FAIR Data Point (manually or automatically), you will need to map your metadata values to the Catalogue’s metadata schema.

...

If you already understand the schema and want to go to metadata mapping immediately, you can follow this tutorial: https://health-ri.atlassian.net/wiki/spaces/FSD/pages/290291734/Mapping+pipelineFSD/pages/290291734/Mapping+tutorial?atlOrigin=eyJpIjoiNjZjNmYzNDczMThmNGQyMDgzZTQ3ODg0ODAxZTAyNWUiLCJwIjoiYyJ9.

In the future, we hope to support you with scripts that can automatically transform metadata entered in a CSV template into RDF, ready to be added to the FDP.

...

🧠 What is metadata

Metadata is essentially data about data. It provides information that describes various aspects of your data, such as its description, the owner of the data, or the format of the data. In other words, metadata helps understanding and managing data effectively by providing additional information about it.

Specifically for the National Health Data Catalogue, based on the provided metadata, users of the catalogue will find relevant datasets and judge their usability. Therefore, as a data holder onboarding data, it is essential to provide detailed and complete metadata about your dataset(s). That way, you also adhere to the F2 of the FAIR principles. If the metadata contains the right information, eg. about the type of cancer that is relevant in a dataset, a data user will be able to find relevant and interesting datasets in the catalogue.

...

For specific details on the schema, please visit the Github specifications dedicated for data experts or data stewards: Currently, we are transferring to a new version of the metadata schema: v2, available on Github here. Specifications from the official v1 release are available here.

...

For an overview of the classes in metadata core v2 and their relations, see the figure below. Additionally, we provide some considerations and guidelines on mapping to the different classes here: https://health-ri.atlassian.net/wiki/spaces/FSD/pages/1020624897/Recommendations+on+mapping+to the different classes on this Confluence page.+classes+in+the+v2+core+metadata?atlOrigin=eyJpIjoiZjcxMGNkZmFjNzk1NDQ0MDliMDE2ODE4NTRiZjhmZjkiLCJwIjoiYyJ9

Overview of all core Health-RI classes and relations between classes

...

Info

Note that you will most likely NOT need or make use of all available classes in the Health-RI core metadata schema (v2). Some classes are not applicable to all cases, e.g. in a case where an institute wants to describe only the available datasets, they might only use the dcat:Catalog and dcat:Dataset classes.
More information and considerations/guidelines for different use cases are described here.and considerations/guidelines for different use cases are described here: https://health-ri.atlassian.net/wiki/spaces/FSD/pages/1020624897/Recommendations+on+mapping+to+classes+in+the+v2+core+metadata?atlOrigin=eyJpIjoiZjcxMGNkZmFjNzk1NDQ0MDliMDE2ODE4NTRiZjhmZjkiLCJwIjoiYyJ9

Properties

Each class consists of a set of its own, related metadata fields, so called properties, that describe the entity (class) in more detail. For example, each dcat:Dataset contains the properties dct:title and dct:description, which are free text fields that provide a title and detailed description of the contents of the dataset. In another example, the class vcard:Kind (which is used to provide contact details of a resource) contains the property vcard:hasEmailto provide an email address in the metadata.
Each property has a number of attributes (i.e. requirement level, cardinality, range, property URI):

...

A UML diagram is a visual representation of a metadata schema. The UML of the v2 metadata schema of Health-RI is depicted below.
A UML is divided by class (the boxes in the UML below), where each box represents a class of the schema. Within each class, the relevant properties are listed with the property URI, the range, requirement level and cardinality.
For example, in the UML below you see the box for dcat:Dataset (class), containing the mandatory property dct:title with range rdfs:Literal and cardinality [1..n]. The dcat:Dataset class also contains the property dcat:distribution with range dcat:Distribution (cardinality [0..n]). As you can see from the capital letter in the range of the property, this property is pointing to another class (dcat:Distribution) also present in the UML. The connection between these classes is also indicated by the open arrow from the dcat:Dataset class to the dcat:Distribution class.
While open arrows indicate connections between classes, closed arrows indicate that a certain class inherits all properties from another class. For example, the dcat:Dataset inherits from dcat:Resource, indicating that all properties from dcat:Resource can also be used in dcat:Dataset. Note that this does not mean that also the values are inherited, but only the ('empty') properties.


Nested classes:

It is possible that a class refers to (another instance) of the same class, e.g. dcat:Catalog pointing to itself via the property dct:hasPart. These kind of nested structures can be used to describe the structure of an institution or infrastructure in more detail, for example if an institute (described by dcat:Catalog) is divided into several independent departments (each described with its one instance of dcat:Catalog) that produce and publish their own sets of dcat:Dataset.

Info

Please note that in the current implementation of the Health-RI core schema, there is a limit to the theoretically indefinite flexibility that DCAT offers, especially since the National Health Data Catalogue cannot currently display these layers of nested structures. Read more about it belowit here https://health-ri.atlassian.net/wiki/spaces/FSD/pages/290291734/Mapping+tutorial#%F0%9F%9A%A7-Current-limitations-in-model-flexibility .

UML diagram of the v2 core metadata schema of Health-RI

...

(tick) Next steps

To map your metadata, you can follow the general tutorial https://health-ri.atlassian.net/wiki/spaces/FSD/pages/290291734/Mapping+pipelinetutorial?atlOrigin=eyJpIjoiMWNhNTg5NzY5NzIyNGJkNzljMmY3Y2ZmYWI5YjUxNTciLCJwIjoiYyJ9. Then the metadata can be transformed into RDF format and exposed using a FAIR Data Point. More information about this step can be found here: 4B Exposing metadata

Additional resources

Technical details on DCAT AP and FAIR Datapoints - Youtube video, Health-RI

HRI Github - You can find recourses and examples on the Health-RI metadata Github. 

Technical details on DCAT AP and FAIR Datapoints - Youtube video, Health-RI

Health-RI metadata schema v1: https://github.com/Health-RI/health-ri-metadata/tree/v1.0.1

Health-RI metadata schema v2: https://github.com/Health-RI/health-ri-metadata/tree/v2.0.0-beta.2

Resources from the EU Open Data Explained, including a general training on metadata and basic and advanced level resourses on DCAT and DCAT-AP.

HealthDCAT-AP literacy portal

FAIR Metrolines (note: some pages under developement):

Metroline Step: Assess availability of your metadata

Metroline Step: Register resource level metadata

Metroline Step: Analyse data semantics

Metroline Step: Apply (meta)data model

Metroline Step: Create or reuse a semantic (meta)data model

...