Relation of the Health-RI core metadata schema to other DCAT application profiles
Introduction
Some background of the metadata schemas used for the Health-RI core metadata schema:
Dublin Core (DC): Dublin Core is a widely used metadata standard designed to provide a simple and standardized way to describe digital resources such as documents, web pages, images, videos and other types of content on the internet.
DCAT: Data Catalog Vocabulary is a metadata standard specifically designed for describing datasets and data catalogs on the web. DCAT is based on RDF (Resource Description Framework), which is a standard model for representing and exchanging metadata and data on the web in a machine-readable format (i.e., data structured in a way processable by a computer).
DCAT-AP: DCAT Application Profile for Data Portals in Europe is a metadata standard developed by the European Commission to facilitate the interoperability of data catalogs and portals across European countries. It builds upon the DCAT (Data Catalog Vocabulary) standard and extends it with additional requirements and recommendations tailored to the European context.
As of version 2 of the core metadata schema for the National Health Data Catalogue, two DCAT application profiles, DCAT-AP NL 3.0 and HealthDCAT-AP have been implemented. Both are extensions of DCAT-AP v3, either country- or domain-specific.
DCAT-AP NL 3.0
DCAT-AP NL is developed in the Netherlands by Geonovum, in collaboration with the RIVM, CBS, Kadaster, Digitaal erfgoed, Rijkswaterstaat and Health-RI. The application profile is a country-specific specification of DCAT-AP v3 and is meant for sharing of metadata between Dutch data portals. By implementing DCAT-AP NL in the Health-RI metadata schema, metadata in the National Health Data Catalogue will eventually be findable in other Dutch data portals.
See for more information and the latest version of the application profile: https://docs.geostandaarden.nl/dcat/dcat-ap-nl30/ .
HealthDCAT-AP
HealthDCAT-AP (draft, currently not officially finalized) is a health-specific extension of DCAT-AP v3 developed in the EU (as part of the TEHDAS2 project) in preparation for the EHDS (European health data space). Among other things, it adds a number of health-specific properties to the schema that enable more detailed description of available health datasets with metadata. Implementing HealthDCAT-AP in the Health-RI metadata schema will allow sharing of metadata with other EU health data portals, as envisioned by the EHDS.
See for the draft version of the application profile: https://healthdcat-ap.github.io/. Additional information is available at the HealthDCAT-AP literacy portal.
Relation of DCAT-AP NL and HealthDCAT-AP and the Health-RI metadata schema
The schema below sketches the relationships of the different DCAT application profiles to each other, developed at different levels (EU or nationally).
Implementation of DCAT-AP NL and HealthDCAT-AP in Health-RI core metadata schema v2
v2 of the Health-RI core metadata schema adopts elements from (draft) HealthDCAT-AP and restrictions from DCAT-AP NL 3.0. This Excel sheet on Github documents per property the source of constraint (cardinality); i.e., whether the constraint is originating from DCAT-AP v3, DCAT-AP NL, HealthDCAT-AP or combination of DCAT-AP NL and HealthDCAT-AP.
In the development phase of Health-RI core metadata v2, some deviations from the two application profiles have been included. These and other modelling decisions were documented in a decision log. Specifically:
Since HealthDCAT-AP has not been finalised yet, v2 of the Health-RI core metadata schema has incorporated a snapshot of the schema from December 2024 (available at this commit on Github). See decision log entry here.
HealthDCAT-AP handles different variants of strictness depending on the level of access rights. v2 of the Health-RI core metadata schema only incorporates the OPEN variant. See decision log entry here. The full schema will be incorporated once HealthDCAT-AP is finished and officially released.
For some properties, the Health-RI core metadata schema v2 deviates from HealthDCAT-AP or DCAT-AP NL. See the respective decision log entries here and here.
Specifically, some newly introduced properties from HealthDCAT-AP have not been added due to the following reasons:Class Dataset:
healthdcatap:hdab– There is no HDAB appointed yet in NL.healthdcatap:healthCategory– This field refers to the categories as defined in art. 51 EHDS. It requires the use of a controlled vocabulary which is currently not yet created. See table of HealthDCAT-AP vocabularies below table in section linked here.
Class Agent:
healthdcatap:publisherType- This field refers to the types of publishers as discussed in the TEHDAS2 program. It requires the use of a controlled vocabulary which is currently not yet created. See table of HealthDCAT-AP vocabularies below table in section linked here.
Furthermore, some deviations from HealthDCAT-AP and DCAT-AP NL have been decided on:
dct:creator(both Catalog and Data Service classes) - DCAT-AP v3/ DCAT-AP NL / HealthDCAT-AP restrict to max 1, but HRI v2 model allows many.dct:language(Distribution class): DCAT-AP NL restricts to max 1 language per Distribution. HRI v2 model allows many, since there can be multiple languages present in the same Distribution.dcat:distribution(Dataset class): HealthDCAT-AP requires at least one distribution (also in the OPEN variant of the profile), but HRI v2 model has this recommended. Not all datasets have a Distribution ready to be described. This is why it was decided to not make this property (and therefore the class Distribution) mandatory.
Additional resources
Health-RI metadata schema v1: https://github.com/Health-RI/health-ri-metadata/tree/v1.0.1
Health-RI metadata schema v2: https://github.com/Health-RI/health-ri-metadata/tree/v2.0.0
Latest version HealthDCAT-AP: https://healthdcat-ap.github.io/
Latest version DCAT-AP NL: https://docs.geostandaarden.nl/dcat/dcat-ap-nl30/
DCAT-AP v3: https://semiceu.github.io/DCAT-AP/releases/3.0.0/
Resources from the EU Open Data Explained, including a general training on metadata and basic and advanced level resourses on DCAT and DCAT-AP.
Technical details on DCAT AP and FAIR Datapoints - Youtube video, Health-RI
Image2Catalog: https://github.com/Health-RI/img2catalog This tool queries an XNAT instance and generates DCAT-AP 3.0 metadata.
Questions?
If you have questions about the onboarding process or would like to learn more. Reach out to our https://www.health-ri.nl/health-ri-servicedesk or on servicedesk@health-ri.nl