...

Table of Contents
minLevel1
maxLevel2
outlinefalse
typelist
printablefalse

...

Details about this document

Editors

  • Bruna dos Santos Vieira

  • Dena Tahvidalri (until 29th Feb 2024)

  • Ana Konrad

Repository

Latest published version

https://github.com/Health-RI/health-ri-metadata/tree/master/Formalisation(shacl)/Core/PiecesShape

Purpose of this document

This document outlines the Plateau 1 Core Metadata Schema, detailing the classes and entities involved and offering implementation guidance (usage notes) for developers at regional nodes. It specifically addresses the schema's design and application but excludes discussion on the national catalog, its onboarding process, and future schema expansions. Additional information and versions are available on GitHub.https://github.com/Health-RI/health-ri-metadata/.

Intended Audience

Technical audience tasked with implementing the metadata schema and stakeholders interested in a detailed understanding of the core schema.

Terminology

According to DCAT-AP:

  • An Application Profile defines the mandatory, recommended, and optional components for a specific use case by leveraging terminology from foundational standards. Additionally, it suggests standardized vocabularies to maintain consistency in the use of terms and data.

  • A Dataset is a self-contained set of data produced by a specific organization, which can be accessed or downloaded for various uses.

  • A Data Portal is an online platform that offers a catalog of datasets and tools to help users locate and utilize these datasets effectively.

Introduction

Scope

...

Introduction

Scope

To make it easier to share, find and reuse data, the Health-RI nodes decided to list resources in a national directory that can be accessed internationally. They all agreed on what basic information should be included, and that the catalog should be interoperable with other EU portals, which led to the creation of the Core Metadata Schema.

This schema describes the minimum amount of information that should be used to describe resources across Health-RI nodes through the national directory, which is in line with what Plateau 1 offers. The schema can be changed or extended to meet the needs of different areas, and new versions will be released in the future. Apart from this metadata documentation, users can look for the onboarding documents with details on how to implement and connect, and the Plateau 1 documents which explain what each feature can do.

Diagram

An overview of the Metadata schema core is presented in the UML diagram depicted below (Fig 1). The UML showcases the primary classes (entities), excluding the detailed definitions such as rdfs:label rdfs:comment. Each block denotes a class and comprises a list of its attributes (properties). If a class is connected to another class by a closed arrow, indicating that it inherits all properties from the other class. For example, dcat:DatasetSeries inherits from dcat:Dataset which inherits from dcat:Resource. The other arrows, represent relations and contain the type of relation, such as dcat:Dataset connects to a dcat:DatasetSeries via the predicate dcat:inSeries, and include the cardinality, such as dcat:Dataset can be connected via dcat:inSeries to zero or more dcat:DatasetSeries.

...

Mandatory and Recommended

Following the DCAT-AP specification, we categorize components into 'mandatory' and 'recommended' classes and properties. A potential third category, 'Optional,' may be introduced in the future.

In the context of data exchange:

  • Mandatory Class: Senders MUST provide information about instances of the class; Receivers MUST process information about instances of the class.

  • Recommended Class: Senders SHOULD provide information about instances of the class if available; Receivers MUST process information about instances of the class.

  • Optional Class: Senders MAY provide the information but are not obliged to do so; Receivers MUST process information about instances of the class.

  • Mandatory property: Senders MUST provide the information for that property; Receivers MUST process the information for that property.

  • Recommended property: Senders SHOULD provide the information if available; Receivers MUST process the information for that property.

  • Optional property: Senders MAY provide the information but are not obliged to do so; Receivers MUST process the information for that property.

Prefixes

...

Prefix

...

Namespace IRI

...

Source

...

dcat

Mandatory and Recommended

Following the DCAT-AP specification, we categorize components into 'mandatory' and 'recommended' classes and properties. A potential third category, 'Optional,' may be introduced in the future.

In the context of data exchange:

  • Mandatory Class: Senders MUST provide information about instances of the class; Receivers MUST process information about instances of the class.

  • Recommended Class: Senders SHOULD provide information about instances of the class if available; Receivers MUST process information about instances of the class.

  • Optional Class: Senders MAY provide the information but are not obliged to do so; Receivers MUST process information about instances of the class.

  • Mandatory property: Senders MUST provide the information for that property; Receivers MUST process the information for that property.

  • Recommended property: Senders SHOULD provide the information if available; Receivers MUST process the information for that property.

  • Optional property: Senders MAY provide the information but are not obliged to do so; Receivers MUST process the information for that property.

Terminology

According to DCAT-AP:

  • An Application Profile defines the mandatory, recommended, and optional components for a specific use case by leveraging terminology from foundational standards. Additionally, it suggests standardized vocabularies to maintain consistency in the use of terms and data.

  • A Dataset is a self-contained set of data produced by a specific organization, which can be accessed or downloaded for various uses. A Data Portal is an online platform that offers a catalog of datasets and tools to help users locate and utilize these datasets effectively.

Used Prefixes

Prefix

Namespace IRI

Source

dcat

http://www.w3.org/ns/dcat#

[VOCAB-DCAT]

dct

http://purl.org/dc/terms/

[DCT]

foaf

http://xmlns.com/foaf/0.1/

[FOAF]

owl

http://www.w3.org/2002/ns07/dcat#owl#

[VOCABOWL2-DCATSYNTAX]

dctrdf

rdf

http://purlwww.w3.org/dc1999/terms/

[DCT]

foaf

http://xmlns.com/foaf/0.1/

[FOAF]

owl

http://www.w3.org/2002/07/owl#

[OWL2-SYNTAX]

http://www.w3.org/1999/02/22-rdf-02/22-rdf-syntax-ns#

[RDF-SYNTAX-GRAMMAR]

rdfs

http://www.w3.org/2000/01/rdf-schema#

[RDF-SCHEMA]

skos

http://www.w3.org/2004/02/skos/core#

[SKOS-REFERENCE]

time

http://www.w3.org/2006/time#

[OWL-TIME]

xsd

http://www.w3.org/2001/XMLSchema#

[XMLSCHEMA11-2]

...

-REFERENCE]

time

http://www.w3.org/2006/time#

[OWL-TIME]

xsd

http://www.w3.org/2001/XMLSchema#

[XMLSCHEMA11-2]

Overview and Diagram

An overview of the Metadata schema core is presented in the UML diagram depicted below (Fig 1). The UML showcases the primary classes (entities), excluding the detailed definitions such as rdfs:label rdfs:comment. Each block denotes a class and comprises a list of its attributes (properties). If a class is connected to another class by a closed arrow, indicating that it inherits all properties from the other class. For example, dcat:DatasetSeries inherits from dcat:Dataset which inherits from dcat:Resource. The other arrows, represent relations and contain the type of relation, such as dcat:Dataset connects to a dcat:DatasetSeries via the predicate dcat:inSeries, and include the cardinality, such as dcat:Dataset can be connected via dcat:inSeries to zero or more dcat:DatasetSeries.

...

Main Classes

Mandatory Classes

Class name

Definition

Usage Note

URI

Dataset

A resource type.
A collection of data, published or curated by a single agent, and available for access or download in one or more representations.

Used to describe one or more datasets. This describes details about the dataset(s). However, a single dataset can have different ways in which they are made available to potential users. How the data in a dataset can be accessed is defined in the Distribution.

dcat:Dataset

Catalog

A catalog that is listed in the National catalog.

Used to describe a bundle of datasets, data services, biobanks, patient registries, or guidelines together under a single title.

dcat:catalog

Agent

An entity that is associated with catalog and/or Datasets.

If the Agent is an organisation, the use of the Organization Ontology is recommended.

foaf:Agent

Resource

Resource published or curated by a single agent.

This is an abstract class, we do not use this class, instead we use specifications of it (e.g. Dataset). This is mainly for a high level grouping and the reuse of properties.

dcat:Resource

...

Class name

Definition

Usage Note

URI

Resource

The class resource, everything.

This class is for grouping and class hierarchy relation purposes.

dcat:Resource

...

Main Properties per Class

Catalog

A curated collection of metadata about resources. A web-based data catalog is typically represented as a single instance of this class.

...