Content Comparison

Status

colour	Red
title	status: in development

Table of Contents

none

maxLevel	6
minLevel	1
maxLevel	6
include
outline	false
indent	style
exclude	Warning:
style	none
type	list
class	printable	false
class

📌 Introduction

In this section, we describe the basics of metadata and explain what metadata mapping is. We also look at the Health-RI Core Metadata Schema and the metadata standards it builds upon. This page is intended for a general audience. For details on the standards and the schema, please visit the github specifications dedicated for data experts or data stewards https://github.com/Health-RI/health-ri-metadata/ .

...

Metadata mapping is the process of establishing connections between corresponding metadata values or fields across different systems. In simple terms, it ensures that your metadata schema for your data is transformed to the HRI metadata schema in the correct way. It involves identifying and linking similar pieces of metadata information from one system to the relevant content or data elements in another system. This mapping ensures consistency and coherence between disparate datasets or databases, allowing for efficient data integration and interoperability. By associating equivalent metadata values or fields, metadata mapping enables seamless communication and exchange of information between systems, facilitating accurate data discovery, retrieval, and interpretation.

Below is an example of simple metadata of a blood a sample. It describes the important information about the sample including ID of the sample, ID of the patient, and a diagnosis:

...

metadata from the PRISMA study. It contains information about the data available:

Class	Property	Property Label	Example
dcat:Catalog	dct:description	Description	The primary aim of the PRISMA study is to investigate the potential value of risk-tailored versus traditional breast cancer screening protocols in the Netherlands. Data collection took place between 2014-2019, resulting in ∼67,000 mammograms, ∼38,000 surveys, ∼10,000 blood samples and ∼600 saliva samples.
	dct:publisher	Publisher	foaf:Agent
	dct:title	Title	Personalised RISk-based MAmmascreening Study (PRISMA)
dcat:Dataset	dcat:contactPoint	Contact Point	vcard:Kind
	dct:creator	Creator	foaf:Agent
	dct: description	Description	The extensive questionnaire covers a number of potential breast cancer risk predictors such as demographics, personal characteristics, reproductive characteristics, medication, lifestyle, health status, family history, psychosocial characteristics.
	dct:issued	Release date	2024-07-02T10:49:07
	dct: identifier	Identifier	https://fdp.radboudumc.nl/dataset/37d6ad17-aa35-425c-946e-855838d3f9cc
	dct:modified	Modified	2024-09-09T08:54:32
	dct:publisher	Publisher	foaf:Agent
	dcat:theme	Theme	http://publications.europa.eu/resource/authority/data-theme/HEAL
	dct:title	Title	PRISMA Questionnaire data
	dct:license	License	https://data.ru.nl/doc/dua/RUMC-RA-DUA-1.0.html
dcat:Distribution	dcat:accessURL	Access URL	DOI (not yet available)
	dcat:mediaType	Format	https://www.iana.org/assignments/media-types/text/csv
	dcat:title	Title	PRISMA Questionnaire data - CSV format
	dcat:description	Description	The questionnaire data in CSV format.
foaf:Agent	foaf:name	name	Radboudumc (Publisher)
foaf:Agent	dct:identifier	identifier	https://ror.org/05wg1m734 (Publisher)
vcard:Kind	vcard:hasEmail	has email	firstname.lastname@radboudumc.nl
vcard:Kind	vcard:hasName	has name	J. Doe
foaf:Agent	foaf:name	name	J. Doe (Creator)
foaf:Agent	dct:identifier	identifier	https://orcid.org/0000-0000-0000-0000 (Creator)

Here is the same data mapped towards the DCAT-AP standard as a datasetHealth-RI metadata core. It contains the same information and adds some mandatory variables like description. However, however, now this data can be easily processed by a computer is machine readable and is in a format that is common for many places on the web.

Code Block

@prefix dcat: <http://www.w3.org/ns/dcat#> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix vcard: <http://www.w3.org/2006/vcard/ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

# Catalog description
<https://fdp.radboudumc.nl/catalogue/prisma>
  <>  a dcat:Dataset:Catalog ;
    dct:title "Personalised RISk-based MAmmascreening Study (PRISMA)" ;
    dct:description "The primary aim of the PRISMA study is to investigate the potential value of risk-tailored versus traditional breast cancer screening protocols in the Netherlands. Data collection took place between 2014-2019, resulting in ∼67,000 mammograms, ∼38,000 surveys, ∼10,000 blood samples and ∼600 saliva samples." ;
    dct:identifier "BS001":publisher [ a foaf:Agent ; foaf:name "Radboudumc (Publisher)" ; dct:identifier <https://ror.org/05wg1m734> ] ;
    dcat:dataset <https://fdp.radboudumc.nl/dataset/37d6ad17-aa35-425c-946e-855838d3f9cc> .

# Dataset description
<https://fdp.radboudumc.nl/dataset/37d6ad17-aa35-425c-946e-855838d3f9cc>
    a dcat:Dataset ;
    dct:title "BloodPRISMA Questionnaire Sampledata" ;
    dct:description "Metadata for a blood sampleThe extensive questionnaire covers a number of potential breast cancer risk predictors such as demographics, personal characteristics, reproductive characteristics, medication, lifestyle, health status, family history, psychosocial characteristics." ;
    dct:issued "2024-0107-15T0802T10:3049:0007"^^xsd:dateTime ;
    dct:publisher "Lab Technician, Sarah Leemodified "2024-09-09T08:54:32"^^xsd:dateTime ;
    dct:identifier <https://fdp.radboudumc.nl/dataset/37d6ad17-aa35-425c-946e-855838d3f9cc> ;
    dct:creator [ a foaf:Agent ; foaf:name "J. Doe (Creator)" ; dct:identifier <https://orcid.org/0000-0000-0000-0000> ] ;
    dct:subject "Hypertension":publisher [ a foaf:Agent ; foaf:name "Radboudumc (Publisher)" ; dct:identifier <https://ror.org/05wg1m734> ] ;
    dcat:theme <http://publications.europa.eu/resource/authority/data-theme/HEAL> ;
    dct:license <https://data.ru.nl/doc/dua/RUMC-RA-DUA-1.0.html> ;
    dcat:distribution <https://fdp.radboudumc.nl/distribution/csv> ;
    dcat:landingPage "https://example.com/blood_sample":contactPoint [
        a vcard:Kind ;
        vcard:hasEmail <mailto:firstname.lastname@radboudumc.nl> ;
        vcard:fn "J. Doe"
    ] .

# Distribution details (CSV)
<https://fdp.radboudumc.nl/distribution/csv>
    a dcat:Distribution ;
    dcat:accessRights "Informed consent obtained"accessURL <doi:not_yet_available> ;
    dcat:mediaType <https://www.iana.org/assignments/media-types/text/csv> ;
    dcat:themetitle "HealthPRISMA Questionnaire data - CSV format" ;
    dcat:keyworddescription "Blood sample", "Hypertension", "CBC", "Lipid Panel", "Glucose TestThe questionnaire data in CSV format." .

# Agent description (Publisher)
<https://ror.org/05wg1m734>
    a foaf:Agent ;
    foaf:name "Radboudumc (Publisher)" ;
    dcatdct:temporal "2024-01-15T08:30:00/2024-01-15T10:00:00"^^xsd:dateTimeidentifier <https://ror.org/05wg1m734> .

# Agent description (Creator)
<https://orcid.org/0000-0000-0000-0000>
    a foaf:Agent ;
    dcatfoaf:hasVersionname "1.0J. Doe (Creator)" ;
    dcatdct:conformsToidentifier <https://eurlorcid.link/dcat-ap>org/0000-0000-0000-0000> .

To map your metadata you first need to understand the structure of your metadata and their semantic meaning and the ontology (vocabulary) used to to describe your data in a Resource Description Framework (RDF), in our case DCAT V3, format. The general outline of the mapping pipeline can be found here: https://health-ri.atlassian.net/wiki/spaces/FSD/pages/edit-v2/290291734?draftShareId=ff45a2e2-80ee-49aa-b6d6-c04dedb6f9f8

...

Once your RDF data is ready, you can publish it to FAIR Data Point, where it can be harvested by the Catalogue. More information about this step can be found here: 3. Exposing metadata