Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Status
colourRed
titlestatus: in development

Table of Contents
class
maxLevel6
minLevel1
maxLevel6
include
outlinefalse
indent
stylenone
excludeWarning:
stylenone
typelist
printablefalse
class

📌 Introduction

In this section, we describe the basics of metadata and explain what metadata mapping is. We also look at the Health-RI Core Metadata Schema and the metadata standards it builds upon. This page is intended for a general audience. For details on the standards and the schema, please visit the github specifications dedicated for data experts or data stewards https://github.com/Health-RI/health-ri-metadata/ .

...

Metadata mapping is the process of establishing connections between corresponding metadata values or fields across different systems. In simple terms, it ensures that your metadata schema for your data is transformed to the HRI metadata schema in the correct way. It involves identifying and linking similar pieces of metadata information from one system to the relevant content or data elements in another system. This mapping ensures consistency and coherence between disparate datasets or databases, allowing for efficient data integration and interoperability. By associating equivalent metadata values or fields, metadata mapping enables seamless communication and exchange of information between systems, facilitating accurate data discovery, retrieval, and interpretation.

Below is an example of simple metadata of a blood a sample. It describes the important information about the sample including ID of the sample, ID of the patient, and a diagnosis:metadata from the PRISMA study. It contains information about the data available:

Class

Property

Property Label

Description/

Example

dcat:

dataset

dataset

Personalised RISk-based MAmmascreening Study (PRISMA)

Catalog

dct:description

Description

This catalog describes the core metadata of Radboudumc datasets

Description

dct:publisher

Publisher

https://ror.org/05wg1m734

dct:title

Title

Radboudumc Core Metadata

dcat:ContactPoint

Contact Point

foaf:agent

dct:creator

Creator name

foaf:agent

dct: description

The primary aim of the PRISMA study

was

is to investigate the potential value of risk-tailored versus traditional breast cancer screening protocols in the Netherlands. Data collection took place between 2014-2019, resulting in ∼67,000 mammograms, ∼38,000 surveys, ∼10,000 blood samples and ∼600 saliva samples.

dct:publisher

Publisher

foaf:Agent

dct:title

Title

Personalised RISk-based MAmmascreening Study (PRISMA)

dcat:Dataset

dcat:contactPoint

Contact Point

vcard:Kind

dct:creator

Creator

foaf:Agent

dct: description

Description

The extensive questionnaire covers a number of potential breast cancer risk predictors such as demographics, personal characteristics, reproductive characteristics, medication, lifestyle, health status, family history, psychosocial characteristics.

dct:issued

Issued

15/01/2024

Release date

2024-07-02T10:49:07

dct: identifier

Identifier

https://fdp.radboudumc.nl/dataset/

8793226e

37d6ad17-

9a7c

aa35-

4e8c

425c-

9cef

946e-

fce41ef0b865

855838d3f9cc

dct:modified

Modified

15/01/2024

2024-09-09T08:54:32

dct:publisher

Publisher

foaf:

agent

Agent

dcat:theme

Theme

http://

purl

publications.

obolibrary

europa.

org

eu/

obo/MONDO_0007254, http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C20116

resource/authority/data-theme/HEAL

dct:title

Title

Personalised RISk-based MAmmascreening Study (PRISMA)

PRISMA Questionnaire data

dct:

type

license

Type

License

http

https://

purl

data.ru.

org

nl/

dc

doc/

dcmitype/Dataset

dct:license

License

Not yet available

dua/RUMC-RA-DUA-1.0.html

dcat:Distribution

dcat:accessURL

Access URL

DOI (not yet available)

dcat:

MediaType

mediaType

Format

https://www.iana.org/assignments/media-types/text/

csv 

csv

dcat:title

Title

PRISMA Questionnaire data - CSV format

dcat:description

Description

The

extensive questionnaire covers different topics such as demographics, personal characteristics, reproductive characteristics, medication, lifestyle, health status, family history, psychosocial characteristics.

questionnaire data in CSV format.

foaf:Agent

foaf:name

name

Radboudumc (Publisher)

dct:identifier

identifier

https://ror.org/05wg1m734 (Publisher)

vcard:Kind

vcard:hasEmail

has email

firstname.lastname@radboudumc.nl

vcard:hasName

has name

J. Doe

foaf:Agent

foaf:name

name

J. Doe (Creator)

dct:identifier

identifier

https://orcid.org/0000-0000-0000-0000 (Creator)

Here is the same data mapped towards the Health-RI metadata core. It contains the same information, however, now this data can be easily processed by a computer is machine readable and is in a format that is common for many places on the web.

Code Block
@prefix dcat: <http://www.w3.org/ns/dcat#> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix vcard: <http://www.w3.org/2006/vcard/ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

# Catalog description
<https://fdp.radboudumc.nl/dataset/8793226e-9a7c-4e8c-9cef-fce41ef0b865>catalogue/prisma>
    a dcat:DatasetCatalog ;
    dct:title "Personalised RISk-based MAmmascreening Study (PRISMA)" ;
    dct:description "The primary aim of the PRISMA study wasis to investigate the potential value of risk-tailored versus traditional breast cancer screening protocols in the Netherlands. Data collection took place between 2014-2019, resulting in ∼67,000 mammograms, ∼38,000 surveys, ∼10,000 blood samples and ∼600 saliva samples." ;
    dct:issued "2024-01-15"^^xsd:date:publisher [ a foaf:Agent ; foaf:name "Radboudumc (Publisher)" ; dct:identifier <https://ror.org/05wg1m734> ] ;
    dct:identifierdcat:dataset <https://fdp.radboudumc.nl/dataset/37d6ad17-aa35-425c-946e-855838d3f9cc> .

# Dataset description
<https://fdp.radboudumc.nl/dataset/8793226e-9a7c-4e8c-9cef-fce41ef0b865>37d6ad17-aa35-425c-946e-855838d3f9cc>
    a dcat:Dataset ;
    dct:title "PRISMA Questionnaire data" ;
    dct:description "The extensive questionnaire covers a number of potential breast cancer risk predictors such as demographics, personal characteristics, reproductive characteristics, medication, lifestyle, health status, family history, psychosocial characteristics." ;
    dct:issued "2024-07-02T10:49:07"^^xsd:dateTime ;
    dct:modified "2024-01-1509-09T08:54:32"^^xsd:datedateTime ;
    dct:publisheridentifier <https://ror.org/05wg1m734> ;
    dcat:theme <http://purl.obolibrary.org/obo/MONDO_0007254>, <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C20116>fdp.radboudumc.nl/dataset/37d6ad17-aa35-425c-946e-855838d3f9cc> ;
    dct:creator [ a foaf:Agent ; foaf:name "J. Doe (Creator)" ; dct:identifier <https://orcid.org/0000-0000-0000-0000> ] ;
    dct:publisher [ a foaf:Agent ; foaf:name "Radboudumc (Publisher)" ; dct:identifier <https://ror.org/05wg1m734> ] ;
    dctdcat:typetheme <http://purl.org/dc/dcmitype/Dataset>publications.europa.eu/resource/authority/data-theme/HEAL> ;
    dct:license "Not yet available" <https://data.ru.nl/doc/dua/RUMC-RA-DUA-1.0.html> ;
    dcat:accessURLdistribution <doi:not_yet_available><https://fdp.radboudumc.nl/distribution/csv> ;
    dcat:MediaType "text/csv" ;contactPoint [
        a dcatvcard:ContactPointKind [;
        avcard:hasEmail foaf:Agent<mailto:firstname.lastname@radboudumc.nl> ;
        foafvcard:namefn "RadboudumcJ. Doe"
;    ] .

# Distribution  vcard:hasEmail <mailto:contact@radboudumc.nl>details (CSV)
<https://fdp.radboudumc.nl/distribution/csv>
    a ]dcat:Distribution ;
    dcat:Distribution [
   :accessURL <doi:not_yet_available> ;
    dcat:mediaType <https://www.iana.org/assignments/media-types/text/csv> ;
    dctdcat:title "PRISMA Questionnaire data - CSV format" ;
    dcat:description "The questionnaire  dct:description "The extensive questionnaire covers different topics such as demographics, personal characteristics, reproductive characteristics, medication, lifestyle, health status, family history, psychosocial characteristics." ;
        dcat:mediaType "text/csv"data in CSV format." .

# Agent description (Publisher)
<https://ror.org/05wg1m734>
    a foaf:Agent ;
    foaf:name "Radboudumc (Publisher)" ;
    dct:identifier <https://ror.org/05wg1m734> .

# Agent description (Creator)
<https://orcid.org/0000-0000-0000-0000>
    a foaf:Agent ;
    foaf:name "J. Doe  dcat:accessURL <doi:not_yet_available>(Creator)" ;
     ]dct:identifier <https://orcid.org/0000-0000-0000-0000> .

To map your metadata you first need to understand the structure of your metadata and their semantic meaning and the ontology (vocabulary) used to to describe your data in a Resource Description Framework (RDF), in our case DCAT V3, format. The general outline of the mapping pipeline can be found here: https://health-ri.atlassian.net/wiki/spaces/FSD/pages/edit-v2/290291734?draftShareId=ff45a2e2-80ee-49aa-b6d6-c04dedb6f9f8

...