Status | ||||
---|---|---|---|---|
|
Table of Contents | |||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
📌 Introduction
In this section, we describe the basics of metadata and explain what metadata mapping is. We also look at the Health-RI Core Metadata Schema and the metadata standards it builds upon. This page is intended for a general audience. For details on the standards and the schema, please visit the github specifications dedicated for data experts or data stewards https://github.com/Health-RI/health-ri-metadata/ .
...
Below is an example of metadata from the PRISMA study. It contains information about the data available:
Class | Property | Property Label |
Example |
dcat: |
dcat:dataset
dataset
Personalised RISk-based MAmmascreening Study (PRISMA)
Catalog | dct:description | Description |
The primary aim of the PRISMA study is to investigate the potential value of risk-tailored versus traditional breast cancer screening protocols in the Netherlands. Data collection took place between 2014-2019, resulting in ∼67,000 mammograms, ∼38,000 surveys, ∼10,000 blood samples and ∼600 saliva samples. | |
dct:publisher | Publisher |
foaf:Agent | |
dct:title | Title |
Personalised RISk-based MAmmascreening Study (PRISMA) | |
dcat:Dataset | dcat: |
contactPoint | Contact Point |
vcard: |
Kind | |
dct:creator | Creator |
foaf: |
Agent | ||
dct: description | Description | The |
extensive questionnaire covers a number of potential breast cancer risk predictors such as demographics, personal characteristics, reproductive characteristics, medication, lifestyle, health status, family history, psychosocial characteristics. |
dct:issued |
Issued
Release date | 2024-07-02T10:49:07 | |
dct: identifier | Identifier |
dct:modified | Modified |
2024-09-09T08:54:32 | ||
dct:publisher | Publisher | foaf: |
Agent | ||
dcat:theme | Theme |
dct:title | Title |
PRISMA Questionnaire data |
dct: |
license |
License |
dct:license
License
Not yet available
dcat:Distribution | dcat:accessURL | Access URL | DOI (not yet available) |
dcat: |
mediaType | Format |
dcat:title | Title | PRISMA Questionnaire data - CSV format |
dcat:description | Description | The |
questionnaire data in CSV format. | |||
foaf:Agent | foaf:name | name | Radboudumc (Publisher) |
dct:identifier | identifier |
mailto:contact@radboudumc.nl
https://ror.org/05wg1m734 (Publisher) | |||
vcard:Kind | vcard:hasEmail | has email | firstname.lastname@radboudumc.nl |
vcard:hasName | has name | J. Doe | |
foaf:Agent | foaf:name | name | J. Doe (Creator) |
dct:identifier | identifier | https://orcid.org/0000-0000-0000-0000 (Creator) |
Here is the same data mapped towards the Health-RI metadata core. It contains the same information, however, now this data can be easily processed by a computer is machine readable and is in a format that is common for many places on the web.
Code Block |
---|
@prefix dcat: <http://www.w3.org/ns/dcat#> . @prefix dct: <http://purl.org/dc/terms/> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix vcard: <http://www.w3.org/2006/vcard/ns#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . # Catalog description <https://fdp.radboudumc.nl/catalogue/catalogue>prisma> a dcat:Catalog ; dct:title "Radboudumc Core MetadataPersonalised RISk-based MAmmascreening Study (PRISMA)" ; dct:description "This catalog describesThe primary aim of the PRISMA study is to investigate the corepotential metadata of Radboudumc datasetsvalue of risk-tailored versus traditional breast cancer screening protocols in the Netherlands. Data collection took place between 2014-2019, resulting in ∼67,000 mammograms, ∼38,000 surveys, ∼10,000 blood samples and ∼600 saliva samples." ; dct:publisher [ a foaf:Agent ; foaf:name "Radboudumc (Publisher)" ; dct:identifier <https://ror.org/05wg1m734> ] ; dcat:dataset <https://fdp.radboudumc.nl/dataset/8793226e37d6ad17-9a7caa35-4e8c425c-9cef946e-fce41ef0b865>855838d3f9cc> . # Dataset description <https://fdp.radboudumc.nl/dataset/8793226e37d6ad17-9a7caa35-4e8c425c-9cef946e-fce41ef0b865>855838d3f9cc> a dcat:Dataset ; dct:title "PersonalisedPRISMA RISk-based MAmmascreening Study (PRISMA)Questionnaire data" ; dct:description "The primaryextensive aimquestionnaire ofcovers thea PRISMAnumber study was to investigate the potential value of risk-tailored versus traditional of potential breast cancer screeningrisk protocolspredictors insuch theas Netherlands.demographics, Datapersonal collectioncharacteristics, tookreproductive placecharacteristics, between 2014-2019medication, lifestyle, resultinghealth in ∼67,000 mammograms, ∼38,000 surveys, ∼10,000 blood samples and ∼600 saliva samples."status, family history, psychosocial characteristics." ; dct:issued "2024-07-02T10:49:07"^^xsd:dateTime ; dct:issuedmodified "2024-01-1509-09T08:54:32"^^xsd:datedateTime ; dct:identifier <https://fdp.radboudumc.nl/dataset/8793226e37d6ad17-9a7caa35-4e8c425c-9cef946e-fce41ef0b865>855838d3f9cc> ; dct:modified "2024-01-15"^^xsd:datecreator [ a foaf:Agent ; foaf:name "J. Doe (Creator)" ; dct:identifier <https://orcid.org/0000-0000-0000-0000> ] ; dct:publisher [ a foaf:Agent ; foaf:name "Radboudumc (Publisher)" ; dct:identifier <https://ror.org/05wg1m734> ] ; dcat:theme <http://purlpublications.obolibraryeuropa.orgeu/obo/MONDO_0007254>, <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C20116>resource/authority/data-theme/HEAL> ; dct:typelicense <http<https://purldata.ru.orgnl/dcdoc/dcmitype/Dataset>dua/RUMC-RA-DUA-1.0.html> ; dctdcat:license "Not yet available" distribution <https://fdp.radboudumc.nl/distribution/csv> ; dcat:distributioncontactPoint [ a dcatvcard:DistributionKind ; dcatvcard:accessURLhasEmail <doi:not_yet_available><mailto:firstname.lastname@radboudumc.nl> ; dcatvcard:mediaTypefn "text/csvJ. Doe" ; ] . # Distribution details (CSV) dcat:title "PRISMA Questionnaire data" ; <https://fdp.radboudumc.nl/distribution/csv> a dcat:Distribution ; dcat:description "The extensive questionnaire covers different topics such as demographics, personal characteristics, reproductive characteristics, medication, lifestyle, health status, family history, psychosocial characteristics." ] ; dcat:ContactPoint [ accessURL <doi:not_yet_available> ; dcat:mediaType <https://www.iana.org/assignments/media-types/text/csv> ; dcat:title "PRISMA Questionnaire data - CSV format" ; dcat:description "The questionnaire data in CSV format." . # Agent description (Publisher) <https://ror.org/05wg1m734> a foaf:Agent ; foaf:name "Radboudumc (Publisher)" ; vcard:hasEmail <mailto:contact@radboudumc.nl> ] . dct:identifier <https://ror.org/05wg1m734> . # Agent description (Creator) <https://rororcid.org/05wg1m734>0000-0000-0000-0000> a foaf:Agent ; foaf:name "RadboudumcJ. Doe (Creator)" ; dct:identifier <https://rororcid.org/05wg1m734>0000-0000-0000-0000> . |
To map your metadata you first need to understand the structure of your metadata and their semantic meaning and the ontology (vocabulary) used to to describe your data in a Resource Description Framework (RDF), in our case DCAT V3, format. The general outline of the mapping pipeline can be found here: https://health-ri.atlassian.net/wiki/spaces/FSD/pages/edit-v2/290291734?draftShareId=ff45a2e2-80ee-49aa-b6d6-c04dedb6f9f8
...