Status | ||||
---|---|---|---|---|
|
Table of Contents | |||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
📌 Introduction
In this section, we describe the basics of metadata and explain what metadata mapping is. We also look at the Health-RI Core Metadata Schema and the metadata standards it builds upon. This page is intended for a general audience. For details on the standards and the schema, please visit the github specifications dedicated for data experts or data stewards https://github.com/Health-RI/health-ri-metadata/tree/master .
🧠 What is metadata
Metadata is essentially data about data. It provides information that describes various aspects of your data, such as its description, the owner of the data, the type of data. In other words, metadata helps understanding and managing data effectively by providing additional information about it.
...
Class | Property | Property Label | Description/Example | |||
dcat: | Cataloguedcat:dataset | dataset | Personalised RISk-based MAmmascreening Study (PRISMA)Catalog | dct:description | DescriptionThis catalog describes the core metadata of Radboudumc datasets | The primary aim of the PRISMA study is to investigate the potential value of risk-tailored versus traditional breast cancer screening protocols in the Netherlands. Data collection took place between 2014-2019, resulting in ∼67,000 mammograms, ∼38,000 surveys, ∼10,000 blood samples and ∼600 saliva samples. |
dct:publisher | Publisher | https://ror.org/05wg1m734 foaf:Agent | ||||
dct:title | TitleRadboudumc Core Metadata | Personalised RISk-based MAmmascreening Study (PRISMA) | ||||
dcat:Dataset | dcat:ContactPointcontactPoint | Contact Point | foafvcard:agentKind | |||
dct:creator | Creator name | foaf:agentAgent | ||||
dct: description | Description | The primary aim of the PRISMA study was to investigate the potential value of risk-tailored versus traditional breast cancer screening protocols in the Netherlands. Data collection took place between 2014-2019, resulting in ∼67,000 mammograms, ∼38,000 surveys, ∼10,000 blood samples and ∼600 saliva samplesextensive questionnaire covers a number of potential breast cancer risk predictors such as demographics, personal characteristics, reproductive characteristics, medication, lifestyle, health status, family history, psychosocial characteristics. | ||||
dct:issued | Issued | 15/01/2024Release date | 2024-07-02T10:49:07 | |||
dct: identifier | Identifier | https://fdp.radboudumc.nl/dataset/8793226e37d6ad17-9a7caa35-4e8c425c-9cef946e-fce41ef0b865855838d3f9cc | ||||
dct:modified | Modified15 | /01/20242024-09-09T08:54:32 | ||||
dct:publisher | Publisher | foaf:agentAgent | ||||
dcat:theme | Theme | http://purlpublications.obolibraryeuropa.orgeu/obo/MONDO_0007254, http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C20116resource/authority/data-theme/HEAL | ||||
dct:title | Title | Personalised RISk-based MAmmascreening Study (PRISMA)PRISMA Questionnaire data | ||||
dct:typelicense | TypeLicense | |||||
dct:license | License | Not yet available | ||||
dcat:Distribution | dcat:accessURL | Access URL | DOI (not yet available) | |||
dcat:MediaTypemediaType | Format | |||||
dcat:title | Title | PRISMA Questionnaire data - CSV format | ||||
dcat:description | Description | The extensive questionnaire covers different topics such as demographics, personal characteristics, reproductive characteristics, medication, lifestyle, health status, family history, psychosocial characteristicsquestionnaire data in CSV format. | ||||
foaf:Agent | foaf:name | name | Radboudumc (Publisher) | |||
dct:identifier | identifier | https://ror.org/05wg1m734 (Publisher) | ||||
vcard:Kind | vcard:hasEmail | has email | firstname.lastname@radboudumc.nl | |||
vcard:hasName | has name | J. Doe | ||||
foaf:Agent | foaf:name | name | J. Doe (Creator) | |||
dct:identifier | identifier | https://orcid.org/0000-0000-0000-0000 (Creator) |
Here is the same data mapped towards the Health-RI metadata core. It contains the same information, however, now this data can be easily processed by a computer is machine readable and is in a format that is common for many places on the web.
Code Block |
---|
@prefix dcat: <http://www.w3.org/ns/dcat#> . @prefix dct: <http://purl.org/dc/terms/> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix vcard: <http://www.w3.org/2006/vcard/ns#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . # Catalog description <https://fdp.radboudumc.nl/catalogue>catalogue/prisma> a dcat:Catalog ; dct:title "Radboudumc Core MetadataPersonalised RISk-based MAmmascreening Study (PRISMA)" ; dct:description "This catalog describesThe primary aim of the PRISMA study is to investigate the corepotential metadata of Radboudumc datasetsvalue of risk-tailored versus traditional breast cancer screening protocols in the Netherlands. Data collection took place between 2014-2019, resulting in ∼67,000 mammograms, ∼38,000 surveys, ∼10,000 blood samples and ∼600 saliva samples." ; dct:publisher [ a foaf:Agent ; foaf:name "Radboudumc (Publisher)" ; dct:identifier <https://ror.org/05wg1m734> ] ; dcat:dataset <https://fdp.radboudumc.nl/dataset/8793226e37d6ad17-9a7caa35-4e8c425c-9cef946e-fce41ef0b865>855838d3f9cc> . # Dataset description <https://fdp.radboudumc.nl/dataset/8793226e37d6ad17-9a7caa35-4e8c425c-9cef946e-fce41ef0b865>855838d3f9cc> a dcat:Dataset ; dct:title "PersonalisedPRISMA RISk-based MAmmascreening Study (PRISMA)Questionnaire data" ; dct:description "The primaryextensive aimquestionnaire ofcovers thea PRISMAnumber study was to investigate the potential value of risk-tailored versus traditional of potential breast cancer screeningrisk protocolspredictors insuch theas Netherlands.demographics, Datapersonal collectioncharacteristics, tookreproductive placecharacteristics, between 2014-2019medication, lifestyle, resultinghealth in ∼67,000 mammograms, ∼38,000 surveys, ∼10,000 blood samples and ∼600 saliva samples."status, family history, psychosocial characteristics." ; dct:issued "2024-07-02T10:49:07"^^xsd:dateTime ; dct:issuedmodified "2024-01-1509-09T08:54:32"^^xsd:datedateTime ; dct:identifier <https://fdp.radboudumc.nl/dataset/8793226e37d6ad17-9a7caa35-4e8c425c-9cef946e-fce41ef0b865>855838d3f9cc> ; dct:modified "2024-01-15"^^xsd:datecreator [ a foaf:Agent ; foaf:name "J. Doe (Creator)" ; dct:identifier <https://orcid.org/0000-0000-0000-0000> ] ; dct:publisher [ a foaf:Agent ; foaf:name "Radboudumc (Publisher)" ; dct:identifier <https://ror.org/05wg1m734> ] ; dcat:theme <http://purlpublications.obolibraryeuropa.orgeu/obo/MONDO_0007254>, <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C20116>resource/authority/data-theme/HEAL> ; dct:typelicense <http<https://purldata.ru.orgnl/dcdoc/dcmitype/Dataset>dua/RUMC-RA-DUA-1.0.html> ; dctdcat:license "Not yet available" distribution <https://fdp.radboudumc.nl/distribution/csv> ; dcat:distributioncontactPoint [ a dcatvcard:DistributionKind ; dcatvcard:accessURLhasEmail <doi:not_yet_available><mailto:firstname.lastname@radboudumc.nl> ; dcatvcard:mediaTypefn "text/csvJ. Doe" ; ] . # dcat:title "PRISMA Questionnaire data" ; Distribution details (CSV) <https://fdp.radboudumc.nl/distribution/csv> a dcat:Distribution ; dcat:description "The extensive questionnaire covers different topics such as demographics, personal characteristics, reproductive characteristics, medication, lifestyle, health status, family history, psychosocial characteristics." ] ; dcat:ContactPoint [ accessURL <doi:not_yet_available> ; dcat:mediaType <https://www.iana.org/assignments/media-types/text/csv> ; dcat:title "PRISMA Questionnaire data - CSV format" ; dcat:description "The questionnaire data in CSV format." . # Agent description (Publisher) <https://ror.org/05wg1m734> a foaf:Agent ; foaf:name "Radboudumc (Publisher)" ; vcard:hasEmail <mailto:contact@radboudumc.nl> ] . dct:identifier <https://ror.org/05wg1m734> . # Agent description (Creator) <https://rororcid.org/05wg1m734>0000-0000-0000-0000> a foaf:Agent ; foaf:name "RadboudumcJ. Doe (Creator)" ; dct:identifier <https://rororcid.org/05wg1m734>0000-0000-0000-0000> . |
To map your metadata you first need to understand the structure of your metadata and their semantic meaning and the ontology (vocabulary) used to to describe your data in a Resource Description Framework (RDF), in our case DCAT V3, format. The general outline of the mapping pipeline can be found here: https://health-ri.atlassian.net/wiki/spaces/FSD/pages/edit-v2/290291734?draftShareId=ff45a2e2-80ee-49aa-b6d6-c04dedb6f9f8
...