...
Metadata mapping is the process of establishing connections between corresponding metadata values or fields across different systems. In simple terms, it ensures that your metadata schema for your data is transformed to the HRI metadata schema in the correct way. It involves identifying and linking similar pieces of metadata information from one system to the relevant content or data elements in another system. This mapping ensures consistency and coherence between disparate datasets or databases, allowing for efficient data integration and interoperability. By associating equivalent metadata values or fields, metadata mapping enables seamless communication and exchange of information between systems, facilitating accurate data discovery, retrieval, and interpretation.
Below is an example of simple metadata of a blood a sample. It describes the important information about the sample including ID of the sample, ID of the patient, and a diagnosis:metadata from the PRISMA study. It contains information about the data available:
Class | Property | Property Label | Description/Example |
dcat:Catalogue | dcat:dataset | dataset | Personalised RISk-based MAmmascreening Study (PRISMA) |
dct:description | Description | This catalog describes the core metadata of Radboudumc datasets | |
dct:publisher | Publisher | ||
dct:title | Title | Radboudumc Core Metadata | |
dcat:Dataset | dcat:ContactPoint | Contact Point | foaf:agent |
dct:creator | Creator name | foaf:agent | |
dct: description | Description | The primary aim of the PRISMA study was to investigate the potential value of risk-tailored versus traditional breast cancer screening protocols in the Netherlands. Data collection took place between 2014-2019, resulting in ∼67,000 mammograms, ∼38,000 surveys, ∼10,000 blood samples and ∼600 saliva samples. | |
dct:issued | Issued | 15/01/2024 | |
dct: identifier | Identifier | https://fdp.radboudumc.nl/dataset/8793226e-9a7c-4e8c-9cef-fce41ef0b865 | |
dct:modified | Modified | 15/01/2024 | |
dct:publisher | Publisher | foaf:agent | |
dcat:theme | Theme | ||
dct:title | Title | Personalised RISk-based MAmmascreening Study (PRISMA) | |
dct:type | Type | ||
dct:license | License | Not yet available | |
dcat:Distribution | dcat:accessURL | Access URL | DOI (not yet available) |
dcat:MediaType | Format | text/csv | |
dcat:title | Title | PRISMA Questionnaire data | |
dcat:description | Description | The extensive questionnaire covers different topics such as demographics, personal characteristics, reproductive characteristics, medication, lifestyle, health status, family history, psychosocial characteristics. | |
foaf:Agent | foaf:name | name | Radboudumc |
dct:identifier | identifier |
|
Here is the same data mapped towards the Health-RI metadata core. It contains the same information, however, now this data can be easily processed by a computer and is in a format that is common for many places on the web.
Code Block |
---|
@prefix dcat: <http://www.w3.org/ns/dcat#> . @prefix dct: <http://purl.org/dc/terms/> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix vcard: <http://www.w3.org/2006/vcard/ns#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . <https://radboudumc.nl/catalogue> a dcat:Catalog ; dct:title "Radboudumc Core Metadata" ; dct:description "This catalog describes the core metadata of Radboudumc datasets" ; dct:publisher <https://ror.org/05wg1m734> ; dcat:dataset <https://fdp.radboudumc.nl/dataset/8793226e-9a7c-4e8c-9cef-fce41ef0b865> . <https://fdp.radboudumc.nl/dataset/8793226e-9a7c-4e8c-9cef-fce41ef0b865> a dcat:Dataset ; dct:title "Personalised RISk-based MAmmascreening Study (PRISMA)" ; dct:description "The primary aim of the PRISMA study was to investigate the potential value of risk-tailored versus traditional breast cancer screening protocols in the Netherlands. Data collection took place between 2014-2019, resulting in ∼67,000 mammograms, ∼38,000 surveys, ∼10,000 blood samples and ∼600 saliva samples." ; dct:issued "2024-01-15"^^xsd:date ; dct:identifier <https://fdp.radboudumc.nl/dataset/8793226e-9a7c-4e8c-9cef-fce41ef0b865> ; dct:modified "2024-01-15"^^xsd:date ; dct:publisher [ a foaf:Agent ; foaf:name "Radboudumc" ; dct:identifier <https://ror.org/05wg1m734> ] ; dcat:theme <http://purl.obolibrary.org/obo/MONDO_0007254>, <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C20116> ; dct:type <http://purl.org/dc/dcmitype/Dataset> ; dct:license "Not yet available" ; dcat:accessURL <doi:not_yet_available> ;distribution [ dcat:MediaType "text/csv" ; a dcat:ContactPointDistribution [; adcat:accessURL foaf:Agent<doi:not_yet_available> ; foafdcat:namemediaType "Radboudumctext/csv" ; vcard:hasEmail <mailto:contact@radboudumc.nl> ] ; dcat:Distribution [ dct:title "PRISMA Questionnaire data" ; dctdcat:description "The extensive questionnaire covers different topics such as demographics, personal characteristics, reproductive characteristics, medication, lifestyle, health status, family history, psychosocial characteristics." ] ; dcat:ContactPoint [ a foaf:Agent ; dcat:mediaType foaf:name "text/csvRadboudumc" ; dcatvcard:accessURLhasEmail <doi:not_yet_available><mailto:contact@radboudumc.nl> ] . <https://ror.org/05wg1m734> a foaf:Agent ; foaf:name "Radboudumc" ; dct:identifier <https://ror.org/05wg1m734> . |
To map your metadata you first need to understand the structure of your metadata and their semantic meaning and the ontology (vocabulary) used to to describe your data in a Resource Description Framework (RDF), in our case DCAT V3, format. The general outline of the mapping pipeline can be found here: https://health-ri.atlassian.net/wiki/spaces/FSD/pages/edit-v2/290291734?draftShareId=ff45a2e2-80ee-49aa-b6d6-c04dedb6f9f8
...