Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Version History

« Previous Version 32 Next »

STATUS: IN DEVELOPMENT

Short description 

“metadata is the descriptor, and data is the thing being described” [https://doi.org/10.1162/dint_r_00024 ]

Metadata refers to the contextual information about a resource (e.g. a dataset), often described as “data about data”. Metadata can come in many different types and forms. The type of metadata you might be most familiar with is the descriptive metadata often collected in repositories such as Zenodo (see the example of how zenodo describes the resources on its repository). This generic metadata includes details on what the resource is about (e.g., data from patient health records), who created it (e.g., a research team at Radboudumc) and when it was collected. Typically, it also discloses information about the possible uses of the resource (e.g., applicable licensing) and access restrictions (e.g., available for public use/restricted access). Other types of metadata commonly used are:

  • Provenance metadata: This refers to how the resource came to be, what protocols were followed, and what tools were used. The purpose of this metadata is to ensure that you, your colleagues or others can reproduce the initial research.

  • Structural metadata: Depending on the type of resource, this refers to a detailed description of your resource that goes beyond the generic information explained above. For instance, in the context of a dataset containing data collected from a questionnaire, content metadata could include the questions asked and the allowed range of values.

  • Codebooks: A detailed document that provides information about the structure, content, and organization of a dataset. A codebook usually describes information such as variable names, and measurement methods and units.

In this step, the focus will be on assessing the availability of your metadata. This step is a good starting point and a common first step for multiple objectives <point towards FAIR objectives>, whether you aim to:

  • gain a clear view of what metadata currently describes your resource

  • expand your current metadata

  • ensure compliance with requirements to publish it in a metadata catalogue <Point to Register resource level metadata>

  • follow a semantic model to describe your metadata

This step involves identifying and collecting all types of metadata gathered for your resource, checking their quality and ensuring they are as accurate and complete as possible.

Why is this step important 

Generally:

To be able to register resource level metadata you need to make sure you have/collect it.

with respect to HRI:

Health-RI is in the process of defining a metadata scheme for onboarding in the Health-RI metadata portal. To allow for onboarding of a dataset, the minimal metadata set must be provided. It is therefore essential that you assess whether this minimal set is collected/available or whether additional metadata needs to be collected. 

Beneficial for you and your team: Having comprehensive and detailed metadata ensures that anyone, including yourself, can understand and work on the data effectively even when some time as passed since collection. This is an example of good data management practices and contributes to data remaining usable and meaningful over time and saves time when setting up new projects.

Beneficial for the organisation: well curated metadata increases the reuse of datasets. It increases interoperability between systems: Complete and error-free metadata makes it easier to migrate between systems (when newer softwares are available)

Good image: Good metadata records reflects well as reusers of the data might be put off by documentation issues and might not use the data as much (Ig also for researchers?)

Improves the quality of your data: Good metadata should describe the data accurately and unambiguously, which in turn improves the overall quality of the data and enhances transparency and reproducibility. This enables others to verify results and build upon them.

Helps with data discovery: Complete metadata improves the ability for you and your team to locate and retrieve data quickly. Additionally, if this metadata is published, it can boost reuse of data, lead to new collaborations and enhance recognition of existing work.

Complies with funders’ and journals’ requirements: Many funding agencies and publishers now require metadata to be published to increase the efficiency and visibility of the research they support.

How to 

step 1: where is metadata from your research already being collected (ensure it’s still up to date and represents still your project accurately)

step 2: Do you think your metadata is still enough for others to understand? Create competence questions to this metadata. Guide yourself with specific questions (think about it).

Step 3: Answer all those competence questions

Step 4: Store the metadata in an appropriate location where it can be useful for you and other people on your team - ask your data stewards which location this is.

In the RUMC researchers can put documentation about their project in the RDR under a Research documentation collection - this is not meant to share with the public!

Step 5: What else can you do from here, link to following pages

  • Publish it in a data catalogue - if you want others to find the dataset?

  • If you want to start a metadata schema because or reuse one - step

  • Expand?create a sunflower - if you want to work with HRI to create a petal

[FAIRopoly] → Doesn’t really sound like “assess” though?

Usually, terms from upper ontologies can be used to describe metadata. For example, use dcat:Dataset from Data Catalog Vocabulary (DCAT) [DCAT] to describe the type of any rare disease dataset and dct:creator from DCMI Metadata Terms (DCT) to indicate the relationship between a dataset and its creator. 

The How to section should:

  • be split into easy to follow steps;

    • Step 1

    • Step 2

    • etc.

  • help the reader to complete the step;

  • aspire to be readable for everyone, but, depending on the topic, may require specialised knowledge;

  • be a general, widely applicable approach;

  • if possible / applicable, add (links to) the solution necessary for onboarding in the Health-RI National Catalogue;

  • aim to be practical and simple, while keeping in mind: if I would come to this page looking for a solution to this problem, would this How-to actually help me solve this problem;

  • contain references to solutions such as those provided by FAIR Cookbook, RMDkit, Turing way and FAIR Sharing;

  • contain

  • custom recipes/best-practices written by/together with experts from the field if necessary. 

Expertise requirements for this step 

Experts that may need to be involved, as described in Metroline Step: Build the Team, are described below.

  • Data manager/Data steward/Researcher (Scientist) or someone else who knows the context and content of the project.

Practical examples from the community 

This section should show the step applied in a real project. Links to demonstrator projects. 

Training

https://carpentries-incubator.github.io/scientific-metadata/instructor/data-metadata.html#types-of-metadata

Suggestions

Visit our How to contribute page for information on how to get in touch if you have any suggestions about this page.

 

  • No labels