Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Metadata is essential for describing information about your resource, whether it is a dataset, article, software, report or other project outputs. In this chapter, we explain how to make metadata about your resources , such as data, available online so others can find it. As explained in A Generic Workflow for the Data FAIRification Process, this step will help you make your data resources more Findable by registering them in a searchable repository, such as a metadata catalogue.

Metadata catalogues are platforms that store and help you find information about various resources. They allow you to search for existing data relevant for your research, saving time in data collection or enabling , and enable others to find your work, thereby increasing collaboration opportunities. Examples of metadata catalogues include the Health-RI Data Catalogue for healthcare and life sciences data , and the BBMRI-NL catalogue for biosample information.

Unlike data repositories like Zenodo or DANS, metadata catalogues do not store the actual resource, but just information about it. Metadata catalogues can link directly to the resource’s location, for example by linking your metadata catalogue entry to your DANS entry through its URL, or let others request access via a contact point or data access forms. Many data repositories also act as metadata catalogues, blending the functions of both. For example, when you publish data in DANS, you provide metadata (Title, Description, Keywords) that helps catalog and find entries within the DANS portal. This blurs the line between metadata catalogues and data repositories (see figure below). Both concepts can also be illustrated by platforms like Google Scholar, which works as a metadata catalogue by indexing information about publications that, then, links each entry to external repositories like Elsevier or PLOS where the actual publications can be accessed.

...

The key advantage of using metadata catalogues is that you don’t need to publish your resource, such as data, beforehand. This can be very useful if your project has just started data collection or if you have very restrictive data access conditions, but do wish for others to be able to find you. For example, registries keeping data about Rare Disease patients may want to be contacted for the purposes of diagnostic and therapy discovery, without making their actual data available in a repository. If you later decide to publish your resource in a (data) repository for long-term preservation and archiving, you can update the metadata catalogue entry with this new information.

There are other advantages to using metadata catalogues, which we’ll explore in the next section. We’ll also explain why this step is important and how to choose the right metadata catalogue for your resources.

Why is this step important 

Metadata catalogues are critical for making research resources, such as data, more visible and accessible. They offer a range of benefits to data holders, users and the broader scientific community. Here's how:For

Benefits for data holders:.

  • Increases discoverability. If you register metadata in catalogues, your data becomes more easily discoverable by others online.

  • Facilitates collaboration. Making your metadata available increases the likelihood of collaboration with other researchers who find your work through the catalogue.

  • Control over data use. Metadata catalogues allow you to specify how your data can be accessed and reused, ensuring that you retain control over its distribution.

  • Efficient compliance. Publishing metadata in catalogues is a low-effort, high-impact step that covers Findability, Accessibility, Interoperability and Reusability aspects for your data, which are now essential for meeting the requirements of numerous grants and institutions.

For Benefits for data users:.

  • Efficient data search. Instead of searching across various platforms, metadata catalogues provide a centralised, searchable repository for relevant data.

  • Time-saving. Reusing already available data saves significant time that would otherwise be spent on new data collection planning and approval.

  • Simplified access requirements. Clear access protocols provided through metadata reduce the complexity and time involved in requesting data.

For Benefits for the scientific community:.

  • Prevention of redundancy. Metadata catalogues reduce duplication of research efforts by making existing data more visible and accessible.

  • Community building. Catalogues promote the adoption of shared data standards, fostering collaboration and coherence within research communities.

  • Improved transparency. Clear documentation of data in metadata catalogues ensures research integrity and openness, which promotes trust in scientific findings.

  • Monitoring research impact. Cataloguing metadata allows for easier tracking of how data is used, cited, and repurposed, providing insights into the broader impact of research efforts.

How to 

Registering resource - level metadata depends on the context of your project and your expertise in metadata and FAIR principles. Here, we present a generic workflow applicable to most scenarios, but it is still advisable to customise this workflow to accommodate your context. This workflow emphasises selecting appropriate metadata catalogues for resources, rather than the technical aspects of metadata schemas.

Step 1

...

- Inventorise resource types

...

The first step is to identify and categorise the specific types of resources you are managing. While there is no universally accepted standard list, typical examples of resource types include datasets, code and articles. Within the category of datasets, there are further distinctions such as sociodemographic data, clinical data, imaging data, omics data, and biobank data. The type of resource impacts your choice of metadata catalogue.

Outcome: A a list of relevant resources along with their respective types.

...

  • Resource type 1: Dataset

    1. Dataset type 1: Biosample data

    2. Dataset type 2: Questionnaire data

Step 2

...

- Determine metadata elements for each resource type

...

In this step, you need to define the conceptual units of information, known as metadata elements, and collect those elements in a spreadsheet per resource type. Below is an example spreadsheet to capture the resource (sub)type and metadata element with description.

Resource Typetype

Resource Subtypesubtype

Metadata

Description

Dataset

Lab data

Collection methods

Description of the method or instruments used to collect the data.

Date sources

Information about where or from whom the data was collected.

Python code

Contributors

Names or IDs of other individuals who contributed to the code.

...

  • what metadata can be utilised to make resources more Findable?

  • what metadata are already in use by others in the same field and can be reused?

Outcome: A a list of metadata elements tailored for each resource type.

Example

Having categorized categorised the resource as a dataset. , Eva sought to determine which metadata elements would benefit each of her resource types , and subtypes. She created a table to organize organise this information:

Resource Typetype

Resource Subtypesubtype

Metadata

Dataset

Title

Description

Keywords about datasets

Associated project

Contact point to grant access to datasets

Dataset

Biosample

Type of study

Disease studied

Material collected

Number of donors

Associated publications

Associated biobanks

Access policy of samples

Dataset

Questionnaire data

Information about tools or questionnaires for data collection

Collection mode (face-to-face, telephone, or online)

Sample design (random, stratified, or cluster)

Time Period

Population studied

Survey questions

To ensure comprehensive metadata, she referred to existing standards:

  • For for generic information, she consulted DCAT and DublinCore;

  • For for Biosamples, she used the guidelines from the Metadata Group on Biobank and Collections in Health-RI;

  • For for questionnaire data, she selected information she deemed relevant, as she was unable to find established community standards.

Step 3 - Search for metadata catalogues per resource

...

type 

The next step is to identify metadata catalogues. Platforms like FAIRsharing help researchers locate appropriate metadata catalogues.

...

Step 4

...

- Determine metadata

...

catalogues 

In this step, you will evaluate the pros and cons of Metadata Catalogue candidates and make the decisions most appropriate for your context. The general suggestion is to prioritise community standards (R1.3) - is there a metadata catalogue which is widely used in your community?

...

  • Biosample data. BBMRI-ERIC data catalogue.

  • Questionnaire data. Metadata about the questionnaire data will first be made available on the Health-RI data catalogue together with high-level information about the study. When the questionnaire data is published in a data repository this entry will be updated to reflect the new location of the dataset.

Step 5

...

- Enter resource metadata required in the selected metadata catalogues

...

The final step involves entering the metadata for each resource into the chosen metadata catalogues, following the specific instructions provided by each metadata catalogue. If a resource is registered in multiple metadata catalogues, ensure that the metadata is consistent across all platforms and that the metadata sets are interlinked where possible. Automated updates of metadata are recommended when available.

...