Disclaimer: This FAIR Metroline Step focuses solely on the registration of metadata. It does not cover the technical details of metadata schemas or FAIR Data Point, both of which will be detailed in subsequent FAIR Metroline Steps.

‘Perfectly good data resources may go unused simply because no one knows they exist. There are many ways in which digital resources can be made discoverable, including indexing.’ (GO FAIR)

To make your resource (e.g. data), available for reuse, its metadata can be published in a catalogue. This step helps you find a catalogue where you can register these resource metadata and explains why adding your resource to such a catalogue is important.

Short description 

Metadata is essential for describing information about your resource, whether it is a dataset, article, software, report or other project outputs. In this chapter, we explain how to make metadata about your resources available online so others can find it. This step will help you make your data resources more Findable by registering them in a searchable repository, such as a metadata catalogue.

Metadata catalogues are platforms that store and help you find information about various resources. They allow you to search for existing data relevant for your research, saving time in data collection, and enable others to find your work, thereby increasing collaboration opportunities. Examples of metadata catalogues include the Health-RI Data Catalogue for healthcare and life sciences data and the BBMRI-NL catalogue for biosample information.

Unlike data repositories like Zenodo or DANS data stations, metadata catalogues do not store the actual resource, but just information about it. Metadata catalogues can link directly to the resource’s location, for example by linking your metadata catalogue entry to your DANS entry through its URL, or let others request access via a contact point or data access forms. Many data repositories also act as metadata catalogues, blending the functions of both. For example, when you publish data in DANS, you provide metadata (Title, Description, Keywords) that helps catalogue and find entries within the DANS portal. This blurs the line between metadata catalogues and data repositories (see figure below). Both concepts can also be illustrated by platforms like Google Scholar, which works as a metadata catalogue by indexing information about publications that, then, links each entry to external repositories like Elsevier or PLOS where the actual publications can be accessed.

For more information about Data Repositories, see Archiving data | Health-RI and Open Science | ERC (europa.eu).

Purple Peach Minimalist Marketing Tips Venn Diagram (1).png

Why is this step important 

The key advantage of using metadata catalogues is that you don’t need to publish your resource, such as data, beforehand. This can be very useful if your project has just started data collection or if you have very restrictive data access conditions, but do wish for others to be able to find you. For example, registries keeping data about Rare Disease patients may want to be contacted for the purposes of diagnostic and therapy discovery, without making their actual data available in a repository. If you later decide to publish your resource in a (data) repository for long-term preservation and archiving, you can update the metadata catalogue entry with this new information. Furthermore, metadata catalogues are critical for making research resources, such as data, more visible and accessible.

Metadata catalogues offer a range of benefits to data holders, users and the broader scientific community.

Benefits for data holders.

Benefits for data users.

Benefits for the scientific community.

How to 

Registering resource level metadata depends on the context of your project and your expertise in metadata and FAIR principles. Here, we present a generic workflow applicable to most scenarios, but it is advisable to customise this workflow to accommodate your context. This workflow emphasises selecting appropriate metadata catalogues for resources, rather than the technical aspects of metadata schemas.

Step 1 - Inventorise resource types

The first step is to identify and categorise the specific types of resources you are managing. While there is no universally accepted standard list, typical examples of resource types include datasets, code and articles. Within the category of datasets, there are further distinctions such as sociodemographic data, clinical data, imaging data, omics data, and biobank data. The type of resource impacts your choice of metadata catalogue.

Outcome: a list of relevant resources along with their respective types.

Example

Researcher Eva wants to document metadata for her resource, the PRISMA study, and decides to follow the steps on this page. After reviewing the first step, she identified and categorised her resource types as follows:

Step 2 - Determine metadata elements for each resource type

In this step, you need to define the conceptual units of information, known as metadata elements, and collect those elements in a spreadsheet per resource type. Below is an example spreadsheet to capture the resource (sub)type and metadata element with description.

Resource type

Resource subtype

Metadata element

Description

Dataset

Lab data

Collection methods

Description of the method or instruments used to collect the data.

Data sources

Information about where or from whom the data was collected.

Python code

Contributors

Names or IDs of other individuals who contributed to the code.

Questions to consider:

For guidance on this process, read Metroline Step: Assess availability of your metadata.

Outcome: a list of metadata elements tailored for each resource type.

Example

Having categorised the resource as a dataset, Eva sought to determine which metadata elements would benefit each of her resource types and subtypes. She created a table to organise this information:

Resource type

Resource subtype

Metadata

Dataset

Title

Description

Keywords about datasets

Associated project

Contact point to grant access to datasets

Dataset

Biosample

Type of study

Disease studied

Material collected

Number of donors

Associated publications

Associated biobanks

Access policy of samples

Dataset

Questionnaire data

Information about tools or questionnaires for data collection

Collection mode (face-to-face, telephone, or online)

Sample design (random, stratified, or cluster)

Time Period

Population studied

Survey questions

To ensure comprehensive metadata, she referred to existing standards:

Step 3 - Search for metadata catalogues per resource type 

The next step is to identify metadata catalogues. Platforms like FAIRsharing help researchers locate appropriate metadata catalogues.

Considerations for Metadata Catalogue selection are described below.

More criteria can be found on:

Outcome: a list of Metadata Catalogue candidates for each resource type.

Example

After determining the necessary metadata elements for PRISMA data in step 2, Eva consulted a data steward in her department. Together, they compiled a list of available metadata catalogues. Because a repository has cataloguing functionality, existing repositories were also considered.

Step 4 - Determine metadata catalogues 

In this step, you will evaluate the pros and cons of Metadata Catalogue candidates and make the decisions most appropriate for your context. The general suggestion is to prioritise community standards (R1.3) - is there a metadata catalogue which is widely used in your community?

Outcome: a finalised list of metadata catalogues for each resource type.

Example

Eva selected the appropriate metadata catalogues for each type of dataset.

Step 5 - Enter resource metadata required in the selected metadata catalogues

The final step involves entering the metadata for each resource into the chosen metadata catalogues, following the specific instructions provided by each metadata catalogue. If a resource is registered in multiple metadata catalogues, ensure that the metadata is consistent across all platforms and that the metadata sets are interlinked where possible. Automated updates of metadata are recommended when available.

Outcome: successfully registered resource-level metadata in a FAIR manner, ensuring the resources are Findable, Accessible, Interoperable, and Reusable.

Example

Eva followed the instructions for onboarding data in the Health-RI data catalogue to register the metadata, which is now available.

Eva entered the biosample data into the BBMRI Catalogue Form, which she downloaded, and submitted it to the Health-RI Service Desk. The metadata for biosample data were successfully registered, see https://directory.bbmri-eric.eu/ERIC/directory/#/collection/bbmri-eric:ID:NL_RB:collection:155?search=PRISMA.

Expertise requirements for this step 

The level of expertise required for this step will depend on several factors:

Depending on these variables, selecting the appropriate metadata catalogue may be a straightforward process or may require input from multiple experts. Experts that may need to be involved, as described in FAIR Metroline Step: Build the Team, are described below.

Practical examples from the community 

The Netherlands ME/CFS Cohort and Biobank Consortium

The Netherlands ME/CFS Cohort and Biobank (NMCB) consortium, in partnership with patient organisations, is leading the way for the development of a national research infrastructure for Myalgic Encephalomyelitis and Chronic Fatigue Syndrome (ME/CFS).

The current choices of metadata catalogues for NMCB are as follows.

These decisions are described in the first version of a FAIR Implementation Profile, the NMCB FIP, and the release of the next version will be in October 2024.

Training

Relevant training will be added soon.

Suggestions

This page is under construction. Learn more about the contributors here and explore the development process here. If you have any suggestions, visit our How to contribute page to get in touch.