Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

While performing this step, keep your FAIRification goals in mind. since, for example, selecting a relevant subset of the data and defining driving user questions(s) depend on a thorough understanding of the data.

Let’s say we have a dataset containing patient information with the following metadata:

Metadata Field

Value

Dataset Name

Health Data

Date of Upload

01/02/2023

Keywords

BP, HR, Conditions

Creator

Dr. Smith

Description

Patient health data including BP and HR

Format

CSV

Source

Hospital A

Rights

Open

Step 1

Compile all the information of data elements, data values, and data structure. Examine the data in whatever format and structure it is available. This step helps to identify inconsistencies, ambiguities, and errors in the data.

  1. In case you are FAIRifying existing/already collected data, locate all relevant sources in which the data is stored. Compile information about the following:

  • Which variables are present in the data (i.e. in the eCRFs)?

  • What are the value ranges for each variable?

In our example, we are working on FAIRifying an already existing metadata of a dataset. Let’s compile and examine the information we have.

Variables and Ranges of our metadata are as follows:

  • Dataset Name: The name of the dataset. Range: Text.

  • Date of Upload: The date on which the dataset was uploaded. Range: Date values in the format mm/dd/yyyy.

  • Keywords: Terms that describe the main topics of the dataset. Range: Text.

  • Creator: The person or organisation that created the dataset. Range: Text, in our example title and last name.

  • Desription: A brief summary of the dataset. Range: Text.

  • Format: A file format of the dataset. Range: Text, in our example a short string indicating the file format.

  • Source: The origin of the dataset. Range: Text, in our case the name of the institution.

  • Rights: The usage rights or licence of the dataset. Range: Text.

  1. In case you are aiming at collecting FAIR data from the start:

  • Which data elements/variables are you planning to collect? For this, the driving user’s question might provide some guidance.

  • If possible, determine the value range for each data element (e.g. for ‘biological sex at birth', values could be ‘male’, ‘female’; while for 'age’, the value range might be 0-110)

...