Page Comparison

...

In this step, the aim is to gain more insight into the existing data, or the data that you aim to collect. Clearly defining the meaning (semantics) of the data is an important step for creating the semantic model, as well as for data collection via, for example, electronic case report forms (eCRFs).

To understand 'the semantics', different aspects of the data elements/variables should be analysed.

The definition/description of data elements. For example, a variable called “sex” could refer to “Biological sex” or “Administrative gender”.
Values for choices. For example, in system A, sex allows for male and female, while in system B, sex also allows for intersex. Such difference reflects the gap of their semantics.
Relationship between data elements. For example, ‘sex’ the “sex” variable is one attribute of ‘patient’ “patient” profile, which may imply that the semantics of this ‘sex’ “sex” variable is ‘sex “sex of patient’patient”.

The outcome of this step should be a set of data elements (variables) with clear and unambiguous semantics (a codebook), which reflect the information you want to collect or share. Be aware that finding machine-actionable items from ontologies for the data elements is not yet part of this step, but is described in Create or reuse a semantic (meta)data model.

[SdR, how about:]

Understanding and clearly defining the meaning (semantics) of (meta)data is an important step for creating the semantic model, as well as for data collection via, for example, electronic case report forms (eCRFs). In this step, the aim is to ensure you gain a clear and unambiguous understanding of the (meta)data. The step provides guidance for both existing data and data that must still be collected.

To illustrate the issue, consider the example where you receive a dataset with a variable called “sex”. Without clearly defined semantics, it is unclear whether this means “biological sex at birth”, “phenotypic sex”, or “administrative gender”. This must be resolved before you can start with the semantic (meta)data model.

Thus, the outcome of this step is a set of data elements (variables) with clear and unambiguous semantics, known as a codebook. Note that finding machine-actionable items from ontologies for the data elements is not yet part of this step, but is described in Create or reuse a semantic (meta)data model.

...

While performing this step, keep your FAIRification goals in mind. If you have a clear idea of your FAIRification goals, it might be easier to define what elements should be present in your (meta)data and how these elements should be represented.

...

For analysing data semantics:

Image RemovedImage Added

For analysing metadata semantics:

Image RemovedImage Added

For easier understanding, we will follow the example dataset containing patient information with the following metadata:

...

Metadata field / Variable	Description of the field	Value range
Dataset Name	The name of the dataset.	Text
Date of Upload	The date on which the dataset was uploaded	Date values in the format MM/DD/YYYY
Keywords	Terms that describe the main topics of the dataset	Text
Creator	The person or organisation that created the dataset	Text, in our example title and last name
Description	A brief summary of the dataset	Text
Format	A file format of the dataset	Text, in our example a short string indicating the file format
Source	The origin of the dataset	Text, in our case the name of the institution
Rights	The usage rights or licence of the dataset	Text

b) In case you are aiming at collecting to collect FAIR (meta)data from the start:

Which data elements/variables are you planning to collect? For this, the competency questions (QCs) might provide some guidance.
If possible, determine the value range for each data element (e.g. for ‘biological sex at birth', values could be ‘male’, ‘female’; while for 'age’, the value range might be 0-110).

Step 2 - Check for an existing standard/codebook

...

Versions Compared

Old Version 46

New Version 47

Key

Step 2 - Check for an existing standard/codebook