Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 24 Next »

Short description 

‘… selecting a relevant subset of the data and defining driving user questions(s) are highly relying on being familiar with the data’ (Generic)

In this step, the aim is to gain more insight into the existing data, or the data that you aim to collect. Clearly defining the meaning (semantics) of the data is an important step for creating the semantic model, as well as for data collection via, for example, electronic case report forms

To understand “semantics”, data values (i.e. meaning of data elements), data representation (format), and structure information (i.e. relationships between data elements) should be analysed.

The goal is a set of data elements with clear and unambiguous semantics, which reflect the information you want to collect or share.

Why is this step important

Even though this step has no clearly defined deliverable, several of the steps that follow rely on being familiar with your data. For example, in order to create or reuse your semantic (meta)data model, it is important to understand the elements and structure of your existing data, or data to be collected. Furthermore, a good understanding of your data is closely connected to the FAIRification goals, since these can depend on the data elements.

How to

While performing this step, keep your FAIRification goals in mind, since, for example, selecting a relevant subset of the data and defining driving user questions(s) depend on a thorough understanding of the data.

Step 1

Check the data in whatever format and structure it is available.

Step 2

Check which data elements are present, and what their relation is. For example, if the dataset is in a relational database, the relational schema provides information about the dataset structure, the types involved (the field names), cardinality, etc.

Step 3

Check the data semantics. Is the meaning of the data elements clear and unambiguous?

Step 4

Check whether the data representation is clear and unambiguous. Investigate which types of data are present.

Step 5

In addition, check whether the data already contains FAIR features, such as persistent unique identifiers for data elements (for more information, see pre-FAIR assessment).

Step 6

Define or align common data elements (CDEs). For a new data collection: define CDEs whose semantics are clear and unambiguous; for an existing data set, existing data elements can be aligned to CDEs.

Expertise requirements for this step 

Below are experts that may need to be involved, as described in Metroline Step: Build the Team.

  • Data specialist. Specialists who can help understanding the data’s structure.

  • Domain expert. Specialists who can help understanding the data’s elements.

Practical examples from the community 

Examples of how this step is applied in a project (link to demonstrator projects).  

Tools and resources on this page

None.

Training

Relevant training will be added in the future if available.

Suggestions

Visit our How to contribute page for information on how to get in touch if you have any suggestions about this page.

  • No labels