Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Short Description 

‘… selecting a relevant subset of the data and defining driving user questions(s) are highly relying on being familiar with the data’ (Generic)

In this step, the aim is to gain more insight into the existing data, or the data that you aim to collect. Clearly defining the meaning (semantics) of the data is an important step for creating the semantic model, as well as for data collection via e.g. eCRFs

...

  • Data specialist: can help with understanding of data structure,

  • Domain expert: can help with understanding of data elements.

How to 

When analyzing the data semantics of an existing data set or setting up a new data collection, consider the following:

...

Format and Structure: What is the format in which the data is available? What is the structure of the data?

...

to

While performing this step, keep your FAIRification goals in mind, since e.g., selecting a relevant subset of the data and defining driving user questions(s) depend on a thorough understanding of the data.

Step 1

Check the data in whatever format and structure it is available.

Step 2

Check which data elements are present, and what their relation is. For example, if the dataset is in a relational database, the relational schema provides information about the dataset structure, the types involved (the field names), cardinality, etc.

...

Data Representation: Is the data format clear and unambiguous? What are the data types?

...

Step 3

Check the data semantics. Is the meaning of the data elements clear and unambiguous?For a new data collection: define common data elements (CDEs) whose semantics are clear and unambiguous; for an existing data set, existing data elements can be aligned to CDEs.

Step 4

Check whether the data representation is clear and unambiguous. Investigate which types of data are present.

Step 5

In addition, check whether the data already contains FAIR features, such as persistent unique identifiers for data elements (for more information, see pre-FAIR assessment).While performing this step, keep your FAIRification goals in mind, since e.g., selecting a relevant subset of the data and defining driving user questions(s) depend on a thorough understanding of the data

Step 6

Define or align common data elements (CDEs). For a new data collection: define CDEs whose semantics are clear and unambiguous; for an existing data set, existing data elements can be aligned to CDEs.

References & Further reading

...