Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Status
colour

...

Blue
titlestatus:

...

On Hold
On 17-9-2024 it was decided to put this page on hold and focus on describing the petal process first. When that part is finished, parts of the information (see e.g. step 4) will be generalised for this page.

Short description 

`Generating a semantic model is often the most time-consuming step of data FAIRification. However, we expect the modelling effort to diminish as more and more models are made available for reuse over time, especially if such models are treated as FAIR digital objects themselves. Thus, it is important to first check whether a semantic model already exists for the data and the metadata that may be reused. For cases where no semantic model is available a new one needs to be generated.` (Generic)

...

Semantic modelling makes it possible that your data and metadata are machine-actionable in order to enable secondary use of your data. After performing this step, your data is being represented as FAIR digital objects (FDO). FDOs are digital objects identified by a Globally Unique, Persistent and Resolvable IDentifier (GUPRID) and described by metadata. This enables the transformed FAIR data set to be efficiently incorporated in other systems, analysis workflows, and unforeseen future applications.

Expertise requirements for this step 

Experts that may need to be involved, as described in Metroline Step: Build the Team, include:

  • Semantic data modelling specialist: creates a new (meta)data model or applies an existing one, ensures that the semantic representation correctly represents the domain knowledge.

  • Domain expert: make sure that the exact meaning of the data is understood by the modeler.

In the BEAT-COVID project, they developed ontological models for data record in collaboration with data collectors, data managers, data analysts and medical doctors [BEAT-COVID paper].

How to 

(I) Reusing a semantic (meta)data model

...

  • list the main concepts (classes) of the data elements to be FAIRified;

  • what are the relationships between the data elements.

It is important that both the data representation (format) and the meaning of the data elements (the data semantics) are clear and unambiguous (see Analyse data semantics).

To help you understand what you would like to include in your model, you can start by creating a list of questions (competency queries). These can serve as a guide to identify the most relevant (meta)data elements to model.

...

...

...

https://github.com/LUMC-BioSemantics/beat-covid/blob/master/fair-data-model/brainstorming/docs/brainstorming_models.pptx

Step 2: Search for ontology terms

...

Ontologies for different purposes can also be found in the FAIR cookbook, as well as on this page.

When choosing an ontology, several selection criteria might apply (from FAIR cookbook):

...

Finally, combine the conceptual model and the ontology terms to create the detailed semantic data model. This model distinguishes between the data items (instances and their values) and their types (classes), is an exact representation of the data and exposes the meaning of the data in machine-readable terms. 

Screenshot 2024-06-04 at 14.14.37.pngImage Added

ontological_model-20240429-143058.png

https://github.com/LUMC-BioSemantics/beat-covid/blob/master/fair-data-model/cytokine/model-triples/ontological_model.png

...

Repeat this step until no great errors occur any more in light of the competency questions.

[Optional] Step 5: Evaluation of semantic (meta)data models

To verify the semantic model, competency questions (CQ) can be used. CQs are an efficient way of of testing models, since they are based on real questions. CQs are evaluated by means of the query used to answer them. In other words, if it is possible to write a query that returns proper answers to the question, then the CQs is validated.

In the BEAT-COVID project, the ontological models were evaluated using competency questions that are based on realistic questions posed by data model users which are proposed as means to verify the scope (e.g.,what is relevant to solve the challenges) and the relationships between concepts (e.g., check for missing or redundant relationships). A preliminary set of CQs from meetings with domain experts is available on Github: https://github.com/LUMC-BioSemantics/beat-covid/tree/master/fair-data-model/cytokine/competency-questions

...

Expertise requirements for this step 

Experts that may need to be involved, as described in Metroline Step: Build the Team, include:

  • Semantic data modelling specialist: creates a new (meta)data model or applies an existing one, ensures that the semantic representation correctly represents the domain knowledge.

  • Domain expert: make sure that the exact meaning of the data is understood by the modeler.

In the BEAT-COVID project, they developed ontological models for data record in collaboration with data collectors, data managers, data analysts and medical doctors [BEAT-COVID paper].

Practical examples from the community

This section should show the step applied in a real project. Links to demonstrator projects. 

...

[BEAT-COVID project] https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-022-00263-7

Authors / Contributors 

Experts whom you can contact for further information 

Tools and resources on this page

Add the tools and resources mentioned on this page. This should be a list of usable content and does not include textual resources such as journal references.

Training

Relevant training will be added in the future if available.

Suggestions

Visit our How to contribute page for information on how to get in touch if you have any suggestions about this page.