/
Metroline Step: Assess FAIRness

Metroline Step: Assess FAIRness

Status: IN development

Short description 

In this step in the post-FAIRification phase check whether you managed to reach the FAIRification goals you set initially [Generic]. If your goals have not yet been met, you may need to revisit some of the earlier steps, or, if that is unfeasible, you may need to adjust your goals. Furthermore, you can check the FAIRness of your (meta)data using FAIRness assessment tooling. If initially a pre-FAIR assessment was also done (step 5), you can compare the results from the pre- and post-FAIRification assessment. Also, such tooling can help track the progress towards FAIRness of your data [FAIRopoly]. 

Why is this step important 

You set out with FAIR goals in mind and this step will help you assess whether you reached them and can help you pinpoint where further work is necessary. 

FAIR assessment tools mentions the following (see How to): 

With the encouragement of journal editors and other stakeholders who have a need to evaluate author/researcher claims regarding the FAIRness of their outputs, a group consisting of FAIR experts, journal editors, data repository hosts, internet researchers, and software developers assembled to jointly define a set of formal metrics that could be applied to test the FAIRness of a resource.   

Q: is there movement towards journals demanding some form of formal statistics when claims towards FAIRness of data are made? If so, that should be added here, since you’d need to objectively assess FAIRness for publications.  

How to 

Using FIPs to assess FAIRness?

 

FAIR cookbook has a chapter about this topic: https://faircookbook.elixir-europe.org/content/recipes/assessing-fairness.html  

(Still not to check the actual content) 

Furthermore: FAIR Evaluator (FAIRoploy and FAIR Guidance) – text copied below. 

Also, perhaps FIPs?

 

FAIRopoly 

  • As a task under the objectives of the EJP RD, we created a set of software packages – The FAIR Evaluator – that coded each Metric into an automatable software-based test, and created an engine that could automatically apply these tests to the metadata of any dataset, generating an objective, quantitative score for the ‘FAIRness’ of that resource, together with advice on what caused any failures (https://www.nature.com/articles/s41597-019-0184-5).  With this information, a data owner would be able to create a strategy to improve their FAIRness by focusing on “priority failures”.  The public version of The FAIR Evaluator (https://w3id.org/AmIFAIR) has been used to assess >5500 datasets.  Within the domain of rare disease registries, a recent publication about the VASCA registry shows how the Evaluator was used to track their progress towards FAIRness (https://www.medrxiv.org/content/10.1101/2021.03.04.21250752v1.full.pdf). To date, no resource – public or private – has ever passed all 22 tests, showing that FAIR assessment is able to provide guidance to even highly-FAIR resources. 

  • The FAIR evaluation results can serve as a pointer to where your FAIRness can be improved. 

 

FAIR Guidance [https://www.ejprarediseases.org/fair_guidance/

FAIR Assessment Tools 

There is growing interest in the degree to which digital resources adhere to the goals of FAIR – that is, to be Findable, Accessible, Interoperable, and Retrievable by both humans and, more importantly, by machines acting on behalf of their human operator. Unfortunately, the path to FAIRness was left undefined by the original FAIR Principles paper, which chose to remain agnostic about which technologies or approaches were appropriate. As such, until recently, it has been impossible to make objectively valid statements about the degree to which a data object exhibits “FAIRness”.   

With the encouragement of journal editors and other stakeholders who have a need to evaluate author/researcher claims regarding the FAIRness of their outputs, a group consisting of FAIR experts, journal editors, data repository hosts, internet researchers, and software developers assembled to jointly define a set of formal metrics that could be applied to test the FAIRness of a resource.  The first edition of these metrics was aimed at self-assessment, in the form of a questionnaire; however, upon review of the validity of several completed self-assessments by data owners, we determined that the questions were often answered inconsistently, or incorrectly (knowingly or unknowingly), and often the data provider did not know enough about the data publishing environment to answer the questions at all.  As such, a smaller group of FAIR experts created a second generation of FAIR Metrics that aimed to be fully automatable.  The result was a set of 22 Metrics spanning most FAIR principles and sub-principles, which explicitly describe what is being tested, which FAIR Principle it applies to, why it is important to test this (meta)data feature, exactly how the test will be conducted, and what will be considered a successful result.  

 

As a task under the objectives of the EJP RD, we created a set of software packages – The FAIR Evaluator – that coded each Metric into an automatable software-based test, and created an engine that could automatically apply these tests to any dataset, generating an objective, quantitative score for the ‘FAIRness’ of that dataset, together with advice on what caused any failures (https://www.nature.com/articles/s41597-019-0184-5).  With this information, a data owner would be able to create a strategy to improve their FAIRness by focusing on “priority failures”.  The public version of The FAIR Evaluator (https://w3id.org/AmIFAIR) has been used to assess >5500 datasets.  Within the domain of rare disease registries, a recent publication about the VASCA registry shows how the Evaluator was used to track their progress towards fairness (https://www.medrxiv.org/content/10.1101/2021.03.04.21250752v1.full.pdf).  To date, no resource – public or private – has ever passed all 22 tests, showing that FAIR assessment is able to provide guidance to even highly-FAIR resources.  

 

Generic 

If driving user question(s) were defined in Step 1 it should be “answered” in this step. The results of these question(s) are gathered by processing the FAIR machine-readable data. If RDF is the machine-readable format used, then RDF data stores (triple stores) are used to store the machine-readable data, and SPARQL queries are used to retrieve the data required to answer the driving user question(s). 

 

FAIRPlus (https://www.nature.com/articles/s41597-023-02167-2

In parallel to the development of the FAIRifiation framework, we also developed a FAIR DataSet Maturity (FAIR-DSM) Model (https://fairplus.github.io/Data-Maturity/). This allowed us to assess the maturity of the datasets used to validate the FAIRification process prior to and following any FAIRification work. In the initial stages of the framework development, we made use of existing approaches including the RDA indicators and the FAIRsFAIR indicators to evaluate FAIRifiation improvements. While these alternative models demonstrated satisfactory results, they generally treat each element or principle of FAIR as a stand-alone element. The FAIR-DSM on the other hand evaluates a dataset as a whole, providing a more balanced view of its overall maturity in terms of content, representation and hosting.  The FAIR-DSM is described in detail elsewhere but briefly, it consists of 5 maturity levels characterised by increasing requirements across all aspects of FAIR, plus a reference level, referred to as “level 0”, representing a state of data that is missing most or all fundamental FAIR requirements. The model considers 3 categories of requirements: content-related requirements; representation and format requirements; and hosting environment capabilities.  In order to conform to a given level of the model, a dataset needs to fulfil a set of indicators covering the requirements for each of the 3 categories at this level. 

 

 

FAIRPlus [https://faircookbook.elixir-europe.org/content/recipes/introduction/fairification-process.html

Phase 3: assess, design, implement, repeat 

Following the selection of the “action” team, an iterative cycle of assessment, design, and implementation in put in place. 

Assessment : Prior to starting the work, the assessment of goals is done to ensure that individuals in the action team are updated and clear with the FAIRification goals formulated by the data owners. This assessment is carried out by review team which could be an independent team or certain individuals from the technical team who are not involved in the action team. The assessment involves a binary decision of “GO” or “NO GO” based on the FAIRification goals and the catalog provided. At this stage, the reviews can also provide suggestion based on their experiences on the resources, tool, or goals. 

Design : Once the team receives a “GO” decision from the review team, the action team now starts by enlisting the steps that need to be done performed to achieve the goal. For each task, the resources, an estimate time duration, as well as the responsible person is selected. 

Implementation : Once the tasks have been selected and assigned, the actual work begins. To ensure that the action team is working smoothly, weekly or bi-weekly meetings is recommended so that the team is aware of the progress. 

Once the implementation of task listed in the design phase are done, the action team assess the work done and checks the aligned with the FAIRification goal. In case more tasks are needed to achieve the goal, a second round of the assess-review-implement cycle takes place as described above with the starting point as the FAIRification goals, the completed tasks and the proposed task 

This phase is usually run in short sprints of 3-month. 

 

 

The How to section should:

  • be split into easy to follow steps;

    • Step 1

    • Step 2

    • etc.

  • help the reader to complete the step;

  • aspire to be readable for everyone, but, depending on the topic, may require specialised knowledge;

  • be a general, widely applicable approach;

  • if possible / applicable, add (links to) the solution necessary for onboarding in the Health-RI National Catalogue;

  • aim to be practical and simple, while keeping in mind: if I would come to this page looking for a solution to this problem, would this How-to actually help me solve this problem;

  • contain references to solutions such as those provided by FAIR Cookbook, RMDkit, Turing way and FAIR Sharing;

  • contain custom recipes/best-practices written by/together with experts from the field if necessary. 

 

Expertise requirements for this step 

Describes the expertise that may be necessary for this step. Should be based on the expertise described in the Metroline: Build the team step.

Practical examples from the community 

Examples of how this step is applied in a project (link to demonstrator projects).  

Training

Add links to training resources relevant for this step. Since the training aspect is still under development, currently many steps have “Relevant training will be added in the future if available.”

Suggestions

Visit our How to contribute page for information on how to get in touch if you have any suggestions about this page.