Metroline Step: Build the Team

Short Description

To be able to reach your FAIRification goals, having a team with the right skillset is important [FAIRopoly]. The composition of the team depends on the exact goals and different skills may be necessary in different phases of the of the process [FAIRinAction]. The core of the team may be formed by one or more data stewards with expertise of the FAIRification process in general and knowledge of the local environment [Generic]. The team may, furthermore, contain (part-time) advisors with, for example domain expertise [FAIRopoly], as well as data managers, software developers, research scientists, project managers and legal support [FAIRinAction].

[Health-RI_FAIRification_Step_Report] In this section we describe the needed expertise for making data more FAIR. In general, FAIRification work requires consultancy with:

Domain experts who know the domain-specific data - the meaning of the data, but also the provenance and relations to other data.
FAIR experts or project managers that conducted a FAIRification project before (who know how to interpret and implement the FAIR principles).

Next to that, depending on your FAIRification goals, you might need more specific experts. To help you identify which expertise is required and available (or not) in your team, we present below a list of common roles and resources involved in FAIRification process by expertise and by FAIR principle. For the items that you do not have the expertise, please contact your local data stewards or other data management services or Health RI to discuss a plan of action.

[Jolanda Summary] To be able to reach your FAIRification goals, having a team with the right skillset is important [FAIRopoly]. The composition of the team depends on the exact goals and different skills may be necessary in different phases of the of the process [FAIRinAction].
To help you identify which expertise is required and available (or missing) in your team, we present a list of common roles and resources involved in the FAIRification process listed by expertise and by FAIR principle.

[Sander] As a FAIRification steward is essential for reaching the FAIRification goals, a full page has been dedicated to this role. See “Metroline Step: Have a FAIRification steward on board” for details on this crucial role.

Why is this step important

FAIRification is a complicated process and requires expertise from a variety of fields. Hence, assembling the right team is essential to meet your goals.

Expertise requirements for this step

[Fieke] The data steward profile is often described according to three roles (policy, research and infrastructure) and eight task areas (policy & strategy; compliance; FAIR data; Services; Infrastructure; Knowledge management; network; data archiving). A single data steward can be responsible for all task areas, but tasks can also be divided among central and embedded / domain data stewards. Each task area requires different competencies. The EMBL-EBI competency hub describes activities, ksa’s (knowledge, skills & abilities) and learning objective for each rol and task area.

How to

[Another RDMkit page on this: https://rdmkit.elixir-europe.org/dm_coordination ]

[Sander] Would it make sense that, if we mention roles in this section in other pages, these roles are actually specified in this page’s How to? We could even create hyperlinks to this page.

RDMkit has a nice section about Roles in Data Management (with more details than I copied below) [Mijke coordinated/wrote most of it this]

In this section, information is organised based on the different roles a professional can have in research data management. You will find:

A description of the main tasks usually handled by each role.
A collection of research data management responsibilities for each role.
Links to RDMkit guidelines and advice on useful information for getting started with data management specific to each role.

Roles:

Data Steward: Data stewardship is a relatively new profession and a catch-all term for numerous support functions, roles and activities. It implies professional and careful treatment of data throughout all stages of a research process.
Policy maker: As a policy maker, you are responsible for the development of a strategic data management framework and the coordination and implementation of research data management guidelines and practices.
Principal Investigator: As a Principal Investigator (PI), you may have recently acquired project funding. More and more funders require data management plans (DMP), stimulating the researcher to consider, from the beginning of a project, all relevant aspects of data management.
Researcher: Your research data is a major output from your research project, it supports your research conclusions, and guides yourself and others towards future research. Therefore, managing the data well throughout the project, and sharing it, is a crucial aspect of research.
Research Software Engineer: Research software engineers (RSE) in the life sciences design, develop and maintain software systems that help researchers manage their software and data. The RSE’s software tools and infrastructure are critical in enabling scientific research to be conducted effectively.
Trainer: As a trainer, you design and deliver training courses in research data management with a focus on bioinformatics data. Your audience is mainly people in biomedical sciences: PhD students, postdocs, researchers, technicians and PIs.

[Generic]

Data FAIRification requires different types of expertise and should therefore be carried out in a multidisciplinary team guided by FAIR data steward(s). The different sets of expertise are on i) the data to be FAIRified and how they are managed, ii) the domain and the aims of the data resource within it, iii) architectural features of the software that is (or will be) used for managing the data, iv) access policies applicable to the resource, v) the FAIRification process (guiding and monitoring it), vi) FAIR software services and their deployment, vii) data modelling, viii) global standards applicable to the data resource, and ix) global standards for data access. A good working approach is to organize a team that contains or has access to the required expertise. The core of such a team may be formed by data stewards, with at least expertise of the local environment and of the FAIRification process in general.

[RDMkit]

Perhaps: https://rdmkit.elixir-europe.org/dm_coordination

[Health-RI_FAIRification_Step_Report]

Expertise and Example Experts - Source: [De Novo]

	Expertise/Knowledge	Example Experts
a	On the data to be FAIRified and how they are managed	Local data steward FAIR data steward Data manager EDC system specialist Clinicians specialised in the domain Patient advocate for the domain
b	On the domain and on what a data resource is used for	Clinicians specialised in the domain Patient advocate for the domain
c	On architectural features of the software that is (or will be) used for managing the data	EDC system specialist Software developer
d	On access policies applicable to the resource	Local data steward Clinicians specialised in the domain Institutional Ethical Review Board
e	On the FAIRification process (guiding and monitoring it)	Local data stewards FAIR data stewards
f	On FAIR software services and their deployment	EDC system specialist Software developer Health-RI expert team
g	On semantic data modelling	Local and FAIR data steward Semantic data modelling specialists Clinicians specialised in the domain
h	On global standards applicable to the data resource interoperability	Local and FAIR data stewards EDC system specialist Senior healthcare interoperability expert
i	On global standards for data access	Local data and FAIR stewards EDC system specialist Senior expert of standards for automated access protocols and privacy preservation

FAIR Principles and Example Resources

#	FAIR Principle	Example resource
F1	Globally unique and persistent identifiers	DOI, ORCID, EUPID,
F2	Metadata about data	DCAT (standard) FAIR data point (former DTL metadata editor) (tool) ISA Framework
F3	Adding clearly and explicitly the identifier of the data they describe in the metadata	FAIRifier tool FAIR data point
F4	indexing or registering metadata and data in a searchable resource	FAIR data point
A1	metadata and data can be retrieved by their identifier via an protocol (making explicit the contact protocol to access the data)	Http/ Ftp In case of sensitive data, add to the metadata the contact info (email / telephone) of who to discuss data access with, and a clear protocol for such access request.
A1.1	open, free and universally implementable protocols	Email / phone Http / ftp / SMTP
A1.2	protocol that allows for authentication / authorization when necessary	(set user rights, register users in repository)
A2	metadata is there even when data is not available anymore (see F4)	FAIR data point
I1	Metadata and data use a proper language for knowledge representation (incl (1) commonly used controlled vocabularies, ontologies, thesauri (having resolvable globally unique and persistent identifiers, see F1) and and (2) a good data model (a well-defined framework to describe and structure (meta)data).	RDF (ttl, rdfs, rdfxml, shex, shacl) Dublin Core / DCAT OWL DAML+OIL JSON LD Semantic data models
I2	The controlled vocabulary used to describe datasets needs to be documented and resolvable using globally unique and persistent identifiers. This documentation needs to be easily findable and accessible by anyone who uses the dataset.	FAIR data point
I3	The goal is to create as many meaningful links as possible between (meta)data resources to enrich the contextual knowledge about the data.	FAIR Data Point http://www.uniprot.org/uniprot/C8V1L6.rdf
R1
R1.1
R1.2
R1.3

Resource glossary

Tool/Standard # can be used to #

Goal Modelling (see link) is a standard that can be used to represent goals that are connected to each other and it helps defining clear FAIRification objectives for both research question and process perspectives.
FAIR data point (see link) is a tool guarantees many FAIR principles and can be used to describe metadata completely in accordance to the DCAT standard, you can create and publish metadata in the FAIR data point which is a searchable and indexable resource (see fair data index, every fair data point is indexed in the fair data index),
DCAT (see link) is a standard to describe metadata of, from detail to general levels: distribution, dataset, catalogue
RDF (see link) extensible knowledge representation model is a way to describe and structure datasets
Smart Guidance (see link) is a tool that defines the specific steps for RD registries data FAIRification

Semantic data model for (e.g. Data model for set of common data elements for rare disease registration, Data model for Omics data, data model for WHO Rapid COVID CRF, Data models from EBI in the ‘documentation’ links on this page http://www.ebi.ac.uk/rdf/)

Practical Examples from the Community

This section should show the step applied in a real project. Links to demonstrator projects.

References & Further reading

Mijke Jetten, Marjan Grootveld, Annemie Mordant, Mascha Jansen, Margreet Bloemers, Margriet Miedema, & Celia W.G. van Gelder. (2021). Professionalising data stewardship in the Netherlands. Competences, training and education. Dutch roadmap towards national implementation of FAIR data stewardship (1.1). Zenodo. https://doi.org/10.5281/zenodo.4623713

Salome Scholtens, Mijke Jetten, Jasmin Böhmer, Christine Staiger, Inge Slouwerhof, Marije van der Geest, & Celia W.G. van Gelder. (2022). Final report: Towards FAIR data steward as profession for the lifesciences. Report of a ZonMw funded collaborative approach built on existing expertise (Versie 4). Zenodo. https://doi.org/10.5281/zenodo.7225070

Toolkit for building your dream team: “a resource intended to make it as easy as possible to organise a workshop aimed at raising awareness of and facilitating discussion around the diversity of roles that contribute to research”. […] “[t]he knowledge sector is now looking towards a team-based approach bringing together more overtly diverse team members with specific skills in funding, research design, data analysis, data management, software development, research ethics, political relationships, dealing with business, interdisciplinarity, communications etc.” https://research-dream-team-toolkit.readthedocs.io/en/latest/index.html

[FAIRopoly] https://www.ejprarediseases.org/fairopoly/

[FAIRinAction] https://www.nature.com/articles/s41597-023-02167-2

[Generic] https://direct.mit.edu/dint/article/2/1-2/56/9988/A-Generic-Workflow-for-the-Data-FAIRification

Authors / Contributors

HRI FAIR TEAM (Jolanda, Bruna, Fieke, Sander)

EJPRD STEWARDS TEAM (Shuxin, Alberto, Ines, Bruna, Cesar, Joeri)