Step 2. Collect and analyze metadata user requirements

status: in Review

Short description

This step is mainly about understanding what users and stakeholders from your domain need to quickly discover datasets of their interest in a catalogue. Users and stakeholders can for example be researchers or data scientists from your domain (or in general, people from your domain who will be using the catalogue). These requirements can be formulated in the form of small user stories and/or competency questions (CQs, see Examples step 2 ) to approach this from a user-centered perspective. From these stories or CQs key metadata elements (requirements) can be extracted. Analyzing these requirements helps to decide which are the most important for creating domain specific metadata fields.

Note that all kind of domain metadata schemas, information models, ontologies and vocabularies may exist already. Analyzing this is part of the next step. The current step is important to understand the domain’s user’s needs, so that in next steps this can be effectively incorporated in the modeling.

A clear and explicit scope statement is important to stay focused on the initial goals while modeling and to communicate to stakeholders what will be and won’t be covered by the schema.

Deliverables

Deliverable

Description

Deliverable

Description

Requirements document

A comprehensive document outlining the metadata requirements gathered from stakeholders, in the form of competency questions.

Scope statement

A clear statement defining the boundaries of the domain-specific metadata schema.

How

See the example page for some concrete examples: Examples step 2

  1. Gathering requirements. Conduct interviews, surveys, and workshops to understand the metadata needs and user stories of the catalogue users/stakeholders you identified. Doing this in the form of interviews, focus groups or workshops usually works better than sending out a form, because catalogue users may not be used to thinking in terms of potential queries they may perform in a catalogue. During an interview or workshop you have the possibility to guide this process and ask further questions. If you end up with a lot of competency questions/requirements, be aware that you will need to extract the main points and prioritize the most important ones.

  2. Define scope. Establish and document the boundaries of what the schema will cover. Have a look at the Health-RI metadata schema and HealthDCAT-AP to see what is already covered for your domain. For examples of scope statements take a look at the domain progress page. You can find some generic hints for a scope statements in the introduction as well, such as that is about discoverability and reuse metadata of datasets. More domain-specific hints may be (as an example): that an omics metadata schema only targets genomics at first instance, or that an imaging metadata schema mainly concerns PET, CT and MRI scans.

HRI hub involvement in this step

  • The HRI Data Team can provide further advice on user stories and competency questions.

Further reading