Introduction

To find a dataset in the National Health Data Catalogue, the dataset needs to be described well. The information that describes the dataset is called ‘metadata’; the way you structure the metadata and the terms you use is called the ‘metadata schema’. While the HRI core metadata schema, based on the DCAT-AP 3.0, addresses the fundamental elements, it might fall short in fully describing the dataset. Even after expanding this metadata schema with health related terms through integration with HealthDCAT-AP and DCAT-AP NL, it may not fully meet the needs of specific domains. To improve disciplinary dataset discovery we may need domain-specific metadata (see also recommendation 4.4 of the Research Data Alliance document on dataset discoverability).

This document serves as a guide for working groups of different domains to develop their own domain-specific metadata schemas for the National Health Data Catalogue. The guide is described in process steps (see figure below), first building the team, then collecting requirements from the domain and finally turning this into domain-specific metadata schemas that are in line with and extend the HRI core metadata schema. The respective process steps are described in more detail in the subpages with deliverables and examples.

We encourage working groups to provide active feedback on the process, including what worked, what didn’t, and any additional steps that may be needed. Per step we will collect examples or prototypes of produced artifacts.

Audience

Intended audience of this document are working groups of different domains who would like to develop their own domain-specific metadata schema (or ‘petal’) and need guidance navigating the process.

Scope and schema considerations

Prerequisites

Process overview

image-20240809-151944.png

Although depicted as a linear, sequential process, the process can be much more nonlinear. The steps serve as a guide to the activities you carry out and may run in parallel. Agreeing on definitions, modeling the semantics, and getting community endorsement can be very cumbersome, so working on a schema through repeated cycles (iterative) and starting small (incremental) may be more efficient than trying to be perfect and complete from the start.

Timelines

[Add picture that Hannah uses + explain alternative design sprint approach].

Contributors and contributing

Authors

Reviewers

Contributing

There are different ways in which you can get involved in developing this method and these pages, ranging from minimal to maximal involvement:

Ideally, this documentation is developed in parallel with the actual schema development so that we can learn from practice and adapt the process (steps) accordingly, but we encourage and value any type of contribution.

Sources and further reading

The process and steps are partly based on: