Health-RI wiki v4.0 -> consultatie (open tot 03-12-2024)


Metadata regarding terms of use, authentication/authorization and ELSI aspects

DATE: 14-11-2024 STATUS: FOR REVIEW

This article describes a proposal for 'ELSI metadata', which means: metadata about the conditions and restrictions that apply to reuse of data, ELSI elements that are important in a data request and ELSI elements that are important when approving a data request, focussing on authentication and authorization.

Metadata for authorization and ELSI aspects are important in different parts of the Health-RI infrastructure. We looked at the following user stories:

  1. Finding datasets

  1. Request datasets

  1. Analyze datasets

  1. Publish datasets in the catalog

It is important to take into account legal obligations at both national and European level. The following laws and regulations were mainly examined: GDPR, WMO, EHDS and the Code of Conduct for Health Research. In addition, the NFU template MTA/DTA and consent forms of various hospitals were examined.

We propose the following metadata elements and, where possible, give a suggestion for the appropriate ontology systems.

Catalogue

It is important to indicate in the catalogue what the conditions for use are so that a user can assess at an early stage whether a request for this dataset is promising. In addition, it is important for the process to have information about who can decide on an application.

Conditions for use are based on laws and regulations (UAVG, Data Governance Act, Wet hergebruik overheidsinformatie, databankenwet, AVG/GDPR, EHDS, Data Act, etc.) and conditions set on the basis of verification of intellectual property (e.g. copyright and database rights) and the rights of the citizen (consent (opt-in) and/or objection (opt-out)). Many of these elements (in bold) are already recorded in HealthDCAT-AP and we only have limited space for this to give our own interpretation.

 

Metadata-element

Explanation

Recommended Coding

Examples/comments

dct:Publisher

Publisher

Who is the provider of the data?

foaf:Agent

The hospital, this will usually also be the data controller.

dpv:hasLegalBasis

Legal basis

The category of legal basis on the basis of which the data was collected (as in the GDPR art 6.1 a to f).

dpv:LegalBasis

 

For example, consent (e.g. in the case of scientific research) or legal obligation (e.g. in the case of quality registers).

dcatap:applicableLegislation

dpv:hasApplicableLaw

Legal basis

If the basis (see above) is a legal obligation, this can be used to indicate on the basis of which legislation.

Persistent URL for the law

Alternative is dpv:hasApplicableLaw but dcatap:applicableLegislation is part of HealthDCAT-AP.

dpv:hasJurisdiction

Jurisdiction

Jurisdiction under which the dataset falls.

dpv:Location

Within the Netherlands, this will probably be the Netherlands and the EU.

dpv:hasPersonalData

Contains personal data

Data categories (which can be linked to a person) that this dataset contains.

dpv:PersonalData

Location, SexualOrientation, MedicalHistory

dct:accessRights

Access Rights

Indication of whether a dataset is open or not. If a dataset is not publicly accessible, the fields below can be used to further specify the conditions or restrictions.

dct:RightsStatement

 

 

Restricted or sensitive (in case of identifying data). In that case, use the fields below to specify conditions/restrictions.

Public (in case of anonymous data).

In DCAT-AP Health limited to the options from the EU-Voc access rights vocabulary.

 

Conditions of use

The main conditions encoded in a machine-readable format. Can be used to filter or to ask additional questions during the application.

For example, whether costs will be charged.

DUO, DUC/CCE of ODRL

 

DUO or DUC/CCE can also serve as syntactic sugar and be converted to ODRL underwater, predicate has yet to be defined.

Link to terms and conditions document

Think of MTA, Consent forms.

Persistent URL

Predicate has yet to be defined. Possibly use foaf:Page?

Contains intellectual property

Does the dataset contain intellectual property that needs to be protected?

 

Boolean

 

This will almost always be no , but is defined in the EHDS, because measures may be taken to protect intellectual property. This may be important later in the application process. Predicate has yet to be defined.

healthdcatap:hdab

Health Data Access Body

Health Data Access Body that can give approval for secondary use under the terms of the EHDS.

foaf:Agent

 

This field will only be implemented once HDAB is in place. Until then, the data controller will be the contact person for assessment.

Linking keys

Keys that are available to link with other datasets.

 

These may be available in addition to the dataset and may not be delivered.

E.g. direct or indirect identifiers, such as name and address details. Predicate and range for object still need to be defined.

Request procedure

For the request procedure, we have tried to define a logical workflow in which things are asked (not published here because it is outdated currently). Based on this preliminary workflow, possible metadata elements were examined.

Metadata-element

Explanation

Recommended Coding

Purpose limitation

Purpose for which the data is requested.

 

DPV.

Categories from EHDS 34, GDPR 5.1 and in addition free text for purposes outside of that.

On the basis of the purpose limitation, it can be assessed whether EHDS may be applicable.

Objectives/research question

Description of the purpose/research question for which the data is requested.

Text or a document

Applicant

Data of the person (see metadata description below).

 

Request from organization or individual

It is relevant to know who is authorized to sign.

Boolean value

Organization

If the application is within an organization by virtue of the requestor's role.

ROR identifier, foaf:Agent

Position of requestor

Function of the person within the organization (see metadata description below).

 

Personal data of the contact person to control authority of the requestor

Data of the person (see metadata description below).

 

Principal Investigator

Data of the person (see metadata description below).

 

dpv:hasImpactAssessment

DPIA

Data Privacy Impact Analysis, only necessary if the requested dataset contains personal data within the meaning of the GDPR (requested in EHDS).

dpv:PIA

Data Management plan

Data management plan (requested in EHDS).

 

Desired format

Requested by the EHDS.

 

Specification

Requested by the EHDS. Main number of people / distribution. Variables.

 

Financial data

If costs are charged (to be specified in the catalogue under use conditions) questions.

 

Authentication and authorization metadata

It is important to unambiguously determine who is requesting the data and who has access to data. There are several options for establishing identity. What applies depends on the degree of certainty that is required and is a consideration that must be made by Health-RI and the data holders involved. The table below defines a number of metadata fields that can be used to maintain control over this. We emphasize that this list is not a data model. When modeling, one should take into account that people may have different roles in different requests and may authenticate themselves in different ways depending on the role.

Authenticate metadata

Metadata-element

Explanation

Recommended Coding

Full name

Full name of the person in the usual spelling.

Note: Names are not unique and may change over time.

 

Identifier

Unique identifier of the person within the system.

 

Role

Role of the person within the process, e.g. Requester, PI, Data Manager, Privacy officer.

EHDS requires that the people with access to the data are listed in the application and data permit.

DPV may need extensions

E-mail address

Contact e-mail address of the person. This must be a personal address, not a group address.

FOAF or vCard, DCAT uses both in different places

Telephone number

Contact phone number, preferably a direct number.

 

Authentication Method

How the person logged in for this session (e.g., username/password, federated login).

 

Time of login

The moment of login.

ISO 8601

Method of establishing identity

Method by which the identity is established, e.g. via online methods such as federated login, or offline methods such as passport checks.

To do this, a code list must be established.

Authorization/data permit

If a person is authorized to use the requested data, this can be recorded in a data permit (EHDS). The EHDS describes in article 46.6 what must be laid down in this. The proposal is to always use this structure mutatis mutandis.

Metadata-element

Explanation

Recommended Coding

Categories

 

The EHDS has established a list of categories of health data.

 

Specification

Specification of the data. Selection criteria, variables, and number of data points. Possibly also distribution.

 

Format

Format in which the data is made available.

 

Pseudonymized data?

Is it identifying data or pseudonymized or anonymized data?

Antoinette Vlieger proposed to start from four categories:

  1. Non-identifying

  1. Probably not identifying

  1. Probably identifying

  1. Certainly identifying

This concerns the context in which the data is used. Data can be identifying in the context of a hospital, but not identifying in the context of research because the linking key is not accessible there.

Description of the purpose

A description of the purpose for which the data was requested can be taken from the application itself.

 

Authorized Persons

EHDS requires a list of all authorized persons who have access to the secure processing environment, but the PI is a good start.

 

Length of time that the data permit is valid

 

ISO start/end date or duration from agreement

Information about the tools available in the Secure Processing Environment and their characteristics

EHDS asks for details of the environment, but in the first instance it seems sufficient to indicate which SPE is used. Possibly refer to a description of the SPE elsewhere. Only if additional tools are installed for the user to include this explicitly.

 

Cost

 

 

Additional terms and conditions

Agreed conditions, for example from an MTA/DTA or DAA.

 

Link to MTA/DTA/DAA

Link to the documents.

 

Date of agreement

Date on which the permit is approved by all parties.

 

Resources