Main storyline research, policy and innovation

DATE: 23-08-2024 STATUS: ADOPTED

In response to feedback on the previous version of this storyline, the research, policy and innovation storyline has been redescribed in this article.

In this storyline we now assume the following types of research:

  • a study that collects, processes, analyzes and publishes new data

  • a study that collects new data, combines it with existing data, conducts analyzes and publishes the results

  • a study that collects, processes, analyzes and publishes existing data

There are more, but what all studies have in common is that they are prepared and that the research and data are eventually published.

This article was written using the context of the data life cycle (DLC) and the HORA business processes research.

This main storyline is an essential case study, which must be supported by the National Health Data Infrastructure for research, policy and innovation. This storyline touches on almost all aspects. This means that this storyline must be supported by the various working groups that are active with the Health-RI wiki. Feedback has been provided from various working groups on this storyline, which still needs to be coordinated. This feedback will be included in the next version to arrive at a broadly supported definition..

In this article, the research, policy and innovation storyline is divided into 5 phases, in the description of the phases they are related to the HORA business processes and data life cycle stages.

  1. Initiation of research and planning

  2. Research preparation

  3. Research implementation

  4. Research publication

  5. Research conclusion

image-20240509-094649.png
HORA 2.1 Business processes research

 

DLC Stage

Desciption

Actor

Plan

Define research question and research design, obtain ethical approval, draw up data management plan

Researcher

Collect

Collecting data (both generating new data and requesting existing data)

Researcher

Process

Data cleansing and FAIRification

Researcher, dataholder

Analyse

Carrying out the research

Researcher

Preserve

Storing and archiving data

Researcher, dataholder

Share

Prepare data and make it available

Researcher

Reuse

Making data available to other researchers

Researcher

There are different types of research. In this article we distinguish the following variants.

Research variants

Data life Cycle stages

 

 

Plan

Collect 1 generate new data

Collect 2 request data

Process by resear-cher

Process by data holder

Analyse

Preserve

Share & Reuse

A

The researcher will collect data with, for example, an EDC tool, process and analyze the input and make the result available

Yes

Yes

No

Yes

No

Yes

Yes

Yes

B

The researcher will collect data with an EDC tool, combine the input with other data, process and analyze the whole and make the result available

 

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

C

The researcher will collect data from one or more sources, perform analyzes and make the results available

Yes

No

Yes

No

Yes

Yes

Yes

Yes

Phase 1: initiation of research and planning

Objective: to obtain a mandate to conduct a study

HORA processes

  • setting up research collaboration

  • draw up research proposal

  • recruit research resources

 

DLC stage

Plan

Collect

Process

Analys

Preserve

Share and reuse

Plan

Collect

Process

Analys

Preserve

Share and reuse

 

 

 

 

 

 

Trigger: a researcher has a research question and wants to conduct a study.

  1. The researcher makes a research proposal containing the components (possibly with research partners).

    1. a research design

    2. a research plan

    3. a research question

    4. the required resources

    5. an initial draft of a data management plan including:

      1. the data sources to be used

      2. the code tables and metadata templates to be used (prescribed by a data governance committee)

    6. a description of the required research environment and additional tools

    7. if applicable, control of data subjects suitable for the research

  2. The researcher identifies the sources to be used for the research (via Storyline: Search data in metadata)

    1. the researcher searches the catalog for suitable sources

    2. the researcher filters the results if desired to achieve a more accurate result

    3. if necessary to find the required data, the researcher identifies himself to the catalogue

    4. To verify that the data is suitable, the researcher can ask questions directly to the data provider

    5. the researcher includes the identifiers of suitable sources to be used in the project proposal and data management plan.

  3. The researcher makes a funding application for the research proposal and submits it to a funder if applicable.

  4. The funder assesses the application and approves it

  5. The researcher completes the research proposal and submits it (to the local research review committee) for assessment. The assessment may include:

    1. a legal review

    2. an ethical assessment

    3. an assessment of social aspects

    4. an assessment by a Technology Transfer Office

  6. The local research review committee assesses the research proposal itself, has the necessary tests carried out and - if they all agree - approves the research proposal

Milestone products:

  • an approved research proposal

  • first draft of a data management plan

  • financial coverage for the study

Phase 2: research preparation

Objective: prepare all components for carrying out the research

HORA processes

  • (re)use research data

DLC stage

Plan

Collect

Process

Analys

Preserve

Share and reuse

Plan

Collect

Process

Analys

Preserve

Share and reuse

 

 

 

 

 

 

Trigger: the research proposal has been approved.

Variant A: the researcher will collect, process and analyze new data

  1. The researcher requests a research environment and additional tools from the toolsupplier to

    1. maintain a data management plan for reusable research results

    2. to establish the citizen's say (informed consent / opt-out).

    3. generate and manage data

  2. The supplier makes the research environment and additional tools available to the researcher.

  3. The researcher configures the additional tools as desired (e.g. the researcher designs an ECRF (Electronic Case Report Form) to collect data).

  4. The researcher completes the Data management plan, using code tables and metadata templates prescribed by the data governance committee

  5. The Local Data Access Committee of the relevant data holder indicates under what conditions of use data may be recorded at the data source for multiple use.

Variants B and C: the researcher needs existing data for the research. The researcher makes a request to make available the sources to be used for the research (Storyline: Request data)

  1. The researcher logs in to the National Health Data Portal and completes a request form for the intended sources, stating:

    1. Identity of the intended data user(s)

    2. Indication of the desired data sets or the data providers

    3. The research design and rationale (including, for example, the duration of the research) for using the desired data sets

    4. The desired analysis on the desired data sets in combination with other data required for the analysis

    5. Safe processing environment and additional tools to be used using the Tool Supplier

    6. Approval from the local research review committee

  2. The data request service facilitates and administers the process for the data user to request selected data set(s) from the central data access committee.

  3. The central data access committee verifies whether the application is complete and whether the desired analysis to be performed meets the conditions of use of the requested dataset(s).

  4. The data request service records the response from the central data access committee and facilitates and administers the process for the data user to request selected data set(s) from the data provider(s).

  5. The approached data provider verifies

    1. that the request meets the applicable conditions of use of the requested data

    2. that there is no veto of the local data access committee on the application and records the result of the assessment with the data application service

  6. The data request service feeds back the results of the assessments to the data user

  7. The data request service sends the contracts (e.g. data access agreements) to the parties involved.

  8. The parties involved sign the contracts

  9. After signing the contracts, the data request service sends a notification to the data provider(s) that the data can be made available on the requested secure processing environment or can be made available for a federated analysis.

  10. The secure processing environment supplier installs and configures the desired secure processing environment with the desired tools as the data user has requested in the request.

  11. Every data provider makes the requested data available (Storyline: Data delivery)

    1. in the desired safe processing environment (for the duration of the research as agreed in the data request or as determined by law).

      1. The data provider minimizes the data set.

      2. The data provider consults the control register to suppress data for which no permission has been given or for which an objection has been made (if necessary)

      3. The data provider carries out (if necessary) a pseudonymization (by means of the generic pseudomization service) on the requested dataset.

      4. The data provider makes the data available for the desired secure processing environment.

      5. The data provider reports to the localization service which data has been made available for which research

      6. The data provider notifies the data request service that the desired dataset has been made available so that the status of the request can be updated

      7. The data provider guarantees the reproducibility of the dataset, for example to be able to repeat the data release at a later time.

    2. for the purpose of federated analysis

      1. Every data provider involved in this federated analysis minimizes the data set.

      2. Each data provider involved in this federated analysis consults the control register to suppress data for which there is no appropriate control (if necessary).

      3. Each data provider involved in this federated analysis carries out the desired or required pseudonymization (by means of the generic pseudonymization service) on the requested dataset.

      4. Each data provider involved in this federated analysis makes the data available for delivery to the data processor designated for this federated analysis.

      5. Each data provider involved in this federated analysis reports to the localization service which data has been made available for which research.

      6. Each data provider involved in this federated analysis reports to the data request service that the desired dataset has been made available so that the request status can be updated.

      7. Every data provider involved in this federated analysis guarantees the reproducibility of the dataset, for example to be able to repeat the data release at a later time.

Milestone products:

  • a safe processing environment

  • installed tools

  • dataset made available

  • updated localization data

  • archived dataset

  • an updated data management plan

Phase 3: research implementation

Goal: to arrive at research results

HORA processes

  • create new research data

  • process and analyze research data

  • produce research results

DLC stage

Plan

Collect

Process

Analys

Preserve

Share and reuse

Plan

Collect

Process

Analys

Preserve

Share and reuse

 

 

 

 

 

 

Trigger: there is a notification that the requested data and/or the research environment is available (if applicable)

Variant A: Generating and processing/analyzing new research data (Storyline: Generating research data )

  1. The data source starts generating new data points.

  2. When the data is generated, the data user receives the generated data

  3. The data user uses the tools in the research environment to further process and analyze the generated data.

Variant B: Generate new research data and combine it with requested data (https://health-ri.atlassian.net/wiki/spaces/VD/pages/155751332 )

  1. The data source starts generating new data points.

  2. When the data is generated, the data user receives the generated data

  3. The data user uses the tools in the research environment to further process and analyze the generated data.

  4. The data user combines the generated data with requested data: the result is the input data for further analysis.

  5. The data user instructs the research environment to perform the analysis determined and approved by the data user on the input data.

  6. The secure processing environment performs the analysis determined and approved by the data user on the input data.

  7. The data user uses code tables and metadata templates prescribed by the data governance committee and that have already been or are currently being recorded in the data management plan to generate the final research results.

Variant C: Analyze requested data (https://health-ri.atlassian.net/wiki/spaces/VD/pages/155751332 )

  1. The data user instructs the research environment to perform the analysis determined and approved by the data user on the input data.

  2. The secure processing environment performs the analysis determined and approved by the data user on the input data.

  3. The data user uses code tables and metadata templates prescribed by the data governance committee and that have already been or are currently being recorded in the data management plan to generate the final research results.

Milestone products:

  • dataset with research results

  • metadata of the dataset with research results

  • developed/trained algorithms (optional)

  • analysis scripts or software code

Phase 4: research publication

Objective: to publish a manuscript and make the research results available.

HORA processes:

  • disseminate research results

  • preserve research results and research data

  • guaranteeing the findability of research data

  • guarantee accessibility of research data

  • guarantee reusability of research data

DLC stage

Plan

Collect

Process

Analys

Preserve

Share and reuse

Plan

Collect

Process

Analys

Preserve

Share and reuse

 

 

 

 

 

 

Trigger: the research has been completed and there are research results to publish

  1. The researcher here acts in the role of data provider. A number of activities can also be carried out by the research institute.

  2. The researcher publishes the conclusions and results in a manuscript

  3. The researcher ensures that the research data and research results are archived

  4. The researcher asks the Local Data Access Committee whether and under what conditions of use the research results may be recorded for multiple use.

  5. The Local Data Access Committee of the data holder in question indicates under which conditions of use the research results may be recorded for multiple use.

  6. The researcher uses code tables and metadata templates that are prescribed by the data governance committee and that have already been or will be recorded in the data management plan to make the research results suitable for multiple use (optional)

  7. The researcher documents the dataset with research results and makes the dataset available

  8. The researcher notifies the localization service about the published data if necessary

  9. The researcher creates an up-to-date metadata file about the research results dataset and registers and publishes the metadata at a FAIR data point of his or her choice, using a separate application or integrated into an application with data repository or metadata catalog functionality.

  10. The researcher registers the FAIR data point used with a FAIR data point register (see, for example, FAIR Data Point) if this has not been done before. It is the responsibility of the catalog to regularly and/or triggered by a search query retrieve the latest status of the metadata from the FAIR data points.

Milestone products:

  • manuscript

  • published research results dataset

  • metadata published on an FDP

  • published workflows and/or algorithms (optional)

Phase 5: research conclusion

Objective: to formally conclude the research and clean up the data and environment used

HORA processes

archiving research results

DLC stage

Plan

Collect

Process

Analys

Preserve

Share and reuse

Plan

Collect

Process

Analys

Preserve

Share and reuse

 

 

 

 

 

 

Trigger: the research has been completed and the research results have been published

  1. The researcher ensures that the research data, research results and documentation are archived.

  2. The researcher sets up the management of the research results dataset and the metadata of the dataset

  3. The researcher ensures that the research environment, research data and research results are cleaned up.

Milestone products:

  • archived data and documentation

  • established data management process

 

 

Â