Main storyline research, policy and innovation
In response to feedback on the previous version of this storyline, the research, policy and innovation storyline has been redescribed in this article.
In this storyline we now assume the following types of research:
a study that collects, processes, analyzes and publishes new data
a study that collects new data, combines it with existing data, conducts analyzes and publishes the results
a study that collects, processes, analyzes and publishes existing data
There are more, but what all studies have in common is that they are prepared and that the research and data are eventually published.
This article was written using the context of the data life cycle (DLC) and the HORA business processes research.
This main storyline is an essential case study, which must be supported by the National Health Data Infrastructure for research, policy and innovation. This storyline touches on almost all aspects. This means that this storyline must be supported by the various working groups that are active with the Health-RI wiki. Feedback has been provided from various working groups on this storyline, which still needs to be coordinated. This feedback will be included in the next version to arrive at a broadly supported definition..
In this article, the research, policy and innovation storyline is divided into 5 phases, in the description of the phases they are related to the HORA business processes and data life cycle stages.
Â
DLC Stage | Desciption | Actor |
Define research question and research design, obtain ethical approval, draw up data management plan | Researcher | |
Collecting data (both generating new data and requesting existing data) | Researcher | |
Data cleansing and FAIRification | Researcher, dataholder | |
Carrying out the research | Researcher | |
Storing and archiving data | Researcher, dataholder | |
Prepare data and make it available | Researcher | |
Making data available to other researchers | Researcher |
There are different types of research. In this article we distinguish the following variants.
Research variants | Data life Cycle stages | ||||||||
 |  | Plan | Collect 1 generate new data | Collect 2 request data | Process by resear-cher | Process by data holder | Analyse | Preserve | Share & Reuse |
A | The researcher will collect data with, for example, an EDC tool, process and analyze the input and make the result available | Yes | Yes | No | Yes | No | Yes | Yes | Yes |
B | The researcher will collect data with an EDC tool, combine the input with other data, process and analyze the whole and make the result available  | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
C | The researcher will collect data from one or more sources, perform analyzes and make the results available | Yes | No | Yes | No | Yes | Yes | Yes | Yes |
Phase 1: initiation of research and planning
Objective: to obtain a mandate to conduct a study
HORA processes
setting up research collaboration
draw up research proposal
recruit research resources
Â
DLC stage
Plan | Collect | Process | Analys | Preserve | Share and reuse |
---|---|---|---|---|---|
 |  |  |  |  |  |
Trigger: a researcher has a research question and wants to conduct a study.
The researcher makes a research proposal containing the components (possibly with research partners).
a research design
a research plan
a research question
the required resources
an initial draft of a data management plan including:
the data sources to be used
the code tables and metadata templates to be used (prescribed by a data governance committee)
a description of the required research environment and additional tools
if applicable, control of data subjects suitable for the research
The researcher identifies the sources to be used for the research (via Storyline: Search data in metadata)
the researcher searches the catalog for suitable sources
the researcher filters the results if desired to achieve a more accurate result
if necessary to find the required data, the researcher identifies himself to the catalogue
To verify that the data is suitable, the researcher can ask questions directly to the data provider
the researcher includes the identifiers of suitable sources to be used in the project proposal and data management plan.
The researcher makes a funding application for the research proposal and submits it to a funder if applicable.
The funder assesses the application and approves it
The researcher completes the research proposal and submits it (to the local research review committee) for assessment. The assessment may include:
a legal review
an ethical assessment
an assessment of social aspects
an assessment by a Technology Transfer Office
The local research review committee assesses the research proposal itself, has the necessary tests carried out and - if they all agree - approves the research proposal
Milestone products:
an approved research proposal
first draft of a data management plan
financial coverage for the study
Phase 2: research preparation
Objective: prepare all components for carrying out the research
HORA processes
(re)use research data
DLC stage
Plan | Collect | Process | Analys | Preserve | Share and reuse |
---|---|---|---|---|---|
 |  |  |  |  |  |
Trigger: the research proposal has been approved.
Variant A: the researcher will collect, process and analyze new data
The researcher requests a research environment and additional tools from the toolsupplier to
maintain a data management plan for reusable research results
to establish the citizen's say (informed consent / opt-out).
generate and manage data
The supplier makes the research environment and additional tools available to the researcher.
The researcher configures the additional tools as desired (e.g. the researcher designs an ECRF (Electronic Case Report Form) to collect data).
The researcher completes the Data management plan, using code tables and metadata templates prescribed by the data governance committee
The Local Data Access Committee of the relevant data holder indicates under what conditions of use data may be recorded at the data source for multiple use.
Variants B and C: the researcher needs existing data for the research. The researcher makes a request to make available the sources to be used for the research (Storyline: Request data)
The researcher logs in to the National Health Data Portal and completes a request form for the intended sources, stating:
Identity of the intended data user(s)
Indication of the desired data sets or the data providers
The research design and rationale (including, for example, the duration of the research) for using the desired data sets
The desired analysis on the desired data sets in combination with other data required for the analysis
Safe processing environment and additional tools to be used using the Tool Supplier
Approval from the local research review committee
The data request service facilitates and administers the process for the data user to request selected data set(s) from the central data access committee.
The central data access committee verifies whether the application is complete and whether the desired analysis to be performed meets the conditions of use of the requested dataset(s).
The data request service records the response from the central data access committee and facilitates and administers the process for the data user to request selected data set(s) from the data provider(s).
The approached data provider verifies
that the request meets the applicable conditions of use of the requested data
that there is no veto of the local data access committee on the application and records the result of the assessment with the data application service
The data request service feeds back the results of the assessments to the data user
The data request service sends the contracts (e.g. data access agreements) to the parties involved.
The parties involved sign the contracts
After signing the contracts, the data request service sends a notification to the data provider(s) that the data can be made available on the requested secure processing environment or can be made available for a federated analysis.
The secure processing environment supplier installs and configures the desired secure processing environment with the desired tools as the data user has requested in the request.
Every data provider makes the requested data available (Storyline: Data delivery)
in the desired safe processing environment (for the duration of the research as agreed in the data request or as determined by law).
The data provider minimizes the data set.
The data provider consults the control register to suppress data for which no permission has been given or for which an objection has been made (if necessary)
The data provider carries out (if necessary) a pseudonymization (by means of the generic pseudomization service) on the requested dataset.
The data provider makes the data available for the desired secure processing environment.
The data provider reports to the localization service which data has been made available for which research
The data provider notifies the data request service that the desired dataset has been made available so that the status of the request can be updated
The data provider guarantees the reproducibility of the dataset, for example to be able to repeat the data release at a later time.
for the purpose of federated analysis
Every data provider involved in this federated analysis minimizes the data set.
Each data provider involved in this federated analysis consults the control register to suppress data for which there is no appropriate control (if necessary).
Each data provider involved in this federated analysis carries out the desired or required pseudonymization (by means of the generic pseudonymization service) on the requested dataset.
Each data provider involved in this federated analysis makes the data available for delivery to the data processor designated for this federated analysis.
Each data provider involved in this federated analysis reports to the localization service which data has been made available for which research.
Each data provider involved in this federated analysis reports to the data request service that the desired dataset has been made available so that the request status can be updated.
Every data provider involved in this federated analysis guarantees the reproducibility of the dataset, for example to be able to repeat the data release at a later time.
Milestone products:
a safe processing environment
installed tools
dataset made available
updated localization data
archived dataset
an updated data management plan
Phase 3: research implementation
Goal: to arrive at research results
HORA processes
create new research data
process and analyze research data
produce research results
DLC stage
Plan | Collect | Process | Analys | Preserve | Share and reuse |
---|---|---|---|---|---|
 |  |  |  |  |  |
Trigger: there is a notification that the requested data and/or the research environment is available (if applicable)
Variant A: Generating and processing/analyzing new research data (Storyline: Generating research data )
The data source starts generating new data points.
When the data is generated, the data user receives the generated data
The data user uses the tools in the research environment to further process and analyze the generated data.
Variant B: Generate new research data and combine it with requested data (https://health-ri.atlassian.net/wiki/spaces/VD/pages/155751332 )
The data source starts generating new data points.
When the data is generated, the data user receives the generated data
The data user uses the tools in the research environment to further process and analyze the generated data.
The data user combines the generated data with requested data: the result is the input data for further analysis.
The data user instructs the research environment to perform the analysis determined and approved by the data user on the input data.
The secure processing environment performs the analysis determined and approved by the data user on the input data.
The data user uses code tables and metadata templates prescribed by the data governance committee and that have already been or are currently being recorded in the data management plan to generate the final research results.
Variant C: Analyze requested data (https://health-ri.atlassian.net/wiki/spaces/VD/pages/155751332 )
The data user instructs the research environment to perform the analysis determined and approved by the data user on the input data.
The secure processing environment performs the analysis determined and approved by the data user on the input data.
The data user uses code tables and metadata templates prescribed by the data governance committee and that have already been or are currently being recorded in the data management plan to generate the final research results.
Milestone products:
dataset with research results
metadata of the dataset with research results
developed/trained algorithms (optional)
analysis scripts or software code
Phase 4: research publication
Objective: to publish a manuscript and make the research results available.
HORA processes:
disseminate research results
preserve research results and research data
guaranteeing the findability of research data
guarantee accessibility of research data
guarantee reusability of research data
DLC stage
Plan | Collect | Process | Analys | Preserve | Share and reuse |
---|---|---|---|---|---|
 |  |  |  |  |  |
Trigger: the research has been completed and there are research results to publish
The researcher here acts in the role of data provider. A number of activities can also be carried out by the research institute.
The researcher publishes the conclusions and results in a manuscript
The researcher ensures that the research data and research results are archived
The researcher asks the Local Data Access Committee whether and under what conditions of use the research results may be recorded for multiple use.
The Local Data Access Committee of the data holder in question indicates under which conditions of use the research results may be recorded for multiple use.
The researcher uses code tables and metadata templates that are prescribed by the data governance committee and that have already been or will be recorded in the data management plan to make the research results suitable for multiple use (optional)
The researcher documents the dataset with research results and makes the dataset available
The researcher notifies the localization service about the published data if necessary
The researcher creates an up-to-date metadata file about the research results dataset and registers and publishes the metadata at a FAIR data point of his or her choice, using a separate application or integrated into an application with data repository or metadata catalog functionality.
The researcher registers the FAIR data point used with a FAIR data point register (see, for example, FAIR Data Point) if this has not been done before. It is the responsibility of the catalog to regularly and/or triggered by a search query retrieve the latest status of the metadata from the FAIR data points.
Milestone products:
manuscript
published research results dataset
metadata published on an FDP
published workflows and/or algorithms (optional)
Phase 5: research conclusion
Objective: to formally conclude the research and clean up the data and environment used
HORA processes
archiving research results
DLC stage
Plan | Collect | Process | Analys | Preserve | Share and reuse |
---|---|---|---|---|---|
 |  |  |  |  |  |
Trigger: the research has been completed and the research results have been published
The researcher ensures that the research data, research results and documentation are archived.
The researcher sets up the management of the research results dataset and the metadata of the dataset
The researcher ensures that the research environment, research data and research results are cleaned up.
Milestone products:
archived data and documentation
established data management process
Â
Â
Â