Metroline Step: Define access conditions
status: Future work
Short description
With the data now FAIRified you need to decide the conditions for researchers to get access to the data. Furthermore, you will need to decide whether they are allowed to re-share your data, use the data commercially, etc [ru_accesslevel].
Data should be shared as open as possible and as closed as necessary [H2020] – closed, to safeguard the privacy of the subjects, and open to allow the data to be reused [AofFAIR]. If data are not open access, access to the data can governed by a Data Access Committee (DAC), a formal or informal group of individuals with the responsibility of reviewing and assessing data access requests [DAC]. The access conditions are described in the data access policy [FAIRopoly].
The conditions for reuse of the data are specified in either a license or data usage agreement (DUA). A license specifies a standard set of terms and conditions under which data can be shared and reused, whereas a DUA can be customised with specific conditions [ru_dua].
Access can be layered. For example, in a dataset, the metadata could be open and available for reuse under a CC0 license, but access to the data could require explicit approval – see [De Novo]. Ideally, accessibility is specified in such a way, that a machine can automatically understand the requirements and perform an appropriate action [GOFAIR_R1]. By requesting users to create a user account for a repository, access to a dataset can be controlled more easily.
We can also add something about different types of access (e.g. open access, registered access, controlled access, …) , see https://rdmkit.elixir-europe.org/human_data and https://libguides.library.usyd.edu.au/datapublication/access
[AofFAIR] https://direct.mit.edu/dint/article/2/1-2/47/9998/The-A-of-FAIR-As-Open-as-Possible-as-Closed-as
[DAC] https://bmcmedethics.biomedcentral.com/articles/10.1186/s12910-020-0453-z
[De Novo] https://ojrd.biomedcentral.com/articles/10.1186/s13023-021-02004-y
[FAIRopoly] https://www.ejprarediseases.org/fairopoly/
[GOFAIR_R1] https://www.go-fair.org/fair-principles/r1-1-metadata-released-clear-accessible-data-usage-license/
[ru_accesslevel] https://data.ru.nl/doc/help/helppages/best-practices/bp-selecting-dua.html?0
[ru_dua] https://www.ru.nl/rdm/vm/licenses-data-use-agreements/
Why is this step important
Accessibility in FAIR implies that one should provide the exact conditions under which the data are accessible. These exact conditions should be clear to machines and humans. By completing this step, access to the data should be well defined.
How to
This could be interesting:
Ontology-based Access Control for FAIR Data
[HANDS]
Establishing an access policy for your data is an important aspect of data stewardship. The data access and sharing policy of your study should be tailored to your project and it should take the Data Governance Policy of your UMC into account.
Most UMCs are currently in the process of setting up a Data Governance Policy or Procedure, often in collaboration with their university. This Data Governance Policy may include regulations on internal access to research data and re-use of data, including authorisations. In addition, it may recommend installing one or more Data Access Boards or Committees that plays a role in the permission of sharing data with third parties.
Be sure that:
you take the Data Governance Policy or Procedure of your institute into account when writing your data management plan (DMP);
the data sharing plan in your DMP is approved by your institute’s Data Access Board, if necessary.
For collaborations with third parties, be sure to draw up a legal agreement that is approved by your institute (i.e., a Research Collaboration Agreement and often a Data Transfer Agreement or Data Sharing Agreement). This agreement should state which party is responsible for the data and it should describe access rights within the collaboration, for instance:
in research consortia;
in community databases (e.g., reference data);
when patient organisations are co-owners;
when your data is on an external server or in an external database.
Frequently Asked Questions
What is Data Governance?
Data Governance is a system of decision rights and accountabilities for information-related processes, executed according to agreed-upon models which describe who can take what actions with what information, and when, under what circumstances, using what methods. (Source: Do’s and don’ts for Informed Consent for Sharing Data, UU, A. vd Kuil).
What are internal and external access policies?
Having access policies for your data is an important aspect of data stewardship. Your access policies should establish who is authorised to access the data:
who gets access to your data (e.g., researchers, data managers, ICT staff, administrative staff);
to which data these people get access;
what type of access they get (e.g., read only, edit).
This includes:
internal access policies (i.e., for yourself and your colleagues, for instance when you need remote access to your data);
external access policies (e.g., in case you are sharing files with others as part of a new research project).
Access policies are part of your data management plan. It is your responsibility to describe them before you start collecting data. In case of a clinical trial, a substantial change in access policies should lead to an amendment of your ethical protocol.
Important aspects are:
never allowing access to personal or clinical data to unauthorised people (this includes colleagues from your research group who are not involved in the project);
under no circumstances granting access to (in)directly identifiable data via computer accounts shared by multiple persons;
not providing more information in a data extraction than needed for a particular analysis;
making sure that access to the database is logged properly (i.e., who accesses the system for what purpose and who retrieves which data elements).
preferably verifying the identity of the user logging into a database with (in)directly identifiable data by at least one other method than just password security (“2-factor authentication”);
preferably use a one-time password generating tag or a message to your phone;
Any access outside the authorisations in the access policies should be considered unauthorised access. You should be able to detect unauthorised access timely, whether from inside or outside. Note that there is a legal obligation to report personal data leaks in most countries.
Who can access contact information? (PATIENT contact information)
In cohort studies, contact data of study subjects are usually registered. Access rules should differentiate between those having access to research data and those having access to these contact data. In principle, one person should not have access to both, unless the researcher is also the treating physician. An exception can only be made for smaller projects that have a limited period during which data are created, processed and analysed. In your Data Management Plan, you will have to argue why this exception applies to your research project (i.e., explain why it is necessary for staff members to access both research data and contact data).
Why do I already have to decide on access policies at the start of my study?
In principle, your access policies should be described at the start of your project. One reason for this is that, in many cases, patients have to give informed consent on data sharing before you start collecting data. Yet, there should be sufficient room for change, following from the principle of responsible data sharing, for instance because:
new funders may require new access and sharing conditions;
your project may lead to unforeseen data, which generate unforeseen requests for those data.
https://rdmkit.elixir-europe.org/human_data
Selecting suitable access modes for sharing human data:
Human data often carries restrictions to its use and it would need to be shared in a manner that obeys such restrictions. There are three access modes for sharing research data:
Open access: Data is shared publicly. Open-access is a rarely used access mode for the sharing of human data. To use open-access researchers need to ensure that the shared data cannot be traced back to individual study participants. In other words the data needs to be anonymised, which is difficult in practice.
Registered access: Data is shared with researchers, whose “researcher” status has been vouched for by their institution and who agree to abide by data usage policies of repositories that serve the shared data. Datasets that are shared via registered-access would typically have no restrictions besides the condition that data is to be used for research.
Controlled access: Data can only be shared with researchers, whose research is reviewed and approved by a data access committee (DAC). Typically researchers, who were involved in the primary collection of data will form the DAC. Use conditions for controlled-access could be a multitude and includes allowed research topics, allowed geographical regions, allowed recipients e.g. non-profit organisations.
https://libguides.library.usyd.edu.au/datapublication/access
Open access – There are no restrictions on access to the data; anyone can view and download a copy.
Embargoed access – A description of your dataset is published, including information such as the dataset title, who created it, and what the data are; however, the dataset is inaccessible until after a specified period of time has elapsed. At the end of the embargo period, data will become available by either open or mediated access, depending on the option that you’ve selected.
Mediated access – A description of your dataset is published, including information such as the dataset title, who created it, and what the data are; however, others won’t be able to access the data until after they apply and have their application approved. Conditions of access are usually set by the owner or submitter of the data and may include providing proof that the requester is a genuine researcher and that they have ethical approval from their own institution to undertake the research.
Closed access – A description of your dataset is published, including information such as the dataset title, who created it, and what the data are; however, the dataset is inaccessible and there is no process in place to allow others to apply for access to it.
The How to section should:
be split into easy to follow steps;
Step 1
Step 2
etc.
help the reader to complete the step;
aspire to be readable for everyone, but, depending on the topic, may require specialised knowledge;
be a general, widely applicable approach;
if possible / applicable, add (links to) the solution necessary for onboarding in the Health-RI National Catalogue;
aim to be practical and simple, while keeping in mind: if I would come to this page looking for a solution to this problem, would this How-to actually help me solve this problem;
contain references to solutions such as those provided by FAIR Cookbook, RMDkit, Turing way and FAIR Sharing;
contain custom recipes/best-practices written by/together with experts from the field if necessary.
Expertise requirements for this step
Describes the expertise that may be necessary for this step. Should be based on the expertise described in the Metroline: Build the team step.
Practical examples from the community
Examples of how this step is applied in a project (link to demonstrator projects).
Training
Add links to training resources relevant for this step. Since the training aspect is still under development, currently many steps have “Relevant training will be added in the future if available.”
Suggestions
Visit our How to contribute page for information on how to get in touch if you have any suggestions about this page.