Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Step 2 - Check for an existing standard/code book

a) For existing (meta)data: check if it comes with a code book or metadata standard. In case it does and it is clear, you can use it for your (meta)data and are done with this step is done.
If the codebook is not helpful, you should contact the owner of the data and get the semantics cleared up, so you don’t misinterpret the data. If you see you still need to do additional work in order to make the data clearer, follow the steps below.

b) For new (meta)data: check if there is a code book or metadata standard you can use. If yesA domain expert (for data) or FAIR data steward/semantic expert (for metadata) can help you find out if and where a codebook or standard might be available.
In case there is a codebook or standard, you can use that, if no, follow the next stepsit. If there is no codebook or standard available, proceed to step 3.

If you find a codebook or metadata standard that fits partially, use it for the elements that are included and follow the steps below for the othersremaining elements.

Info

Health-RI, together with domain representatives, will be aiming to develop domain-specific national data standards in the future.

You can find more about metadata standards and ontologies at the following link: https://howtofair.dk/links-additional-reading/#more-on-metadata-standards-and-ontologies-

...

Check the data semantics. Is the meaning of the data elements clear and unambiguous? For data elements with ambiguous meaning, try to improve their definition. For this, it might help to examine find out what is the intended value range of a variable to find out if next to the intended value range, other values could be filled in, too.- is the exact range known and is it clear enough?

In the example of collecting data on a patient’s 'sex', it might be unclear if it means ‘biological sex at birth' or ‘gender’. In another example, 'age' of a subject can be expressed in years, but in some cases (i.e. studies with small children) could also be expressed in months. It should therefore be clearly stated if the value range for age should be captured expressed in years or months.

In the below spreadsheet we can see what the issues are with our current metadata and suggested improvements in order to make the meaning of them clearer.

...

Metadata Field

Value

Issue

Suggested variable description

Suggested Value Range

Suggested descriptionValue

Dataset Name

Health Data

Generic and not descriptive.Patient Health Records 2023

The name of the dataset.

Text

Patient Health Records 2023

Date of Upload

01/02/2023

Ambiguous format
(MM/DD/YYYY or
DD/MM/YYYY).

2023-01-02

Date when the dataset was uploaded, in ISO 8601 format (YYYYYYY-MM-DD).

Date in ISO 8601 format (YYYY-MM-DD)

2023-01-02

Keywords

BP, HR, Conditions

Abbreviations used without context.

Blood Pressure, Heart Rate, Hypertension

Keywords describing the main topics covered by the dataset.

Text

Blood Pressure, Heart Rate, Hypertension

Creator

Dr. Smith

Generic name without additional identifying information.

Full name and affiliation of the dataset creator, as well as ORCID.

Text and ORCID identifier

Dr. John Smith, Hospital A

ORCID: 0001-0002-3456-7890Full name and affiliation of the dataset creator, as well as ORCID.

Description

Patient health data including BP and HR

Lacks detail.

Extended description providing context and details about the dataset.

Text

Detailed patient health records including measurements of blood pressure (BP) and heart rate (HR), along with diagnosed medical conditions and prescribed medications.

Extended description providing context and details about the dataset.

Format

CSV

Broad category, can be more detailed.

Data format and version.

Text

CSV, version 1.0Data format and version.

Source

Hospital A

Lacks detail, too generic.

Hospital A, Department of Cardiology

Specific department and institution where the data was sourced.

Text and ROR identifier

Hospital A, Department of Cardiology,

https://ror.org/example

Rights

Open

Too broad.CC BY 4.0

Licensing terms specifying the rights for data usage.

URL to CC License

CC BY 4.0

Step 4 - Check relationships

...