...
Why is this step important
Several of the Metroline steps that follow rely on being familiar with your data. For example, in order to
While performing this step, keep your
...
Which data elements/variables are you planning to collect? For this, the driving user’s question might provide some guidance.
If possible, determine the value range for each data element (e.g. for ‘biological sex at birth', values could be ‘male’, ‘female’; while for 'age’, the value range might be 0-110)
Step 2
For existing data: check if it comes with a code book. If yes: does it help? If yes, you’re done, if no: Contact the owner of the data and get the semantics cleared up, so you don’t misinterpret the data. If you see you still need to do additional work in order to make the data clearer, follow the steps below.
For new data: check if there is a code book you can use. If yes: use that, if no: follow the steps (paragraph below)
Check if there is an existing data standard or code book that you can reuse. If there is, use it, otherwise follow the steps below. If you find a codebook that fits partially, use it for the elements that are included and follow the steps below for the others. Health-RI, together with domain representatives, will be aiming to develop domain-specific national data standards in the future.
Step 3
Check the data semantics. Is the meaning of the data elements clear and unambiguous? For data elements with ambiguous meaning, try to improve their definition. For this, it might help to examine the value range of a variable to find out if next to the intended value range, other values could be filled in, too.
...
Define or align common data elements (CDEs).
NOTE: we don’t define CDEs in this step, but we do need to include checking for already existing ones in other steps.
Common Data Elements (CDEs) are standardised, precisely defined question paired with specific allowable responses. These CDEs can be used systematically across different sites, studies, or clinical trials to ensure consistent data collection. CDEs give us a way to standardize and share precise and unambiguous definitions of the meaning of data independent of any data model or data set. [NIH]
...