...
comes with CARE-SM model. If you want to use your own custom model, you have to adjust the YARRRML model. (via Matey).
Components/steps:
RML: RFD mapping language. reusable templates that support not only CSV to RDF transformations, but also transformations from other formats. RML templates specify individual triple patterns that should be created during a transformation. Eg: The subject Uniform Resource Identifier (URI), predicate URI, and object URI are represented as strings that may contain variables, where the variables are references to locations within the source document (e.g., the appropriate column header within a CSV file). During a transformation, every variable in an RML template is replaced by the value of that location within a single source record (e.g., a single row of a CSV file) and then the source is iterated over all records to complete the transformation. RML templates themselves are represented in RDF and are therefore not always easily human-readable. With the aim of simplifying the RML syntax, such that our EJP RD FAIRification stewards, or potentially the registry data custodians themselves, could edit the template if required, we identified a second, related technology – YARRRML
YARRML - Linked data generation rules, generated in a human readable way. YARRRRML docs can be converted into RML template, which are them applied to a CSV to automate transformation. Specify rules to transform data to linked data (eg. triples). +CSV → automated mapping CSV to RDF according to rules (specified by RML)
So you need a template-compliant CSV (generated by data castodian, together with FAIR data steward in this paper)
transforming a non-RDF data format into RDF with RML
...
: 2 optional tools: SDM-RDFizer (alternative in this paper: RMLMapper) - next Metroline step
Fed into GraphDB (triple store). Metadata automatically updated when data updates
Exposed to FDP (default, can use templates for controlled input FDP) - covered in next Metroline step
From FAIR in a Box Github:
The EJP-RD CARE-SM Transformation process has three steps:
A simple "preCARE" CSV file is created by the data owner (
you must do this!
)The preCARE.csv is transformed into the final CARE.csv (
this is automated
) by the caresm toolkit (part of the docker-compose)The final CARE.csv is processed by the YARRRML transformer, and RDF is output into the
./data/triples
folder
Sources: https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-022-00264-6
https://direct.mit.edu/dint/article/5/1/184/113181/The-FAIR-Data-Point-Interfaces-and-Tooling
B. CastorEDC
Build eCRF for data collection (previous step?)
Map fields from eCRF to semantic model (step 7 of de novo) in the 'data transformation application'
eCRF values linked to the ontology concepts used as a machine-readable
representation of the value in the rendered RDF
Entered data automatically converted into RDF in real time (next step) = de novo FAIRification
C. ontoText refine: map structured data to a locally stored RDF schema in GraphDB. Chose right predicates and types, define datatype and implement transformations. Integrated in GraphDB workbench.
...
Describes the expertise that may be necessary for this step. Should be based on the expertise described in the Metroline: Build the team step.
FAIR expert/data steward: help with the tools.
Practical examples from the community
...