About the project
Development of a nationwide standardised, data protection-compliant infrastructure for the storage and provision of COVID-19 research datasets. Among other things, a comprehensive database, data collection tools, use & access procedures and a trust centre are planned.
The infrastructure will be able to map complex COVID-19 research datasets, including clinical data, image data and data on biosamples, in a multi-centre, patient-related and pseudonymised manner and make them available to researchers.
In the care of patients with COVID-19, a large amount of research-relevant data, materials and findings are generated within the IT systems and electronic medical record systems of the university hospitals, which must be collected as standardised as possible and recorded promptly, as well as centrally merged and evaluated. The following objectives are at the forefront here:
- Generating and disseminating evidence as quickly as possible to support the best possible patient care for COVID-19, taking into account legal and ethical issues,
- the prevention of future epidemics,
- optimising the handling of epidemics, not only in the individual treatment situation, but also in crisis management and in the design of the healthcare system as a whole, and
- creating the basis for vaccine and drug development.
The DZHK operates a clinical research platform for clinical studies in the cardiovascular field that fulfils the basic requirements of a COVID-19 data platform and with which over 120 clinics throughout Germany are already working. The system can be used immediately and is being temporarily customised for COVID-19 research. This ensures a rapid start to data collection. As a long-term solution for the research data infrastructure for COVID-19, a powerful comprehensive research data platform will be set up according to FAIR principles based on the structures and preliminary work of the Medical Informatics Initiative (MII). In addition to the central data platform, transactional and persistent data storage components will be set up and combined nationwide. A high-quality data set from as many university hospitals in Germany as possible is to be integrated. In addition, the possibility of connecting citizen apps and clinical apps is to be created. In addition to technical solutions, functional processes for nationwide pseudonymisation, modular consent and dynamic revocation management are also being developed. As a result, different types and sources of data can be brought together in a patient-centred manner, complex research questions can be processed and healthcare can be supported.
The research data platform
- utilises open standards and is built using open specifications and interchangeable components.
- allows cross-site data pooling and integration on the basis of international standards for interoperable data integration solutions with uniformly regulated access and standardised use.
- connects the Data Integration Centres of the MIIs of all German university medical centres, but also enables other hospitals to participate.
- processes the Germany-wide harmonised COVID-19 data set (GECCO).
- complies with the FAIR principles for research data.
- enables fast data requests and analyses, provided consent has been given
- allows the development of mobile apps as well as clinical applications that can be made available to all clinics, e.g. the Covid-19 Smart Infection Control System.
- enables the rapid integration of applications to support the care of Covid-19 patients by providing transactional applications in local hospital IT infrastructures.
- Supports decentralised federated data collection and data provision by university hospitals through the use and promotion of the MII's Data Integration Centres, while enabling the implementation and centralised evaluation of distributed scientific analyses.
- The use of open information models and programming interfaces enables new services and applications to be developed quickly and efficiently and integrated into the platform. These can be used by healthcare providers and citizens alike.
The added value of the project can be summarised as follows:
- Structured data should be made available to researchers as easily as possible in high quality and up to date in order to support the answering of a wide range of scientific questions, thereby achieving progress and creating benefits for society.
- A comprehensive, standardised database from a wide range of sources, including data from patients and citizens, is to be established as quickly as possible that meets all the requirements of research ethics and the EU GDPR.
- The aim is to create a database that also enables new types of scientific analyses and the development of evidence-based decision support systems.
- The data basis for political decisions is to be sustainably improved by making relevant data from patient care (diagnostics, therapy) available in a timely manner on a nationwide equal basis.
- Innovative, high-quality services and applications for citizens (apps for patient-generated data) and for healthcare facilities (foundations for clinical decision support) should make it easier to deal with the pandemic and pandemic management and improve patient care.
The DZHK's clinical research platform was adapted to the requirements of COVID-19 research in just two months. This allowed the recruitment of patients and infected persons to begin immediately. From October 2020 to November 2021, complex data sets from 3,740 patients from the three NAPKON cohorts were recorded in the platform and the first research projects were carried out.
The DZHK e.V. will hand over the coordination of the clinical research platform to the Network University Medicine at the end of 2021. As the collection of data has proven successful, the infrastructure partners of the DZHK will continue to collect data on NAPKON patients in the future, then as direct partners of the NUM. The clinical research platform thus complements the MI-I research data platform, which is focussed on routine data.
Software development in the CODEX project is largely complete and includes, among other things
- NUM nodes to be installed at the sites for data delivery to the central platform, as well as the mediation of federated queries;
- the central research data platform for the central storage and analysis of pandemic data;
- decentralised, transactional components for the standardised connection of value-added services (e.g. the Smart Infection Control App SmICS) at the locations;
- the dashboard component for the visually informative presentation of aggregated pandemic data from the sites.
CODEX utilises the GECCO dataset created as part of NAPKON, a dataset based on the Fast Healthcare Interoperability Resources (FHIR), which was developed very quickly and successfully to provide the scientific community with a common language and working basis.
More than half of the 34 participating sites have already successfully participated in federated analyses based on the NUM node software. More than a third of the sites have successfully sent synthetic data to the central platform as part of the technical test. Further sites will follow suit.
Around half of the sites have already received a positive ethics vote, which allows data to be stored in the central platform if the Medical Informatics Initiative (MI-I) broad consent has been obtained. This also guarantees the provision of data via other consents that have been approved by the respective ethics committee.
The regulatory requirements created in this way, the data sets and network structures utilised enabled the complete publication of two publications and the submission of a further study. With the knowledge of the press, the care situation in the participating university hospitals could be presented visually and analysed in the dashboard.