Outbreak data analysis platform
The data analysis platform provides a unique combination of linked, curated data from UK sovereign data assets, together with a flexible high performance compute space. Created for Covid-19 research, the ISARIC4C data analysis platform combines the data safeguards of an NHS trusted research environment, with >£100M of exabyte-scale computational capacity on the UK national supercomputer. This creates a unique opportunity to combine clinical, biological, genomics and virology research in as secure, openly-accessible framework.
The outbreak analysis platform was developed by ISARIC4C to encourgage and facilitate research by collating, linking and curating clinical and research data, enabling deep integrative analyses of multi-omic disease profiling, stratified by viral variant, clinical phenotype and outcome.
This platform now serves as a hub for a coordinated UK national research response to COVID-19. Data are included from:
- ISARIC4C tier 0: (unconsented) prospective clinical data from 207094 cases
- ISARIC4C tiers 1 and 2: serial multiomic assays from research samples of blood, respiratory secretions, urine, and stool from 2309 cases
- COG-UK: (unconsented) summary variant data from COG-UK viral sequencing study is already included for matched patients
- GenOMICC study complete data: microarray and whole genome sequence data from 12591 cases
- PHOSP complete data: follow-up clinical and biological data generated by the Post-Hospitalisation for COVID-19 follow-up study (1075 cases)
- UK-CIC: deep immunological phenotyping data from across the UK Coronavirus Immunology Consortium, using ISARIC4C samples and local collections.
Research data within the analysis platform is already linked to:
- NHS Scotland primary, secondary care and death records
- NHS Digital health records data
In future, plans are in place to transfer data to link with:
- ICNARC and SICSAG critical care audit databases
- NIMS National Immunisation Dataset
- Pillar 1 testing
- Pillar 2 testing
The ISARIC Coronavirus Clinical Characterisation Consortium (4C) is the largest observational study of hospitalised patients with COVID-19 anyhwere in the world. By generating, integrating and analysing clinical, biological, genetic and virological data on patients with Covid-19 in UK hospitals, ISARIC4C has:
- provided essential weekly updates to SAGE that guide the public health response isaric4c.net/reports/,
- quantified the role of age, comorbid illness and obesity in disease severity,1
- identified the substantial effect of nosocomial transmission of Covid-19 within hospitals SPI-M/SAGE report,
- created the global standard ISARIC4C score for prognostication isaric4c.net/risk,2
- elucidated cytokine patterns underlying disease mechanisms,3
- identified host genetic mechanisms of disease,4
- provided key evidence underlying the choice of therapeutic agents for clinical trials3,4
- provided data supporting identification of high risk groups for vaccination (highlighted in No10 briefing)
- provided real world data on vaccine effectiveness and failure (SAGE 87 Egan et al, Egan et al.)
Because of these and other achievements (see isaric4c.net/outputs), ISARIC4C was used as exemplar by the National Institte of Healthcare Research for pandemic preparedness research resulting in real patient benefit.
Analysis platform structure
There are two routes of access to the analysis platform (Figure 1): 1. NHS Trusted Research Environment (Safe Haven) for access to personal clinical data and data collected without explicit consent. 2. Rapid-access flexible compute for access to non-disclosive research data collected with explicit consent.
Within both of these environments there is an additional division in the data: 1. Publishable “open access” data which any user can use and report as they wish, according to data protection and privacy rules; 2. Embargoed active research data, shared by academic investigators and available for linked analysis but not for publication without agreement from all contributors.
This design is intended to build trust in order to encourage immediate contributions of research data from academic collaborators.
Rapid addition of viral sequence data from the COG-UK platform will enable real-time detection of the clinical impact of new viral strains, in-depth biological study of reinfection, and host:pathogen interactions at a genetic and mechanistic level.