Supercomputing to perform multi-omic analysis is becoming a powerful approach to better understand disease biology and to advance disease prediction, and in this powerful new paper, it is being harnessed to train genetic prediction models.
In a comprehensive and robust study published in the prestigious journal, Nature, scientists from the Cambridge Baker Systems Genomics Initiative at the University of Cambridge and Baker Heart and Diabetes Institute led a large team of global collaborators to develop, validate and apply multi-omic genetic scores using machine learning for more than 17,000 molecular traits. They also developed an online portal (OmicsPred.org) for these genetic scores to accelerate research in this fast-growing area.
The study utilised the INTERVAL study, a large cohort of 50,000 healthy UK blood donors with extensive multi-omic profiling. This enabled prediction of 13,668 RNA transcripts, 2692 proteins and 867 metabolites.
Multi-omics (such as transcriptomics, proteomics, and metabolomics) can provide a comprehensive and powerful view of biological systems. Increasing evidence has shown that genetic prediction of complex molecular traits including genes, proteins and metabolites can be an accurate, efficient and powerful tool in research and clinical settings to better understand cardiovascular diseases, diabetes, cancers and other diseases. It can also help with discovery of novel drug targets and biomarkers.
Computational biology expert Professor Michael Inouye, who led this study, says this study demonstrates how this pioneering research is helping to overcome challenging issues around time, cost and underrepresented demographics.
Munz Chair of Cardiovascular Prediction and Prevention at the Baker Institute, Professor Inouye says the collection of multi-omics data is an extremely expensive and time-consuming process.
"Because of these barriers, large-scale population cohorts typically generate multi-omic data for only a subset of participants, which reduces the statistical power of subsequent analyses and creates inequities for studies that do not have ample resources or are from underrepresented ancestries and other demographics", Professor Inouye says.
In this study, he says the relative predictive values and robustness of the genetic scores are assessed and validated in seven different external studies comprising European, East Asian, South Asian and African American ancestries.
Professor Inouye says they also demonstrated the longitudinal stability and utility of these genetic scores and highlighted a series of biological insights regarding genetic mechanisms in metabolism and pathway associations with disease.
"We anticipate the OmicsPred resource will be widely and routinely utilised to investigate multi-omic traits and phenotype associations."