Important new updates to the largest open database for polygenic scores, the Polygenic Score (PGS) Catalog, could help to generate more equitable disease risk predictions for a diverse range of ethnic backgrounds.
These updates — which include the addition of data from multi-ancestry and non-European populations and a new software tool for the PGS calculation — are described in a Nature Genetics paper.
This work was undertaken as part of the Cambridge Baker Systems Genomics Initiative, a research partnership between the Baker Heart and Diabetes Institute and Cambridge University to significantly expand the Baker Institute's ability to access big data and corresponding expertise to target approaches in disease prediction and personalised medicine. Other collaborators involved in this latest work include the EMBL's European Bioinformatics Institute (EMBL-EBI), the GWAS Catalog and colleagues.
Head of the Cambridge Baker Systems Genomics Initiative and Munz Chair of Cardiovascular Prediction and Prevention, Professor Mike Inouye, says the PGS Catalog is the largest open database for polygenic scores with ~27,000 users from over 140 countries in the past year alone. These scores estimate an individual's genetic predisposition to a specific trait or disease by summarising the effect of many different genetic variants across the genome.
"Polygenic scores are particularly useful for predicting complex health conditions such as heart disease, diabetes, and certain cancers, where multiple genetic variants contribute to the overall risk," says Professor Inouye. "Integrating these scores into clinical practice could help scientists and clinicians understand the genetic influences on health, potentially leading to better prevention strategies and tailored treatments."
The PGS Catalog was created to standardise the way these scores are reported and to make them more reliable for clinical applications.
Increasing ancestral diversity
The field of genetics increasingly recognises the importance of diversifying genomic datasets. Since its inception in 2021, the PGS Catalog has grown to host over 4735 polygenic risk scores representing a 721% increase. Much of this increase also expands the ancestral diversity of the Catalog's data. Due to lack of genetic data from populations of non-European ancestry, data in early releases of PGS Catalog mostly consisted of scores using data from individuals of European ancestry. Now more PGS have been added from studies using African, Asian, and often multi-ancestry data to develop and evaluate the PGS.
The PGS Catalog Calculator
The PGS Catalog Calculator is a new addition to the PGS Catalog. This open-source software tool automates the process of calculating PGS, allowing users to apply polygenic scores to new genomic data, simplifying tasks such as genotype data formatting and variant matching.
The Calculator also implements methods for genetic similarity analysis and ancestry adjustment, an important step towards ensuring that calculated polygenic scores are more interpretable across populations. This could help to streamline the use of polygenic scores in research and clinical studies.
Take a look at the PGS Catalog and get access to the PGS Catalog Calculator software on GitHub. The Calculator documentation includes explanations about polygenic scores, ancestry adjustment, and guides on how to install and use the software.