In a recently published article in Nature Communications, a team of researchers from the University of Virginia — including Phil Bourne, dean of the School of Data Science, Cam Mura, a senior scientist with the School, and Eli Draizen, a recent UVA alumnus— offer an AI-driven approach to explore structural similarities and relationships across the protein universe.
Their study challenges conventional notions about protein structure relationships (that is, patterns of similarities and differences) and, in so doing, identifies many faint relationships that are missed by traditional methods.
Specifically, the authors report a computational framework that can detect and quantify such protein relationships at scale (across myriad proteins), in a novel, flexible, and nuanced manner that combines deep learning-based approaches with a new conceptual model, known as the Urfold, that allows for two proteins to exhibit architectural similarity despite having differing topologies or "folds."
Bourne, Mura and Draizen collaborated on the project with Stella Veretnik. All of the authors are members of the Bourne & Mura Computational Biosciences Lab, which is part of the School of Data Science and UVA's Department of Biomedical Engineering.
The publication is the culmination of years of work by the Bourne Lab to develop this AI-driven framework, called DeepUrfold, to enable the Urfold theory of structure relationships to be explored systematically and at scale.
Using DeepUrfold, the Bourne Lab team detected faint structural relationships across the protein universe between proteins that had otherwise been deemed as unrelated, evolutionarily or otherwise.
In capturing and describing these distant relationships, DeepUrfold views protein relationships in terms of "communities" and avoids the conventional approach of classifying proteins into separate, non-overlapping bins. Taken together, these new methodological approaches could push researchers to move beyond thinking of protein similarities in static, geometric terms and toward a more integrated approach.
Bourne, founding dean of the School of Data Science, is world renowned in the scientific community for his research, including structural bioinformatics and computational biology more broadly. Earlier in his career, he co-led the development of the RCSB Protein Data Bank, a veritable treasure trove of protein structure information that helped revolutionize the field and paved the way to contemporary AI advances like AlphaFold.
Mura, who holds appointments with the School of Data Science and Department of Biomedical Engineering at UVA, has an extensive background in structural and computational biology, including biochemical and crystallographic studies of RNA-based systems and molecular biophysics of DNA. He views biological systems through the lens of molecular evolution and explores the intersection of these areas with data science.
Draizen received a doctorate in biomedical engineering from UVA under the mentorship of Bourne and currently serves as a postdoctoral scholar in computational biology at the University of California, San Francisco.
Veretnik has been a senior research scientist at UVA who focuses on computational biology and the structure, function, and evolution of protein folds.