Researchers at the IBB-UAB have developed the most comprehensive database available to date to help understand the basis of protein aggregation, a phenomenon associated with ageing and several pathologies. The new resource, A3D-MOBD, brings together the proteomes of twelve of the most studied model organisms which cover distant biological clades and contains over half a million predictions of protein regions with a propensity to form aggregates.
The A3D-MOBD was developed by the Protein Folding and Computational Diseases Group at the Institut de Biotecnologia i de Biomedicina of the Universitat Autònoma de Barcelona (IBB-UAB), which is directed by Biochemistry and Molecular Biology Professor Salvador Ventura, and in collaboration with scientists from the University of Warsaw, was recently published in the journal Nucleic Acids Research. It provides pre-calculated aggregation propensity analyses and tools for the study of this phenomenon on a proteomic scale, as well as evolutionary comparison between different species.
The new resource builds on the method that the same research group designed in 2015, Aggrescan 3D, but significantly expands the obtainable data. In total, it contains more than 500,000 structural predictions for more than 160,000 proteins from twelve highly characterised model organisms of great interest and widely used biology, biotechnology and biomedicine research. It includes the herbaceous plant Arabidopsis thaliana, the nematode worm Caenorhabditis elegans, zebrafish Danio rerio, the enteric bacterium Escherichia coli, the minimal genome bacteria Mycoplasma genitalium, mouse Mus musculus, the fusion and fission yeasts Saccharomyces cerevisiae and Schizosaccharomyces pombe, human Homo sapiens, rat Rattus norvegicus, the fruit fly Drosophila melanogaster and the COVID-19 causative virus SARS-CoV-2. The adaptive architecture of A3D-MOBD allows for future additions of other organisms relevant to the medical, biological, agricultural and industrial sectors.
In addition, the tool provides results on protein solubility and stability and includes additional information to contextualise the aggregation process. To develop it, researchers used several computational sources such as the artificial intelligence-based protein structure modelling algorithm AlphaFold or TOPCONS for the prediction of protein interaction with lipid membranes, as well as linking to organism specific gold-reference databases such as the Human Protein Atlas or Wormbase.
Researchers from the Protein Folding and Computational Diseases group taking part in the project. From left to right: Javier Garcia-Pardo, Oriol Bàrcenas, Valentín Iglesias, Michał Burdukiewicz, Salvador Ventura and Carlos Pintado-Grima.
Protein aggregation is associated with ageing and is the basis of different pathologies, such as Parkinson's disease, Alzheimer's disease and amyotrophic lateral sclerosis (ALS). It is also one of the most important barriers in the industrial production of therapeutic molecules, increasing their final price. With the publication of this database, researchers hope to obtain new clues to understand why some diseases caused by protein aggregation develop in some species, while other organisms are not susceptible to them.
The resource now published by UAB researchers represents the most comprehensive tool available to date for the prediction of aggregation-prone regions. "We anticipate that it will offer solutions to a much wider audience of researchers, not only because of the large collection of structures, but also because of its integration with databases from different biological fields," says Salvador Ventura. "We are confident that it will set a new standard in protein aggregation research and we expect it to become a basic resource in this field," concludes the UAB researcher.
A3D-MOBD website: http://biocomp.chem.uw.edu.pl/A3D2/MODB
Article: Badaczewska-Dawid AE, Kuriata A, Pintado-Grima C, Garcia-Pardo J, Burdukiewicz M, Iglesias V, Kmiecik S, Ventura S. A3D Model Organism Database (A3D-MODB): a database for proteome aggregation predictions in model organisms. Nucleic Acids Res 2023, gkad942. DOI: 10.1093/nar/gkad942