Geneticists looking inside the nuclear genome for mutations that contribute to disease have long relied on a principal known as constraint modeling, which allows researchers to assess the degree of selective pressure that leads to the purging of certain gene variants. But while constraint models have been highly effective for identifying disease-causing variants in the nuclear genome, they have not been useful for mutations in the mitochondrial genome, a source of frustration for geneticists and families living with genetic illnesses.
[T]his tool for the first time is providing a map of which sites in the mitochondrial genome are most important for health and disease.
Nicole Lake
A Yale-led team, however, has developed a breakthrough approach that offers hope. The novel framework - described recently in the journal Nature - provides geneticists with a long-needed tool to determine which mitochondrial DNA (mtDNA) mutations contribute to disease.
The multidisciplinary team was led by Yale geneticists Nicole Lake and Monkol Lek.
"We had few tools to help us identify disease-causing mutations in the mtDNA," said Lake, an assistant professor of genetics at Yale School of Medicine (YSM) and the Yale Center of Genomic Health. "While there are dozens available for the nuclear genome, this tool for the first time is providing a map of which sites in the mitochondrial genome are most important for health and disease."
Mitochondria are cellular structures known as the cells' "power plants" because they are the site of cells' energy production. They contain DNA, inherited from one's mother, and are an essential subset of the genome. Mitochondria determine whether a cell lives or dies via the process of programmed cell death, or apoptosis.
Attempts to build a mitochondrial constraint model have been frustrated by several factors, including the comparatively small size of the mitochondrial genome and the unique features of mtDNA.
To address this, the Yale team developed a constraint model that employs an entirely new methodology. As a first step, they created a mitochondrial mutational model that adapted a version of a "composite likelihood" model to analyze a collection of newly arising genetic mutations, which helped the researchers understand the likelihood of mutations occurring at different locations within the genome. That, in turn, enabled them to create a map that shows which regions were more prone to genetic mutations underlying disease.
The model builds on a previous study by Lake and Lek, who is also an assistant professor of genetics at YSM, for which they generated a mitochondrial data set of 56,434 individuals in collaboration with Lek's former colleagues at the Broad Institute of MIT and Harvard. When the researchers fed that same mitochondrial data into the new model, they were able to quantify constraint - or the absence of change - in mtDNA, confirming the new model's effectiveness in identifying regions with the presence of mutations contributing to mitochondrial disease.
We built a mutational model that was really predictive and gave us the results we had hoped to see. That's when we really knew we had a way forward.
Nicole Lake
To further validate the model, the team augmented their data with that of another large genomic project, the UK Biobank, which contains genetic and health information from half a million participants. When they applied their model to variants from this expanded pool of individuals with mitochondrial disease, their constraint model held up. This, they said, demonstrated the model could be used by other researcher to advance the discovery of mtDNA variation in underlying disease.
"It's about building a model that's robust; that can be verified," said Lek. "That's reliable and reliably predictive."
Many variants found in the mtDNA are referred to as "variants of uncertain significance," which is frustrating to geneticists and the families of people with these diseases, Lake said. "I'm hopeful the tool laid out in this paper provides the information we need, the clues, for reducing these uncertainties in analyzing mtDNA," she said.
While the breakthrough is hardly a matter of chance, Lake credits a series of serendipitous conversations with helping the team get to their "eureka" moment. In 2020, about a week before the COVID-19 pandemic triggered shutdowns across the world, Lek mentioned the challenges that the researchers were having to Shamil Sunyaev, a computational genomicist and geneticist at Harvard. Sunyaev, who had been invited to speak at Yale's Department of Genetics, noted a similar problem he and his colleagues at the Broad Institute had encountered researching cancer genetics - a challenge they addressed using a composite likelihood model.
"All credit to Nicole," said Lek. "I said 'what do you think?' and she ran with it."
With the world shut down, Lake and Lek began meeting virtually with Sunyaev, who listened to their ideas and raised questions. As the project progressed, they brought in Dan Arking, a professor of genetic medicine at Johns Hopkins University, and other experts to help analyze their mtDNA data set - now part of gnomAD, a large, open-source international genomic database - and the data from the UK Biobank.
"We built a mutational model that was really predictive and gave us the results we had hoped to see," said Lake. "That's when we really knew we had a way forward."
Added Lek: "We made sure to incorporate as many perspectives as possible. The best work is done in collaboration."
Lake sees their breakthrough model as a first-generation tool and is thrilled that it will be freely accessible to scientists across the globe.
"This is an important advance but there is exciting potential to expand," she said. "This has been a somewhat overlooked part of our genome and it's a part of our genome that really matters. There is a lot more to learn there."
The international team of researchers also included Kaiyue Ma, Kenneth K. Ng, Justin Cohen, and Hongyu Zhao, all from Yale School of Medicine.