Toward Video Generative Models Of Molecular World

Massachusetts Institute of Technology

As the capabilities of generative AI models have grown, you've probably seen how they can transform simple text prompts into hyperrealistic images and even extended video clips.

More recently, generative AI has shown potential in helping chemists and biologists explore static molecules, like proteins and DNA. Models like AlphaFold can predict molecular structures to accelerate drug discovery, and the MIT-assisted " RFdiffusion ," for example, can help design new proteins. One challenge, though, is that molecules are constantly moving and jiggling, which is important to model when constructing new proteins and drugs. Simulating these motions on a computer using physics - a technique known as molecular dynamics - can be very expensive, requiring billions of time steps on supercomputers.

As a step toward simulating these behaviors more efficiently, MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) and Department of Mathematics researchers have developed a generative model that learns from prior data. The team's system, called MDGen, can take a frame of a 3D molecule and simulate what will happen next like a video, connect separate stills, and even fill in missing frames. By hitting the "play button" on molecules, the tool could potentially help chemists design new molecules and closely study how well their drug prototypes for cancer and other diseases would interact with the molecular structure it intends to impact.

Co-lead author Bowen Jing SM '22 says that MDGen is an early proof of concept, but it suggests the beginning of an exciting new research direction. "Early on, generative AI models produced somewhat simple videos, like a person blinking or a dog wagging its tail," says Jing, a PhD student at CSAIL. "Fast forward a few years, and now we have amazing models like Sora or Veo that can be useful in all sorts of interesting ways. We hope to instill a similar vision for the molecular world, where dynamics trajectories are the videos. For example, you can give the model the first and 10th frame, and it'll animate what's in between, or it can remove noise from a molecular video and guess what was hidden."

The researchers say that MDGen represents a paradigm shift from previous comparable works with generative AI in a way that enables much broader use cases. Previous approaches were "autoregressive," meaning they relied on the previous still frame to build the next, starting from the very first frame to create a video sequence. In contrast, MDGen generates the frames in parallel with diffusion. This means MDGen can be used to, for example, connect frames at the endpoints, or "upsample" a low frame-rate trajectory in addition to pressing play on the initial frame.

This work was presented in a paper shown at the Conference on Neural Information Processing Systems (NeurIPS) this past December. Last summer, it was awarded for its potential commercial impact at the International Conference on Machine Learning's ML4LMS Workshop.

Some small steps forward for molecular dynamics

In experiments, Jing and his colleagues found that MDGen's simulations were similar to running the physical simulations directly, while producing trajectories 10 to 100 times faster.

The team first tested their model's ability to take in a 3D frame of a molecule and generate the next 100 nanoseconds. Their system pieced together successive 10-nanosecond blocks for these generations to reach that duration. The team found that MDGen was able to compete with the accuracy of a baseline model, while completing the video generation process in roughly a minute - a mere fraction of the three hours that it took the baseline model to simulate the same dynamic.

When given the first and last frame of a one-nanosecond sequence, MDGen also modeled the steps in between. The researchers' system demonstrated a degree of realism in over 100,000 different predictions: It simulated more likely molecular trajectories than its baselines on clips shorter than 100 nanoseconds. In these tests, MDGen also indicated an ability to generalize on peptides it hadn't seen before.

MDGen's capabilities also include simulating frames within frames, "upsampling" the steps between each nanosecond to capture faster molecular phenomena more adequately. It can even "inpaint" structures of molecules, restoring information about them that was removed. These features could eventually be used by researchers to design proteins based on a specification of how different parts of the molecule should move.

Toying around with protein dynamics

Jing and co-lead author Hannes Stärk say that MDGen is an early sign of progress toward generating molecular dynamics more efficiently. Still, they lack the data to make these models immediately impactful in designing drugs or molecules that induce the movements chemists will want to see in a target structure.

The researchers aim to scale MDGen from modeling molecules to predicting how proteins will change over time. "Currently, we're using toy systems," says Stärk, also a PhD student at CSAIL. "To enhance MDGen's predictive capabilities to model proteins, we'll need to build on the current architecture and data available. We don't have a YouTube-scale repository for those types of simulations yet, so we're hoping to develop a separate machine-learning method that can speed up the data collection process for our model."

For now, MDGen presents an encouraging path forward in modeling molecular changes invisible to the naked eye. Chemists could also use these simulations to delve deeper into the behavior of medicine prototypes for diseases like cancer or tuberculosis.

"Machine learning methods that learn from physical simulation represent a burgeoning new frontier in AI for science," says Bonnie Berger, MIT Simons Professor of Mathematics, CSAIL principal investigator, and senior author on the paper. "MDGen is a versatile, multipurpose modeling framework that connects these two domains, and we're very excited to share our early models in this direction."

"Sampling realistic transition paths between molecular states is a major challenge," says fellow senior author Tommi Jaakkola, who is the MIT Thomas Siebel Professor of electrical engineering and computer science and the Institute for Data, Systems, and Society, and a CSAIL principal investigator. "This early work shows how we might begin to address such challenges by shifting generative modeling to full simulation runs."

Researchers across the field of bioinformatics have heralded this system for its ability to simulate molecular transformations. "MDGen models molecular dynamics simulations as a joint distribution of structural embeddings, capturing molecular movements between discrete time steps," says Chalmers University of Technology associate professor Simon Olsson, who wasn't involved in the research. "Leveraging a masked learning objective, MDGen enables innovative use cases such as transition path sampling, drawing analogies to inpainting trajectories connecting metastable phases."

The researchers' work on MDGen was supported, in part, by the National Institute of General Medical Sciences, the U.S. Department of Energy, the National Science Foundation, the Machine Learning for Pharmaceutical Discovery and Synthesis Consortium, the Abdul Latif Jameel Clinic for Machine Learning in Health, the Defense Threat Reduction Agency, and the Defense Advanced Research Projects Agency.

/University Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.

You might also like