Car design is an iterative and proprietary process. Carmakers can spend several years on the design phase for a car, tweaking 3D forms in simulations before building out the most promising designs for physical testing. The details and specs of these tests, including the aerodynamics of a given car design, are typically not made public. Significant advances in performance, such as in fuel efficiency or electric vehicle range, can therefore be slow and siloed from company to company.
MIT engineers say that the search for better car designs can speed up exponentially with the use of generative artificial intelligence tools that can plow through huge amounts of data in seconds and find connections to generate a novel design. While such AI tools exist, the data they would need to learn from have not been available, at least in any sort of accessible, centralized form.
But now, the engineers have made just such a dataset available to the public for the first time. Dubbed DrivAerNet++, the dataset encompasses more than 8,000 car designs, which the engineers generated based on the most common types of cars in the world today. Each design is represented in 3D form and includes information on the car's aerodynamics - the way air would flow around a given design, based on simulations of fluid dynamics that the group carried out for each design.
Each of the dataset's 8,000 designs is available in several representations, such as mesh, point cloud, or a simple list of the design's parameters and dimensions. As such, the dataset can be used by different AI models that are tuned to process data in a particular modality.
DrivAerNet++ is the largest open-source dataset for car aerodynamics that has been developed to date. The engineers envision it being used as an extensive library of realistic car designs, with detailed aerodynamics data that can be used to quickly train any AI model. These models can then just as quickly generate novel designs that could potentially lead to more fuel-efficient cars and electric vehicles with longer range, in a fraction of the time that it takes the automotive industry today.
"This dataset lays the foundation for the next generation of AI applications in engineering, promoting efficient design processes, cutting R&D costs, and driving advancements toward a more sustainable automotive future," says Mohamed Elrefaie, a mechanical engineering graduate student at MIT.
Elrefaie and his colleagues will present a paper detailing the new dataset, and AI methods that could be applied to it, at the NeurIPS conference in December. His co-authors are Faez Ahmed, assistant professor of mechanical engineering at MIT, along with Angela Dai, associate professor of computer science at the Technical University of Munich, and Florin Marar of BETA CAE Systems.
Filling the data gap
Ahmed leads the Design Computation and Digital Engineering Lab (DeCoDE) at MIT, where his group explores ways in which AI and machine-learning tools can be used to enhance the design of complex engineering systems and products, including car technology.
"Often when designing a car, the forward process is so expensive that manufacturers can only tweak a car a little bit from one version to the next," Ahmed says. "But if you have larger datasets where you know the performance of each design, now you can train machine-learning models to iterate fast so you are more likely to get a better design."
And speed, particularly for advancing car technology, is particularly pressing now.
"This is the best time for accelerating car innovations, as automobiles are one of the largest polluters in the world, and the faster we can shave off that contribution, the more we can help the climate," Elrefaie says.
In looking at the process of new car design, the researchers found that, while there are AI models that could crank through many car designs to generate optimal designs, the car data that is actually available is limited. Some researchers had previously assembled small datasets of simulated car designs, while car manufacturers rarely release the specs of the actual designs they explore, test, and ultimately manufacture.
The team sought to fill the data gap, particularly with respect to a car's aerodynamics, which plays a key role in setting the range of an electric vehicle, and the fuel efficiency of an internal combustion engine. The challenge, they realized, was in assembling a dataset of thousands of car designs, each of which is physically accurate in their function and form, without the benefit of physically testing and measuring their performance.
To build a dataset of car designs with physically accurate representations of their aerodynamics, the researchers started with several baseline 3D models that were provided by Audi and BMW in 2014. These models represent three major categories of passenger cars: fastback (sedans with a sloped back end), notchback (sedans or coupes with a slight dip in their rear profile) and estateback (such as station wagons with more blunt, flat backs). The baseline models are thought to bridge the gap between simple designs and more complicated proprietary designs, and have been used by other groups as a starting point for exploring new car designs.
Library of cars
In their new study, the team applied a morphing operation to each of the baseline car models. This operation systematically made a slight change to each of 26 parameters in a given car design, such as its length, underbody features, windshield slope, and wheel tread, which it then labeled as a distinct car design, which was then added to the growing dataset. Meanwhile, the team ran an optimization algorithm to ensure that each new design was indeed distinct, and not a copy of an already-generated design. They then translated each 3D design into different modalities, such that a given design can be represented as a mesh, a point cloud, or a list of dimensions and specs.
The researchers also ran complex, computational fluid dynamics simulations to calculate how air would flow around each generated car design. In the end, this effort produced more than 8,000 distinct, physically accurate 3D car forms, encompassing the most common types of passenger cars on the road today.
To produce this comprehensive dataset, the researchers spent over 3 million CPU hours using the MIT SuperCloud, and generated 39 terabytes of data. (For comparison, it's estimated that the entire printed collection of the Library of Congress would amount to about 10 terabytes of data.)
The engineers say that researchers can now use the dataset to train a particular AI model. For instance, an AI model could be trained on a part of the dataset to learn car configurations that have certain desirable aerodynamics. Within seconds, the model could then generate a new car design with optimized aerodynamics, based on what it has learned from the dataset's thousands of physically accurate designs.
The researchers say the dataset could also be used for the inverse goal. For instance, after training an AI model on the dataset, designers could feed the model a specific car design and have it quickly estimate the design's aerodynamics, which can then be used to compute the car's potential fuel efficiency or electric range - all without carrying out expensive building and testing of a physical car.
"What this dataset allows you to do is train generative AI models to do things in seconds rather than hours," Ahmed says. "These models can help lower fuel consumption for internal combustion vehicles and increase the range of electric cars - ultimately paving the way for more sustainable, environmentally friendly vehicles."
This work was supported, in part, by the German Academic Exchange Service and the Department of Mechanical Engineering at MIT.