AI models often rely on "spurious correlations," making decisions based on unimportant and potentially misleading information. Researchers have now discovered these learned spurious correlations can be traced to a very small subset of the training data and have demonstrated a technique that overcomes the problem.
"This technique is novel in that it can be used even when you have no idea what spurious correlations the AI is relying on," says Jung-Eun Kim, corresponding author of a paper on the work and an assistant professor of computer science at North Carolina State University. "If you already have a good idea of what the spurious features are, our technique is an efficient and effective way to address the problem. However, even if you are simply having performance issues, but don't understand why, you could still use our technique to determine whether a spurious correlation exists and resolve that issue."
Spurious correlations are generally caused by simplicity bias during AI training. Practitioners use data sets to train AI models to perform specific tasks. For example, an AI model could be trained to identify photographs of dogs. The training data set would include pictures of dogs where the AI is told a dog is in the photo. During the training process, the AI will begin identifying specific features that it can use to identify dogs. However, if many of the dogs in the photos are wearing collars, and because collars are generally less complex features of a dog than ears or fur, the AI may use collars as a simple way to identify dogs. This is how simplicity bias can cause spurious correlations.
"And if the AI uses collars as the factor it uses to identify dogs, the AI may identify cats wearing collars as dogs," Kim says.
Conventional techniques for addressing problems caused by spurious correlations rely on practitioners being able to identify the spurious features that are causing the problem. They can then address this by modifying the data sets used to train the AI model. For example, practitioners might increase the weight given to photos in the data set that include dogs that are not wearing collars.
However, in their new work, the researchers demonstrate that it is not always possible to identify the spurious features that are causing problems - making conventional techniques for addressing spurious correlations ineffective.
"Our goal with this work was to develop a technique that allows us to sever spurious correlations even when we know nothing about those spurious features," Kim says.
The new technique relies on removing a small portion of the data used to train the AI model.
"There can be significant variation in the data samples included in training data sets," Kim says. "Some of the samples can be very simple, while others may be very complex. And we can measure how 'difficult' each sample is based on how the model behaved during training.
"Our hypothesis was that the most difficult samples in the data set can be noisy and ambiguous, and are most likely to force a network to rely on irrelevant information that hurt a model's performance," Kim explains. "By eliminating a small sliver of the training data that is difficult to understand, you are also eliminating the hard data samples that contain spurious features. This elimination overcomes the spurious correlations problem, without causing significant adverse effects."
The researchers demonstrated that the new technique achieves state-of-the-art results - improving performance even when compared to previous work on models where the spurious features were identifiable.
The peer-reviewed paper, "Severing Spurious Correlations with Data Pruning," will be presented at the International Conference on Learning Representations (ICLR), being held in Singapore from April 24-28. First author of the paper is Varun Mulchandani, a Ph.D. student at NC State.