New Technique Unveiled to Steal AI Models

NC State

Researchers have demonstrated the ability to steal an artificial intelligence (AI) model without hacking into the device where the model was running. The technique is novel in that it works even when the thief has no prior knowledge of the software or architecture that support the AI.

"AI models are valuable, we don't want people to steal them," says Aydin Aysu, co-author of a paper on the work and an associate professor of electrical and computer engineering at North Carolina State University. "Building a model is expensive and requires significant computing sources. But just as importantly, when a model is leaked, or stolen, the model also becomes more vulnerable to attacks - because third parties can study the model and identify any weaknesses."

"As we note in the paper, model stealing attacks on AI and machine learning devices undermine intellectual property rights, compromise the competitive advantage of the model's developers, and can expose sensitive data embedded in the model's behavior," says Ashley Kurian, first author of the paper and a Ph.D. student at NC State.

In this work, the researchers stole the hyperparameters of an AI model that was running on a Google Edge Tensor Processing Unit (TPU).

"In practical terms, that means we were able to determine the architecture and specific characteristics - known as layer details - we would need to make a copy of the AI model," says Kurian.

"Because we stole the architecture and layer details, we were able to recreate the high-level features of the AI," Aysu says. "We then used that information to recreate the functional AI model, or a very close surrogate of that model."

The researchers used the Google Edge TPU for this demonstration because it is a commercially available chip that is widely used to run AI models on edge devices - meaning devices utilized by end users in the field, as opposed to AI systems that are used for database applications.

"This technique could be used to steal AI models running on many different devices," Kurian says. "As long as the attacker knows the device they want to steal from, can access the device while it is running an AI model, and has access to another device with the same specifications, this technique should work."

The technique used in this demonstration relies on monitoring electromagnetic signals. Specifically, the researchers placed an electromagnetic probe on top of a TPU chip. The probe provides real-time data on changes in the electromagnetic field of the TPU during AI processing.

"The electromagnetic data from the sensor essentially gives us a 'signature' of the AI processing behavior," Kurian says. "That's the easy part."

To determine the AI model's architecture and layer details, the researchers compare the electromagnetic signature of the model to a database of other AI model signatures made on an identical device - meaning another Google Edge TPU, in this case.

How can the researchers "steal" an AI model for which they don't already have a signature? That's where things get tricky.

The researchers have a technique that allows them to estimate the number of layers in the targeted AI model. Layers are a series of sequential operations that the AI model performs, with the result of each operation informing the following operation. Most AI models have 50 to 242 layers.

"Rather than trying to recreate a model's entire electromagnetic signature, which would be computationally overwhelming, we break it down by layer," Kurian says. "We already have a collection of 5,000 first-layer signatures from other AI models. So we compare the stolen first layer signature to the first layer signatures in our database to see which one matches most closely.

"Once we've reverse-engineered the first layer, that informs which 5,000 signatures we select to compare with the second layer," Kurian says. "And this process continues until we've reverse-engineered all of the layers and have effectively made a copy of the AI model."

In their demonstration, the researchers showed that this technique was able to recreate a stolen AI model with 99.91% accuracy.

"Now that we've defined and demonstrated this vulnerability, the next step is to develop and implement countermeasures to protect against it," says Aysu.

The paper, "TPUXtract: An Exhaustive Hyperparameter Extraction Framework," is published online by the Conference on Cryptographic Hardware and Embedded Systems. The paper was co-authored by Anuj Dubey, a former Ph.D. student at NC State, and Ferhat Yaman, a former graduate student at NC State. The work was done with support from the National Science Foundation, under grant number 1943245.

The researchers disclosed the vulnerability they identified to Google.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.