Model weights are the heart of AI’s intelligence – and its Achilles heel

  • AWS’ VP of Infrastructure Services thinks more people need to focus on AI security
  • He pointed to model weights as a critical area needing to be locked down
  • Research group RAND has identified more than three dozen attack vectors that could impact model weights

When we asked AWS’ VP of Infrastructure Services Prasad Kalyanaraman to name one thing he thinks people in the industry should be talking about more, he barely took a breath before he laid it on us: security. And more specifically, security around artificial intelligence (AI) model weights.

To wildly oversimplify, model weights are inputs that determine the strength of connections an AI model makes between different things. They essentially function like dials you can turn to emphasize different things within the model.

“One of the core things about responsible AI is making sure the model weights are secure,” he told Fierce this week.

“Once you actually have the model weights, you can recreate the architecture for a training network and you can make the training network do things for you which may not be responsible,” he explained. “So, the model weights are pretty critical IP as well as they allow you to change the training model in such a way that may be undesirable.”

Kalyanaraman said AWS has implemented multi-layer security to help protect its customers. But Kalyanaraman isn’t alone in his concerns.

AI giants OpenAI and Anthropic have both published blogs tackling the topic of model weight security, and the U.S. NTIA also has an eye on the safety challenges associated with model weights. The U.S. Department of Defense even commissioned a report on the subject from research organization RAND. An interim version of RAND’s report was released in October 2023, but an updated edition came out last month.

What’s the big deal?

But why all the fuss about model weights in particular?

Well, as OpenAI’s team explained in a blog: “Unreleased model weights are paramount to protect because they represent core intellectual property and need to be safeguarded from unauthorized release or compromise.”

And when you think about the fact that Amazon’s CEO believes custom models are the key to AI and that Red Hat’s CTO similarly predicted a future that includes thousands of AI models…well you start to understand why the weights that differentiate a model might be important to protect.

The next question, then, is what kinds of threats is the industry up against?

According to RAND’s May 2024 report, researchers identified 38 distinct attack vectors across nine categories. The categories group together attacks which run unauthorized code, compromise existing credentials, undermine the access control system itself, bypass the primary security system altogether, gain unauthorized physical access to systems, impact the supply chain, achieve nontrivial access to data or networks, exploit human weaknesses (think bribes, etc.,) and those which are related to AI-specific avenues.

And those are just the attack vectors we currently know about.

“The diversity of attack vectors is large, so defenses need to be varied and comprehensive,” RAND researchers wrote. “Achieving strong security against a specific category of attack does not protect an organization from others.”

RAND also identified at least five different environments in which AI model weights need to be protected: training, research, internal deployment, public API deployment and on-prem deployment.

Best practices

Researchers proposed the adoption of a five-level security framework as a means to define the security capabilities required to counter different levels of attacks and calibrate systems accordingly.

At the highest level, RAND recommended that weight storage, physical security, security during transport and use, monitoring, permitted interfaces and access control should all be part of weight security.

But a key takeaway from the report is that more discussion and research is needed: "Securing model weights against the most capable actors will require significantly more investment over the coming years.”