0%
Still working...

Google Says Treat Prompts Like Code. Here’s Why That’s the Smartest AI Security Advice Right Now

Most organisations are still thinking about AI security the wrong way. They’re locking down models, obsessing over data classification, and building walls around infrastructure. All necessary — but they’re missing the most dangerous gap: the prompt layer.

Google’s Office of the CISO just told us what to do about it. And the advice is deceptively simple: treat prompts like code.

The Prompt Is the New Attack Surface

In January 2026, Google Cloud’s Office of the CISO published a paper on operationalising their Secure AI Framework (SAIF). Among the recommendations, three stood out: data is the new perimeter, prompts should be treated as code, and agentic AI requires identity propagation.

That second one — treat prompts like code — is the one that should be pinned to every security team’s wall.

Here’s why. When I look at how organisations are deploying LLMs today, prompts are still treated like casual inputs. They’re strings of text that users type in. They’re system instructions that developers paste into a config file. Nobody’s reviewing them. Nobody’s version-controlling them. Nobody’s testing them adversarially before they hit production.

That’s the equivalent of writing SQL queries with unsanitised user input in 2003. We know exactly how that story ends.

Prompt Injection Is the New SQL Injection

Google’s CISO team made the comparison explicit: “In terms of ease-of-use for threat actors, prompt injection is the new SQL injection.”

That’s not hyperbole. SAIF identifies prompt injection as one of 15 core AI security risks. The framework distinguishes between direct injection — where a user manipulates the model into bypassing its instructions — and indirect injection, where malicious instructions are hidden inside documents, emails, or websites that the model processes.

The indirect variant is particularly dangerous. A user asking an AI assistant to summarise an email doesn’t expect that email to contain hidden instructions that exfiltrate their data. But researchers have demonstrated exactly this attack, repeatedly and reliably.

In my experience, most enterprise teams aren’t even testing for direct prompt injection yet. Indirect injection isn’t on their radar at all.

What “Treating Prompts Like Code” Actually Means

When Google says treat prompts like code, they’re not speaking metaphorically. They mean apply the same rigour you’d apply to any code that touches production.

That means version control. System prompts should be stored in repositories, reviewed through pull requests, and tracked for changes. If someone modifies the system prompt that governs your customer-facing AI assistant, that change should be auditable.

That means input validation. Every prompt — whether from a user, an automated workflow, or a retrieval-augmented generation (RAG) pipeline — should be inspected before it reaches the model. Google recommends deploying a dedicated AI firewall, like their Model Armor, to scan inputs for malicious intent and outputs for sensitive data leaks.

That means adversarial testing. Just as you’d pen-test an API, you need to red-team your prompts. Google’s SAIF framework explicitly calls for adversarial training and testing as a control against prompt injection and model evasion attacks. If you’re not deliberately trying to break your own prompts, someone else will.

And that means output sanitisation. The model’s response is untrusted output. It should be validated, filtered, and rendered safely — just like any user-generated content in a web application. Unsanitised model output can lead to cross-site scripting, data exfiltration, and a host of downstream exploits that security teams already know how to prevent in traditional applications.

The Agentic Layer Makes This Urgent

Everything I’ve described becomes exponentially more dangerous with AI agents. When a model can not only generate text but also take actions — send emails, call APIs, modify files, execute code — the blast radius of a compromised prompt is no longer theoretical.

Google updated SAIF in 2026 specifically to address agentic risks. The framework now includes three new controls: agent permissions (least-privilege access for every tool an agent can call), agent user control (explicit approval for actions that modify data or act on behalf of users), and agent observability (full audit trails of what agents did and why).

The research is clear. Attackers can hijack communication between agents in multi-agent systems. They can plant dormant triggers in calendar invites that activate later. They can exploit ambiguity in user instructions to cause agents to email the wrong person or share sensitive data with the wrong system.

I’ve been in enterprise IT long enough to recognise this pattern. Every time we give a system more autonomy, the security model needs to evolve with it. We didn’t let web applications make arbitrary database calls without parameterised queries. We shouldn’t let AI agents take arbitrary actions without scoped permissions and validated inputs.

What I’d Do If I Were Running a Security Team Right Now

If I were a CISO looking at this, I’d start with three immediate actions.

First, audit every system prompt across my organisation. Identify where they live, who can modify them, and whether changes are tracked. If the answer to any of those is “I don’t know” — that’s your first problem.

Second, deploy input and output inspection. Whether it’s Google’s Model Armor, a third-party solution, or a custom pipeline, something needs to sit between users and models to catch injection attempts and prevent sensitive data from leaking in responses.

Third, treat AI agent permissions like you’d treat service account permissions. Least privilege. Scoped access. Identity propagation so every backend system knows who is actually requesting the action, not just that “the AI agent” requested it.

Google also recommends going further: contribute to industry collaboration through initiatives like the Coalition for Secure AI (CoSAI), which includes Anthropic, Cisco, IBM, Intel, Nvidia, and PayPal. The reality is that AI security isn’t a problem any single vendor can solve alone. The framework needs to be shared, tested, and refined by the community.

The Bottom Line

Google’s advice to treat prompts like code is the kind of clarity the industry desperately needs right now. It translates a novel, abstract risk into language that every security professional already understands.

We spent two decades learning to never trust user input in web applications. We learned to parameterise queries, validate inputs, sanitise outputs, and audit changes. Those lessons apply directly to AI systems — with the additional complication that the “input” can now be hidden inside a PDF, an email, or a calendar invitation.

The organisations that internalise this now will be the ones that deploy AI safely at scale. The ones that don’t will learn the hard way, just like we did with SQL injection twenty years ago.

Leave A Comment

Recommended Posts