The AI Security Reckoning, Just in Time for RSA

Lock and security concept for AI security

I work at an AI security company. Every week, the threat briefing gets worse. Here's what I'm seeing heading into RSA.

RSA is weeks out. The conference floor will be full of vendors promising to secure your AI. But the threat briefing has already been writing itself since January — and it's more coherent than most keynotes will be.

Three storylines dropped in the span of a few weeks. Read together, they sketch the shape of where enterprise security is actually headed.

Your AI Dev Tools Are the Soft Underbelly

Start with the tools developers trust most.

Security researcher Ari Marzouk spent six months probing AI-powered IDEs and published what he called IDEsaster — over 30 vulnerabilities across Cursor, Windsurf, GitHub Copilot, Roo Code, Zed, and others, with 24 assigned CVEs. The core finding: every AI IDE tested was vulnerable to attack chains that combine prompt injection with the IDE's own legitimate features to achieve data exfiltration or remote code execution. Not exotic exploits — using the tools as designed, against you.

Separately, OX Security weaponized a patched Chromium CVE against the latest releases of Cursor and Windsurf, exposing 1.8 million developers to 94 known vulnerabilities sitting in outdated Electron builds. Cursor's response to responsible disclosure: they considered it "self-DOS" and out of scope. There are 93 more CVEs waiting.

Then there's the extension marketplace problem. Cursor, Windsurf, and Google Antigravity are all VSCode forks, so they can't use Microsoft's official extension marketplace. They inherited its recommendation list anyway — pointing to extensions that don't exist in the OpenVSX registry. Anyone could have claimed those namespaces and uploaded malware. Security researchers at Koi claimed them first as placeholders. Over 1,000 developers installed those placeholder extensions anyway — no icon, explicit description saying they were placeholders — because their IDE told them to.

And it's not just IDEs. GeminiJack, discovered by Noma Security, was a zero-click vulnerability in Google Gemini Enterprise. An attacker embeds hidden prompt injection instructions in a shared Google Doc, a calendar invite, or an email. Later, when any employee runs a standard search — "show me our budgets" — Gemini retrieves the poisoned document, executes the instructions, and exfiltrates the results to an attacker-controlled domain. No clicks. No warnings. No alerts from traditional DLP. Google patched it, but Noma's read is right: this attack class isn't going away.

Microsoft Copilot has its own version. The "Reprompt" attack exploited a URL parameter to inject hidden prompts into Copilot sessions, enabling continuous data extraction even after the session closed. Patched in January 2026, five months after disclosure.

Here's the pattern: AI tools with deep access to your data and systems are being treated as productivity tools, not as infrastructure that needs real security controls. That gap is being actively exploited.

Agents Are a Different Category of Problem

The IDE vulnerabilities are bad. Agentic AI is an order of magnitude worse in potential blast radius.

A Dark Reading poll found that 48% of security professionals now rank agentic AI as the top attack vector for 2026 — ahead of deepfakes, ransomware, and supply chain compromise. The reason isn't hard to see: agents operate with elevated privileges across systems, make decisions autonomously, and chain actions without a human approving each step. Compromise one agent and you don't just get that agent. You get everything it can reach.

OWASP released its Top 10 for Agentic AI Applications in late 2025, developed by 100+ practitioners. Goal hijacking sits at number one — attackers manipulate an agent's objectives through poisoned inputs in emails, documents, RAG pipelines. The agent can't reliably tell instructions from data. Neither can your SIEM tell when the agent executing 10,000 sequential queries is doing legitimate work or carrying out an attacker's instructions.

Here's what makes this different from traditional app sec: the attack surface is less about the model and more about what the model can do. Prompt injection, tool misuse, privilege escalation, memory poisoning, cascading failures through multi-agent networks. In one simulated incident, a single compromised agent poisoned 87% of downstream decision-making within four hours. Traditional incident response doesn't move that fast.

In January 2026, researchers found hundreds of malicious skills on ClawHub — the first major supply chain attack targeting an AI agent ecosystem. Skills and MCP servers, the protocols connecting modern agentic deployments, are shipping without the security scrutiny we've applied to open source packages for decades. We're back to 2005-era open source risk: grab it, use it, trust it, regret it.

This is the Black Duck moment for agents. The open source world had to learn the hard way that importing a package is importing its entire risk profile. We're about to learn the same lesson about AI skills, MCP servers, and agent toolchains — except the blast radius when something goes wrong isn't a library vulnerability. It's an autonomous system with production access acting on someone else's instructions.

The Labs Are Racing to Close the Gap

Both major AI labs shipped something in response.

Anthropic launched Claude Code Security in late February — a limited research preview for Enterprise and Team customers. Rather than pattern-matching against known vulnerability signatures, it reads codebases the way a human security researcher would: tracing data flows, understanding component interactions, catching the context-dependent flaws that rule-based tools miss. In testing with Mozilla, Claude Opus 4.6 found 22 Firefox vulnerabilities in two weeks, including 14 high-severity — roughly a fifth of all high-severity Firefox bugs patched in all of 2025. It also found a heap buffer overflow in the CGIF library by reasoning about the LZW compression algorithm, a class of bug that coverage-guided fuzzing couldn't catch even at 100% code coverage.

The dual-use tension is right there in Anthropic's own disclosure: the same capabilities that help defenders find vulnerabilities help attackers exploit them faster. They're investing in safeguards to detect malicious use. Whether those safeguards keep pace with the capability improvements is a genuinely open question.

OpenAI moved differently. On March 9, they announced the acquisition of Promptfoo — an AI security platform used by 25%+ of Fortune 500 companies for red-teaming and evaluating LLM applications. Promptfoo goes directly into OpenAI Frontier, their enterprise agent platform. The bet: security testing needs to be native to the platform where agents are built and operated, not bolted on after deployment. Automated red-teaming, prompt injection detection, jailbreak testing, out-of-policy behavior monitoring — all as first-class features of the agent runtime.

This is a platform play, not just a product addition. OpenAI is saying: if you're going to run AI coworkers on Frontier, security testing is part of the contract.

A Market Is Forming Around This Problem

The labs are securing their own platforms. That leaves the rest of the stack.

I've been watching this space closely — partly because I work at one of these companies, partly because I think this is the most important category forming in security right now. A few companies are building the tooling that Black Duck built for open source, but for AI agents, skills, and the protocols connecting them.

Here's who I'm paying attention to heading into RSA:

Zenity has been at this the longest, with posture management and governance for agents across Microsoft Copilot Studio, Salesforce Agentforce, and custom-built agents. Their MCP server security report is required reading. They're also covering AI skills specifically — governing what data skills can access, who built them, and how they're embedded into broader agent workflows. That's exactly the ClawHub attack vector.

Noma Security is building end-to-end coverage: discovery across your full AI landscape (every agent, every MCP server, every data source), continuous red-teaming, and runtime protection. They found GeminiJack — and their platform now includes native hooks into Cursor, Windsurf, and major IDEs. They're positioned as the bridge between dev tooling and production agent governance.

DeepKeep just shipped an AI Agent Scanner purpose-built for attack surface discovery — mapping what each agent can access, what tools it calls, and where the risk concentrations are. Runtime protection for agentic frameworks with AI firewall placement is on their roadmap.

Mindgard is doing automated AI red-teaming with adversary emulation — continuously probing models and agents for prompt injection, agentic manipulation, and shadow AI exposure.

And yes, Root — where I work — is covering the cloud and open source layer, where the underlying infrastructure, containers, and OSS components feeding into AI pipelines carry their own unscanned risk. Full disclosure: I'm obviously biased here, but I've tried to be fair in mapping the landscape.

None of these solve the same problem. The market is fragmenting the way application security did: SAST, DAST, SCA, CSPM, CNAPP — each filling a gap the others left. The agent security equivalent of that stack is being built right now.

What RSA Will (and Won't) Tell You

RSA will have no shortage of AI security theater — dashboards, demos, vague references to "agentic risk posture." I'd cut through it with two questions.

Before you walk the floor, here's what to actually look for:

-->Discovery coverage — does the tool see every agent, skill, MCP server, and data source in your environment?
-->Runtime enforcement — testing and red-teaming without runtime controls isn't protection
-->Toolchain visibility — the attack surface is the tools agents call, not just the model
-->Skills and MCP server governance — if the threat model ignores the toolchain, it's not ready for 2026
-->Multi-agent coverage — can it detect cascading compromises across interconnected agent networks?

First: does it cover the full chain? Discovery alone isn't protection. Testing without runtime enforcement isn't protection. Red-teaming without visibility into what agents can actually reach isn't useful. Ask vendors where they start and where they stop.

Second: does it account for skills and MCP servers, not just models? The model is the least interesting part of the attack surface at this point. The tools agents call, the servers they connect to, the skills they import — that's where attackers are going. If a vendor's threat model treats the agent as a black box and ignores the toolchain, they're not ready for 2026.

The Black Duck analogy holds. When open source security became a real discipline, the companies that won weren't the ones who patched the loudest CVEs. They were the ones who gave security teams visibility into what they were actually running. That's the gap in agent security right now. The market is forming to fill it. RSA will surface who's serious.

If you're heading to RSA and want to compare notes, find me there. I'd love to hear what you're seeing.

Beatriz Rodgers is Head of Product Marketing at Root, an AI security company. She also runs Product Marketing Mindset, a solo PMM consultancy, and produces Beyond Features, a brand and community for developer marketing practitioners.