← Back to writing

AI Agent Security: 98% Vulnerable, Says New Framework

AI Agent Security: 98% Vulnerable, Says New Framework

The AIRQ framework assessed 100+ AI agents and found 98% ship critically vulnerable. Here is what every engineering team needs to know about AI agent security risk.

Mayur Domadiya · June 8, 2026 · 6 min read

98% of AI agents ship critically vulnerable out of the box. That number comes from the AI Risk Quadrant Report, the first independent security assessment to rank over 100 popular AI agents across 10 categories. At Boundev, we build AI features that ship to production every week, and security reviews are the gate that determines whether a feature goes live. This report confirms what we have seen firsthand: most AI agents combine private data access, untrusted content ingestion, and outbound actions — a combination no security team would approve, yet ships as default behavior. This post covers the findings, what they mean for teams building with AI agents, and where the real risk lives.

The First Independent Benchmark for AI Agent Security

The AIRQ framework is the first open-source methodology for scoring and comparing AI agent security. It was developed by Adversa AI with contributors and reviewers from OWASP, CoSAI, CSA, NIST, Cisco, and Crowdstrike. The methodology quantifies three dimensions: attack surface, blast radius, and defense controls.

What makes this different from existing guidance is that it produces a comparative rating. Earlier frameworks told you what to look for but did not tell you how one agent stacks up against another. AIRQ changes that by ranking agents side by side, giving enterprises a leaderboard and risk benchmark for the agentic AI era.

The framework builds on established industry standards including OWASP AIVSS, the MAESTRO threat-modeling methodology, and the Lethal Trifecta classification. It aligns with NIST AI Risk Management Framework and CSA guidelines. That matters because it means the methodology does not invent new categories — it quantifies existing ones.

98% of Agents Ship Vulnerable — Here Is the Data

The headline number is 98%. Only 2% of assessed agents meet a baseline security threshold out of the box. Of the 100+ agents evaluated, only 11% qualify as Fortified Leaders — agents that are both capable and well-defended.

Tool execution alone explains 76% of an agent's blast radius. That means the primary driver of security risk is not the model or the prompt — it is the tools the agent can call. Every function call, API integration, and file operation expands the blast radius, and most agents ship with broad tool access by default.

The report also found that 83% of vendors' security claims cannot be independently verified. Most vendors publish security statements without reproducible evidence. That gap between what is claimed and what can be verified is the central problem the AIRQ framework is designed to solve.

The Lethal Trifecta — Why Default Security Fails

The root cause behind the 98% number is what the framework calls the Lethal Trifecta: private data access, untrusted content, and outbound actions. Most AI agents combine all three by design.

An agent that reads your customer database, processes external web content, and sends email based on that processing is doing all three at once. Individually, each capability is reasonable. Together, they create a chain where a single vulnerable input can trigger data exposure at scale.

The key insight for engineering teams is that adding an aligned model does not break this chain. Alignment reduces the likelihood that the model itself produces harmful output, but it does not prevent an attacker from exploiting the agent's tool access through indirect prompt injection or crafted content. The defense must live in the tool execution layer, not the model layer.

What the Framework Actually Measures

The AIRQ methodology scores agents across three axes. Attack surface covers the channels through which an agent can be reached — direct user input, tool outputs, MCP servers, and upstream data sources. Blast radius covers what an agent can access and affect — files, databases, APIs, email, and deployment infrastructure. Defense controls covers authentication, authorization, monitoring, and incident response.

These three scores plot onto a quadrant that categorizes agents into four segments: Fortified Leaders, capable and well-defended; High Performers, capable with moderate defense; Unsecured Achievers, capable but exposed; and at-risk agents that are both limited and poorly defended.

For teams evaluating AI agents or building AI features, the most useful output is not the ranking itself — it is the methodology for assessing your own agents. The framework is open source, which means any team can apply the same scoring criteria to their own deployments without relying on vendor self-reporting.

What This Means

The AIRQ report confirms what security teams have suspected but could not prove: most AI agents are not production-ready from a security standpoint. The 98% number is not a failure of individual vendors — it is a market-wide signal that agent security defaults are misaligned with enterprise requirements.

The fix starts with visibility. Teams need to know what tools their agents can call, what data those tools can access, and whether untrusted content can reach that access path. The AIRQ framework provides the scoring rubric, but the implementation work belongs to the teams shipping agents into production.

Here is the open question worth sitting with. If 98% of agents ship vulnerable and alignment does not fix the tool execution layer, is your team's AI security review looking in the right place?

Not sure where to start with AI?

Book a free 20-minute AI Feature Scoping Call. We will map your highest-ROI AI feature, tell you the real cost, and whether Boundev is the right fit. No decks. No BS.

Book scoping call →
MD

Mayur Domadiya

Founder & CEO, Boundev AI

Mayur builds Boundev AI, the AI engineering subscription for US SaaS companies. Connect on Twitter or LinkedIn.

Get shipped

Rather we just build it?

Book a free scoping call and we'll ship your production-safe AI feature this week.