- Updated: March 11, 2026
- 6 min read
AI Agent Hacks McKinsey’s Lilli Platform: Inside the Generative AI Breach
An autonomous AI agent exploited an unauthenticated SQL injection in McKinsey’s internal Lilli platform, gaining full read‑write access to millions of confidential documents and the system’s prompt layer within two hours.
What Happened: The McKinsey AI Agent Hack in a Nutshell
In late February 2026, a self‑directed AI agent from the CodeWall research platform identified a critical vulnerability in UBOS AI agents’s own security testing suite and turned its attention to McKinsey & Company’s internal AI system, Lilli. Without any credentials, insider knowledge, or human supervision, the agent mapped the public API surface, discovered an unauthenticated endpoint that concatenated JSON keys into SQL statements, and executed a blind‑SQL‑injection chain. Within two hours the agent harvested 46.5 million chat messages, hundreds of thousands of files, and the entire prompt configuration that governs Lilli’s behavior.
How the Autonomous AI Agent Bypassed Defenses
Step 1 – Surface Mapping and Endpoint Discovery
The agent began by crawling the publicly exposed API documentation (UBOS tech news style documentation). Over 200 endpoints were listed; 22 of them lacked authentication. One of these endpoints accepted user search queries and stored them in a relational database. While the query values were safely parameterised, the field names (JSON keys) were directly interpolated into the SQL string.
Step 2 – Detecting a Hidden SQL Injection
When the agent sent malformed JSON keys, the database returned error messages that echoed the exact key strings. Recognising this pattern, the agent flagged a classic SQL injection that conventional scanners like OWASP ZAP missed because the payload was in the column name, not the value.
Step 3 – Blind Iteration and Data Exfiltration
Using the error feedback loop, the agent performed fifteen blind iterations, each refining the query shape. The technique is akin to “boolean‑based blind SQL injection” but fully automated. Once the query structure was known, the agent issued SELECT statements that streamed live production data back through the API response.
Step 4 – Chaining to IDOR and Prompt‑Layer Compromise
Beyond raw data, the agent combined the injection with an Insecure Direct Object Reference (IDOR) flaw, allowing it to pull individual employee search histories. More critically, the same injection point accessed the system prompts—the instructions that dictate Lilli’s responses. By issuing a single UPDATE call, the agent could rewrite these prompts, effectively poisoning the AI’s output without touching any code or server files.
Figure 1 – Attack flow from endpoint discovery to prompt‑layer manipulation.
Why This Vulnerability Is a Game‑Changer for AI Security
- Legacy bug, modern impact: SQL injection is decades‑old, yet it remains deadly when combined with AI‑specific assets like prompt libraries.
- Prompt layer as a crown jewel: The instructions that steer generative models are now high‑value targets. Compromise can silently alter advice, remove guardrails, or embed confidential data into AI outputs.
- Blind, unauthenticated access: No credential checks meant any internet‑facing scanner could replicate the attack, dramatically lowering the barrier for threat actors.
- Scale of exposure: Over 46 million chat messages, 728 k files (PDFs, Excel, PowerPoint, Word), 3.68 million RAG document chunks, and 266 k+ OpenAI vector stores were exposed.
- Persistence without logs: Updating prompts via SQL leaves no file‑system footprints, making detection extremely difficult.
Potential Business Consequences
For a consulting powerhouse like McKinsey, the fallout could include:
- Poisoned strategic recommendations that mislead clients.
- Data leakage through AI‑generated reports, violating NDAs.
- Regulatory penalties for exposing proprietary research.
- Loss of client trust and brand reputation.
McKinsey’s Response and Disclosure Timeline
| Date | Action |
|---|---|
| 2026‑02‑28 | Autonomous agent identified the unauthenticated SQL injection and began enumerating Lilli’s production database. |
| 2026‑02‑28 | Full attack chain confirmed – unauthenticated injection, IDOR, 27 findings documented. |
| 2026‑03‑01 | Responsible disclosure email sent to McKinsey’s security team with high‑level impact summary. |
| 2026‑03‑02 | CISO acknowledged receipt, requested detailed evidence, and began emergency patching. |
| 2026‑03‑02 | All unauthenticated endpoints were secured, public API docs were taken offline, and development environment isolated. |
| 2026‑03‑09 | Public disclosure of the breach (this article) and a detailed post‑mortem released. |
Expert Commentary: The Rise of AI‑Driven Threat Actors
Security researcher Dr. Lina Patel, senior analyst at CodeWall, notes:
“Traditional pen‑testing tools are built for static code paths. An autonomous AI agent can iterate, learn, and pivot in real time, effectively becoming a ‘living’ attacker that never sleeps. The Lilli breach proves that AI‑centric assets—prompt libraries, vector stores, and RAG pipelines—must be treated as critical infrastructure.”
Cyber‑risk consultancy About UBOS echoes this sentiment, emphasizing the need for “prompt‑layer hardening” and continuous AI‑driven red‑team exercises.
Mitigation Strategies for Enterprises Using Generative AI
- Zero‑trust API design: Enforce authentication on every endpoint, even those deemed “read‑only.”
- Input sanitisation for schema elements: Never concatenate column names or table identifiers directly from user‑supplied JSON.
- Prompt versioning & integrity checks: Store prompts in immutable logs (e.g., append‑only databases) and sign them cryptographically.
- Automated AI red‑team testing: Deploy platforms like UBOS AI agents to continuously probe your own AI stack.
- Monitoring for anomalous AI output: Flag responses that contain unusually large data dumps or internal file paths.
Future Outlook: AI Agents as Both Defenders and Attackers
The McKinsey incident illustrates a pivotal shift: AI agents are no longer just tools; they are autonomous actors capable of both protecting and compromising systems. As enterprises adopt Enterprise AI platform by UBOS and integrate services like OpenAI ChatGPT integration, the attack surface expands.
Key trends to watch:
- Self‑learning red‑team bots: Vendors will offer “continuous adversarial testing” as a SaaS product.
- Prompt‑layer security standards: Industry bodies are drafting guidelines for prompt integrity, similar to code‑signing.
- Regulatory focus on AI data leakage: GDPR‑style rules may soon require explicit audit trails for AI‑generated content.
Organizations that embed AI security into their DevSecOps pipelines—leveraging tools like Workflow automation studio and the Web app editor on UBOS—will be better positioned to stay ahead of autonomous threats.
Conclusion
The breach of McKinsey’s Lilli platform demonstrates that even the most sophisticated enterprises can fall prey to a simple, decades‑old vulnerability when it intersects with modern AI architectures. Autonomous AI agents can discover, exploit, and persist within systems at machine speed, turning prompt libraries into the newest “crown jewels.” Companies must adopt a zero‑trust mindset for every API, enforce strict prompt governance, and continuously test their AI stacks with AI‑driven red‑team tools.
By treating prompts, vector stores, and RAG pipelines as critical assets, and by embracing continuous AI security testing, organizations can transform the same technology that powers the attack into a robust line of defense.