Why AI Can’t Verify Its Own Code and What That Means for Enterprise AppSec

Asaf Saar June 16, 2026 7 min read

AI models that generate code are also the best at exploiting it. Here’s why independent verification, not the model itself, is the only trustworthy answer.

This month, the US government ordered Anthropic to suspend access to its most capable models, Mythos 5 and the newly released Fable 5, for all foreign nationals, citing national security. The trigger was a single reported jailbreak that let one of those models slip past its own guardrails on cybersecurity tasks. Anthropic complied, then disputed the order in public, warning that the same standard applied across the industry would freeze frontier model deployment everywhere.

A month earlier, OpenAI had taken the opposite path. Its Daybreak program made frontier cyber capability broadly available through tiered access, identity verification, and a roster of security partners that already includes much of the industry.

Two of the most capable labs in the world. Two opposite philosophies on the same problem. One locks the capability down and still gets recalled. One opens it up and absorbs the risk that comes with it. Both are circling the same uncomfortable fact, and it is the fact that should shape how every enterprise thinks about AI-generated software.

The same model writes the code and breaks it

The model that is best at generating code is also the model that is best at finding the flaws in it. These are not separate capabilities that happen to live in the same system. They are the same capability pointed in two directions. A model that can reason about a codebase well enough to extend it can reason about it well enough to exploit it.

That is the dual-use problem at the center of this entire moment. It is why a security-tuned model is simultaneously a defender’s tool and an attacker’s tool, and why a government, upon seeing one jailbreak, reached for export controls. The capability that makes these models valuable is the same capability that makes them dangerous, and no amount of guardrail tuning can fully separate the two.

For an enterprise shipping AI-generated code at scale, this lands as a simple question. If the model that wrote your code is also the model most capable of finding its vulnerabilities, who do you trust to tell you the code is safe?

Independence is not a preference, it is a structure

The answer cannot be the model that wrote the code. A system grading its own output is not verification, it is self-attestation. And it cannot be the lab that ships the model either, because the lab’s incentive is to demonstrate that its model produces safe code, not to find every reason it does not.

This is the part that the Mythos recall and the Daybreak rollout both make concrete. When the capability is this powerful and this contested, trust has to come from a layer the generator does not own and the lab does not control. Independence stops being a nice-to-have and becomes the only structure that holds.

CISOs have understood this instinctively for years. You do not let the team that built the system run its only security review. The frontier era does not soften that principle. It scales it up to the level of the model itself, and now to the level of the lab behind the model.

Why AI floods security teams with findings and why that’s the wrong problem

Here is the part the frontier conversation keeps skipping. These models do not just write more code. They surface more findings, faster and in greater volume than any scanner before them, and they do it continuously. In other words, AI has commoditized vulnerability discovery. Finding the flaw is no longer the hard part. It just got cheaper and louder. The premium has moved to verification: knowing which findings are real, which are reachable, and which are actually closed.

So the enterprise ends up in one place: more code than ever, more findings than ever, and less time than ever to deal with them. The pile grows every sprint. The question stops being “Can we find the vulnerabilities?” and becomes “Now what?”.

Which of these hundreds of findings is reachable, exploitable, and worth a developer’s afternoon? Which has a fix that can ship today? Which is noise? Answering that, at the speed the code is generated, is the actual work. Triage, prioritization, verification, and remediation, done fast enough to keep pace with generation rather than falling further behind with every release.

That is the bottleneck of the agentic era. That’s the problem Mend.io was built for.

The labs see it too. Both Mythos and Daybreak now lead with remediation, with patch generation and fix validation, not raw discovery. But this lands straight back on the independence problem. When the generator remediates its own output, you are again trusting the system that wrote the code to certify the fix. Remediation at speed is the right answer. Remediation at speed from a layer that did not write the code is the trustworthy one.

What this means for defenders

This is the ultimate mandate for independent security, amplified by the two companies best positioned to make the case for it.

An independent verification and remediation layer, neutral across whatever generated the code, fed by signals across thousands of enterprise codebases rather than the output of any single model, is the durable answer to a problem the frontier models cannot solve for themselves. Finding the flaw is the cheap half. Closing it, fast and across every generator in the stack, is the half that compounds. The labs build the most powerful generators and the most powerful exploit finders the world has seen. What they cannot build is the thing that sits outside both and is trusted precisely because it is outside.

Two more properties follow from the same logic. The layer has to cover the whole estate, not just last week’s AI-written commits. Most enterprise risk lives in years of accumulated architecture, transitive dependencies, and legacy no single model wrote and no single team still owns. And it has to be economical enough to run continuously, because a layer you can only afford to point at a fraction of the code on occasion is not verification, it is sampling. Neither breadth across the estate nor the economics of always-on scanning is a problem the frontier labs are trying to solve. Both are the independent layer’s to own.

That independence is structural. It does not depend on out-engineering the labs, which is a losing game, and it does not erode as the models get better. It gets more valuable.

The bottom line

The agentic era does not need less trust. It needs trust that scales at the speed of generation. The events of the last month are a preview of how unstable that trust becomes when the generator, the auditor, and the regulator are all arguing over the same model.

The way through is not a better model. It is a layer that verifies and remediates code the model does not control and the lab does not own, and that closes findings as quickly as they appear. That layer has a name. It is dedicated, independent application security. It always was. The frontier models just made it impossible to ignore.

The independent layer has always existed. The frontier models just made it urgent.

Increase visibility and control over the AI components in your applications

Mend AI

About the author

Asaf Saar

EVP Product

Asaf Saar is EVP and Chief Product Officer at Mend.io, where he leads product strategy for the company’s application security and software composition analysis platform, including its work securing AI-generated code and AI components. He joined Mend.io after more than five years as VP of Product Management at Tricentis. Asaf has spent his career building and leading product organizations at scale, with a focus on developer tooling, quality, and turning technical depth into commercial momentum.

Table of contents