Separating explore from verify to kill false positives
Why the agent that finds a weakness should never be the one that decides it's real, and what that buys you.
By The Rift team
False positives are the tax every security team pays for automated tooling. Our core design decision for keeping that tax near zero is simple to state and surprisingly load-bearing: the agent that discovers a weakness is never the agent that confirms it.
Why split the roles
A model that's incentivised to find things will, given any ambiguity, lean toward "found something." That's useful for coverage and terrible for precision. So we separate the two jobs:
- Explorers cast wide. They're rewarded for surfacing candidate weaknesses, including speculative ones.
- Validators are adversarial. Their job is to refute a candidate: to reproduce it from a clean session or throw it out.
A finding only reaches a human after an independent validator reproduces it end-to-end. No reproduction, no finding.
What it costs, what it buys
Running a second adversarial pass is more compute per candidate. In exchange you get findings a human can act on without re-checking, which is the entire point of the product. A backlog of unverified "maybes" isn't an asset; it's work you've shifted onto the customer.
It mirrors how good teams already work
The split isn't novel. It's how a careful pentester operates internally ("is this actually exploitable, or am I fooling myself?") and how peer review works in research. We just made it the architecture instead of a habit.