A new round of cybersecurity tests shows OpenAI's GPT-5.5 performing on par with Anthropic's Mythos Preview, undermining claims that the latter represented a step-change in AI threat detection.
Researchers put both models through a series of red-team exercises designed to test their ability to identify vulnerabilities, craft exploit payloads, and respond to simulated attacks. The results were near-identical across nearly every category tested. Mythos Preview, which had been promoted as a major leap forward in AI cybersecurity capability, showed no statistically significant edge over GPT-5.5.
This matters because Mythos had built a reputation on unsubstantiated claims of superior security performance. If two leading models perform equivalently on these tasks, the "breakthrough" narrative falls apart. It also suggests the industry may be approaching a plateau in raw capability — the differentiators going forward will be deployment safety, latency, and cost, not "which model is smarter.
The hype around Mythos Preview serves as a reminder that model marketing rarely matches benchmark reality. Until independent testing confirms a meaningful gap, treat "revolutionary" security claims with skepticism.