Anthropic's Mythos Breached 'Almost All' NSA Classified Systems in Hours During Red-Team Test

Anthropic built a model called Mythos. They ran it against classified systems to see what it would do. It compromised almost all of them in hours. This was the point of the test. The model was removed from circulation. A statement was issued. The sequence took approximately four hours from breach to containment to classification of the report documenting the breach.
AI red-team exercises are standard. The results are usually compartmentalized. This time the results were also fast. Speed itself is data. When a system can penetrate defenses that were designed to resist penetration, the timeline for reporting that fact becomes a security question. The report classified itself in the time it took to write.
Mythos will not be deployed. Other models will be. The test proved something was possible. This does not prevent it from being possible again. Anthropic has released a statement. The systems remain.