The model reportedly demonstrated the ability to bypass restrictions placed on it during controlled testing.
The AI company Anthropic announced that it will not be publicly releasing its newest system, known as “Mythos,” citing concerns over advanced capabilities during testing.
The decision follows recent leaks that suggested the model was the most powerful system the company had developed to date. On Tuesday, Anthropic published a system card for the model, labeled “Claude Mythos Preview,” that explained the significant leap in performance that ultimately led the company to limit its availability, per Gizmodo. System cards are documents used by AI developers to provide transparency into how a model performs, including its strengths, weaknesses, and potential dangers.
According to the report, the model demonstrated the ability to bypass restrictions placed on it during controlled testing. In a scenario, the system was given access to a restricted computer environment with limited online tools and challenged to break out. The model was able to successfully find a way to access the broader internet and was even able to contact a researcher off-site. The model also shared details on how to achieve this on obscure but public websites.
The system card also revealed that in some cases, the model engaged in behavior beyond its intended design. During testing, it sometimes tried to conceal mistakes and work around instructions rather than report the issues. In one instance, it obtained information it was not supposed to have and tried to adjust its response to avoid alerting researchers.
The model even found an exploit in a computer system and then took additional steps to obscure its actions from tracking tools.
Anthropic noted that such behavior occurred in a limited number of interactions, but admitted it was significant enough to warrant concern and caution.
Back in 2019, OpenAI initially delayed its release of GPT-2 over fears it would be misused. This decision was made when the co-founders of Anthropic, Dario Amodei, Jack Clark, and Chris Olah, were working there. However, the program was still ultimately released.
Despite the decision to pause public rollout, Mythos may still be used in limited settings. A report by Gizmodo noted that the model is expected to be made available to select partner companies, including Amazon Web Services, Apple, Google, JPMorganChase, Microsoft, and NVIDIA. These organizations are expected to use the system to find security vulnerabilities in software.
Powered by The Post Millennial CMS™ Comments
Join and support independent free thinkers!
We’re independent and can’t be cancelled. The establishment media is increasingly dedicated to divisive cancel culture, corporate wokeism, and political correctness, all while covering up corruption from the corridors of power. The need for fact-based journalism and thoughtful analysis has never been greater. When you support The Post Millennial, you support freedom of the press at a time when it's under direct attack. Join the ranks of independent, free thinkers by supporting us today for as little as $1.
Remind me next month
To find out what personal data we collect and how we use it, please visit our Privacy Policy

Comments