The risks of Mythos are no myth

The exploits of Anthropic’s powerful new AI model Claude Mythos Preview sound like a movie plot: a super-clever computer system locked in a cyber “cage” manages to break out and connect to the internet. Mythos did not do this spontaneously, to be clear, but because its creators challenged it as a test. Yet not only did Mythos breeze through the challenge, it emailed an Anthropic researcher to inform him then, unprompted, posted details online to brag. After it also showed superhuman abilities to find, and exploit, security flaws in software, Anthropic judged Mythos too risky to release to the public. It is restricting access for now to selected tech, cyber security and financial firms.

Some suggest Anthropic is engaged in clever marketing or PR. Rival OpenAI also said this week it would release its own new cyber security-focused model only to vetted users. Yet the dangers the episode has exposed — and their implications — should not be dismissed.

Anthropic insists Mythos scores highly on its standard safety benchmarks. In the escape from its test environment, though, and in solving other complex tasks, it found Mythos had sometimes taken “reckless excessive measures”, then covered its tracks.

您已阅读32%（1209字），剩余68%（2548字）包含更多重要信息，订阅以继续探索完整内容，并享受更多专属服务。

The risks of Mythos are no myth

FT观点

相关话题

The risks of Mythos are no myth

FT观点

相关话题

推荐阅读