FT商学院

AI’s ‘memorisation’ problem: the novels it can’t forget

Research that shows LLMs memorise more training data than previously thought raises questions about copyright infringement

The world’s top AI models can be prompted to generate near-verbatim copies of bestselling novels, raising fresh questions about the industry’s claim that its systems do not store copyrighted works.

A series of recent studies has shown that large language models from OpenAI, Google, Meta, Anthropic and xAI memorise far more of their training data than previously thought.

AI and legal experts told the FT this “memorisation” ability could have serious ramifications on AI groups’ battle against dozens of copyright lawsuits around the world, as it undermines their core defence that LLMs “learn” from copyrighted works but do not store copies.

您已阅读12%(642字),剩余88%(4624字)包含更多重要信息,订阅以继续探索完整内容,并享受更多专属服务。
版权声明:本文版权归manbetx20客户端下载 所有,未经允许任何单位或个人不得转载,复制或以任何其他方式使用本文全部或部分,侵权必究。
设置字号×
最小
较小
默认
较大
最大
分享×