FT商学院

Why computer-made data is being used to train AI models

Microsoft, OpenAI and Cohere experiment with “synthetic data,” as they reach the limits of information created by humans

Artificial intelligence companies are exploring a new avenue to obtain the massive amounts of data needed to develop powerful generative models: creating the information from scratch.

Microsoft, OpenAI and Cohere are among the groups testing the use of so-called “synthetic data” — computer-generated information to train their AI systems known as large language models (LLMs) — as they reach the limits of human-made data that can further improve the cutting-edge technology.

The launch of Microsoft-backed OpenAI’s ChatGPT last November has led to a flood of products rolled out publicly this year by companies including Google and Anthropic, which can produce plausible text, images or code in response to simple prompts.

您已阅读11%(724字),剩余89%(6167字)包含更多重要信息,订阅以继续探索完整内容,并享受更多专属服务。
版权声明:本文版权归manbetx20客户端下载 所有,未经允许任何单位或个人不得转载,复制或以任何其他方式使用本文全部或部分,侵权必究。
设置字号×
最小
较小
默认
较大
最大
分享×