Home > News > Internet

The Open Secret of the AI Industry: Everyone's Copying Each Other

Hei Bai Wed, Apr 24 2024 08:11 PM EST

On April 17th, it was reported by national media that many developers and founders disclosed a common practice among numerous startup companies developing AI chatbots: they likely utilize data from OpenAI and other companies.

These bots can rival GPT-4 in specific domains, but only a fraction of the fee goes to OpenAI.

The modus operandi for these startups is straightforward: they subscribe to GPT-4, pose a series of questions, then employ these questions and answers to train their own models. And they're not alone in adopting this strategy.

Furthermore, these companies don't publicly disclose their use of OpenAI's technology, despite OpenAI's CEO previously stating that smaller enterprises can reasonably leverage their tech.

However, this practice fundamentally undermines OpenAI's growth, and CEO Otman may reconsider his stance at any moment.

Daniel Han, co-founder of Unsloth AI, estimates that about half of his clients acquire data from GPT-4 or Anthropic's Claude model to enhance their own models.

Moreover, the practice of startups using OpenAI data to train models mirrors the actions of tech giants like OpenAI.

Media reports indicate that Google transcribes YouTube videos, Meta hires contractors to summarize copyrighted books, and Adobe uses Midjourney's AI to generate photos, all to train their respective AI models. s_58b277b6a4d64a3c9cf7453d7c8b0702.jpg