Home > News > Techscience

$250 Million! OpenAI Secures 5-Year Rights Deal with News Corp for Large Model Training and Question-Answering

QinCheng Mon, May 27 2024 10:54 AM EST

Expanding ChatGPT once again, this time incorporating content from over a dozen media outlets.

On May 22nd local time, OpenAI announced a multi-year agreement with News Corp to access current and archived content from major news and information publications, including over a dozen media outlets such as The Wall Street Journal, Barron's, New York Post, The Times, The Sun, and more.

Under the agreement, OpenAI will be able to showcase News Corp's media content in ChatGPT for answering user queries. Additionally, News Corp will share journalistic expertise to help ensure that OpenAI's products meet the highest news standards.

Sources familiar with the matter stated that the deal spans five years and is valued at over $250 million, including cash and credit for using OpenAI's technology.

Furthermore, this collaboration does not include access to content from News Corp's other businesses. OpenAI stated that the ultimate goal is to enable people to make informed choices based on reliable information and news sources.

OpenAI CEO Sam Altman remarked, "Our partnership with News Corp marks a proud moment for the news industry and the tech sector. We deeply value News Corp's history as a global leader in breaking news reporting and are thrilled to enhance user access to their high-quality reporting. Together, we aim to lay the groundwork for a future where artificial intelligence deeply respects, enhances, and upholds the standards of world-class journalism."

Previously, OpenAI announced partnerships with the American social platform Reddit, enabling access to real-time content via its API and integrating it into products like ChatGPT. Agreements were also reached with The Financial Times, The Associated Press, The World News, and several other media outlets to use their databases for training AI models.

However, collaborations with different media outlets vary slightly; for instance, the partnership with The Associated Press is worth only a few million dollars annually, primarily focusing on using text archives for training. The partnership with The Financial Times ranges from $5 million to $10 million per year, including the display of news content.

Nevertheless, OpenAI's journey in copyright partnerships has not been entirely smooth. Dozens of media outlets, including The New York Times, The Intercept, and The New York Daily News, have filed copyright infringement lawsuits, alleging that OpenAI unlawfully used their news content to train AI models.

Regarding training AI models with publicly available internet materials, OpenAI stated that this fair use has long been established and widely supported by precedents. This principle is fair to creators and essential for innovators. Nonetheless, the company offers a straightforward opt-out process for publishers to prevent their websites, such as The New York Times, from being accessed by its tools.

OpenAI emphasized that since large models learn from a vast collection of human knowledge, any single source, including The New York Times, is just a small part of the overall training data, and no single data source is crucial for the model's expected learning.

The Wall Street Journal, under News Corp, highlighted that AI companies are eager for publishers' content to refine models and create new products like AI-driven search. Publishers are seeking to ensure substantial returns for the use of their intellectual property, sparking complex and sometimes intense negotiations across the industry.

According to reports, in the agreement between News Corp and OpenAI, it is ensured that news content will not be immediately available on ChatGPT after publication. This is a concern for publishers, as AI could provide complete answers based on news content, potentially reducing traffic and ad revenue for publishers if users do not need to pay to access news sites.

Sources mentioned that OpenAI is looking to provide relevant links beneath answer summaries to show users the content's origins from various publishing partners.