AI "Origin Deity" Activated! How Big an Impact?

Mon, Apr 22 2024 08:18 AM EST

Author: Feng Liange, Wang Jun, Intern Kong Yaxuan

Editor: Wang Jun

Open or closed? That is the question of the era for large-scale models.

Earlier this year, Elon Musk criticized OpenAI and its CEO Sam Altman for gradually withholding details of their model research. "To this day, the OpenAI website still claims its mission is to ensure that general artificial intelligence benefits all humanity. However, in reality, OpenAI has transformed into a closed-source subsidiary of the tech giant Microsoft," said the former co-founder of OpenAI.

While OpenAI falls short of being "Open," Meta has "opened" its latest open-source artificial intelligence model.

On April 18th, Meta released its latest version of the open-source large-scale model, Llama 3, causing a stir in the open-source AI community. Coincidentally, the day of Llama 3's release also marked the birthday of top AI scholars and open-source advocate, Andrew Ng. "(Llama 3 is) the best gift so far. Thank you, Meta!" he exclaimed.

As we progress into 2024, the debate between open and closed sourcing is intensifying. Representing the closed-source camp is OpenAI, currently the strongest contender, while the open-source camp, including Meta's LLaMa and Mistral, as well as Google, continues to iterate. The closed-source camp adheres to the belief in Scaling Law, betting on the creation of stronger general models. The open-source camp continuously improves model capabilities and emphasizes driving the commercialization of large-scale models with more vertical performance and flexible configurations.

The discussion on whether to choose open or closed models continues unabated.

For insiders, this choice not only determines how they will advance the AI "technology tree" but also influences their commercial route selection. In other words, it is likely a survival issue in this fiercely competitive market. Meta's latest release, Llame 3, comes in two versions: pre-trained and fine-tuned for both 8B and 70B. According to information on Meta's official website, the Llama 3 model pushes the boundaries of data and scale to new heights. Trained on over 15T of data across two custom 24K GPU clusters—more than seven times the training data used for Llama 2—it supports an 8K context length, twice the capacity of Llama 2.

In addition to Llama 3, Meta has also introduced new trust and security tools, including Llama Guard 2, Code Shield, and CyberSec Eval 2.

Reportedly, Llama 3 is set to launch on major cloud providers and model API platforms such as AWS (Amazon Web Services), Databricks, Google Cloud, Hugging Face, Kaggle, IBM WatsonX, Microsoft Azure, and others. Llama also benefits from hardware support from AMD, AWS, Dell, Intel, NVIDIA, and Qualcomm.

On the official website, Meta has provided parameter comparisons between the two versions of Llama 3 and competitors like Google's Gemma and Gemini, Mistral, and Anthropic's Claude 3. According to Meta's website, Llama 3 performs well across five evaluation sets: MMLU (Multimodal Multilingual Understanding), GPQA (General Question Answering), HumanEval (Coding Ability), GSM-8K (Mathematical Ability), and MATH (Difficult Mathematical Problems). It's worth noting that Llama3 boasts exceptional coding capabilities. In a user test shared by Kazek, the administrator of the AI-themed WeChat account "Digital Life Kazek," Llama3-8B was able to provide code solutions for the classic Queen's Problem in chess, surpassing its predecessor Llama2, which required specialized code models to achieve similar results.

The market reacted swiftly. On the 18th, Meta's stock price defied the trend, closing up by 1.54%. The following day, Baidu Intelligent Cloud opened invitations for testing on its Qianfan Large Model Platform, offering training and inference solutions tailored for Llama 3, assisting developers in training their own large models.

The 8B and 70B models merely mark the beginning of the Llama 3 series. Yang Likun, Chief Scientist of Meta AI, hinted on social media that more versions will be released in the coming months.

Jim Fan, Senior Scientist at NVIDIA, speculates that future versions, such as the anticipated Llama 3-400B and beyond, could be a "turning point," enabling the open-source community to access models comparable to GPT-4. Staying at the table, each showing their prowess.

In the last round of metaverse competition, Meta, once thought to be in danger of falling into a pit, played a good hand in the field of artificial intelligence with the Llama series. To discuss the industry shake-up brought by Llama3, we first need to understand what open source means in the realm of large models.

Open source in the large model domain typically entails the architecture, training code, and pre-trained weights of a model being made publicly available, allowing researchers and developers to access and use them freely.

However, the degree of openness varies depending on the model. "Some may only provide limited access or partial code," noted Guo Tao. Criteria for determining whether a large model is truly open source may include the accessibility of code and data, the leniency of usage licenses, the activity of community support, and the openness to improvement and new applications.

In the AI world, "top players" are distributed along both paths. On the closed-source front, there's no need to say much; overseas, there's ChatGPT under OpenAI, while domestically, there's Baidu's Wenxin Yi Yan and the rising star Kimi from the Dark Side of the Moon.

As for open source, besides the Llama series, other widely used open source large models include OpenFlamingo launched by the nonprofit organization LAION, Dolly by Databricks, and MPT by MosaicML. Domestically, there's Alibaba's Thousand Questions, Zhidao's ChatGLM-4, Baichuan-7B Chinese and English large models by Baichuan Intelligence, the Wudao 3.0 large model series by Beijing Zhiyuan Wudao Intelligence, and the CPM-Bee 10B Chinese base large model by Mianbi Intelligence.

This differentiation is often influenced by multiple factors such as technological progress and iterative business models.

Angel investor and veteran AI expert Guo Tao believes that from a technical perspective, open source can promote research and innovation in academia, while closed source can help maintain a technological edge for a certain period.

From a business perspective, open source can attract contributions from developer communities, promote rapid iteration of technology, and widespread dissemination of applications, but it may affect the company's profit model. Closed source can protect intellectual property rights, create direct revenue streams for companies, but may limit the popularization of technology and the development of ecosystems.

In fact, before the release of Llama3, the Chinese internet had just gone through a round of debate on open versus closed source.

According to media reports, Baidu CEO Robin Li recently stated that open sourcing large models is not very meaningful, as the performance of closed source models will continue to improve. "With Wenxin large model 4.0, we can tailor smaller models suitable for various scenarios based on considerations such as effectiveness, response speed, and inference costs, and support fine-tuning and post-pretraining. Models derived from dimensionality reduction and pruning perform better than directly using open source models of the same size, with significantly lower costs for the same level of effectiveness."

Li Yanhong has always been a staunch advocate of the closed-source route, citing reasons including but not limited to acknowledging that a closed-source business model can better gather manpower and financial resources.

His opponent, Zhou Hongyi, founder of 360, succinctly states, "In a nutshell, without open source, there would be no Linux today, and without Linux, there would be no internet." "The launch of 'Source Deity': What's the impact?

"The release of Llama 3 is poised to shake up the market landscape," said Guo Tao in an interview with 21st Century Business Herald. He pointed out that its outstanding performance might attract more users and investors, thereby increasing its market share.

According to the official website, Llama 3 will be conditionally open-source for commercial use (requiring separate application for monthly active users exceeding 700 million). "But this is essentially equivalent to completely free commercial use," said Kazek.

Previously, investor Zhu Xiaohu had discussed topics related to the artificial intelligence market in an interview with Tencent News. When asked about the key milestones in the development of large models in 2023, his response was the launch of Llama. This laid the foundation for China's innovation at the application level and lowered the commercialization threshold.

Of course, Zhu Xiaohu's mention of monetization refers to users within the open-source ecosystem. There often isn't a definite answer to whether open-source model publishers can profit or when they might expect profit opportunities.

Closed-source large models typically profit through authorized use, subscription services, or direct product sales. A prime example in the AI field is the leading player OpenAI, which, despite its efforts to promote open-source projects, has been charging API licensing fees to other companies for services like ChatGPT, which holds a core position. During API service, other companies do not have access to the details and source code of the ChatGPT model, only interfacing with it through API calls.

According to survey data from GoDaddy, a platform for entrepreneurial services, ChatGPT has become the most widely used generative AI product among small businesses in the United States, with a utilization rate of 70%. This indicates that OpenAI's chosen path of commercialization through closed-source models has to some extent been successful.

How can open-source models seek opportunities for survival and development?

Firstly, by attracting users through an open ecosystem. Guosheng Securities' research report points out that open-source large models achieve low training costs and high performance through larger identifier training datasets, DeepSpeed, RLHF, and other methods, breaking down barriers to large models below super-large models.

"After acquiring users, open-source large models typically profit by providing value-added services, customized development, technical support, etc.," said Guo Tao. Companies can offer professional training services or customized application solutions based on open-source models.

For Meta and many other open-source advocates, the ambition of open-source goes beyond short-term commercialization; they aim to lead in rule design and ecosystem construction. Some industry experts analyze that barriers are not so easily overcome after open-sourcing, especially the barriers of high-quality, annotated training datasets and professional models.

Zhu Lingfeng, Director of Data Compliance Execution at StarRise Meizu Group, stated that currently, some open-source AIs are led by top companies. "The more people use them, the more they can strengthen network effects, but it's not truly open. Subsequent use of them requires supporting tools and services. Top companies may also use regulatory exemptions to gain rent-seeking space." In other words, the game of giants using open-source as bait may further strengthen the monopoly position of large companies, which may not be conducive to industry competition.

It was reported that in April last year, during a conference call with analysts, Zuckerberg mentioned that if the industry could standardize the basic tools used by Meta, then Meta could benefit from improvements made by others. In May of the same year, a document leaked internally from Google titled "We don't have a moat, and neither does OpenAI" spread on the SemiAnalysis website. The views included the notion that Google needs the open-source community more than the other way around. The article's author pointed out that ecosystems formed by different open-source models will always be potential competitors to OpenAI, and the result of competition with open-source AI is inevitably failure.

With Meta's move, the game has changed. Will the battle between large models see a true victor?

"It's unlikely that there will be an absolute winner between open-source and closed-source large models, as they each suit different applications and scenarios," Guo Tao believes. Open-source large models are more suitable for projects that require rapid innovation and large-scale collaboration, while closed-source large models may be more suitable for commercial applications with high performance and security requirements.

SFC

Editor of this issue: Jiang Peipei, Intern: Li Jie

21st Century Business Herald recommends reading.

pre：Meta AI expands globally and launches web version meta.ai

next：The skyrocketing popularity of Kimi - Who is it stealing business from?

AI "Origin Deity" Activated! How Big an Impact?

Navigation

Related Articles