In the field of artificial intelligence, a revolution in open source is underway, led by APUS and xDAN Intelligence. On April 2nd, the billion-parameter MoE (Mixture of Experts) architecture model, jointly trained and developed by APUS and strategic partner xDAN Intelligence, was officially open sourced on GitHub. From adapting to low-end computing chips for industry-wide accessibility to enhancing the quality and efficiency of the first domestically open-sourced billion-parameter MoE architecture model, the application of the APUS-xDAN Billion-Parameter MoE Model 4.0 is poised to make a powerful impact. Compared to other major domestic AI companies, New Dawn Intelligence, in its recent collaboration with APUS to open source a large-scale model, might seem relatively young. However, its founding team boasts a rather impressive lineup: a group composed of elites from Tsinghua, Berkeley, Tencent, Meta, and other top academic and engineering circles. This includes globally renowned developers from top open-source AI communities and experienced architects from Tencent Cloud. In early March this year, they completed a multimillion-dollar angel investment round, jointly funded by APUS and veteran AI industry investor Zhou Hongyang.
This powerful collaboration leverages the strengths of both parties, enabling the APUS-xDAN Large Model 4.0 (MoE) to achieve 90% of the GPT-4's comprehensive performance on lower-end computing chips like the 4090. This breakthrough offers greater value to Chinese enterprises utilizing large model technology and successfully tackles the challenge of computing power bottlenecks through algorithm optimization.
Adapting to lower-end computing chips opens a new era of technology inclusiveness. Recently, the U.S. Department of Commerce announced amendments to the semiconductor export control regulations it released on October 17, 2023. In addition to the previous restrictions on companies like Nvidia from exporting advanced AI chips to China, the new rules further intensify the restrictions to include the export of low-end chips used in laptops. This indicates that China will face even greater challenges in accessing American AI chips and chip manufacturing tools. Facing the limitations of domestic computing power resources and international technology blockades, APUS Chairman and CEO Li Tao said, "For China to break free from the 'computing power trap' orchestrated by the US, on one hand, we need to evolve our algorithms to enable low-end computing power to support high-end models; on the other hand, we must continue to push for the evolution of the application ecosystem. Only by persisting in this dual evolution can we possibly break through."
The newly open-sourced APUS-xDAN Large Model 4.0 (MoE) is a behemoth model with hundreds of billions of parameters that runs smoothly on data, engineering, and the 4090 chips, further breaking through the US semiconductor export controls and successfully aiding the Chinese AI industry in achieving widespread application of models. Based on practical testing, the APUS-xDAN Big Model 4.0 (MoE) demonstrates impressive technical metrics. With a GSM8K score of 79 for mathematical ability, an MMLU score of 73 for comprehension, and a BBH score of 66 for reasoning ability, its overall performance surpasses that of GPT3.5 and approaches GPT4. Notably, in terms of mathematical ability, it outperforms Elon Musk's open-source Grok.
For governments, the emergence of the APUS-xDAN Big Model 4.0 (MoE) signifies that despite limited domestic computational resources, China can still autonomously develop and operate top-tier large-scale language models. This significantly enhances the country's technological self-sufficiency and strategic security at the national level.
For businesses and individual developers, especially those with limited funds such as small entrepreneurs, there's no need to invest heavily in high-end GPUs like the A100 and H100. With just the relatively economical 4090, they can harness the power of this robust AI tool, greatly reducing the barriers to innovation and facilitating the widespread adoption of AI technology.
The open-sourcing of the domestically developed MoE architecture model, the APUS-xDAN Big Model 4.0, is reshaping the boundaries of efficiency. Notably, this release pioneers the adoption of the Mixture of Experts (MoE) architecture in China, becoming the country's first open-source large-scale model with a trillion parameters based on MoE architecture. In contrast to other models claiming MoE architecture without open-source validation, the APUS-xDAN Big Model 4.0 (MoE) truly applies the MoE architecture to large-scale models.
It's worth mentioning that the APUS-xDAN Big Model 4.0 (MoE) adopts a MoE architecture similar to GPT4, characterized by the combination of multiple expert models and the activation of only two sub-modules. This results in a 200% efficiency improvement in actual running efficiency compared to traditional Dense models of the same size and a 400% reduction in inference costs. Through further high-precision fine-tuning and quantization techniques, the model size is reduced by 500%, making it the first domestically developed trillion-parameter MoE model in both Chinese and English that can run on consumer-grade graphics cards. These features endow the APUS-xDAN Large Model 4.0 (MoE) with unparalleled learning efficiency and model capacity when tackling complex tasks, injecting robust momentum into the expansion of AI frontiers. This has carved out a new path for the Chinese AI industry, positioning it as a pioneer in the innovation and exploration of domestic large models. This time around, the APUS-xDAN Large Model 4.0 (MoE), with its impressive scale of 136 billion parameters, has outstripped the current largest open-source model in China—Alibaba's Wenwen 72B (72 billion parameters), achieving top-tier performance among mainstream large models. This historic breakthrough not only marks a significant leap in China's development of ultra-large-scale pre-trained models but also prominently showcases the country's international standing in AI research strength and technological innovation.
Building an AI Industrial Application Ecosystem, Empowering Various Sectors
"The development and application of AI are inseparable from the support of large models. The emergence of open-source large models enables more enterprises and developers to utilize these models to create a myriad of AI+ applications," said Li Tao. The open-sourcing of the APUS-xDAN Large Model 4.0 (MoE) somewhat fills the gap in China's open-source models with hundreds of billions of parameters, bringing more possibilities to the development and application of AI technologies.
Since its establishment in 2014, APUS has launched over 200 application products, covering tools, content, and services among various scenarios. After transitioning to artificial intelligence, APUS has restructured its native product matrix, empowering its entire product line with tools+AI, content+AI, and services+AI in three application layers, exploring and incubating super applications for end-users.
In industry application scenarios, from the "APUS Zhixin Large Model" used in G-end internet information for intelligent rumor refutation, to the "APUS Qihuang Large Model" enhancing intelligent diagnosis and treatment in B-end hospitals, the "APUS Shaobo Large Model" for intelligent e-commerce marketing, to the C-end "APUS Moran Large Model" capable of intelligent painting, the "APUS Yunmeng Large Model" with AI writing capabilities in the Chinese creative field, and the "APUS Moshi Large Model" enabling text-to-video capabilities... APUS is accelerating the democratization of large models, making AI applications simpler. At APUS AI Open Lab, developers can swiftly deploy their innovative applications through API interfaces. Currently, this AI Open Lab has gathered top AI talents from around the globe, collaborating with APUS's computational power support to collectively open-source and advance the latest AI large models, thereby promoting the healthy development of the AI industry.
It's reported that the recently open-sourced APUS-xDAN Large Model 4.0 (MoE) marks another significant milestone for APUS in the layout of AI general large models, following the joint open-source of APUS Large Model 3.0 (Lingli) with the National Engineering Laboratory of Shenzhen University. This underscores APUS's advancement in the deployment of AI general large models.
In the future, facing the development and challenges of the artificial intelligence industry, APUS will actively explore solutions through continuous technological research and community building. They will continuously enhance the stability and generalization ability of open-source large models to ensure their leading edge in various complex application scenarios, further empowering myriad industries and sectors.