Home > News > Internet

China's First Music SOTA Model "Tiangong Music Mega Model" Enters Public Beta Today

cici Wed, Apr 24 2024 08:04 PM EST

On April 17, 2024, on the first anniversary of the "Tiangong" mega model, Kunlun Wanwei announced a major event: the official launch of the "Tiangong 3.0" base mega model and the "Tiangong SkyMusic" music mega model for public beta testing! Exactly one year ago today, the first version of the Tiangong mega model was officially released online. Over the past year, we have continuously iterated on the model and its applications, steadily improving both. This progress is our way of expressing gratitude to our vast user base for their unwavering support.

"Tiangong 3.0" boasts an impressive 400 billion parameters, surpassing Grok-1's 314 billion parameters, making it the world's largest open-source MoE mega model. "Tiangong 3.0" exhibits groundbreaking performance improvements in areas such as semantic understanding, logical reasoning, generality, generalization, uncertainty knowledge, and learning capabilities, with enhancements exceeding 30% in mathematical/reasoning/code/creative writing abilities. dd9d3295-34da-4119-90fa-331e31fcbab5.png The powerful model technology endows "Tiangong 3.0" with outstanding performance. In authoritative multimodal benchmark evaluations like MMBench, "Tiangong 3.0" surpasses GPT-4V, leading globally. cb2ef0df-2b37-4e06-8717-73852e5abd0f.png (Tian Gong 3.0's multimodal capabilities surpass GPT-4V, leading globally)

Today, "Tian Gong 3.0" launches the public beta of "Tian Gong SkyMusic," a music large-scale model, to the general public. "Tian Gong SkyMusic" is China's first SOTA (State-of-the-Art) music model and marks the first time that China's self-developed large-scale model technology has taken the lead in the global AIGC (Artificial Intelligence Generated Content) field. 4604d191-b1ba-4241-94e2-00cd342d3475.png 天工SkyMusic: China's First Music AIGC SOTA Model

Previously, large models have achieved breakthroughs in multiple technical fields such as text and images, bringing about comprehensive industry transformation. However, in the field of AI music generation, the world has been eagerly awaiting a product to kickstart the "Music ChatGPT Moment".

This is because, for a long time, a significant amount of research in the AI music industry has been focused on symbolic music generation techniques, and most can only achieve the generation of background music (BGM) without vocals. The quality, effects, and aesthetics of the music fall far short of usable standards, and the industry has been slow to take off. d88323bc-855b-49d8-980f-2e8d78dcf1f5.jpg ("Tiangong SkyMusic" Self-developed AI Music Large Model Technology Architecture)

Different from the mainstream path in the industry, "Tiangong SkyMusic" adopts a self-developed large model music audio generation technology route. This route directly achieves integrated end-to-end music generation of instruments, vocals, melodies, volume, and notes through large model technology, with extremely high technical difficulty. Globally, only a few top players, including Kunlun Wanwei, are involved.

In the cross-evaluation with the overseas top AI music large model Suno V3, "Tiangong SkyMusic" significantly outperforms its competitors in areas such as vocal & BGM sound quality, vocal naturalness, and pronunciation clarity. With a comprehensive score of 6.65, it surpasses Suno V3, becoming the global AI music SOTA model.

Furthermore, "Tiangong SkyMusic" also possesses the original capabilities of reference music generation and dialect song generation.

Reference music generation: Users can upload their own reference music or choose existing reference music from the "Tiangong SkyMusic" database to generate songs with similar styles and singing styles, further lowering the threshold for using large music models, allowing users unfamiliar with music theory to easily play.

Dialect song generation: The music generated by "Tiangong SkyMusic" not only excels in vocal naturalness and pronunciation clarity but also supports many dialects such as Cantonese, Chengdu dialect, and Beijing dialect, allowing users to freely express music and spread dialectal culture.

"Tiangong SkyMusic" is China's first publicly available AI music generation model, and it marks the first time that China's self-developed large model technology leads the world in the field of AIGC.

Currently, in the field of text large models, OpenAI has attracted global attention; however, in subfields such as AI search and AI music generation, Chinese players are bravely advancing, continuously achieving top SOTA performances in subfields through self-developed technology, jointly building China's large model industry, and creating an independent and controllable large model industry ecosystem.

Tiangong 3.0: 400 billion parameters, the world's largest open-source MoE large model

Building on the leading edge of the previous generation "Tiangong 2.0" MoE large model, "Tiangong 3.0" achieves comprehensive performance upgrades, adopting a 400 billion-level parameter MoE hybrid expert model architecture, currently the world's largest and most powerful open-source MoE model.

"Tiangong 3.0" features comprehensive upgrades in logical reasoning ability, semantic understanding ability, ability to handle complex requirements, and content creation ability. It also introduces many AI capabilities such as multi-round search and comprehensive tool invocation, chart drawing, research mode, enhanced mode, and image manipulation and expansion, bringing users a brand new AI experience.

Multi-round search and comprehensive tool invocation: "Tiangong 3.0" has been specially trained in its ability to independently plan, invoke, and combine external tools and integrate information, enabling it to independently generate and call code to complete various complex user demands including industry research, product cross-evaluation, information analysis, image generation, chart drawing, and more.

Moreover, "Tiangong 3.0" can break down user tasks into segmented steps through its powerful semantic understanding ability, instantly determining whether internet connection or tool invocation is required, and conducting single or multiple rounds of internet searches, tool invocation, including multi-round searches, hot information analysis, image generation, and other complex user demands. c5a4cddb-23be-4c99-8d21-82b9fcecb5eb.png

markdown
| 电影名称          | 累计票房(亿元人民币) |
|-------------------|------------------------|
| 夺冠              | 13.72                  |
| 长津湖             | 10.02                  |
| 刺杀小说家         | 9.38                   |
| 看见了什么         | 7.95                   |
| 我和我的祖国       | 5.68                   |

5218b221-4076-4815-9d12-d44cc1907d21.png "天工3.0" is a comprehensive tool with unique capabilities such as multi-round searches, integrated tool invocation, and chart drawing. It taps into the underlying AI capabilities like AI search, AI dialogue, AI code generation, AI image recognition, and AI image generation, providing users with a more convenient and efficient AI experience. It serves as a true productivity tool for AI.

In addition, "天工3.0" also includes several AI capabilities such as research mode, enhancement mode, and image editing and expansion.

In research mode, "天工3.0" can extend related questions around a user's simple command, automatically generating research outlines, diagrams, practice summaries, and mind maps to help users quickly and clearly grasp core content and fulfill complex research needs. 146590ac-38c0-4504-b029-5f6370585791.png Enhanced Mode: In Enhanced Mode, "Tiangong 3.0" is capable of dissecting, refining, and following up on complex user queries, comprehending and completing information, thus exhibiting enhanced performance in natural language understanding. It excels in handling uncertain knowledge, allowing for more precise and efficient fulfillment of user needs. 2a124009-c4cc-4fce-900b-4ec34e86ec15.png Title: "Spring Festival Box Office Films of 2024; Understanding and Further Inquiry on 'Tiangong 3.0'"

Modified Image:

"Tiangong 3.0" achieves comprehensive breakthroughs in multimodal capabilities, surpassing GPT-4V and ranking first globally. Supported by a robust technological foundation, "Tiangong 3.0" introduces new features to its AI drawing capabilities, including image size expansion, directional adjustments, padding-to-image generation, padding-to-image evolution, and padding-to-image enlargement. 92195546-7206-4ac8-a24e-5c7a460c6ae4.png Since the official launch of the "Tiangong" large model in April 17th last year, Kunlun Wanwei has built an AI business matrix around the "Tiangong" series of large models, including AI image editing, AI search, AI music, AI video, AI social networking, and AI gaming. It is one of the domestic artificial intelligence technology companies with the strongest model technology and engineering capabilities, and the most comprehensive layout.

In the past year, in addition to continuously upgrading and iterating the "Tiangong" series base models, Kunlun Wanwei has also launched the first domestic AI search engine "Tiangong AI Search", open-sourced the billion-scale large language model "Tiangong Skywork-13B", and introduced a series of cutting-edge large model products such as the leading domestic AI Agent development platform "Tiangong SkyAgents".

Currently, the "Tiangong" series of large models have integrated multiple capabilities including AI music, AI search, AI writing, AI long-text reading, AI drawing, AI voice synthesis, AI comic creation, AI image recognition, AI code writing, and AI table generation. In the future, AI video capabilities will be added, benchmarking "super applications", becoming the "super model" of the artificial intelligence era. Driven by the company's mission of "achieving general artificial intelligence, allowing everyone to better shape and express themselves", Kunlun Wanwei will always be committed to the innovation and development of AI technology and products, continuously improving the user experience of AI products, and working hand in hand with users, researchers, and developers to create the future of domestic large models.