Home > News > Internet

Sino-US AI Music Showdown: Who Reigns Supreme?

cici Wed, May 01 2024 08:04 PM EST

When it comes to the hottest field in AI lately, AI music generation takes center stage.

Abroad, we have "Suno," "Udio," "Stable Audio2.0"… while domestically, there's currently only "TianGong SkyMusic" by Kunlun Wanjie. Yet, each of these AI music generation applications is hands-on and ready to go.

So, for the most representative products in the field of AI music generation, both promising to generate high-quality audio content within minutes, how do Chinese and foreign AI music generation models differ? Which product capabilities are more valued by users?

Let's compare the overseas representative player "Suno" with the domestic representative player "TianGong SkyMusic" to find out. Let the showdown begin!

First, let's open both "Suno" and "TianGong SkyMusic" simultaneously.

Currently, Suno is only accessible through a web-based platform, while "TianGong SkyMusic" is integrated into a mobile app, making it more suitable for Chinese user habits.

Starting with "Suno," its interface includes Home, Create, Library, Explore, and other secondary pages. Home acts as a showcase for Suno user creations, featuring popular works and collections of various music genres like blues, rap, and classical. Clicking on Create takes you to the composition page. 39f724b3-7da0-4965-a382-1a48a0ed1d11.png The "Tiangong SkyMusic" is integrated into the "Tiangong APP," offering features such as showcasing user creations and providing inspiration. Compared to this, the interface of "Tiangong SkyMusic" is simpler, with a more direct pathway to creating music.

In terms of operation, although both are AI music generation applications, these two products have a significant difference in their generation logic.

"Tiangong SkyMusic" allows users to choose songs from a library or upload their own songs as references. On the other hand, "Suno" requires users to accurately describe the desired music style using relatively precise music theory knowledge. f88e054d-04b8-4516-8694-7621fa4d5d9a.png Compared to "天工SkyMusic," "Suno" requires users to accurately input "cue words" in natural language during the creative process. These cue words include descriptions of music styles like Pop, Folk, Acoustic, as well as emotional atmospheres such as Uplifting, Hopeful, Joyful... This actually raises the bar for users from the outset. For instance, how would one accurately describe music similar to the style of "In the Name of the Father"?

a6c54ce0-3625-4dd6-9d1d-16379f129f24.png "天工SkyMusic" is more user-friendly for ordinary people who have not received professional music training. Most ordinary people have limited knowledge of music genres and may not accurately grasp musical styles. It's difficult to achieve ideal compositions with vague descriptions. Using a similar piece of music as a reference can better express the composer's needs compared to verbal descriptions. In contrast, "Suno" may be more suitable for geeks interested in music or professional musicians.

Next, let's evaluate the "AI lyric writing" feature of the two AI music apps.

Let's take the theme of "The Story of Yue Fei" and see what responses "Suno" and "天工SkyMusic" provide. 64c3f8a3-25fe-4a85-b5ef-79831f2c868b.png In general, the automatically generated lyrics by "天工SkyMusic" better capture the ancient style, while those by "Suno" seem a bit stiff, lacking the fluidity and charm expected in lyrics. However, considering that making a foreign app "speak" Chinese is already quite challenging, this aspect can be considered an extra credit question, not a scoring factor.

Voice synthesis is the dimension that best demonstrates the effectiveness and quality of AI music generation. The AI voice synthesis of "天工SkyMusic" can produce Chinese vocals with exceptionally high proficiency and clear pronunciation, showcasing outstanding audio quality and lifelike singing effects, already reaching the state-of-the-art (SOTA) level in the industry!

In this regard, there's a noticeable gap in the Chinese singing proficiency between the two. Suno's Chinese vocal performance is notably inferior, giving off a vibe of a foreigner speaking Chinese. The following music piece generated by Suno indeed lacks both clarity and accuracy in Chinese pronunciation.

[file_v3_00a8_d354e231-4fd3-4826-aca9-82e61c275ecg.mp4]

[Song created by the user with Suno]

It's worth mentioning that for Chinese users, "天工SkyMusic" also offers a super exciting feature—creating songs in dialects! China is a country with a rich dialect culture, each dialect having its unique charm and expression. By providing lyrics in dialects and using songs with dialect features as reference, "天工SkyMusic" can generate music pieces with rich local characteristics.

Taking these two examples of Sichuan rap and Cantonese love song created by users, not only do they showcase the creative potential of dialect songs, but they also demonstrate the absolute advantage of "天工SkyMusic" in the Chinese domain. The enthusiasm and rhythm of Sichuan rap, the elegance and affection of Cantonese love songs, are perfectly reproduced by "天工SkyMusic"!

[All for Love (Sichuan Version).MP4][Chopped Pepper Fish Head.mp4]

After hands-on experience, it's evident that in the showdown between domestic and foreign AI music generation models, there's a clear winner.

In fact, according to official data, across several indicators such as voice and BGM audio quality, naturalness of vocals, and comprehensibility of pronunciation, the overall performance of "天工 SkyMusic" surpasses that of "Suno V3", becoming the latest SOTA model in music AI and marking the first time that China's self-developed large-scale model technology leads the world in the field of AIGC. bd727e1c-6a9b-4c30-ac57-4c879d01bc43.png The King of Domestic Music AIGC with 400 Billion Parameters

The all-around capabilities of "SkyMusic" can easily outperform foreign music large models. Where does the technical prowess behind it come from?

It all goes back to Kunlun Wanwei's AI layout years ago. In 2016, the company acquired StarMaker, responsible for the research and development as well as the operation of the entire product, and thus embarked on research and layout of AI music.

In February of this year, Kunlun Wanwei released SkyMind 2.0, which has far surpassed industry standards, astonishingly! Now, the recently released SkyMind 3.0 on April 17 has grown at lightning speed, with its model technology knowledge capabilities increased by over 20%, and mathematical, reasoning, coding, cultural, and creative capabilities increased by over 30%, equivalent to a "jack-of-all-trades" PhD!

Before the launch of SkyMind 3.0, the most formidable large model on the market, Grok-1, had 3.14 trillion parameters, while SkyMind 3.0 reached an astonishing 400 billion! It's as if the "brain" of this "PhD" stores such a vast amount of information and can process it in seconds.

Upon its release, SkyMind 3.0 became one of the largest open-source MoE models in the world. In various authoritative multimodal evaluation results such as MMBench, the performance of "SkyMind 3.0" has already surpassed GPT-4V, leading globally, bringing a completely new disruptive artificial intelligence experience to people.

"SkyMusic" is based on the open-source large model foundation of "SkyMind 3.0," not only achieving global leadership but also becoming the "world's first publicly available technical architecture" in the field of AI music generation.

"SkyMusic" adopts the Sora model architecture in the field of music audio, but in order to achieve more realistic and professional music effects, it bypasses the mainstream symbolic music generation technology route currently available on the market and chooses a route of large model technology that requires higher technical requirements and greater resource investment.

After countless research and development experiments and algorithmic investments, "SkyMusic" successfully overcame difficulties and explored the best solution of Encoder-DiT-Decoder.

It is only through years of deep cultivation in the AI field that breakthroughs can be made. The release of "SkyMusic" has ignited the music industry, relieving professionals from the worry of high music production costs, allowing ordinary music lovers to experience the joy of music creation, and showcasing the strength of China's self-developed large models in vertical fields to the world.

Concerns like "Will AI replace musicians?" need no longer worry anyone. In the future, "SkyMusic" will become one of the important creative tools in China's music industry, helping musicians create more excellent works and improve efficiency. Kunlun Wanwei will also continue to optimize and iterate industries, promoting the vigorous development of China's AI and music industries.