Home > News > AI

Transforming the music industry with Suno? It's too early.

Thu, Mar 28 2024 08:03 AM EST

?url=http%3A%2F%2Fdingyue.ws.126.net%2F2024%2F0327%2F6b538aa5j00sazel90037d000u000gwm.jpg&thumbnail=660x2147483647&quality=80&type=jpg Title: DingjiaoOne Original Author: Wang Lu Editor: Wei Jia

"Feeling down? Let Suno write you a song."

Recently, AI music generation tool Suno released its latest version, praised by netizens as the ChatGPT of the music world due to its powerful features of low entry barriers, fast speed, and high completeness of songs. Some songwriters even started to worry that they might lose their jobs.

Suno is a text-generated music tool developed by a foreign AI startup company. The current version is V3, allowing users to input a few lyrics or a song title, select a music style, and get two songs within a minute, each lasting around two minutes. Suno supports multiple languages including Chinese and is freely accessible to users.

In the past two years, explosive applications like Midjourney and Sora have emerged in the fields of text-generated images and videos respectively. In the domain of AI-generated music, there are also many similar tools. In China, there are NetEase's NetEase Tianyin, Tencent's TME Studio, and Lingdong Yinkexue's BGM Cat (founded by a team from Tsinghua University). Internationally, there are notable ones like Google's Magenta Studio and Soundful, which claims to generate unique songs without copying any existing ones. However, none of these garnered widespread attention until Suno emerged.

At this moment, users outside the music industry are creating various humorous songs with Suno, satisfying their musical cravings. Industry insiders are also testing their own lyrics with AI composition, only to be amazed by the high completeness of the results. Many feel that the impact brought by Suno can be compared to the previous buzz generated by Sora in the film and television industry.

With Suno's popularity, information about its behind-the-scenes team has also come into public view. This company, located in Cambridge, Massachusetts, currently has 12 employees, four of whom were former colleagues from the same company, all machine learning experts, with two of them also being music enthusiasts.

Compared to many other AI music generation tools on the market, what makes Suno stand out? Will it disrupt the music industry?

"Idiot-proof" operation, generating two songs in one minute

Amateur music enthusiast Keke stumbled upon Suno's content on social media, and its powerful feature of allowing novice users to complete songwriting in one minute made him unable to resist trying it out.

After randomly inputting a string of numbers and letters without any logic, Keke received a melody in less than half a minute. "Such a twisted set of lyrics matched with the melody surprisingly smoothly," he felt very surprised.

The low threshold for one-click song creation has made netizens excited. Some asked Suno to sing recipes seriously, while others poured their bitter working experiences into it, generating songs and sharing them online. Many netizens expressed that the songs created by Suno made them laugh until their stomachs hurt. ?url=http%3A%2F%2Fdingyue.ws.126.net%2F2024%2F0327%2Fd8c2a421j00sazela004fd000u000dhm.jpg&thumbnail=660x2147483647&quality=80&type=jpg Source: Suno Official Website

Why is Suno so popular? What exactly makes it stand out?

Firstly, it benefits from its low threshold, allowing ordinary users to find joy in songwriting. Many netizens have expressed being drawn to Suno's convenience. Creating a song with Suno only requires four simple steps. By logging in to the official website, clicking on AI creation, entering text, and clicking the create button, users can get two songs with the same lyrics but different melodies.

From opening the software to generating a song, the entire process takes less than a minute. If users are not good at writing lyrics or are too lazy to do so, there is a solution. Users only need to choose their desired musical style, and it can automatically generate it.

Suno is also very intelligent, as it supports further optimization of songs. If users feel that a generated song is good but want to improve it further, they only need to click on the "generate similar" option, similar to the "I want to look more like myself" button on Meitu's camera app. In a few seconds, they can get another piece of work.

Previously, Suno allowed users to use it for free five times a day, which means generating ten songs, but it did not allow commercial use. However, according to the official website, users can now receive 20 points for initial registration, and no points will be given the next day. Each time a song is generated, it consumes 5 points, which means it can only be played for free twice. If users want to play more times or use it for commercial purposes, there is a way to pay for it.

By selecting the membership recharge button on the interface, there are four packages ranging from 68 CNY to 498 CNY to choose from. The higher the price, the more creation times you get. For example, 68 CNY corresponds to 136 songs, and 498 CNY corresponds to 1328 songs. The website indicates that these songs can be used commercially.

Ordinary users are amazed at Suno's one-click generation speed and low threshold of operation, while some insiders are more concerned about its professionalism and even feel threatened.

Music producer Fan Yubu used several "very" to describe his level of surprise. He told "定焦" that if previous AI music tools were at the level of elementary school students, then Suno is at least in junior high school.

Before Suno, he tried many AI tools to write songs, but the results were mediocre. In his words, after spending half a day training AI, the results were mostly accompaniment models, or focused on certain aspects, such as assisting in songwriting, automatic arrangement, generating vocals, and making album covers. Overall, the time and effort he invested in using AI were not much different from not using it.

But this time, Suno can automatically generate lyrics + select musical styles + sing, "developing to the point of being able to write solos automatically." He also found that the songs it created even had harmony, "which many junior music producers cannot achieve, surpassing those junior arrangers who are priced at 800 CNY per song on Taobao."

During the exchange with "定焦," many professionals believed that the songs generated by Suno could reach commercial standards, that is, they could be sold directly without copyright risks.

Lei Ming, CEO of AIMeng Technology, put the lyrics of a new-generation singer into Suno and generated a song with one click. After listening to it, he felt that the AI-created songs were even closer to industry standards. "The quality of the produced songs is very high, and some can even reach the level of records."

Music practitioner Xiao Jie also told "定焦" that he and his musician friends were all amazed by Suno's power. "The music industry, which was already not making enough money, has encountered another setback."

Suno is good, but it doesn't need to be deified

Although there are many AI-generated music tools, some insiders feel that the previous tools had obvious problems in terms of song quality and were not easy to operate. Therefore, it was more like the story of "crying wolf" until the appearance of Suno V3, which greatly refreshed their perception.

Zoro, who has ten years of experience in programming at major companies and also closely follows AI, told "定焦" that he believes the impact brought by Suno may be similar to that of Sora, a major breakthrough in the creation of music and film. Moreover, it is already in the stage of public free use.

It is understood that Suno is mainly supported by two self-developed large models, namely the Bark voice model based on transformer and the Chirp music model. The former is mainly used to generate vocals, while the latter provides musical melodies and effects. These two models make Suno's generated musical melodies more intelligent and complex, which is also the core technology of this company.

However, Suno is not perfect. It has problems in terms of duration, language understanding, and music multi-track separation.

Firstly, the songs generated by Suno are all less than two minutes long. Both ordinary users and professional music producers clearly feel that many works often end abruptly, which directly affects the auditory experience.

Secondly, although Suno supports the creation of songs in multiple languages, it is still most proficient in English. There are obvious deviations in understanding Chinese lyrics and musical styles.

After using it dozens of times, Fan Yubu found that when generating electronic, R&B, and rock music styles, Suno can handle it well, but it doesn't work well with Chinese pop styles. "定焦" tried to create with Suno by selecting the "folk" style, but the two songs it created were more like anime music.

He also encountered situations where the chorus and verse were not distinguished. He input his own Chinese lyrics and clearly marked the chorus and verse, but Suno still sang the last sentence of the chorus to the verse.

In addition, the works created by Suno do not support track adjustments, so professionals cannot adjust the generated songs, which is currently the biggest obstacle to Suno's commercialization.

Lei Ming said that although the music generated by Suno can easily reach the standard of advertising music and film dubbing, if the client wants to modify some details after listening to it once, Suno cannot do it and can only randomly generate another song. Fanyu Yu feels that, at this juncture, Netease's AI music tool - Netease Tianyin - is relatively user-friendly, supporting the export of tracks. However, compared to Suno, the compositions generated by Netease Tianyin are still somewhat rudimentary and require musician certification. ?url=http%3A%2F%2Fdingyue.ws.126.net%2F2024%2F0327%2F607139eaj00sazela0075d000u000d7m.jpg&thumbnail=660x2147483647&quality=80&type=jpg Source: Netease Tianyin Official Website

AI researcher Cyrus argues that currently, AI-generated music lacks high clarity in audio quality, making it difficult to adapt. Even if Suno can export it, the music loses significant detail in both high and low frequencies and often comes with noise and distortion. Therefore, Suno is primarily used for entertainment purposes at the moment.

According to Cyrus, while the essence of generating images and videos lies in simulating the physical world perfectly, the most crucial aspect of music is how well it can evoke human emotions. However, Suno's melodies are relatively simplistic, the variety of instruments used is limited, and the duration of generated music is insufficient, all of which hinder emotional expression.

To address these issues, it requires data, time, and certain technological breakthroughs.

From a technical standpoint, AI researcher and Ph.D. candidate, Bull Xiao Bo, explains that Suno's underlying technology still relies on the foundational architecture of large models such as diffusion and transformers, but it has made breakthroughs in multimodal capabilities, including text (lyrics), sound (vocals, music), and images (it can generate covers, albeit simple ones). However, the generated songs are only less than two minutes long, likely due to insufficient computational power.

An industry insider also told "定焦" ("Fixed Focus") that a significant technical challenge in AI-generated music is that large models struggle to learn complex music theory while understanding and simulating the emotional expression of lyrics and melodies. Replicating specific styles or the expression of certain artists would further complicate matters, requiring vast amounts of data and computing power.

Cyrus believes that while Suno has made progress, it is not a revolutionary innovation at the technological level. "Suno's AI-generated music, in terms of controllability and complexity, is far behind DALL-E, and there is still a long way to go," Bull Xiao Bo added.

Who Will Suno Compete With?

Currently, Suno and AI-generated music tools are being discussed fervently, but their rate of adoption seems slower than expected.

"定焦" ("Fixed Focus") consulted several domestic music labels and well-known artist teams, most of whom stated that such tools would not affect their daily creative process. In fact, some haven't even heard of Suno.

This is primarily due to two reasons.

Firstly, musicians or companies use AI primarily to increase efficiency. However, practitioners need to spend a considerable amount of time fine-tuning AI to achieve desired results, which is comparable to the time spent on individual creations. Moreover, when musicians have bursts of inspiration, they often work faster.

The more significant reason is that currently, AI-generated music does not meet the personalized song preferences of users/customers. Even though Suno is quite powerful, the lack of creative flair is evident.

Ultimately, the music produced by AI remains industrialized.

Bull Xiao Bo notes that the biggest challenge for such tools currently is acquiring high-quality data and relevant copyrights. Copyright issues primarily rely on platform efforts, while obtaining quality data tests operational capabilities. If more excellent musicians can consistently produce high-quality music data on the Suno platform, AI may be able to generate more high-quality, soulful songs.

However, the rapid pace of updates and iterations by Suno, represented by the release of Bark for text-to-audio conversion in April last year, the addition of vocal music in July, the availability on webpages in December, and now the launch of Suno V3, has caused panic among many practitioners. Comparing the three upgraded versions of Suno, users have noticed significant improvements in bug fixes. For instance, inputting the same lyrics, V1 would transform classical themes into pop, V2 leans towards classical music, incorporating sounds like the Guzheng at the beginning, while V3 seamlessly integrates vocals and melody. If Suno continues to iterate and train with large amounts of data, it is not impossible to replace certain job positions in the future. ?url=http%3A%2F%2Fdingyue.ws.126.net%2F2024%2F0327%2F54a41508j00sazela002vd000u000dim.jpg&thumbnail=660x2147483647&quality=80&type=jpg Image source: Suno official website

In the short term, it will soon have an impact on two types of people.

Firstly, ordinary users. Suno co-founder Shulman has stated that their aim is to lower the barrier for users to create music, enabling every ordinary user to become a creator. Suno does not aim to replace artists. From the current user feedback, this tool has already allowed many music novices to experience the joy of writing a song without understanding music theory.

The other category is companies that mass-produce songs. In discussions, it is widely acknowledged that Suno cannot replace music genres that require teamwork, such as bands, nor will it affect the livelihoods of well-known musicians. "However, it will replace over 95% of less skilled practitioners," said Lei Ming, indicating that those who do not pursue uniqueness and favor mass songwriting companies and some areas that treat music as accompaniment will quickly be replaced by AI, such as advertising music, ambient music, film and television dubbing, and even some homogeneous internet singers and viral short video songs.

Cyrus also believes that Suno does not require high demands for arrangement and mixing, especially dealing a big blow to the kind of cheap songs that cost only a few hundred bucks in the market. However, for professionals such as composers, lyricists, arrangers, and mixing engineers, it may bring benefits, as these tasks can now be completed by one person.

It is understood that foreign companies have already received demands for AI mass composition, and professionals have also applied AI to film and television music such as "Barbie" and "Aubonne." Currently, there is no large-scale commercial use of AI music in China, but practitioners have already begun to experiment with it.

It should be noted that besides technology, copyright issues brought by AI are also a barrier affecting the future development of Suno and AI composition.

"Resistance to AI composition in the music industry," "Some singers suspected of using AI composition causing dissatisfaction among netizens," "Music companies require authorization when using AI with their own singer's voice" ... Since AI composition emerged, these voices have not ceased.

From another perspective, Zoro analyzes that Suno will also bring some positive impacts to the industry. "There will be fewer copyright disputes for BGM in things like movies, TV shows, and short videos, as every user can use AI to generate music that fits specific scenes."

Now, Suno has announced that Suno V4 is in development and will introduce some exciting new features. At that time, AI and humans will embark on a new round of competition again.

*Image source: Unsplash.