Home > News > AI

The Truth about "Long Text": Battle Royale among Domestic Large Models

Thu, Mar 28 2024 08:14 AM EST

How hot is Kimi? Single-handedly disrupting the A-share market and the large model arena.

The Kimi concept stock has ignited the capital market for several consecutive days, with multiple concept stocks hitting the limit up. In a bullish trend, everyone wants to get a piece of the action. According to incomplete statistics from LightCone Intelligence, currently, at least ten listed companies including Duokewenhua, Zhangyue Technology, and Wanxing Technology have announced through public notices that they are understanding or accessing the Kimi intelligent assistant.

Seeing Kimi's popularity soaring, major players are also eager to join the battle of the large models with "long text" overnight.

Benchmarking Kimi's 2 million-word parameter volume, Baidu Wenxin Yiyuan will open up long text processing capabilities ranging from 2 to 5 million words next month, a hundredfold increase from its previous maximum document processing capacity of 28,000 words. Alibaba Tongyi Qianwen announced an upgrade, opening up to a maximum of 10 million words for long text processing. 360 Zhibrain is currently in internal testing for 5 million words, and after the official upgrade, the functionality will be integrated into the 360 AI browser.

These four Chinese large model companies are pushing the boundaries of long text capabilities to new heights. For reference, the current strongest champion of large models, OpenAI's GPT-4 Turbo-128k, can process text with a capacity of approximately 100,000 Chinese characters, while Claude3-200K, specializing in long text processing, has a context processing capacity of around 160,000 Chinese characters.

But even though they are all "long," some are like Sun Wukong, while others are like the Six-eared Macaque.

A source in the large model industry told LightCone Intelligence, "Indeed, some companies are using RAG (Retrieval Augmented Generation) to confuse the situation. Lossless long text and RAG each have their own advantages and points of convergence, but fundamentally they are different technologies... It's easy to confuse the situation with the term 'long text.'"

"Baidu, Alibaba, 360, most likely all use RAG solutions," the industry insider added.

Whether it's RAG or long text, simply being "long" doesn't represent everything. Like the previous round, model manufacturers are "rolling" parameters, and larger model parameters don't necessarily mean better performance. Besides context length, memory capacity, reasoning ability, and computing power are all decisive factors.

Entering the first year of the landing of domestic large models in 2024, there are countless applications for large models. Why has long text taken the lead in stirring up waves? And based on the characteristics of long text, what practical problems in AI applications can it solve?

Is longer really better when it comes to long text?

Since the birth of ChatGPT, foreign countries have been continuously producing new AI applications, generating traffic while also showing the potential for commercialization.

According to the recent "GenAI Consumer Applications Top 100 Report" released by venture capital firm a16z, among the most visited applications by users each month, ChatGPT-like efficiency assistants dominate the top ten list. ChatGPT's monthly web traffic approaches 2 billion times, while the second-ranked Gemini's monthly traffic is around 400 million times. ?url=http%3A%2F%2Fdingyue.ws.126.net%2F2024%2F0326%2F8cdab2d5j00saxltd002gd000u000isg.jpg&thumbnail=660x2147483647&quality=80&type=jpg The AI scene in China hasn't seen the same level of success and vibrancy as elsewhere. Before Kimi, the smart assistant from the dark side of the moon, gained popularity, there were only two notable domestic applications: Baidu's Wenxin Yiyan App and ByteDance's Dou Bao.

According to relevant data up to September 2023, Baidu's Wenxin Yiyan App reached its peak monthly active users at 7.1 million. By December of the same year, Dou Bao, developed by ByteDance, reached 2 million monthly active users, doubling to 4 million by January 2024.

Wenxin Yiyan, leveraging Baidu's advantage in large models and search traffic, once became the largest AI application in terms of traffic in China. On the other hand, Dou Bao, relying on traffic conversion from Douyin, although launched slightly later, managed to surpass its competitors later on.

Against this backdrop, Kimi's skyrocketing success appears particularly remarkable. In a sense, Kimi became the first AI application in China to break into the mainstream solely based on product capabilities and organic user growth.

Yang Zhilin, the founder of the dark side of the moon, once told Luminous Intelligence that his team discovered that many large model applications failed to land due to limitations in input length, which is why the dark side of the moon focused on long-text technology.

From a user's perspective, usability is the most critical metric for evaluating AI application products, and this relies on the long-text technology behind Kimi.

Breaking down the capabilities of long text, it mainly includes length, memory, understanding, and reasoning. Increasing the length of text further can enhance the usability and professionalism of AI applications.

For ordinary users, short chats with AI assistants may be interesting but inadequate for problem-solving, especially in specialized fields such as law, medicine, and finance, where specific data and knowledge need to be "fed" to the large model to accurately output answers. For enterprises, they need more of an "expert" assistant, where a large amount of enterprise data and industry data need to be imported in advance, with input and output without loss, to ensure that the final analysis results are referenceable. Claude is a typical example, gaining a large number of enterprise users in the B2B vertical industry by leveraging the advantages of long text and taking a different path from ChatGPT.

Multi-round dialogue and memory capabilities can be directly applied to most scenarios today, such as NPC in gaming scenarios, where character settings are given through long-text input, and every dialogue by players is recorded, generating personalized game profiles to avoid the problem of repeatedly awakening due to re-login. In the execution of Agent tasks, enhancing memory capabilities can assist Agents in forming clear action steps, avoiding incidents where Agents fight each other.

The understanding and reasoning abilities of long text are manifested in two aspects: understanding and generating imaginative applications, and generating logical applications. For example, in the application of AI novels, the ability of long text lies in understanding the prompts input by users and extending them imaginatively; in fields such as programming and medical Q&A, it requires the logical reasoning ability to reasonably continue writing code and infer symptoms based on user descriptions.

Xu Xinran, Vice President of the dark side of the moon, once said that the order-of-magnitude increase in the lossless context length of large models will further open up the imagination of AI application scenarios, including the analysis and understanding of complete code repositories, intelligent Agents capable of independently completing multi-step complex tasks, lifelong assistants who will not forget crucial information, and truly unified architectures of multimodal models.

Therefore, long text has always been a comprehensive ability rather than the longer, the better. On the contrary, excessively pursuing length may lead to problems of computational scarcity.

Large model companies are seizing the "flow" with a daily customer acquisition cost of 200,000 CNY. With skyrocketing traffic and five expansions after downtime, daily active users reaching millions, and a month-on-month growth rate of 107.6%, the dark side of the moon has presented an impressive report card. However, this is just the beginning. Several industry insiders revealed to Luminous Intelligence earlier this year that after the rapid technological iterations of 2023, large models have entered the second half of industrial implementation and commercialization. Last year, various companies have successively launched commercialization initiatives. Intelligent Spectrum, Baichuan, and Mianbi have all embarked on commercialization to varying degrees. The dark side of the moon is slightly slower and has not yet announced its commercialization plan. However, it has urgently begun the commercialization acceleration process, with Kimi Assistant ads seen on social platforms such as Bilibili and Douyin. ?url=http%3A%2F%2Fdingyue.ws.126.net%2F2024%2F0326%2Fffdb225bj00saxltd002ld000u000oug.jpg&thumbnail=660x2147483647&quality=80&type=jpg Although no one has completely ruled out the path of monetizing 2C, following the train of thought from the AI 1.0 era in 2016, most still prioritize 2B as the primary breakthrough. It seems only natural to have the technology first, then explore directions for its application in industries and develop landing solutions.

On the flip side, Moon Shadow, a big model company, took a different approach. Since making its debut in October last year, it has set its sights on the 2C application market. Yang Zhilin once stated that long text is the root technology of Moon Shadow, which can be split into different scenes and fields for 2C applications.

Even before the Kimi effect exploded, many ordinary users and corporate users had provided feedback saying, "Kimi is the best AI assistant in China, no doubt about it." Starting with a focus on product effectiveness and user experience, Kimi's explosion now seems somewhat inevitable.

Under commercial pressure, big model companies will probably choose to walk on both legs of 2B and 2C. Compared to other big model companies, Moon Shadow offers another path for commercialization. While other players start with 2B and then move to 2C, using 2B to drive 2C, Moon Shadow takes the opposite approach—starting with 2C and then using 2C products to drive 2B deals.

After all, except for foreign products like ChatGPT, there were hardly any cases of 2C product growth in China before. With nearly half a year of accumulation, Kimi has single-handedly opened up a path in 2C, perhaps prompting many major players to see more possibilities in 2C and eagerly rush to prove their ability to handle long texts in the market.

However, returning to the essence of commercialization and making money, one still needs to consider how to convert temporary traffic into tangible conversion rates.

Observations from Lumicon Intelligence indicate that most big model companies still follow the internet streaming model when promoting products, repackaging old wine in new bottles, advertising on platforms like Douyin, Bilibili, and Xiaohongshu, and placing ads in offline office buildings, airports, and subways.

The actual conversion rate after all these operations is still unknown, but what's spent to acquire customers is real money. According to reports from Sina Technology, an investor revealed that Kimi's user acquisition cost currently ranges from 12 to 13 CNY per user. Based on estimated downloads, Kimi's average daily download volume in the past month was 17,805. By this calculation, Kimi's daily user acquisition cost is at least 200,000 CNY.

Most AI assistants on the market are currently available for free download and use. Due to the network externality effect, as the user base grows, the consumption of resources also increases. The recent Kimi downtime incident is a prime example, where the sudden surge in users put pressure on computing power and servers, along with a significant consumption of token costs.

For big model companies, the triple tug-of-war between scalability, conversion rates, and costs cannot be resolved in the short term. Even ChatGPT, which outperforms other applications in terms of traffic, faces the dilemma of balancing profits and losses.

According to data from Data.ai, as of June 19, 2023, ChatGPT's iOS app had a daily active payment rate of about 4.36% in its first month online. OpenAI predicts that for compressed cost versions of the GPT-3.5 and GPT-4 models, if the monthly payment rate increases by 0.25% each month, it may not be sustainable; however, if the monthly payment rate increases by 0.5% each month, it could turn the tide.

Continuous increases in the monthly payment rate may sound enticing, but the reality is that "premature aging" may set in before any explosive growth occurs, and growth stagnation may arrive first.

For big model manufacturers, especially startups, there aren't many chances to experiment. They can't just come out of the technical pitfalls only to plunge into the pitfalls of investment flow. Following the trend of long texts won't solve all problems; breaking out of commercialization models is the key.