Home > News > AI

DouBao Large Model Price List Released, Supporting Highest Concurrency Standards in China

Wed, May 22 2024 07:51 AM EST

Source: Cover News

Recently, the official website of Volcano Engine updated the pricing details of the DouBao large model, comprehensively displaying the price information of different versions and specifications of the DouBao general model. With model inference pricing significantly lower than industry prices, the DouBao general model's TPM (Tokens per minute) and RPM (Requests per minute) both reach the highest standards in China. Taking the DouBao flagship model pro-32k as an example, the price is 99% lower than the industry average, while the TPM limit is 2.7 to 8 times higher than models of the same specifications. ?url=http%3A%2F%2Fdingyue.ws.126.net%2F2024%2F0521%2F8222b7a8j00sdtce0004md000u0008yg.jpg&thumbnail=660x2147483647&quality=80&type=jpg The "post-paid" pricing information for the Bean Series models

Official information indicates that under the "post-paid" pricing model based on actual token usage, the Bean Universal Model Pro and Bean Universal Model Lite versions with window sizes of 32k and below have a model rate limit of 10K RPM and 800K TPM (whichever reaches the limit first). The TPM limits for other mainstream models in China mostly range from 100K to 300K, while RPM is typically between 60 and 120. The RPM limit for lightweight models is relatively higher, but still only in the range of 300 to 500.

Based on the 10K RPM limit, enterprise customers can make an average of 167 calls per second to the Bean Universal Model, meeting the large model application needs of most production systems in various business scenarios. This standard already meets the RPM limit provided by OpenAI for high-level customers (Tier 4 and Tier 5 customers).

For the more computationally challenging long-text models, the 128k versions of the Bean Universal Model Pro and Lite have a model rate limit of 1K RPM and 400K TPM, significantly higher than other domestic 128k long-text models.

Furthermore, the Bean Large Model has released the latest pricing for "pre-paid" model units. With "pre-paid," enterprises can purchase a specific TPM quota for calling a particular model, eliminating the need for additional payment based on token consumption, allowing for advanced planning of computational resources for foreseeable traffic fluctuations. ?url=http%3A%2F%2Fdingyue.ws.126.net%2F2024%2F0521%2F83f537c5j00sdtce00038d000lx00gug.jpg&thumbnail=660x2147483647&quality=80&type=jpg "DouBao Series Model 'Prepaid' Model Unit Price List

Taking the DouBao Universal Model pro-32k as an example:

According to the 'prepaid' model unit price calculation, the monthly price for 10K TPM is 2000 yuan. 10K * 60 * 24 * 30 = 43200K. That is, the price of 432000K Tokens is 2000 yuan, with an average price of 0.0046 yuan per thousand Tokens.

According to the 'postpaid' mode calculation: In the calculation cost of model inference, the inference input usually accounts for the vast majority, and the industry generally believes that the inference input is 5 times the output. Calculated based on the DouBao Universal Model pro-32k inference input at 0.0008 yuan per thousand Tokens, and inference output at 0.002 yuan per thousand Tokens, the comprehensive price of model inference is 0.001 yuan per thousand Tokens.

Volcano Engine stated that the DouBao large model provides customers with a flexible and economical payment model. 'Postpaid' can meet the business needs of the vast majority of enterprise customers, helping enterprises to use large models at extremely low costs and accelerate the application of large models.

'The ultra-low pricing of the DouBao model comes from our confidence in optimizing costs through technological means, rather than subsidies or price wars to grab market share,' said Tan Dai, President of Volcano Engine. 'The saying 'you get what you pay for' does not work in the enterprise market. Only technology-driven ultimate cost-effectiveness can truly create value.' Volcano Engine is closely cooperating with the DouBao large model team of ByteDance to continuously optimize model performance and inference costs, providing better models, lower costs, and easier deployment platform support for enterprises and developers.

Attached: Model service price document on the Volcano Engine official website https://www.volcengine.com/docs/82379/1099320

Cover News Reporter Huang Jingru"