On Monday, March 11th, Musk tweeted that Grok would be open-sourced within a week. After a week of eager anticipation from the developer community, Grok's code was finally released on Sunday.
Grok-1 is an open-sourced, transformer-based autoregressive model with 3140 billion parameters, making it one of the largest open-source models available today. Similar to other prominent open-source large language models, Grok-1 is free to use and commercialize.
Prior to its open-source release, Grok was primarily known for its ability to directly query data from X in real-time and its "humor."
However, the open-source version of Grok comes with caveats. For developers, Grok's ability to query X's data is unavailable. Information from before November 2023, when Grok was launched, was not included in its training data.
In addition to these limitations, Grok's performance falls short of other models. At launch, its performance was notably lower than Palm 2, Claude 2, and GPT-4. One of its significant limitations is its context length of only 8192 tokens.
Grok's significant attention (39k stars on GitHub within three days of launch) appears to be largely driven by Musk's personal influence. While Musk has positioned Grok's open-source release as part of his mission to "make AI accessible to the entire world," some believe it could serve a purpose in Musk's legal battle against OpenAI—effectively making the open-source launch a form of "AI grandstanding."
Is Grok Open-Source for Show?
Open-source is a collaborative effort between companies and developers; however, Grok presents a different scenario.
Before Grok-1, many open-source LLMs had parameter counts of only 7 billion, with LLaMA-2 being one of the largest at 700 billion parameters.
Musk's release of Grok-1, with its massive 3140 billion parameters, is unprecedented. The model requires an estimated 628 GB of GPU memory to run, making it impractical for most individual developers to use on their local machines. Even for cloud users, running Grok-1 would require at least eight A100 or H100 GPUs with 80 GB of memory.
As one user commented on Grok's GitHub discussion board: "GCP instance with 4 A100 80GB; it cost around $20 per hour; this is not for me LOL." Open-Sourcing Grok
Upon Grok's open-sourcing, I joined a discussion group dedicated to it. However, within a day, the focus had shifted to the 40-billion parameter Qwen-1.5.
Despite Grok-1 supporting 8-bit quantization, developers speculate that its "playability" would be significantly enhanced if it were quantized to 160GB. Developers with limited computational resources may consider waiting for an official or community-developed quantized version.
At least in the short term, open-sourcing Grok is not particularly user-friendly for the average developer. So, how does open-sourcing benefit Grok itself? From a traditional open-source perspective, it's difficult to say.
Some argue that open-source models can harness the collective power of developers to optimize them. However, modern AI open-sourcing differs significantly from traditional open-source software. Its impact on AI models is not as evident as it has been on software.
In traditional software development, open-sourcing a system, tool, or software allows developers to directly fix bugs and optimize code. In contrast, most modern AI models are complex black boxes. Identifying issues and refining underlying algorithms is not straightforward.
"Training AI models is a highly 'centralized' task," asserts Shengshu Tech CEO Tang Jiayu. "It's challenging to gather developer contributions through 'distributed' open-sourcing for AI models. Closed-sourcing enables centralized resource allocation, including intellectual and computational power, for continuous iteration."
Some developers believe the primary advantage for creators of open-source AI models is gaining recognition. Given Grok's current capabilities, establishing its reputation in the AI model landscape requires attracting developers, companies, and institutions to use and develop it.
Has Closed-Sourcing Limited Grok's Potential?
From an industry perspective, Grok has not garnered much attention. Its benchmark scores fall short of other recently released AI models, and few comparative evaluations have used Grok as a baseline.
Commercially, Grok's performance on platform X has also been lackluster.
Upon its launch on X, Grok adopted a subscription model similar to ChatGPT Plus. However, while ChatGPT's GPT-3.5 is freely accessible, Grok is only available to X Premium members, who pay $16 per month or $168 per year.
This paywall has prevented Grok from capitalizing on X's massive user base.
According to SimilarWeb, in February 2024, x.com received 104 million total visits, with an average visit duration of just 24 seconds. In comparison, other popular closed-source AIs had significantly higher traffic: chat.openai.com had 1.55 billion visits with an average duration of 7 minutes 33 seconds; gemini.google.com had 316.1 million visits with an average duration of 6 minutes 22 seconds; and the relatively niche claude.ai had 20.86 million visits with an average duration of 5 minutes 48 seconds.
While numerous factors influence website traffic, and x.com's audience and nature differ from those of the other platforms, the vast disparity in visit duration suggests that X users have likely not engaged extensively with the paid Grok.
Musk's initial strategic positioning of Grok may have been driven by the desire to boost X's Premium membership sales and supplement its advertising revenue. However, Grok's impact on X has likely fallen short of Musk's expectations. Rather than remaining stagnant within X Premium, open-sourcing Grok could create new opportunities for Musk and X.AI.
Open-Sourcing's Ripple Effects
Open-source AI models have proven successful for companies looking to establish industry recognition. Examples include MistralAI, Zhihu AI, Ali Cloud's Tongyi Thousand Questions, and others.
Meta, embroiled in its metaverse endeavors, experienced a resurgence with the open-sourcing of its LLaMA model. AI models have become a significant global market disruptor, and Meta's major pivot has been its open-source model.
LLaMA's open-sourcing has showcased Meta's technical prowess in large language models (LLM) and its commitment to open innovation. This strategic move has somewhat assuaged concerns about Meta's metaverse strategy. The company's stock price has increased significantly over the past year, rising from $315.5 billion to $1.2 trillion, outperforming companies such as JD.com.
LLaMA's open-sourcing, particularly its cost-effectiveness, holds strategic importance for Meta. Compared to competing models from Google and Microsoft, LLaMA's compact size and high performance enable Meta to deploy efficient AI models at a lower cost. This not only increases AI accessibility but also opens up possibilities for broad deployment across various applications and use cases. Some analysts predict that generative AI, an area in which Meta has a strong presence with products ranging from chatbots to games to future productivity software, will drive a market worth over $500 billion. Sure, here is your translation:
While LLaMA's initial open-source release was widely believed in the industry to have been an inadvertent "leak," the eventual outcome has cemented Meta's technological and market leadership in the AI large model space.
The strategic logic of "open source" is not unfamiliar to Musk.
In 2014, Musk open-sourced over 350 of Tesla's electric vehicle patents. At the time, Musk said in an interview that "Tesla's primary goal is to accelerate the world's transition to sustainable energy." Musk's altruistic "open source" move has proven to have made him a major beneficiary in the long run.
Tesla's open-patent strategy stirred up the global automotive market in one fell swoop. Many new energy vehicle companies emerged with the help of Tesla's patents, directly activating the entire new energy vehicle market. As the industry leader, Tesla has maintained its leading position in the industry with the help of its long-accumulated industry reputation and technical development strength.
While open-sourcing Grok is unlikely to make as big a splash as Tesla's open-patent move and change the entire AI industry landscape, it should still have some positive effects on x.AI for the time being.