On April 16th, the Create 2024 Baidu AI Developer Conference was held in Shenzhen. A reporter from Nandu Finance Society learned on the scene that Robin Li, the founder, chairman, and CEO of Baidu, delivered a keynote speech titled “Everyone Can Be a Developer,” envisioning a world not confined to coding skills but one where creation is possible for everyone through natural language as the medium.
As the competition for large-scale AI models heats up among corporations at home and abroad, this day seems increasingly within reach. Over the past year, Baidu’s Ernie large model has undergone significant upgrades, evolving from version 3.0 to 3.5, and finally to version 4.0, which has led to significant improvements in its intelligent assistants, such as Ernie-AI, in code generation, explanation, optimization, and other general capabilities.
During the conference, Robin Li shared the progress of the Ernie 4.0 model with more than five thousand tech enthusiasts and practitioners in attendance. He noted that the algorithm training efficiency of the Ernie large model had increased by 5.1 times compared to a year earlier, with a weekly training efficiency of 98.8%, and the inference performance had improved 105-fold, reducing the inference cost to just 1% of the original. This means that customers can now invoke the model 1 million times for the same cost where previously they could only do so 10,000 times.
Pan Helin, a member of the Ministry of Industry and Information Technology’s Information and Communication Economic Experts Committee, told the Nandu Finance Society reporter that the Ernie large model is the most comprehensive in terms of domestic ecosystem. Through continuous development, it excels in specialized applications such as AI programming, which expands and locks in the potential user base of the Ernie model, making users more willing to pay for it, as programming capability is essentially a form of productivity.
“We see that the key to the competition in domestic large models lies in occupying key applications. For example, some AIs are aimed at mapping, some at editing, some at virtual humans, copywriting, PPT, and like Ernie’s large model is aimed at code programmers. The vertical application advantages of these large models will become the foundation for the future growth and strength of large models,” Pan Helin added.
Robin Li Prefers Smaller Models: More Cost-Effective
“Having encountered countless pitfalls and paid a steep ‘tuition’,” said Robin Li when discussing the idea of developing AI-native applications based on large models. However, it’s undeniable that Baidu has indeed blazed a unique trail in the realm of AI large models.
At the conference, Li shared some specific ideas for developing AI-native applications based on large models, including MoE (Mixture of Experts), smaller models, and agents.
Li emphasized that future large-scale AI-native applications would largely adopt an MoE architecture, which does not rely on a single model to solve all problems. Second, smaller models are more cost-effective. “They have lower inference costs, faster response times, and in some specific scenarios, after fine-tuning with SFT, their performance can rival that of large models,” he said. Third, agents substantially lower the barriers to development. As the capabilities of agents improve, they spur the creation of a plethora of new applications. They allow machines to think and act like humans, to autonomously complete complex tasks, and to self-iterate and evolve.
In line with Baidu’s evolving development philosophy, the current Ernie model series includes not only flagships like ERNIE 3.5 and ERNIE 4.0 but also lighter versions such as ERNIE Speed, Lite, and Tiny. “It’s not that large models are unaffordable; small models are simply more cost-effective,” which is also within the range of consideration for Baidu’s audience needs.
Currently, within the hottest AI track, the debate over large versus small models has begun to emerge, and large models are becoming “increasingly smaller.” Observations note that in December last year, Google launched three versions of Gemini: Ultra, Pro, and Nano, with the smallest Nano being able to run directly on mobile devices, available in versions with 1.8 billion and 3.25 billion parameters. Subsequently, in December, Microsoft launched the Phi-2 model with only 2.7 billion parameters, which not only exceeds the performance of Mistral-7B but is also not far behind the 700-billion parameter version of Llama 2.
However, compared to small models, large models always have an unspoken rule—the more parameters, the better the performance. The most representative of these are OpenAI’s GPT3.5 and GPT4 series, Google’s Gemini series, etc. In certain specific scenarios, the powerful computing capabilities of large models enable them to handle more detailed and complex tasks, and they also demonstrate higher accuracy in prediction and classification tasks—a “hurdle” still challenging for many small models to overcome.
AI Applications Developed from Large Models Offer Greater Value
Furthermore, Robin Li once again emphasized his previous viewpoint, “Large models do not directly create value; it is the AI applications developed from large models that can meet the real market demand.”
Based on this, Baidu has prepared “ready-to-use” tools in the three directions mentioned earlier—MoE, smaller models, and agents. During his keynote speech, Li released three major development tools: an agent development tool called AgentBuilder, an AI-native application development tool called AppBuilder, and a model customization tool for various sizes called ModelBuilder.
Li pointed out that to date, more than 30,000 agents have been created, and over 50,000 developers and thousands of enterprises have joined the platform. “Today, every business and every customer can have their exclusive agent on Baidu. The entire process requires no coding; with the input of prompt-like information and simple tuning operations, a smart agent can be quickly generated, serving as a 24/7 gold medalist salesperson.”
Taking the example of the launch of the Sofia Merchant Agent, the data showed a 30% reduction in the cost of obtaining valid leads. That is to say, the cost of acquiring a valid customer, which used to be 100 yuan, is now only 70 yuan.
As for AppBuilder, Li categorized it as an AI-native application development tool, “In just three steps, developers can use natural language to develop an AI-native application, and conveniently publish and integrate it into various business environments.”
The Nandu Finance Society reporter observed that simply by setting a name, entering role commands, and inserting components, one can create an AI-native application. Li demonstrated the creation process of an AI-native application through three examples: a “Theme Park Queue Assistant,” the “HDU AI Assistant” by North China Electric Power University, and Baidu Wenku’s intelligent comic generation.
He also pointed out that AppBuilder has two major advantages: it is powerful, with question-answer accuracy and friendly response level reaching above 95%; and it is easy to use, allowing for fast application creation and one-click distribution.
Finally, for professional developers, Baidu introduced ModelBuilder. It allows developers to customize models of any size according to their needs and to further fine-tune them for specific scenarios through SFT, thereby achieving better results.
Report by Nandu Finance Society Reporter Yan Zhaoxin, Intern Chai Jia