Baidu AI DevCon 2024: Li Yanhong Unveils 3 New AI Tools

“In the future, natural language will become the new universal programming language. You just need to speak, and you can become a developer, using your creativity to change the world.”

On April 16th, 2024, the Create 2024 Baidu AI Developer Conference was held in Shenzhen. Baidu’s founder, chairman, and CEO Robin Li delivered a keynote speech titled “Everyone is a Developer.” He believes that large-scale models and generative AI will fundamentally change the developer community.

“AI is sparking a creativity revolution. In the future, developing applications will be as simple as making short videos. Everyone is a developer, everyone is a creator.”

Baidu has prepared three “ready-to-use” tools for developers, including the intelligent agent development tool AgentBuilder, the AI-native application development tool AppBuilder, and various sizes of model customization tools ModelBuilder. “These three tools represent advanced productivity.”

It’s worth mentioning that Robin Li shared Baidu’s specific approach to developing AI-native applications on-site, stating: “This is what Baidu has gained from countless trials and paid high tuition fees over the past year.” The three approaches to developing AI-native applications are MoE, small models, and intelligent agents.

During the conference, Robin Li officially released the tool version of the Wenzin Large Model 4.0. He also revealed that as of now, the user base of Wenzin’s one-sentence has exceeded 200 million. “The Wenzin Large Model has become a leading and widely used AI basic model in China.”

Thanks to the powerful Wenzin Large Model, developers can significantly reduce the size of models pruned from Wenzin 4.0, which performs better at the same size compared to directly using open-source models; at the same effectiveness, the cost is significantly lower. “So open-source models will become increasingly outdated.”

The following are Robin Li’s main points:

Natural language will become the new universal programming language

You can become a developer just by speaking

“Large-scale models and generative AI will fundamentally change the developer community. In the past, developers used code to change the world; in the future, natural language will become the new universal programming language. You just need to speak, and you can become a developer, using your creativity to change the world.”

“After more than a year, Comate has entered over 10,000 companies including Himalaya, Mitsubishi Elevator, and SoftBank Dynamics, with a code adoption rate of 46%. Among the new code added by Baidu every day, 27% is generated by Comate.”

“Today, you don’t have to write code to create an AI application; you don’t have to program to create an intelligent agent. AI is sparking a creativity revolution. In the future, developing applications will be as simple as making short videos. Everyone is a developer, everyone is a creator.”

AgentBuilder, AppBuilder, ModelBuilder

All represent advanced productivity

“As a technology company, Baidu’s role is to provide the necessary development tools for everyone as much as possible, constantly improving the creativity of the entire society.”

“Specifically, we provide a powerful series of basic models, namely the Wenzin Large Model series, which includes flagship versions like ERNIE 3.5, ERNIE 4.0, as well as lightweight versions like ERNIE Speed, Lite, Tiny, and so on.”

“We also provide tools for developing various applications based on large models, including the intelligent agent development tool AgentBuilder, the AI-native application development tool AppBuilder, and various sizes of model customization tools ModelBuilder. These three tools represent advanced productivity.”

Wenzin’s one-sentence user base exceeds 200 million

Wenzin Large Model 4.0 tool version officially released

“Wenzin’s one-sentence was released on March 16th last year, it has been a year and a month today. Our user base has exceeded 200 million, with an API daily call volume of over 200 million, serving 85,000 customers, and over 190,000 AI-native applications developed using the Qianfan platform.”

“In recent months, Wenzin’s large model has achieved further significant improvements in general capabilities such as code generation, code interpretation, and code optimization, reaching the international leading level. Today, we officially release the tool version of the Wenzin Large Model 4.0.”

“The Wenzin Large Model has become a leading and widely used AI basic model in China.”

“Compared to a year ago, the algorithm training efficiency of the Wenzin Large Model has increased by 5.1 times, with a weekly training efficiency reaching 98.8%, and the inference performance has increased by 105 times, and the inference cost has been reduced to 1% of the original. In other words, customers who used to make 10,000 calls a day can now make 1 million calls a day at the same cost.”

Specific approach to developing AI-native applications

Is what we have gained from countless trials and paid high tuition fees over the past year

“Large-scale models themselves do not directly create value, AI applications developed based on large models can meet real market demands.”

“Today, I want to share with you some specific approaches and tools for developing AI-native applications based on large models. This is what we have gained from countless trials and paid high tuition fees over the past year.”

“The first is MoE. In the future, large-scale AI-native applications will be based on the MoE architecture. Here, MoE is not a general academic concept, but a mix of large and small models, not relying on a single model to solve all problems.”

“The second is small models. Small models have low inference costs and fast response times. In some specific scenarios, after SFT fine-tuning, the effectiveness of small models can rival that of large models.” This is why we released three lightweight models: Speed, Lite, and Tiny. We start with a large model, distill it into a base model, and then further train it with data. This approach yields much better results than training a small model from scratch, and it outperforms models trained based on open-source frameworks in terms of speed and cost-effectiveness.

The third aspect is intelligent agents. Intelligent agents are currently a hot topic. As their capabilities improve, they will continue to spawn numerous new applications. The mechanism of intelligent agents, including understanding, planning, reflection, and evolution, enables machines to think and act like humans. They can autonomously perform complex tasks, continuously learn in their environment, and achieve self-iteration and evolution. In some complex systems, we can also enable different intelligent agents to interact and cooperate with each other to accomplish tasks of higher quality.

Intelligent agents are the closest thing to everyone in the future

The most mainstream use of large models

Intelligent agents may be the closest and most mainstream use of large models in the future. Based on powerful base models, intelligent agents can be mass-produced and applied in various scenarios.

Baidu has just upgraded the Wenzhin Intelligent Agent Platform. So far, more than 30,000 intelligent agents have been created, with over 50,000 developers and tens of thousands of companies joining. Our goal is to make everyone and every organization a developer of intelligent agents, creating the most complete intelligent agent ecosystem in China. How do we achieve this goal? By providing everyone with the zero-threshold intelligent agent development tool, AgentBuilder.

Today, every merchant and customer can have their own exclusive intelligent agent on Baidu. The entire process requires no programming at all. By inputting information similar to prompt words and performing simple operations, one can quickly generate an intelligent agent and become a 24/7 online premium salesperson.

On-site, Li Yanhong demonstrated three intelligent agent cases: Singapore Tourism Board, Qide Education, and Sophia. He personally taught developers to create an intelligent agent in five minutes with zero threshold using natural language.

“The Qide Education intelligent agent is very popular. In the first week of its launch, it successfully distributed 1.55 million times, interacted with users 58,000 times, and witnessed a sharp increase in lead conversion rates and a significant decrease in the cost of converting effective leads, greatly improving operational efficiency.”

“Since the launch of the Sophia Merchant Intelligent Agent, the cost of effective leads has decreased by 30%. In other words, if the cost of acquiring an effective customer used to be 100 yuan, now it only costs 70 yuan.”

AppBuilder: The Best AI Native Application Development Tool

Develop an application in three steps using natural language

“AppBuilder is currently the most user-friendly AI native application development tool. On AppBuilder, we have pre-packaged and pre-configured various components and frameworks required for developing AI native applications, greatly reducing the development threshold.”

In just three steps, developers can develop an AI native application using natural language and easily deploy it into various business environments.

On-site, through three cases: “Amusement Park Queue Assistant,” “North China Electric Power University ‘Huadian AI Assistant,'” and “Baidu Wenku Intelligent Comic Generator,” Li Yanhong demonstrated the process of creating an AI native application. With just three simple steps: setting the name, filling in role instructions, and inserting components, an AI native application can be created.

He also pointed out that AppBuilder has two major advantages:

“One is its powerful functionality. With the understanding and compliance capabilities of Wenzhin 4.0 to instructions, our AppBuilder can ensure a decent level of performance from the cold start, without spending a long time fine-tuning due to poor performance. Leveraging Retrieval Augmented Generation (RAG) technology, in typical scenarios such as knowledge questioning, our question answering accuracy and friendly response rate both exceed 95%, far surpassing other similar products.”

“AppBuilder also provides a rich set of component tools, including AI capabilities components based on Baidu’s multi-year technology accumulation, large model capability components, and 55 exclusive business components opened by Baidu. As well as some mainstream third-party APIs, such as flight inquiries, paper inquiries, etc. We also support custom components, allowing customers to directly integrate their own proprietary tools and data. These rich components together support efficient development of AI native applications.”

“The second is its simplicity and ease of use. With AppBuilder, you can quickly create and distribute applications in just three steps. We also support open-source SDKs for easy secondary development.”

ModelBuilder: Model Customization Tool for Various Sizes

Efficient and low-cost model production

“The tool more suitable for professional developers is ModelBuilder, which can customize models of any size according to developers’ needs and further refine the models based on segmented scenarios to achieve better results.”

On-site, Li Yanhong demonstrated a case of essay correction in the education industry. The “Essay Correction Assistant” after data processing and model refinement not only provides more professional teacher comments on thinking and format compliance but also achieves scores closer to real teacher reviews compared to unrefined models.

He also interacted with Xiaodu in real-time on-site, demonstrating Xiaodu’s use of multiple models in combination through MoE for different tasks. For example, using the small model ERNIE Tiny for model routing, while the high-performance Wenzhin 4.0 is used for complex tasks such as scheduling. It was mentioned that compared to the flagship version using only Wenzhin’s large model, Xiaodu’s response speed can be doubled, and costs can be reduced by 99%.

Li Yanhong said, “These examples of ModelBuilder demonstrate Baidu’s ability to efficiently and cost-effectively produce models.”

“To facilitate quick start-up, ModelBuilder comes with a comprehensive and rich set of large models pre-installed.

Baidu’s Latest AI Models and Future Developments

Baidu offers a range of AI models catering to diverse needs, including flagship versions like ERNIE3.5 and ERNIE4.0, designed for complex and general scenarios with powerful capabilities. Additionally, there are three lightweight large models: ERNIE Speed, Lite, and Tiny. Moreover, two models cater to specific verticals: ERNIE Character, suitable for role-playing, and ERNIE Functions, ideal for dialogue or question-and-answer scenarios involving external tool usage and business function calls. ModelBuilder also supports mainstream third-party models from both domestic and international sources, totaling up to 77 models, making it the platform with the largest number of large models in China.

Open Source Models: Falling Behind

With the advent of robust foundational models like Wenxin 4.0, there’s a shift towards tailoring models to various scenarios, considering factors such as performance, response speed, inference costs, among others. By pruning out smaller-sized models based on these considerations, they exhibit better performance compared to directly using open-source models of equivalent size. Moreover, they demonstrate lower costs for equivalent performance.

Previously, open source models were perceived as cost-effective. However, in the realm of large models, they prove to be the most expensive. Therefore, open-source models are destined to lag behind.

Multimodal Large Models: The Gateway to AGI

Maximizing Visual Models for Autonomous Driving

Looking ahead, multimodal large models, incorporating text, images, speech, video, etc., are pivotal for the long-term development of foundational models, serving as the pathway to Artificial General Intelligence (AGI). Baidu has made long-term investments in these areas and will promptly update the latest advancements in large models.

I have a rather unique observation: the most significant application scenario for visual large models lies in autonomous driving. Baidu leads in this direction, being the global frontrunner in autonomous driving. We not only train AI on generating videos but also on understanding real-world occurrences and predicting future events.

Based on over 100 million kilometers of data from complex urban road tests in China, Baidu has developed the Apollo Visual Perception large model. It encompasses four fundamental capabilities: detection, tracking, understanding, and mapping. This enables Baidu to offer more intelligent, adaptable, and safer autonomous driving solutions.

Empowering Everyone as Developers: The Future

A Future Forged Together by Developers

In contemporary China, with 1 billion internet users, robust foundational large models, abundant AI application scenarios, and the world’s most comprehensive industrial ecosystem, the nation vigorously encourages and supports the “Artificial Intelligence Plus” initiative. Every individual and enterprise merely need to leverage these tools to unleash boundless creativity and productivity.

![Baidu AI Models](baidu-ai-models.jpg)

markdown
Baidu AI Models: Source Baidu

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.