This week, the “hottest chicken” in the AI industry was born – AutoGPT.
Not only that, the debut of this software system has pushed the AI process to new heights – autonomous artificial intelligence. As the name suggests, its main ability is to be “autonomous”, without any human intervention! For example, a netizen asked AutoGPT to develop a website, and the AI system completed it with React and TailwindCSS in less than 3 minutes. It can surf the Internet, use third-party tools, think and operate your computer all by itself. This set of tactics can be considered as breaking through and cycling the path of “action->observation results->thinking->deciding on the next action”. Even former Tesla AI director Andrej Karpathy commented that AutoGPT is the next frontier in prompt engineering. And not just AutoGPT, various autonomous AI tools have been emerging one after another, such as AgentGPT, BabyAGI, and so on. Netizens are amazed at the rapid development of AI: “New things are coming out every day, it’s too tiring.” So what is the difference between the popular AutoGPT and the previous ChatGPT that caused a stir in the AI industry? Weibo blogger “Mu Yao” put it to the test.
Source of the image: Weibo blogger “Mu Yao”. He used the question “Explain to me how to use LangChain” (LangChain is an application framework for large language models) for testing. If this question was given to a traditional GPT, AI would not know how to answer because the concept of LangChain is too new. If given to the “GPT + plugin” combination to answer, the result would be similar to Bing or Bard. But AutoGPT is different. Let’s enjoy its execution process: first, try to crawl information from the LangChain official website. It was rejected by the website, possibly due to blocking. After failing, I thought about it and said, why not Google it? Google produced many results. After reading these results for a while, I said, “Now we have collected a lot of useful information, let’s install it and play with it.” Then I started installing LangChain on my computer… And the blogger “Mu Yao” just sat in front of the screen, watching all of this happen automatically in front of me… The blogger couldn’t help but exclaim: If I had this kind of initiative, I would have finished three PhDs. There is also an AI with similar autonomy, as well as the recently popular AgentGPT. Just give it a task, and let it complete the rest: creating a new startup company with $100 in funding.
After receiving the task, AgentGPT began his presentation: thinking… adding tasks: research low-cost business models that require minimal funding, identify potential target markets and their needs, develop a streamlined MVP (minimum viable product), and test it in the target market to validate demand. Thinking… Then AgentGPT first output the results of “research low-cost business models that require minimal funding”: according to my research, some low-cost business models that require minimal funding include: 1. Dropshipping: with dropshipping, you can sell products online without holding inventory. You only need to complete orders through third-party suppliers who will directly ship the products to your customers. You can start paying for website hosting and domain registration from $50. 2. Social media marketing: if you have talent in social media marketing, you can start your own agency by helping businesses reach their target audience. All you need is a computer, internet connection, and some experience and skills in social media marketing, and your starting price can be $50 or less. 3. Event planning: if you have a passion for organizing events, you can start your own event planning business. You can start with planning small events such as birthday parties, weddings, etc. You can start from $100, create a simple website, business cards, and market yourself through social media.
Free Writing: If you have a talent for writing, you can start your own freelance writing career. You can find clients through online platforms such as Upwork, Fiverr, and other job websites. For just $50, you can create a portfolio website and market yourself through social media. These are just some low-cost business models that you can consider with just $100 in capital. Choosing a business model that aligns with your skills and interests is important. But it doesn’t stop there, AgentGPT will continue its work: thinking, adding tasks, brainstorming potential product or service ideas for each identified low-cost business model, evaluating the potential profitability and scalability of each idea, selecting the most promising idea and developing a lean MVP for testing in the target market. And then, the process continues with more thinking and execution. BabyAGI, for instance, only requires a target and a number of task iterations, and AI promptly gets to work. Some bloggers are already using AutoGPT as their productivity tool – making videos! Many netizens, after seeing such impressive feats from the AIs, are in awe and wondering how to play with these powerful tools. Fortunately, operating them is not at all difficult.
First, prepare the keys for OpenAIAPI and PINECONEAPI (the latter is a database specifically designed for AI long-term memory development, using the former will consume the balance in your OpenAI account, the free version comes with $18). Python version 3.8 or higher is required. If you need speech output, you can also prepare an ElevenLabsAPI. Ps. The addresses for obtaining keys are provided in the links at the end of the article. Next, open your CLI tool and download the AutoGPT project: git clone https://github.com/Torantulino/Auto-GPT.git Then: cd ‘Auto-GPT’ pip install -r requirements.txt Then, find the root folder of AutoGPT, rename “.env.template” to “.env” and open it. Replace the keys with the ones you prepared. Finally, run in the terminal: python scripts/main.py If you want to use speech mode, run: python scripts/main.py –speak If you cannot access GPT-4, turn on the “gpt3only” mode: python scripts/main.py –gpt3only Now, you can start your own AutoGPT project.
Remember to input “NEXTCOMMAND” to authorize the program to continue after each operation in AutoGPT. Although the project also provides an easier “continuous mode”, which can be initiated with the command “pythonscripts/main.py—continuous” when first running the program, the author strongly advises against it! This mode may cause your AI to continue running indefinitely or execute operations that you did not intend to authorize. If you insist on trying it, the risk is yours… Additionally, by default, AutoGPT uses DALL-E to generate images, but if you want to switch to StableDiffusion, you will need an HuggingFaceAPI token. In essence, AutoGPT can be described as a “nesting doll”, as it expands the scope of its applications to include file operations, web browsing, and data retrieval. This sets it apart from all other AIs we have seen before, and it has earned over 36,000 GitHub stars. As former AI researcher at Mila Lior analyzed: AutoGPT essentially provides a memory and a body for GPT-based models. Specifically, the architecture of AutoGPT is based on GPT-4 and GPT-3.
Connected through API, AutoGPT can conduct independent iteration, i.e. improving the output through self-critical review, building on previous work, and integrating prompt history for more accurate results. AutoGPT has memory management and integrates with the Pinecone database, allowing for long-term memory storage, context retention, and decision improvement based on it. To Twitter blogger JayHack, the emergence of AutoGPT and similar tools actually proves that the essence of intelligence is “nesting”. Whether it’s AutoGPT or BabyAGI, they both use LLM to recursively call themselves. This is a big trend in the AI field recently: more professionally, it’s called model stacking, i.e. models “going down the road” to apply other models to decompose and solve tasks. Simply put, it’s “nesting”. Apart from AutoGPT and BabyAGI, there are also new tools like ViperGPT, SayCan, and ToolKit, as well as previous releases like VisualGPT and HugginGPT by Microsoft, that follow this idea. Going further back, the original DALL·E was actually CLIP nested in VAE. Interestingly, JayHack pointed out that Marvin Minsky, known as the “godfather of AI”, had already described human intelligence in 1986 as an organization with many interacting subsystems.
Finally, JayHack also stated that it is precisely because of the “sheepshearing” operation that our AI is now able to take on more complex tasks, bringing us one step closer to general artificial intelligence. Indeed, many netizens agree that AutoGPT may be the next big trend in the field of AI. Faced with such powerful “sheepshearing” tools, some people are starting to lament.
Reference links: [1] https://twitter.com/AlphaSignalAI/status/1645847165066006529 [2] https://twitter.com/karpathy/status/1642598890573819905 [3] https://weibo.com/1644684112/MBK3WCt8o [4] https://twitter.com/DataChaz/status/1645152577258962944 API key acquisition address: https://platform.openai.com/account/api-keys https://www.pinecone.io/ https://elevenlabs.