We recently reported on a news item about two Chinese entrepreneurs in Silicon Valley who founded a company called Cognition AI. They launched an artificial intelligence software engineer named Devin, which is claimed to be capable of autonomously editing entire software programs. They also assert that they can complete tasks from an outsourcing website called Upwork using this AI engineer.
However, they have recently been exposed for exaggeration by a blogger known as Internet of Bugs, who detailed their promotional videos and argued that the company has overstated its capabilities. According to the blogger, AI is not yet capable of performing tasks and solving problems as flawlessly as a human software engineer.
The main criticisms highlighted are:
1. The AI program did not truly address user issues as it often provided irrelevant answers, not directly addressing the actual questions posed by users.
2. Devin claimed to be able to fix bugs within the software, but the debunking blogger stated that such bugs would not typically be written by human programmers. He suggested that human programmers could likely solve such problems in a simple two-step process, whereas the AI engineer took numerous, more complex steps. This appeared more elaborate than necessary and was not particularly impressive.
Despite these criticisms, I believe there might be a potential reversal in opinion. Here’s why: although the AI software engineer did not accurately meet user needs this time, it still manages to solve problems in many instances, which justifies the launch of such a software. However, claiming that it can meet all the demands of an outsourcing website might be premature.
This situation is somewhat analogous to claiming that autonomous driving has reached the so-called Level 5, capable of handling all scenarios, which might be an exaggeration. Nonetheless, it’s likely that AI engineers will increasingly meet user needs. In other words, using AI to assist with programming is a probable future development.
The issue of AI creating bugs that humans wouldn’t, or executing more cumbersome operations than humans, might not actually be a problem. This isn’t a defense of AI, but rather a recognition that AI’s approach to programming might inherently differ from humans. That is, AI might make mistakes that humans wouldn’t, but conversely, it avoids common human errors and self-optimizes after correcting its own mistakes, thereby improving over time. We should not judge AI’s actions based on human approaches.
For instance, the famous victory of AlphaGo over Lee Sedol was notable not just because AI won, but because its method of thinking differed significantly from human Go players, which could potentially inspire human programmers and engineers.
In terms of future directions, despite current setbacks and exposures of overstatements, the path of AI software engineering remains promising. The complex programming issues are indeed being tackled by AI, not just because of larger models, but due to collaborative efforts among AI agents. These agents can specialize in different aspects of programming, such as design, task-specific solutions, and integrating processes, much like a human software team.
As these AI agents increasingly handle more complex tasks, their collective intelligence could rival human intelligence, not as individual entities but as a cohesive group. This mirrors humanity’s reliance on collective intelligence over individual capabilities.
Therefore, the trajectory of AI development continues to pay homage to human intelligence, and the future where AI agents collaboratively assist humans in tasks like software development is likely not far off. While Cognition AI may have its imperfections, exploring this direction is undoubtedly worthwhile.