Home > News > Techscience

The world's first AI programmer accused of fraud? Industry insiders: Doubts are reasonable, but programmers can't do without AI

YuYan,MoHui Thu, Apr 18 2024 10:52 AM EST

Devin, touted as the world's first AI software engineer, has recently come under scrutiny and accusations of fraud by online bloggers. Industry experts believe that the blogger's doubts are reasonable and substantiated, as Devin's "astonishing performance" indeed raises suspicions of commercial hype.

Claiming to be the "world's first AI software engineer," Devin is now facing allegations of fraud from online bloggers. On April 9th, a blogger claiming 35 years of software engineering experience, named Karl, meticulously recreated Devin's demonstration video frame by frame and raised four doubts, including the deceptive nature of Devin's programming abilities, stating that "the tasks it handles are not random but deliberately chosen for demonstration"; furthermore, many of the issues seemingly fixed by Devin during the operation were actually staged by Devin himself. 661f287be4b03b5da6d0cf48.jpeg The "world's first AI software engineer," Devin, has been called into question for potential fabrication by an online blogger. Following the release of the skeptical video, it sparked discussions among numerous tech enthusiasts. Wang Yihao, head of the Shanghai Artificial Intelligence Industry Association and the Big Model Special Class, recently stated in an interview with The Paper that Karl's doubts are reasonable. Devin's seemingly "amazing effects" indeed raise suspicions of commercial hype and packaging. However, it cannot be denied that AI has become one of the indispensable tools for programmers.

The first AI programmer is accused of exaggerating the actual effects through multiple instances of "self-study" and self-construction of code.

Devin, released by Cognition Labs on March 12th this year, is touted as the "world's first AI engineer." In a 1 minute and 50-second demonstration video posted on the Cognition website, Devin can handle the entire development project end-to-end with just one command. Additionally, the video shows its ability to autonomously learn new technologies, build and deploy applications end-to-end, autonomously search for and fix code issues, and simultaneously execute multi-step workflows according to user needs. Programmers can observe its progress in real-time and correct errors by simply issuing commands when they are detected.

Karl raised doubts after frame-by-frame comparison of the above video. He believes that at 2.936 seconds into the demonstration video, the screen shows "they've searched for this task" in the top left corner, indicating that the task Devin is handling in the demo video is not random but chosen by the presenter. Karl suspects that this may mean Devin is not outstanding in most of its work, and may even be worse than shown in the demonstration video. 661f287be4b03b5da6d0cf4a.jpeg At 2.936 seconds into the demo video, the phrase "They searched for this task" appears in the top left corner of the screen.

During the runtime, Devin encountered several instances of "self-building and self-repairing," seemingly fixing code but actually generating erroneous code unrelated to what it found on the internet or what the client requested. Karl doubts Devin's actual efficiency in operation, suggesting many meaningless operations may be involved.

Furthermore, while the front part of the demo video shows March 9th at 3:25 PM, the latter part displays 9th at 9:41 PM, indicating a 6 hours and 20 minutes gap in Devin's work. Karl, in his actual operation, only took 35 minutes and 55 seconds to replicate Devin's work.

Karl expresses he's not against AI per se but condemns AI hype like that of the Devin team. He advocates for a cautious and skeptical attitude towards any information on the internet, especially AI-related information.

Industry insiders consider the blogger's skepticism to be well-founded.

It's noted that Devin is not yet available for open use and can only be applied for via email. Public knowledge about Devin mainly comes from the official demo video and a few evaluations from third-party developers and product personnel.

According to media reports, the team behind Cognition AI consists of 10 members, with a core team of 3: Scott Wu, Steven Hao, and Walden Yan. The team is relatively young, with all members boasting a total of 10 gold medals from the International Olympiad in Informatics (IOI). Many members have also participated in international informatics competitions during their teenage years. After the announcement of "Devin, the world's first AI programmer," Cognition AI gained significant attention. Public records show that Cognition AI previously raised a Series A funding of $21 million led by Peter Thiel's Founders Fund.

Currently, is AI programming exaggerated or hyped? What can AI programmers do compared to human programmers?

Wang Yihao, head of the Shanghai Artificial Intelligence Industry Association and the Large Model Special Class, told Pungpang Technology that the existence of AI programmers can indeed assist people in independently completing simple development tasks, even without the help of a real programmer. It can reduce a large amount of repetitive labor that doesn't require innovation, such as batch modification of code naming conventions and code dependencies. However, Karl's skepticism towards Devin seems justified based on the current evidence. From the demonstration, Devin's time spent on paperwork is about twelve times longer than that of an experienced programmer.

Wang Yihao believes that, despite some exaggeration in Devin's capabilities, the trend of AI programming development cannot be denied. Programmers are adept at embracing the convenience brought by large models and trying out various code assistance tools. Programmers rely on these assistants, much like designers rely on Stable Diffusion.

(Original Title: Is the world's first AI programmer Devin a hoax? Insiders: Skepticism is justified, but programmers can't do without AI)