Anthropic’s New AI Model Takes Control of Your Computer

Anthropic says it is teaching its Claude AI model to control desktop computers based on prompts. In demonstration videos, the model is shown controlling a computer to conduct research for an outing on the town, searching the web for places to visit near the user’s home and even adding an itinerary to their desktop calendar. 

The functionality, simply called “computer use,” is only available to developers today, and it’s unclear what pricing looks like or how well the tech actually works. Anthropic says in a tweet about computer use that during testing, Claude got sidetracked from a coding assignment and started searching Google for images of Yellowstone National Park. So, yeah… there are still kinks to work out.

From a technical perspective, Anthropic says that Claude is able to control the computer by taking screenshots and sending them back to the model to study what’s on the screen. That includes analyzing the distance between the cursor position and the position of the next button it needs to click. It then sends back commands to the computer to keep proceeding with a task. 

Anthropic, which is backed by the likes of Amazon and Google, says Claude is the “first frontier AI model to offer computer use in public beta.” 

Computer use is general purpose and it’s unclear what it might ultimately be useful for in practice. Thus far, AI models perform best — and produce fewer hallucinations — when they’re provided limited datasets to work with. Anthropic suggests computer use could perform repetitive tasks or open-ended research.

If anyone figures out how to use this new functionality, the /r/overemployed community on Reddit will likely be the first. At the very least it could perhaps be the new mouse jiggler for Wells Fargo employees. Or maybe you could use it to go through your social media accounts and delete all your old posts without needing to find a third-party tool to do it. Things that are not mission critical or require factual accuracy where AI still struggles — LLMs like Claude are still fundamentally just predicting what word should come next in a sentence. 

Although there has been a lot of hype in the AI space, and companies have spent billions of dollars developing AI chatbots, most revenue in the space is still generated by the companies like Nvidia that provide GPUs to these AI companies. Anthropic has raised more than $7 billion in the past year alone.

The latest buzzword tech companies are pumping to sell the technology is “agents,” or autonomous bots that purportedly can complete tasks on their own. Microsoft on Monday announced the ability to create autonomous agents with Copilot that could do “everything from accelerating lead generation and processing sales orders to automating your supply chain.”

Salesforce CEO Marc Benioff has dismissively called Microsoft’s product “Clippy 2.0” for being inaccurate—though of course, he was saying this as he promotes Salesforce’s own competing AI products. Salesforce wants to enable its customers to create their own custom agents that can serve purposes like answering customer support emails or prospecting for new clients. Enterprise AI applications are more promising than general AI chatbots like ChatGPT, because companies typically restrict the models to only using internal data, so they cannot as easily make up facts and figures. 

Overall, white collar workers still don’t seem to be taking up chatbots like ChatGPT or Claude. Reception to Microsoft’s Copilot assistant has been lukewarm, with only a tiny fraction of Microsoft 365 customers spending the $30 a month for access to AI tools. But Microsoft has reoriented its entire company around this AI boom, and it needs to show investors a return on that investment. So, agents are the new attempt at selling a use case. 

The biggest problem, as always, is that AI chatbots produce a lot of output that’s factually inaccurate, poor in quality, or reads like it obviously wasn’t written by a human. The amount of time it takes to correct and clean up the bot’s output almost negates any efficiencies produced by them in the first place. That’s fine for going down rabbit holes in your spare time, but in the workplace it’s not acceptable to be producing error-riddled work. Because Claude’s computer use feature is open-ended, you should be cautious about about letting it go wild through your email, only for it to send people jargon back in response, or screw up some other task that you have to go back and fix. The fact that OpenAI itself admits most of its active users are students sort of says it all.

But again, a bot that can control your computer could ultimately be good at completing repetitive or boring tasks that don’t require much hand-holding.

Anthropic in a tweet about the new functionality itself admits that computer use should be tested with “low-risk tasks.”

Original Source: gizmodo

Leave a Reply

Your email address will not be published. Required fields are marked *