AI agent for dirty work? The Claude model can change the face of office work

Auto Draft
Auto Draft

When you look at the development of artificial intelligence, you might get the impression that this is a technology that is going to turn our lives upside down. Many people see it as a real threat to their jobs, which may simply become automated over time. Of course, this is not a completely irrational way of thinking.

Some time ago, current NVIDIA CEO Jensen Huang admitted that its goal is to introduce up to 100 million such AI-based assistants. This will make work more efficient because people will not focus on routine and often sterile activities. It sounds interesting, but he’s not the only one with such ambitions.

Anthropic recently came up with a very similar idea and presented something equally interesting. This time we are talking about assistants who would be part of our computers and do all the “dirty work” for us. How exactly would this work?

The Claude model from Anthropic as a tool to control our computer

The new version of Claude can do something that many other models will not be able to achieve for a long time.

A few days ago, Anthropic officially announced that it had achieved something truly special. As presented, the latest version of their AI model called Claude has been adapted to perform a number of tasks on a computer. At first glance, it’s nothing extraordinary, but it has a lot of potential.

We’re talking specifically about opening browsers, searching for things on the Internet, or entering data using a mouse and keyboard. In fact, these are very basic things, but at this stage they are sufficient for this model to significantly increase its usefulness.

During a special demonstration prepared for Wired editorial staff, Claude’s new capabilities were presented. For example, the model was asked to plan a trip to watch the sunset on the Golden Gate Bridge in San Francisco.

Człowiek i asystent AIA robot helping humans with everyday office work? A vision of the future that may be within our reach.

Effect? Claude used Chrome to find the right information (even the perfect place to watch the sunset), then opened the Calendar app, scheduled the event, and shared it with the person who was invited to the trip.

The second challenge was to build a simple website that would also be a place to promote this model. The model also coped with this task, using Visual Studio Code from Microsoft. However, a certain drawback turned out to be that the website was initially too stylized for the 1990s.

Interestingly, when the model was asked to correct the errors that appeared, it returned to the code editor, found the appropriate bugs and deleted them. In this way, AI created a website for itself without much programming intervention from humans.

Claude the great hope of office automation?

Anthropic’s chief science officer and associate professor at Johns Hopkins University, Jared Kaplan, said this is how we enter a new phase of AI development. Thanks to their innovation, models can use all digital tools available to humans and thus significantly facilitate their work.

Another company representative, Mike Krieger, revealed that the entire company hopes that AI-based agents will automate certain routine tasks over time. This is, of course, to translate into greater productivity of people, who can focus on something more demanding. The interested person himself confessed that if the AI ​​model did the “dirty work” for him, he would go to play the guitar.

According to information provided by the company, several other companies are already testing the model in question for the purpose of researching automation. Specifically mentioned here are uses for Canva (design and editing assistance) and Replit (coding tasks).

The group of the first entities to use Claude is larger and will probably only grow over time. However, this shows that there is considerable interest in such technologies among companies, because they can be treated as a path to better efficiency and effectiveness.

There is still a long way to complete happiness

The idea is ambitious, but the execution still requires some refinement. Despite everything, Claude appears to be an extremely interesting tool.

Although Claude’s abilities are already impressive, there is still a long way to full implementation and use of him as a full-fledged AI assistant. After all, we are talking about something that should be reliable so as not to generate costly losses for companies. This turns out to be impossible at the moment.

Even though Anthropic claims that their model currently outperforms models like ChatGPT 4 and Gemini, it is still “highly imperfect.” Tests at OSWorld have shown that Claude actually performs better than the competition (14.9% of correctly completed tasks versus 7.7% achieved by ChatGPT4), but he is still far from human precision.

However, there is considerable development potential, and the further direction of development will be dictated by companies interested in such tools. It is very possible that in some time such AI computer assistants will become a standard for many typical office positions. For now, however, the road to this seems long and extremely bumpy.