Monday, December 23, 2024

Anthropic’s new AI can use computers like a human, redefining automation for enterprises

Must read


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Anthropic, the AI research and safety company, has announced a new suite of capabilities—including an upgraded version of its flagship AI model, Claude 3.5 Sonnet, and a new model, Claude 3.5 Haiku—that could transform how businesses automate complex workflows. But the most striking development in this release is a new feature: Claude can now use a computer like a human, navigating screens, clicking buttons, and typing text.

This new feature, called “Computer Use,” could have far-reaching implications for industries that rely on repetitive tasks involving multiple applications and tabs. From data entry to research to customer service, the potential applications are broad—and potentially industry-shaping.

AI moves from text to screen interaction

Since its founding, Anthropic has focused on creating AI models that are safe, reliable, and capable of complex reasoning. With Claude 3.5 Sonnet and Haiku, the company is expanding the model’s capabilities even further. The new “Computer Use” feature allows AI to perform tasks that were previously handled only by human workers, such as opening applications, interacting with interfaces, and filling out forms.

“Computer use capabilities have the potential to change how tasks that require navigation across multiple applications are performed,” said Mike Krieger, Chief Product Officer at Anthropic, in an exclusive interview with VentureBeat. “This could lead to more innovative product experiences and streamlined back-office processes.” Krieger emphasized that the new capability is still in its beta phase, but as the technology evolves, it could improve data analysis, visualization, and user interface interactions, making many tasks more efficient.

“We anticipate it being particularly useful for tasks like conducting online research, performing repetitive processes like testing new software, and automating complex multi-step tasks,” he said. “As the technology matures, it could enhance data analysis, visualization, and user interface interactions, potentially improving accessibility… We’re excited to see how developers will leverage this capability to create new tools and workflows that enhance productivity and user experiences across various sectors.”

Claude 3.5 Sonnet, Anthropic’s newest AI model, autonomously completes a vendor request form by retrieving required information from a CRM system, showcasing its ability to perform multi-step tasks across different software platforms. (Credit: Anthropic)

Early adopters see potential

Anthropic’s early partners, including GitLab, Canva, and Replit, are already benefiting from Claude 3.5 Sonnet’s new features. GitLab, which specializes in software development and security, has been testing the model for automating tasks in their development pipeline. According to the company, Claude has improved reasoning capabilities by up to 10% without slowing down performance, making it well-suited for complex, multi-step processes like software testing and deployment.

Replit, a coding platform, has gone a step further. Michele Catasta, President of Replit, said the model “opens the door to creating a powerful autonomous verifier that can evaluate apps while they’re being built.” This could ease bottlenecks in software development, where testing often delays project timelines.

Meanwhile, Canva, the graphic design platform, is exploring how Claude’s computer use skills could speed up design creation and editing. Danny Wu, Head of AI Products at Canva, said in a statement, “We’re discovering time-savings within our team that could be game-changing for users.”

What does “Computer Use” actually mean?

What sets this new capability apart from traditional automation tools is that Claude isn’t confined to specific workflows or software programs. Instead, it can “see” a screen using screenshots, interact with various applications, and adapt to different tasks as they come up. This flexibility makes it more versatile than current robotic process automation (RPA) technologies.

For example, in a demo shared by Anthropic, Claude helps complete a vendor request form for Ant Equipment Co. In the video, Claude starts by taking a screenshot of the computer screen, identifies that some necessary information is missing from a spreadsheet, then navigates to a CRM system, locates the required data, and fills out the form—all without human intervention.

This level of automation could have major implications for industries like finance, legal services, and customer support, where tasks often involve switching between multiple systems and applications. “Claude could open spreadsheets, run analyses, and create visualizations. For customer service, it could navigate CRM systems to quickly find and update customer information,” Krieger told VentureBeat.

Security and privacy concerns

However, the ability for AI to control a computer raises serious security and privacy concerns. Anthropic has built several safeguards into the system to address these risks. The company made it clear that Claude cannot access a computer without a developer providing the necessary tools.

“Claude cannot ‘just use your computer.’ The computer use feature requires developers to provide tools like a screenshot tool and an action-execution layer, which allows Claude to perform mouse movements and keystrokes,” Krieger explained.

Anthropic is also taking a cautious approach by releasing the feature in a limited public beta, available only through an API. This allows developers to test it in controlled environments before it becomes more widely available. The company has also developed classifiers to detect misuse and prevent the AI from interacting with sensitive websites, such as government portals. “Our methods to scan for prohibited activity are designed to safeguard customer data privacy and confidentiality,” Krieger said.

A new era for office automation?

In the near term, businesses could see immediate productivity gains in areas like data entry, customer service, and IT support. But as the technology matures, the potential applications could extend far beyond these initial use cases.

Imagine a world where AI handles complex legal processes, from reviewing contracts to completing compliance forms. Or envision AI assisting doctors in navigating electronic health records and diagnosing patients by cross-referencing medical databases.

Claude’s new “Computer Use” feature brings us closer to a future where AI can perform a wide range of tasks that span different software applications and systems. This gives it a level of flexibility that was previously unimaginable for AI technologies, which were often confined to specific, narrow tasks.

Proceeding with caution

Still, it’s important to remember that this capability is in its early stages. Claude’s ability to use computers is not yet perfect, and Anthropic acknowledges that it struggles with tasks that humans find trivial, like scrolling or zooming. “Since it’s still in beta and can occasionally miss short-lived actions, we recommend human oversight for high-stakes tasks,” Krieger said.

That said, Anthropic is committed to refining the technology. “We’ve developed new classifiers and prompt analysis tools to identify potential misuse of computer use features,” Krieger added, indicating the company is serious about addressing the risks associated with this powerful technology.

What’s next?

As AI continues to evolve, the way we work may change dramatically. For enterprise decision-makers, the benefits of automating multi-step workflows could be substantial. But this also raises questions about the future of jobs that rely on these very tasks.

For now, Anthropic is focused on the immediate benefits of Claude 3.5 Sonnet and Haiku while ensuring the technology is deployed responsibly. As Krieger put it, “We’re excited to see how developers will leverage this capability to create new tools and workflows that improve productivity and user experiences across various sectors.”

With companies like GitLab, Canva, and Replit already exploring its potential, it’s clear that AI is poised to play an even bigger role in the future of work—perhaps sooner than we think.

Latest article