Anthropic, a leading artificial intelligence company, has announced significant upgrades to its language model, Claude, with the introduction of Claude 3.5 Sonnet and Claude 3.5 Haiku. The upgraded Claude 3.5 Sonnet delivers across-the-board improvements over its predecessor, particularly in coding, where it led the field. Meanwhile, Claude 3.5 Haiku matches the performance of Claude 3 Opus, Anthropic’s prior largest model, on many evaluations for the same cost and similar speed.
In a groundbreaking move, Anthropic also introduces computer use capability in public beta, allowing developers to direct Claude to use computers like humans do – by looking at screens, moving cursors, clicking buttons, and typing text. This new capability has already been explored by companies such as Asana, Canva, Cognition, DoorDash, Replit, and The Browser Company, which are using it to automate tasks that require dozens or even hundreds of steps to complete.
The upgraded Claude 3.5 Sonnet is now available for all users, while the new Claude 3.5 Haiku will be released later this month.
Introducing Upgraded AI Models and Computer Use Capability
The field of artificial intelligence has taken a significant leap forward with the introduction of upgraded AI models, including Claude 3.5 Sonnet and Claude 3.5 Haiku, as well as a groundbreaking new capability in public beta: computer use. This innovative feature allows AI systems to interact with computers like humans do, enabling them to automate repetitive processes, build and test software, and conduct open-ended tasks like research.
Upgraded AI Models: Claude 3.5 Sonnet and Claude 3.5 Haiku
The new Claude 3.5 Sonnet model represents a significant improvement in AI-powered coding, with stronger reasoning capabilities and no added latency. This makes it an ideal choice for powering multi-step software development processes. The model has been tested by external experts, including GitLab, Cognition, and The Browser Company, who have reported substantial improvements in coding, planning, and problem-solving compared to the previous version.
Claude 3.5 Haiku, on the other hand, is a faster and more affordable model that surpasses even Claude 3 Opus, the largest model in the previous generation, on many intelligence benchmarks. It is particularly strong on coding tasks, scoring 40.6% on SWE-bench Verified, outperforming many agents using publicly available state-of-the-art models.
Computer Use Capability: Teaching AI Systems to Navigate Computers
The computer use capability is a fundamentally new approach that teaches AI systems general computer skills, allowing them to use a wide range of standard tools and software programs designed for humans. This is achieved through an API that enables Claude to perceive and interact with computer interfaces, translating instructions into computer commands.
While the current ability of Claude to use computers is imperfect, it has shown promising results on OSWorld, which evaluates AI models’ ability to use computers like people do. The model scored 14.9% in the screenshot-only category, notably better than the next-best AI system’s score of 7.8%. When afforded more steps to complete the task, Claude scored 22.0%.
Safety Measures and Responsible Deployment
As with any new technology, there are potential risks associated with computer use, such as spam, misinformation, or fraud. To mitigate these risks, developers have developed new classifiers that can identify when computer use is being used and whether harm is occurring. The company has also outlined a Responsible Scaling Policy to ensure the safe deployment of this technology.
Looking Ahead: Exploring New Possibilities
The introduction of these upgraded AI models and the computer use capability marks an exciting milestone in the development of artificial intelligence. As developers begin to explore the potential of this technology, it will be crucial to learn from initial deployments and better understand both the possibilities and implications of increasingly capable AI systems.
In conclusion, the upgraded Claude 3.5 Sonnet and Claude 3.5 Haiku models, along with the computer use capability, represent a significant leap forward in artificial intelligence. As developers begin to explore the potential of this technology, it will be crucial to prioritize safety measures and responsible deployment to ensure that these innovations benefit society as a whole.
External Link: Click Here For More
