Anthropic has introduced the Claude 3 family models, which they claim surpasses other industry models such as GPT-4. The Claude 3 family consists of three distinct models: Haiku, Sonnet, and Opus, arranged in ascending order of capability, each designed to cater to diverse user needs in terms of intelligence, speed, and cost.
Anthropic revealed comparative data highlighting Opus's superior performance over OpenAI's GPT-4 model across all aspects evaluated. For instance, in the category of graduate-level expert reasoning (GPQA), Opus scores 50.4%, outperforming GPT-4's 35.7% by a significant 14.7%. Similarly, Opus achieves 95% in basic math tests, surpassing GPT-4's 92%, and scores 86.8% in MMLU knowledge, slightly higher than GPT-4's 86.4%.
The Claude 3 models also boast multimodal capabilities, enabling them to process a wide range of visual formats, including photographs, tables, graphs, and technical diagrams.
Additionally, Claude 3 demonstrates improved contextual understanding, reducing the likelihood of refusing to answer innocuous user requests. Anthropic emphasizes maintaining high accuracy at scale, utilizing complex factual questions related to known model weaknesses.
Responses are classified as correct, incorrect, or admissions of uncertainty, with the model admitting when it lacks the answer instead of providing incorrect information. This approach has enabled Opus to offer twice the response accuracy compared to Claude 2.1. Regarding responsible AI development, Anthropic underscores that the Claude 3 model family has been designed to be as reliable as possible, with dedicated teams identifying and mitigating risks such as misinformation and autonomous replication.
Opus shines as Claude 3's vanguard, earning praise for extraordinary "Near-Human" grasp and articulation on complex undertakings, though as Clement Delangue, co-founder & CEO of Hugging Face, commented on his X account:
AI/Claude is cool of course, but it is not human (despite its name). It's a bunch of lines of code and probabilistic algorithms trained on datasets behind an API to take an input and generate an output. Let's please all remember this for healthier debates and less fear-mongering!
Anthropic claims to have improved the security, transparency, and privacy of the models while reducing biases and promoting neutrality. Despite advancements over previous AI versions, the Claude 3 models maintain Anthropic's responsible scaling policy's ASL-2 security level.
The Opus and Sonnet models are available through the Claude API for developers, with Sonnet also accessible on Amazon Bedrock and in private preview on Google Cloud's Vertex AI Model Garden.