AI Mastery Index: Our weighted metric combining intelligence, coding capability, and cost-efficiency.

Rank	Model	Provider	Score (pts)
1	GPT-5.5 Pro	OpenAI	94.2pts
2	Claude 4.7 Opus	Anthropic	93.8pts
3	Gemini 3.1 Ultra	Google	92.5pts
4	DeepSeek-V4	DeepSeek	91.8pts
5	Llama 4 (405B)	Meta	90.5pts
6	Mistral Large 3	Mistral	89.2pts
7	o1-preview	OpenAI	88.5pts
8	Claude 3.5 Sonnet	Anthropic	88pts
9	GPT-4o	OpenAI	87.2pts
10	Gemini 1.5 Pro	Google	86.4pts
11	Llama 3.1 405B	Meta	85pts
12	Grok-2	xAI	84.5pts
13	Qwen 2 72B	Alibaba	83.2pts
14	Llama 3.1 70B	Meta	82.5pts
15	Command R+	Cohere	81.8pts
16	GPT-4o-mini	OpenAI	81pts
17	Gemini 1.5 Flash	Google	80.2pts
18	Claude 3 Haiku	Anthropic	78.5pts
19	Mistral NeMo 12B	Mistral	76.5pts
20	Llama 3.1 8B	Meta	75pts

Measured via **HumanEval++** and **LiveCodeBench**. Reflects ability to handle complex system-level refactoring and library integration.

Our proprietary **AMSE-2026** benchmark. Measures success rates in 10-turn planning loops with self-correction and tool use.

Calculated as (Average Score / Log10(Cost per 1M tokens)). Higher is better value.

The AI Authority Leaderboard