LLM leaderboard

Compare large language models for performance, price and more, to find the best match for your needs.

Leaderboard
Largest context
  1. Gemini 2.5 Pro
  2. Gemini 2.0 Flash
  3. Gemini 2.0 Flash-Lite
Highest output tokens
  1. o1-pro
  2. o1
  3. o3-mini
Least expensive
  1. R1 Distill LLama 8B
  2. Ministral 3B
  3. Gemini 1.5 Flash-8B
Model comparison
Model
Input price / 1M tokens
Output price / 1M tokens
Context window
Output token limit
Reasoning model
Open source
Gemini 1.5 Flash-8B
$0.04
$0.15
10000008192
Ministral 3B
$0.04
$0.04
1280004096
R1 Distill LLama 8B
$0.04
$0.04
1280008000
Qwen Turbo
$0.05
$0.20
10000008192
Coder V2 Lite
$0.06
$0.18
1280008000
Gemini 1.5 Flash
$0.07
$0.30
10000008192
Gemini 2.0 Flash
$0.10
$0.40
10000008192
Llama 3.1 8B
$0.10
$0.10
1280002048
Ministral 8B
$0.10
$0.10
1280004096
Gemma 2 9B
$0.12
$0.15
80008192
Coder V2
$0.14
$0.28
1280008000
GPT-4o mini
$0.15
$0.60
12800016384
GPT-4o mini Audio
$0.15
$0.60
12800016384
Gemma 2 27B
$0.17
$0.51
80008192
Mistral Saba
$0.20
$0.60
320004096
Claude 3 Haiku
$0.25
$1.25
2000004096
V3
$0.27
$1.10
1280008000
Codestral
$0.30
$0.90
1280004096
R1 Distill Qwen 32B
$0.30
$0.30
1280008000
GPT-3.5 Turbo
$0.50
$1.50
163854096
Llama 2 Chat
$0.50
$0.25
40962048
QwQ 32B
$0.55
$0.75
1310008192
Llama 3.3 70B
$0.59
$0.70
1280002048
GPT-4o mini Realtime
$0.60
$2.40
1280004096
Llama 3.2
$0.60
$0.60
1280002048
Gemini 2.0 Flash-Lite
$0.70
$0.30
10000008192
R1 Distill Llama 70B
$0.72
$0.99
1280008000
Claude 3.5 Haiku
$0.80
$4.00
2000008192
Qwen 2.5 Coder 32B
$0.80
$0.80
1310008192
R1 Distill Qwen 14B
$0.88
$0.88
1280008000
Sonar Reasoning
$1.00
$5.00
127000N/A
Sonar
$1.00
$1.00
127000N/A
o3-mini
$1.10
$4.40
200000100000
o1-mini
$1.10
$4.40
12800065536
Qwen 2.5 Max
$1.60
$6.40
320008192
Mistral Large
$2.00
$6.00
1280004096
Pixtral Large
$2.00
$6.00
1280004096
Sonar Reasoning Pro
$2.00
$8.00
128000N/A
Sonar Deep Research
$2.00
$8.00
200000N/A
GPT-4o
$2.50
$10.00
12800016384
GPT-4o Audio
$2.50
$10.00
12800016384
Claude 3.7 Sonnet
$3.00
$15.00
2000008192
Claude 3.5 Sonnet
$3.00
$15.00
2000008192
Sonar Pro
$3.00
$15.00
200000N/A
Llama 3.1 405B
$3.50
$3.50
1280002048
GPT-4o Realtime
$5.00
$20.00
1280004096
GPT-4 Turbo
$10.00
$30.00
1280004096
o1
$15.00
$60.00
200000100000
Claude 3 Opus
$15.00
$75.00
2000004096
GPT-4
$30.00
$60.00
81928192
GPT-4.5 Preview
$75.00
$150.00
12800016384
o1-pro
$150.00
$600.00
200000100000
Gemini 2.5 Pro
N/A
N/A
100000064000
Gemma 3 1B
N/A
N/A
320008192
Gemma 3 27B
N/A
N/A
1280008192
Qwen 2.5 72B
N/A
N/A
1310008192
Key definitions
Price:Price per token refers to the cost of processing each token in the prompt sent to an LLM, while output price per token is the cost of each token generated by the model in response. The price shown in the leaderboard section is a blended price, using a typical ratio of 3:1 of input to output usage. Some models have a price of 0, which can be in the case of a limited free trial.
Context window:The maximum amount of text (tokens) the model can process at once, including both input and generated output. It determines how much prior conversation or document history the model can "remember" within a single interaction.
Output token limit:Maximum output tokens define the upper limit of tokens an LLM can generate in a single response. This limit is influenced by the model's context window and provider policies, dictating the length of its output.
Reasoning model:A reasoning LLM signifies a model capable of going beyond pattern recognition to perform logical inference and problem-solving. This involves tasks like complex mathematics, planning, and generating "chain of thought" explanations, mimicking human-like cognitive processes. Essentially, it aims to understand and solve problems, not just reproduce text.
Open source:Some LLMs are published under an open-source license, allowing developers to access and modify the code, and this means you are also able to host these models yourself on premises or in the cloud. Others, such as Mistral, are available for self-hosting under licence.

LLM Leaderboard FAQ