Question 1

How is blended cost calculated?

Accepted Answer

Blended cost combines input and output token pricing in a single $/1M figure, weighted by typical usage patterns. The calculator above lets you override the default assumption (10K input / 2K output) with your actual workload to see model-specific costs.

Question 2

What's the cheapest LLM in 2026?

Accepted Answer

Several open-weight models from Google, Alibaba, and DeepSeek cost under $0.10/M tokens. Gemma 4 31B, DeepSeek V4 Flash, and gpt-oss-20B all land under that threshold while still scoring above 24 on the Artificial Analysis Intelligence Index. For frontier-tier capability under $1/M tokens, DeepSeek V4 Pro and Kimi K2.6 are typically the price-performance leaders.

Question 3

How is the Intelligence score calculated?

Accepted Answer

The Intelligence score uses the Artificial Analysis Intelligence Index methodology — a composite of GPQA Diamond, AIME 2025, SWE Bench, MMLU Pro, and other public benchmarks normalized to a 0-100 scale. Scores in the high 50s currently mark the frontier (Claude Opus 4.7, GPT-5.5, Gemini 3.1 Pro Preview), with most production models sitting in the 30-50 range.

Question 4

How much does Claude Opus 4.7 cost?

Accepted Answer

Claude Opus 4.7 lists at $15/M input tokens and $75/M output tokens, blending to approximately $4.10/M at a typical 10K/2K input-output ratio. At 1,000 requests/month with this workload, expect ~$50/month; at 100,000 requests, ~$5,000/month. Use the calculator above to model your specific workload.

Question 5

Which LLM has the longest context window?

Accepted Answer

Gemini 3.1 Pro Preview supports up to 10M tokens of context — the longest currently available. Claude Opus 4.7 supports 1M, GPT-5.5 supports 922K, and several open-weight models including Qwen 3.7 Max and DeepSeek V4 Pro support 1M. For most production workloads, 200K-256K is sufficient.

Question 6

How often is this data updated?

Accepted Answer

The dataset is verified weekly against vendor pricing pages and the Artificial Analysis leaderboard. New model releases are typically added within 7 days of public availability. The "Last update" badge in the stats band shows the most recent verification date.

Question 7

What's the difference between input and output token pricing?

Accepted Answer

Most LLM providers charge separately for input tokens (what you send) and output tokens (what the model generates). Output tokens are typically 3-5× more expensive than input. The "blended" cost is a single normalized figure that approximates the per-token cost at a typical usage ratio — useful for high-level comparison but always verify with the actual two-rate pricing for production budgeting.

Question 8

What's the difference between Frontier, Advanced, Mainstream, and Efficient tiers?

Accepted Answer

Tiers reflect Intelligence Index ranges. Frontier (50+) is the current capability ceiling — best for complex reasoning, agentic work, and high-stakes outputs. Advanced (40-49) handles most production work well. Mainstream (30-39) suits routine tasks where cost matters more than peak capability. Efficient (under 30) is for high-volume, low-stakes work where cost dominates.

Question 9

Is there an API for accessing this data programmatically?

Accepted Answer

The REST API and MCP server are launching Q3 2026. Join the waitlist above to be notified when they go live. Both will be free for non-commercial use with rate limits, and provide the same dataset surfaced on this page plus historical pricing once the trends archive is built.

Question 10

How do I choose the right model for my use case?

Accepted Answer

Start with capability requirements — frontier reasoning needs frontier-tier models; routine classification often runs fine on Efficient tier. Then look at cost at your expected volume — the gap between $0.20/M and $4/M compounds quickly at scale. Finally consider latency and context window for your specific UX. The calculator above ranks by total cost for your scenario; sort the full table by Intelligence/Speed/Latency to optimize for other dimensions.

Start with an Audit

Architect & Implement

Train, Deploy, Optimize

AI Tools

AI Playbooks

Content & Editorial

Humane AI & Accountability

Three lanes, one program

Constellation · Atlas · ANT · OWL

Specs & open releases

Cite, partner, build

The LLM Cost Calculator.

Your workload

Estimated cost for your workload

All Models · Ranked

Use the data programmatically.

Frequently asked questions.