GPT-4o mini

Cheap, fast small model for high-volume tasks.

Illustrative specs. Context window, modalities, output cost, and EU availability for GPT-4o mini are representative, last verified 1 Jun 2026. Verify against the provider before committing.

Hosting Foundry Foundry (Azure)

Context window 128,000 tokens

Modalities text, vision inputs

Output cost $0.6 / 1M tokens

Best for

Classification and routing
First-pass summarization
Cost-sensitive RAG

The selection wizard ranks GPT-4o mini against every other model for your latency, context, modality, cost, and residency needs — and shows where it wins and where something else fits better.

Open the LLM Selection Wizard

What is GPT-4o mini best for? Cheap, fast small model for high-volume tasks. It fits Classification and routing, First-pass summarization, Cost-sensitive RAG.

What context window and modalities does GPT-4o mini support? GPT-4o mini handles up to 128,000 tokens of context and supports text, vision input. It runs on Foundry (Azure).

How much does GPT-4o mini cost? Around $0.6 per 1M output tokens (illustrative, verified 1 Jun 2026). Output tokens usually dominate the bill — verify input and cached pricing against the provider before budgeting.

Suggest improvement

GPT-4o mini

GPT-4o mini at a glance

Best for

Is it the right model?

Frequently asked questions