LLM model

GPT-4o mini

Cheap, fast small model for high-volume tasks.

Open the selection wizard All models
Illustrative specs. Context window, modalities, output cost, and EU availability for GPT-4o mini are representative, last verified 1 Jun 2026. Verify against the provider before committing.

GPT-4o mini at a glance

Strong default for anything where a small model is good enough.

Hosting Foundry Foundry (Azure)
Context window 128,000 tokens
Modalities text, vision inputs
Output cost $0.6 / 1M tokens

Best for

  • Classification and routing
  • First-pass summarization
  • Cost-sensitive RAG

Is it the right model?

Match it against your requirements.

The selection wizard ranks GPT-4o mini against every other model for your latency, context, modality, cost, and residency needs — and shows where it wins and where something else fits better.

Open the LLM Selection Wizard

Frequently asked questions

GPT-4o mini specs and cost.

What is GPT-4o mini best for? Cheap, fast small model for high-volume tasks. It fits Classification and routing, First-pass summarization, Cost-sensitive RAG.
What context window and modalities does GPT-4o mini support? GPT-4o mini handles up to 128,000 tokens of context and supports text, vision input. It runs on Foundry (Azure).
How much does GPT-4o mini cost? Around $0.6 per 1M output tokens (illustrative, verified 1 Jun 2026). Output tokens usually dominate the bill — verify input and cached pricing against the provider before budgeting.