GPT-4o

Multimodal workhorse with text, vision, and audio.

Illustrative specs. Context window, modalities, output cost, and EU availability for GPT-4o are representative, last verified 1 Jun 2026. Verify against the provider before committing.

Hosting Foundry Foundry (Azure)

Context window 128,000 tokens

Modalities text, vision, audio inputs

Output cost $10 / 1M tokens

Best for

Audio-in/audio-out workloads
Multimodal chat
Mixed-content RAG

The selection wizard ranks GPT-4o against every other model for your latency, context, modality, cost, and residency needs — and shows where it wins and where something else fits better.

Open the LLM Selection Wizard

What is GPT-4o best for? Multimodal workhorse with text, vision, and audio. It fits Audio-in/audio-out workloads, Multimodal chat, Mixed-content RAG.

What context window and modalities does GPT-4o support? GPT-4o handles up to 128,000 tokens of context and supports text, vision, audio input. It runs on Foundry (Azure).

How much does GPT-4o cost? Around $10 per 1M output tokens (illustrative, verified 1 Jun 2026). Output tokens usually dominate the bill — verify input and cached pricing against the provider before budgeting.

Suggest improvement

GPT-4o

GPT-4o at a glance

Best for

Is it the right model?

Frequently asked questions