Gemini 2.5 Flash

Fast, large-context, multimodal balanced model.

Illustrative specs. Context window, modalities, output cost, and EU availability for Gemini 2.5 Flash are representative, last verified 1 Jun 2026. Verify against the provider before committing.

Hosting Vertex Vertex (GCP)

Context window 1,000,000 tokens

Modalities text, vision, audio inputs

Output cost $2.5 / 1M tokens

Best for

High-volume multimodal
Large-context at low latency
Balanced cost/quality

The selection wizard ranks Gemini 2.5 Flash against every other model for your latency, context, modality, cost, and residency needs — and shows where it wins and where something else fits better.

Open the LLM Selection Wizard

What is Gemini 2.5 Flash best for? Fast, large-context, multimodal balanced model. It fits High-volume multimodal, Large-context at low latency, Balanced cost/quality.

What context window and modalities does Gemini 2.5 Flash support? Gemini 2.5 Flash handles up to 1,000,000 tokens of context and supports text, vision, audio input. It runs on Vertex (GCP).

How much does Gemini 2.5 Flash cost? Around $2.5 per 1M output tokens (illustrative, verified 1 Jun 2026). Output tokens usually dominate the bill — verify input and cached pricing against the provider before budgeting.

Suggest improvement

Gemini 2.5 Flash

Gemini 2.5 Flash at a glance

Best for

Is it the right model?

Frequently asked questions