Gemini 2.0 Flash

Cheapest fast model with a huge context window.

Illustrative specs. Context window, modalities, output cost, and EU availability for Gemini 2.0 Flash are representative, last verified 1 Jun 2026. Verify against the provider before committing.

Hosting Vertex Vertex (GCP)

Context window 1,000,000 tokens

Modalities text, vision inputs

Output cost $0.4 / 1M tokens

Best for

High-volume cheap inference
Large-context simple tasks

The selection wizard ranks Gemini 2.0 Flash against every other model for your latency, context, modality, cost, and residency needs — and shows where it wins and where something else fits better.

Open the LLM Selection Wizard

What is Gemini 2.0 Flash best for? Cheapest fast model with a huge context window. It fits High-volume cheap inference, Large-context simple tasks.

What context window and modalities does Gemini 2.0 Flash support? Gemini 2.0 Flash handles up to 1,000,000 tokens of context and supports text, vision input. It runs on Vertex (GCP).

How much does Gemini 2.0 Flash cost? Around $0.4 per 1M output tokens (illustrative, verified 1 Jun 2026). Output tokens usually dominate the bill — verify input and cached pricing against the provider before budgeting.

Suggest improvement

Gemini 2.0 Flash

Gemini 2.0 Flash at a glance

Best for

Is it the right model?

Frequently asked questions