LLM model

Gemini 2.0 Flash

Cheapest fast model with a huge context window.

Open the selection wizard All models
Illustrative specs. Context window, modalities, output cost, and EU availability for Gemini 2.0 Flash are representative, last verified 1 Jun 2026. Verify against the provider before committing.

Gemini 2.0 Flash at a glance

Lowest output cost in this catalog.

Hosting Vertex Vertex (GCP)
Context window 1,000,000 tokens
Modalities text, vision inputs
Output cost $0.4 / 1M tokens

Best for

  • High-volume cheap inference
  • Large-context simple tasks

Is it the right model?

Match it against your requirements.

The selection wizard ranks Gemini 2.0 Flash against every other model for your latency, context, modality, cost, and residency needs — and shows where it wins and where something else fits better.

Open the LLM Selection Wizard

Frequently asked questions

Gemini 2.0 Flash specs and cost.

What is Gemini 2.0 Flash best for? Cheapest fast model with a huge context window. It fits High-volume cheap inference, Large-context simple tasks.
What context window and modalities does Gemini 2.0 Flash support? Gemini 2.0 Flash handles up to 1,000,000 tokens of context and supports text, vision input. It runs on Vertex (GCP).
How much does Gemini 2.0 Flash cost? Around $0.4 per 1M output tokens (illustrative, verified 1 Jun 2026). Output tokens usually dominate the bill — verify input and cached pricing against the provider before budgeting.