Azure OpenAI Token & PTU Cost Calculator

Estimate Azure OpenAI monthly cost from input/output token counts and request volume, compare every model side by side, and see the pay-as-you-go vs provisioned throughput (PTU) break-even. Runs entirely in your browser; nothing is sent to the server.

Pricing note: Token and PTU rates are illustrative values for East US public Azure, last verified 1 Jun 2026. Verify against the Azure Pricing Calculator before committing budgets. PTU break-even compares pay-as-you-go token spend against the cost of the minimum provisioned deployment — it does not model throughput sizing (PTUs → tokens/min).

Quick presets

Model

Input tokens per request

Prompt + system + retrieved context.

Output tokens per request

Completion length. Output is usually the dominant cost.

Requests per month

Cost per request $0 input + output tokens

Pay-as-you-go / mo $0 token spend

PTU floor / mo $0 min provisioned deployment

Break-even 0 requests / month

Model	Per request	PaYG / mo	PTU floor / mo	Cheaper option

Pay-as-you-go You pay per token. Cost scales linearly with volume — great until sustained traffic makes it predictable enough to commit.

Provisioned Throughput (PTU) You reserve dedicated capacity at a fixed monthly price. Below the minimum deployment it is poor value; above the break-even it wins and gives predictable latency.

Break-even The monthly request count where PaYG token spend equals the PTU floor. Above it, the tool recommends PTU. It is a cost line only — real PTU sizing depends on your tokens-per-minute throughput.

1. Retry & overhead tokens Failed calls, re-prompts, and function-calling round-trips add tokens the napkin math misses.

2. Fine-tuning hosting A fine-tuned model carries a separate hourly hosting charge on top of token cost — often the biggest surprise line.

3. Prompt caching & batch discounts These reduce cost; this calculator deliberately ignores them, so it errs on the conservative (higher) side.

4. Pinning output too low Output tokens dominate. Underestimating completion length is the most common reason an estimate undershoots reality.

Suggest improvement

Azure OpenAI Token & PTU Cost Calculator

Workload inputs

Estimate

All models at this workload

How to read this

Costs this does not include yet