Node Count Planner
Estimate Kubernetes worker node count from schedulable capacity targets, allocatable node capacity, utilization, growth buffer, resilience goals, and zone distribution.
Input
Estimate Kubernetes worker node count from schedulable capacity targets, allocatable node capacity, utilization, growth buffer, resilience goals, and zone distribution.
Result
Recommended total worker nodes, zone distribution, limiting factor, survivability, and resilience-aware planning breakdown.
Things to check
- The final recommendation is higher than the pure capacity baseline because resilience or zone distribution rules are increasing the cluster size.
- The zone distribution is slightly uneven. That is normal for some totals, but it is worth checking if you want stricter symmetry.
Failure-state capacity view
| State | Available nodes | Available vCPU | Available memory | CPU target | Memory target |
|---|---|---|---|---|---|
| Normal state | 7 | 39.20 vCPU | 156.80 GiB | Pass | Pass |
| After 1 node loss | 6 | 33.60 vCPU | 134.40 GiB | Pass | Pass |
| After 2 node losses | 5 | 28.00 vCPU | 112.00 GiB | Fail | Fail |
Headroom after recommendation
What this means
After applying a 20% growth buffer and a 70% utilization target, the planner estimates 6 baseline worker nodes. The current limiting factor is balanced.
The resilience target requires the cluster to continue meeting the modeled demand after losing 1 node(s). With 7 recommended nodes, 6 nodes remain available after the modeled failure.
The recommendation is distributed across 3 zone(s) as 3 / 2 / 2. The current zone-balance interpretation is slightly uneven.
This looks like a production-oriented topology. The final recommendation of 7 nodes supports three-zone distribution (3 / 2 / 2) rather than sizing only for nominal aggregate capacity.
Planning breakdown
- Start with required schedulable capacity: 24.00 vCPU and 96.00 GiB.
- Apply growth buffer of 20% to get growth-adjusted targets: 28.80 vCPU and 115.20 GiB.
- Apply target utilization of 70% to allocatable node capacity: usable node capacity becomes 5.60 vCPU and 22.40 GiB.
- Calculate CPU-based required nodes: ceiling(28.80 / 5.60) = 6.
- Calculate memory-based required nodes: ceiling(115.20 / 22.40) = 6.
- Take the larger result as the capacity-driven baseline: 6 nodes.
- Apply resilience target "Tolerate 1 node failure" by requiring the cluster to keep enough surviving nodes after 1 modeled failure(s): 6 + 1 = 7 nodes.
- Apply zone floor of 3 node(s) for 3 zone(s).
- Final recommendation is the maximum of capacity-driven, resilience-driven, and zone-floor requirements: 7 nodes.
- Balanced zone distribution for the recommendation is 3 / 2 / 2.
- After the selected resilience failure target, 6 nodes remain, leaving 4.80 vCPU and 19.20 GiB of post-failure headroom.
Common presets
Node planning quick guide
Practical notes for resilience-aware Kubernetes worker node planning.
Allocatable capacity
Use allocatable worker-node CPU and memory, not raw VM size, so the planner works with realistic schedulable capacity.
Growth and utilization
Growth buffer increases required capacity, while target utilization reduces usable capacity per node.
Resilience and zones
The final node count is shaped not only by capacity but also by failure tolerance and balanced multi-zone distribution.
Planning assumptions
These assumptions define exactly what this version of the planner models.
- Inputs represent allocatable node capacity, not raw instance size.
- Growth buffer is applied before calculating required nodes.
- Target utilization reduces usable planning capacity per node.
- Resilience is modeled as surviving node loss while still meeting the target.
- Zone distribution is balanced as evenly as possible.
- This tool does not simulate autoscaler behavior, daemonset pressure, taints, or scheduler-level placement rules.
Common use cases
Typical architecture and capacity planning scenarios.
Estimate worker node count for a new Kubernetes cluster before selecting node pools or environments.
Compare the impact of growth buffer, utilization targets, and node failure tolerance on final cluster size.
Build a more realistic baseline for multi-zone production planning instead of sizing only for nominal capacity.