Understanding Idle Costs

Idle cost is the portion of your infrastructure spend that isn’t doing useful work. In Kubernetes environments, idle waste comes in two distinct forms with different causes and different solutions.

Type 1 — Overprovisioned resources

What it is: Capacity that a pod has requested but is not using.

When a pod sets requests.cpu: 1000m but only ever uses 100m, 900m of CPU is reserved on the node for that pod but sits idle. You’re paying for 1000m but only getting value from 100m.

This is the most common form of idle waste, typically caused by:

Cargo-cult defaults — Engineers copying resource requests from other deployments without tuning them
Fear of OOMKills — Setting very high memory requests “just in case”
Over-conservative capacity planning — Applications that have shrunk but resource requests haven’t been updated
Batch workload peaks — Jobs that need resources occasionally but hold them continuously

How to identify it

In CostPilot:

Open Cost Explorer → select the Workload dimension
Sort by Efficiency ascending
The lowest-efficiency workloads have the largest gap between requests and actual usage

CostPilot’s Insights will also flag specific workloads with high overprovisioning, along with the estimated monthly savings from right-sizing.

How to reduce it

Right-size resource requests — Set CPU and memory requests to approximately P95 of actual usage, plus a small buffer (10–20%)
Use Vertical Pod Autoscaler (VPA) — VPA in recommendation mode automatically suggests right-sized requests based on observed usage
Set resource limits — Prevents single workloads from consuming unbounded resources, improving overall cluster density
Schedule requests reviews — Make resource request tuning a quarterly engineering task

✦ Tip

A common target is to get CPU utilisation to 60–70% of requests. This gives sufficient headroom for traffic spikes while avoiding large waste.

Type 2 — Unallocated node capacity

What it is: Node capacity that no pod has claimed.

Even if every pod on a node uses exactly what it requests, nodes are never 100% schedulable — the Kubernetes system reserves capacity for OS processes, kubelet, and system pods. Beyond that, if your node pool has more total capacity than your workloads need, the excess is permanently idle.

Common causes:

Static node pools — A fixed number of nodes regardless of actual load
Overprovisioned node types — Using large instance types when smaller ones would suffice
Cluster autoscaler lag — Nodes provisioned for a peak that has since passed, but not yet scaled down
Bin-packing inefficiency — Pod scheduling leaves gaps that are too small for any queued pod

How to identify it

In CostPilot:

Go to Dashboard → look at the Idle cost breakdown
Hover over the idle cost card to see the split between overprovisioned (Type 1) and unallocated (Type 2) waste

Alternatively, the Node Health insight category specifically reports on nodes with chronically low utilisation.

How to reduce it

Enable Cluster Autoscaler — Automatically adds nodes when pods are pending and removes them when utilisation is low
Right-size node pools — Use multiple node pools with different instance types to better match workload profiles (CPU-optimised, memory-optimised)
Use spot instances — For workloads that can tolerate interruption, spot nodes reduce the cost of idle capacity significantly
Tune bin-packing — Set Kubernetes scheduler policies to pack pods more tightly onto fewer nodes
Pod disruption budgets — Allow the autoscaler to evict pods and scale down nodes more aggressively

Comparing the two types

	Overprovisioned	Unallocated
Cause	Pod requests > actual usage	Node capacity > pod requests
Fix	Right-size resource requests	Right-size or autoscale node pool
Tooling	VPA recommendations	Cluster Autoscaler
Urgency	High (large cumulative waste)	High (pure unused cost)

A note on headroom

Some idle cost is intentional and healthy. A cluster with zero idle capacity has no room to absorb traffic spikes or schedule new pods — you’ll hit resource exhaustion under load. A reasonable target is:

10–15% CPU headroom above current pod requests
15–20% memory headroom above current pod requests (memory is less elastic than CPU)

CostPilot’s efficiency scoring accounts for this — an efficiency of 80–85% is excellent, not a problem.

ℹ Note

CostPilot calculates idle cost using a 2-hour rolling average of pod requests vs. node capacity, smoothing out short-term scheduling noise.