Nodes & Infrastructure

The Nodes page gives you a direct view into your infrastructure layer — the individual virtual machines that make up your Kubernetes cluster. Where the Cost Explorer answers “what are my teams spending?”, the Nodes view answers “what am I actually paying for underneath?”.

What the Nodes Page Shows

Each row in the Nodes table represents a single node in your cluster. For every node, CostPilot surfaces:

  • Node name — the Kubernetes node name, typically matching the cloud provider’s instance identifier
  • Instance type — the underlying VM type (e.g. m5.xlarge, c5.2xlarge)
  • Pricing type — whether the node is running as an on-demand, reserved, or spot instance
  • CPU utilisation — average CPU usage as a percentage of the node’s total allocatable CPU
  • Memory utilisation — average memory usage as a percentage of allocatable memory
  • Node cost — the estimated monthly cost for the node, derived from cloud pricing data for the instance type and pricing model
  • Workloads — the pods and deployments currently scheduled onto that node

This combination lets you evaluate each node on two axes simultaneously: what it costs, and how hard it is actually working.

Reading Utilisation vs Cost

A node’s cost is largely fixed — you pay for the instance whether it is busy or idle. Utilisation tells you how much of that fixed cost you are actually extracting value from.

The key ratios to watch:

UtilisationCostInterpretation
HighHighEfficient — this node is earning its place
HighLowExcellent — likely a spot or smaller instance doing a lot
LowLowAcceptable — small nodes often run supporting workloads
LowHighProblematic — expensive capacity sitting mostly idle
Tip

Sort the table by cost descending, then scan down the utilisation columns. Your most expensive underutilised nodes will surface immediately.

Spotting Underutilised Nodes

A node is underutilised when its CPU or memory usage is consistently well below its capacity. Common causes:

  • Overly large instance types chosen for peak demand that rarely materialises
  • Pods with generous resource requests that reserve capacity but do not use it
  • Namespaces with strict pod-to-node affinity rules that leave other nodes sparse
  • Nodes kept for workload isolation (e.g. a dedicated monitoring node) that handle light traffic

Look for nodes where both CPU and memory utilisation are below 30% consistently. A single such node might be acceptable, but a pattern across your pool indicates your bin-packing efficiency needs attention.

Note

CostPilot calculates idle cost at the cluster level and attributes it proportionally to namespaces and labels. The Nodes view lets you trace where that idle capacity physically lives.

Node Pool Composition

Most clusters run a mix of instance types — general-purpose nodes for standard workloads, compute-optimised nodes for CPU-intensive jobs, and spot instances to reduce costs on fault-tolerant work.

The Nodes page makes your pool composition immediately visible. Look for:

  • Instance type variety — are you running the right shapes for the work being scheduled?
  • Spot vs on-demand ratio — more spot coverage generally means lower cost, at the cost of interruption risk
  • Node count per type — a single large node is often cheaper than several small ones for the same total capacity, but reduces scheduling flexibility
Warning

If all your nodes are the same large instance type, you may be over-provisioning. A heterogeneous pool with autoscaling typically produces better cost efficiency.

Relationship to Idle Costs

Idle cost is the portion of your infrastructure spend that no workload is actively consuming. It accumulates when:

  • Nodes are running but pods are not filling their capacity
  • Resource requests are far higher than actual usage
  • Cluster autoscaler has not yet scaled down after a traffic drop

The Nodes view is the best place to understand where idle cost is physically sitting. A node with 15% CPU utilisation is running at 85% idle — that 85% of its cost is being wasted if nothing meaningful is using the reserved headroom.

Making Node Pool Decisions

Use the Nodes view to inform right-sizing and pool configuration decisions:

Scenario: Most nodes are under 40% CPU but memory is saturated Your workloads are memory-bound. Consider switching to memory-optimised instance types (e.g. r5 family on AWS). You can likely serve the same workloads with fewer, better-suited nodes.

Scenario: You have several identical large nodes, all at low utilisation Your cluster may benefit from smaller instance types and tighter autoscaling bounds. Fewer nodes at higher utilisation is almost always more cost efficient.

Scenario: A single node is consistently at 90%+ utilisation That node is a potential bottleneck. If it is on-demand, evaluate whether a slightly larger instance type would give more headroom at marginal extra cost. If it is spot, ensure your workloads can tolerate interruption.

Scenario: Spot nodes are running critical stateful workloads This is a risk, not a cost concern. Move stateful workloads to on-demand nodes and keep spot for stateless, fault-tolerant jobs.

Tip

Changes to your node pool take effect in CostPilot within the next metrics collection cycle. After resizing, return to this view to confirm utilisation patterns have shifted as expected.