Agent Installation

This guide covers the complete Helm installation of the CostPilot agent, including prerequisites, all values.yaml options, and verification steps.

How the agent works

The CostPilot agent consists of two components deployed into your cluster:

  • Operator — A single-replica Kubernetes controller that manages agent lifecycle, configuration, and rolling updates
  • Agent ReplicaSet — Three replicas with leader election. The elected leader collects pod resource metrics every 15 seconds and ships them to CostPilot. Followers are on standby for immediate failover.

Both components run from the same container image with different entrypoints.

Prerequisites

RequirementMinimum
Kubernetes1.24+
Helm3.x
kubectl accessCluster admin
Cluster API keyGenerated in Nodes → Clusters

Installation

1. Add the Helm repository

helm repo add costpilot https://charts.cost-pilot.com
helm repo update

2. Create the namespace

kubectl create namespace costpilot

3. Create the API key secret

The agent authenticates with your CostPilot account using a Cluster API Key. Store it as a Kubernetes Secret:

kubectl create secret generic cp-agent-secret \
  --namespace costpilot \
  --from-literal=cluster-api-key=YOUR_API_KEY_HERE
Warning

The secret name defaults to cp-agent-secret. If you use a different name, set secrets.agent.clusterApiKeySecretName in your values accordingly.

4. Install the chart

Minimal installation (defaults to EU region):

helm upgrade --install costpilot costpilot/agent \
  --namespace costpilot

With explicit region:

helm upgrade --install costpilot costpilot/agent \
  --namespace costpilot \
  --set backend.ingesterEndpoint=eu

Configuration reference

The following table covers all available values.yaml options.

Backend

KeyDefaultDescription
backend.ingesterEndpointeuIngester region. eu or us

Agent

KeyDefaultDescription
agent.enabledtrueDeploy the agent ReplicaSet
agent.replicaCount3Number of agent replicas (3 recommended for HA)
agent.config.intervalSeconds15Metric collection interval in seconds
agent.config.peerPort18443Port used for leader/follower communication

Operator

KeyDefaultDescription
operator.enabledtrueDeploy the operator
operator.replicaCount1Operator replicas (keep at 1)
operator.resources.requests.cpu50mCPU request
operator.resources.requests.memory64MiMemory request
operator.resources.limits.cpu100mCPU limit
operator.resources.limits.memory128MiMemory limit

Image

KeyDefaultDescription
global.imageRegistryghcr.io/smrt-devopsContainer registry
image.repositorycost-pilot/agentImage repository
image.tagv0.0.1-nightlyImage tag
image.pullPolicyAlwaysPull policy

Secrets

KeyDefaultDescription
secrets.agent.clusterApiKeySecretNamecp-agent-secretName of the Secret containing cluster-api-key

Full values.yaml example

global:
  imageRegistry: "ghcr.io/smrt-devops"

image:
  repository: cost-pilot/agent
  tag: "v0.0.1-nightly"
  pullPolicy: Always

backend:
  ingesterEndpoint: "eu"

agent:
  enabled: true
  replicaCount: 3
  config:
    intervalSeconds: 15
    peerPort: "18443"

operator:
  enabled: true
  replicaCount: 1
  resources:
    requests:
      cpu: 50m
      memory: 64Mi
    limits:
      cpu: 100m
      memory: 128Mi

secrets:
  agent:
    clusterApiKeySecretName: "cp-agent-secret"

Verifying the installation

Check pod status:

kubectl get pods -n costpilot

Expected output:

NAME                                READY   STATUS    RESTARTS   AGE
cost-pilot-operator-xxx-yyy         1/1     Running   0          60s
costpilot-agent-xxx-aaa             1/1     Running   0          45s
costpilot-agent-xxx-bbb             1/1     Running   0          45s
costpilot-agent-xxx-ccc             1/1     Running   0          45s

Check agent logs for metric shipping:

kubectl logs -n costpilot -l app=costpilot-agent -f

You should see log lines similar to:

level=info msg="shipping metrics" pods=47 interval=15s
level=info msg="metrics shipped successfully" tenant=t_xxx cluster=c_yyy

Resource requirements

The agent is designed to be lightweight:

ComponentCPU RequestMemory Request
Operator50m64Mi
Each agent replica~10m~32Mi

Total footprint for the default 3-replica setup: ~80m CPU, ~160Mi memory.

Upgrading

helm repo update
helm upgrade costpilot costpilot/agent --namespace costpilot

The operator handles rolling updates of agent replicas automatically.

Uninstalling

helm uninstall costpilot --namespace costpilot
kubectl delete namespace costpilot
Warning

Uninstalling the agent stops metric collection for that cluster. Historical data in CostPilot is retained. You can reinstall at any time and data collection will resume.