Agent Installation
This guide covers the complete Helm installation of the CostPilot agent, including prerequisites, all values.yaml options, and verification steps.
How the agent works
The CostPilot agent consists of two components deployed into your cluster:
- Operator — A single-replica Kubernetes controller that manages agent lifecycle, configuration, and rolling updates
- Agent ReplicaSet — Three replicas with leader election. The elected leader collects pod resource metrics every 15 seconds and ships them to CostPilot. Followers are on standby for immediate failover.
Both components run from the same container image with different entrypoints.
Prerequisites
| Requirement | Minimum |
|---|---|
| Kubernetes | 1.24+ |
| Helm | 3.x |
kubectl access | Cluster admin |
| Cluster API key | Generated in Nodes → Clusters |
Installation
1. Add the Helm repository
helm repo add costpilot https://charts.cost-pilot.com
helm repo update
2. Create the namespace
kubectl create namespace costpilot
3. Create the API key secret
The agent authenticates with your CostPilot account using a Cluster API Key. Store it as a Kubernetes Secret:
kubectl create secret generic cp-agent-secret \
--namespace costpilot \
--from-literal=cluster-api-key=YOUR_API_KEY_HERE
The secret name defaults to cp-agent-secret. If you use a different name, set secrets.agent.clusterApiKeySecretName in your values accordingly.
4. Install the chart
Minimal installation (defaults to EU region):
helm upgrade --install costpilot costpilot/agent \
--namespace costpilot
With explicit region:
helm upgrade --install costpilot costpilot/agent \
--namespace costpilot \
--set backend.ingesterEndpoint=eu
Configuration reference
The following table covers all available values.yaml options.
Backend
| Key | Default | Description |
|---|---|---|
backend.ingesterEndpoint | eu | Ingester region. eu or us |
Agent
| Key | Default | Description |
|---|---|---|
agent.enabled | true | Deploy the agent ReplicaSet |
agent.replicaCount | 3 | Number of agent replicas (3 recommended for HA) |
agent.config.intervalSeconds | 15 | Metric collection interval in seconds |
agent.config.peerPort | 18443 | Port used for leader/follower communication |
Operator
| Key | Default | Description |
|---|---|---|
operator.enabled | true | Deploy the operator |
operator.replicaCount | 1 | Operator replicas (keep at 1) |
operator.resources.requests.cpu | 50m | CPU request |
operator.resources.requests.memory | 64Mi | Memory request |
operator.resources.limits.cpu | 100m | CPU limit |
operator.resources.limits.memory | 128Mi | Memory limit |
Image
| Key | Default | Description |
|---|---|---|
global.imageRegistry | ghcr.io/smrt-devops | Container registry |
image.repository | cost-pilot/agent | Image repository |
image.tag | v0.0.1-nightly | Image tag |
image.pullPolicy | Always | Pull policy |
Secrets
| Key | Default | Description |
|---|---|---|
secrets.agent.clusterApiKeySecretName | cp-agent-secret | Name of the Secret containing cluster-api-key |
Full values.yaml example
global:
imageRegistry: "ghcr.io/smrt-devops"
image:
repository: cost-pilot/agent
tag: "v0.0.1-nightly"
pullPolicy: Always
backend:
ingesterEndpoint: "eu"
agent:
enabled: true
replicaCount: 3
config:
intervalSeconds: 15
peerPort: "18443"
operator:
enabled: true
replicaCount: 1
resources:
requests:
cpu: 50m
memory: 64Mi
limits:
cpu: 100m
memory: 128Mi
secrets:
agent:
clusterApiKeySecretName: "cp-agent-secret"
Verifying the installation
Check pod status:
kubectl get pods -n costpilot
Expected output:
NAME READY STATUS RESTARTS AGE
cost-pilot-operator-xxx-yyy 1/1 Running 0 60s
costpilot-agent-xxx-aaa 1/1 Running 0 45s
costpilot-agent-xxx-bbb 1/1 Running 0 45s
costpilot-agent-xxx-ccc 1/1 Running 0 45s
Check agent logs for metric shipping:
kubectl logs -n costpilot -l app=costpilot-agent -f
You should see log lines similar to:
level=info msg="shipping metrics" pods=47 interval=15s
level=info msg="metrics shipped successfully" tenant=t_xxx cluster=c_yyy
Resource requirements
The agent is designed to be lightweight:
| Component | CPU Request | Memory Request |
|---|---|---|
| Operator | 50m | 64Mi |
| Each agent replica | ~10m | ~32Mi |
Total footprint for the default 3-replica setup: ~80m CPU, ~160Mi memory.
Upgrading
helm repo update
helm upgrade costpilot costpilot/agent --namespace costpilot
The operator handles rolling updates of agent replicas automatically.
Uninstalling
helm uninstall costpilot --namespace costpilot
kubectl delete namespace costpilot
Uninstalling the agent stops metric collection for that cluster. Historical data in CostPilot is retained. You can reinstall at any time and data collection will resume.