Upgrade the Agent
This guide covers how to upgrade the CostPilot agent to a new version, using either Helm or Kustomise. The Operator handles rolling restarts of agent pods automatically — you do not need to manage agent pods directly.
Before you upgrade
- Check the release notes for breaking changes.
- Verify the current version:
kubectl get pods -n costpilot -l app=cost-pilot-operator \
-o jsonpath='{.items[0].spec.containers[0].image}'
Upgrading with Helm
1. Update the chart repository
helm repo update
2. Check the latest chart version
helm search repo costpilot/agent --versions
3. Upgrade
helm upgrade costpilot costpilot/agent \
--namespace costpilot \
--reuse-values
--reuse-values preserves your existing values (API key secret name, region, etc.) without you needing to re-specify them.
To pin to a specific chart version:
helm upgrade costpilot costpilot/agent \
--namespace costpilot \
--version 0.2.0 \
--reuse-values
4. Verify
kubectl rollout status deployment/cost-pilot-operator -n costpilot
kubectl get pods -n costpilot
The Operator will reconcile within a few seconds and update the agent ReplicaSet to use the new image.
Upgrading with Kustomise
Option A — Update your image tag (overlay)
In your overlay’s kustomization.yaml:
images:
- name: ghcr.io/smrt-devops/cost-pilot/agent
newTag: "v0.2.0" # update to the new version
Then apply:
kubectl apply -k kustomize/overlays/production/
Option B — Update the base reference
If you reference the remote base by Git tag:
# my-cluster/kustomization.yaml
resources:
- github.com/smrt-devops/cost-pilot-agent//kustomize/base?ref=v0.2.0
Re-apply:
kubectl apply -k my-cluster/
3. Verify
kubectl rollout status deployment/cost-pilot-operator -n costpilot
kubectl get pods -n costpilot
What happens during an upgrade
- Helm or
kubectl applyupdates the Operator Deployment image. - Kubernetes performs a rolling restart of the Operator pod.
- The new Operator pod starts and reads its own image SHA from its pod status.
- The Operator compares its image SHA against the SHA currently set on the agent pods.
- If they differ, the Operator updates the agent ReplicaSet’s pod template with the new image — triggering a rolling restart of agent pods.
- Agent pods are replaced one at a time. The remaining running replicas continue collecting metrics during the rollout.
Metric collection continues uninterrupted during an upgrade. With three agent replicas and leader election, at least one replica is always collecting while others restart.
Rollback
Helm rollback
helm rollback costpilot -n costpilot
To roll back to a specific revision:
helm history costpilot -n costpilot # list revisions
helm rollback costpilot 2 -n costpilot
Kustomise rollback
Update the image tag back to the previous version and re-apply.
Verifying metric collection after upgrade
After the upgrade completes, confirm metrics are flowing:
# Check agent logs for shipping activity
kubectl logs -n costpilot -l app=costpilot-agent --prefix --tail=20
# Check the Operator reconciled successfully
kubectl logs -n costpilot -l app=cost-pilot-operator --tail=20
You should see metrics shipped successfully in the agent logs within 15–30 seconds of the agents coming up. The CostPilot Dashboard shows a warning banner if no metrics are received for 15 minutes — if this appears, check the logs.