Kustomise Reference
This page is the complete reference for the CostPilot Kustomise manifests. For the installation walkthrough, see Kustomise Installation.
Manifest structure
deploy/kustomize/
├── base/
│ ├── kustomization.yaml # Resource list and namespace
│ ├── namespace.yaml # costpilot Namespace
│ ├── serviceaccount.yaml # Operator ServiceAccount
│ ├── clusterrole.yaml # Operator ClusterRole (cluster-scoped permissions)
│ ├── clusterrolebinding.yaml # Binds ClusterRole to Operator ServiceAccount
│ ├── role.yaml # Operator Role (namespace-scoped permissions)
│ ├── rolebinding.yaml # Binds Role to Operator ServiceAccount
│ ├── configmap.yaml # Agent configuration (edit this)
│ └── deployment.yaml # Operator Deployment
└── overlays/
└── production/
└── kustomization.yaml # Production overlay (pinned image tag)
ConfigMap reference
The cp-agent-config ConfigMap is the primary configuration interface. The Operator watches it and reconciles automatically when values change.
The ConfigMap must have labels app: costpilot-agent and component: config. The Operator uses these labels to discover the ConfigMap. Do not remove or rename them.
| Key | Default | Description |
|---|---|---|
ingesterEndpoint | eu | Ingester region. Must be eu or us. The Operator builds the full ingester URL from this value. |
clusterApiKeySecretName | cp-agent-secret | Name of the Kubernetes Secret containing the cluster API key. The Secret must have a key named cluster-api-key. |
image | ghcr.io/smrt-devops/cost-pilot/agent:latest | Container image for agent pods. The Operator uses this to create and update the agent ReplicaSet. |
imagePullPolicy | IfNotPresent | Kubernetes image pull policy for agent pods. Use Always when using latest or a mutable tag. |
intervalSeconds | 15 | Metric collection interval in seconds. Minimum value is 15. Lower values increase data freshness but add load on the Kubernetes API. |
replicas | 3 | Number of agent replicas. Minimum of 2 recommended in production for leader election continuity. |
peerPort | 18443 | Reserved port for internal agent communication. Do not change. |
restartSignal | "" | RFC3339 timestamp. When set to a value newer than an agent’s start time, the agent restarts. Updated automatically by the CostPilot dashboard. See Agent Architecture — Restart signal. |
Deployment reference
The deployment.yaml configures the Operator Deployment.
| Field | Default | Notes |
|---|---|---|
spec.replicas | 1 | Number of Operator replicas. Additional replicas act as standby. |
spec.template.spec.containers[0].image | ghcr.io/smrt-devops/cost-pilot/agent:latest | Override with a pinned tag for production. |
args[--leader-elect] | false | Set to true if running multiple Operator replicas. |
resources.requests.cpu | 50m | — |
resources.requests.memory | 64Mi | — |
resources.limits.cpu | 100m | — |
resources.limits.memory | 128Mi | — |
Agent pod resource limits (CPU 160m, memory 128Mi) are managed by the Operator and are not configurable via the ConfigMap or manifests. They are set in the Operator’s reconciliation logic.
RBAC reference
The Kustomise base creates two RBAC objects for the Operator:
ClusterRole
Grants cluster-scoped read access to resources the agent needs for cost collection, plus permission to create ClusterRoles and ClusterRoleBindings for the agent:
| Resource | Verbs | Purpose |
|---|---|---|
clusterroles, clusterrolebindings | get, list, watch, create, update, patch | Create agent’s cluster-scoped RBAC |
nodes | get, list | Node capacity and provider labels |
pods | get, list | Pod specs across all namespaces |
persistentvolumeclaims | get, list | Storage cost allocation |
metrics.k8s.io/pods, metrics.k8s.io/nodes | get, list | CPU/memory usage |
Role (namespace-scoped)
Grants namespace-scoped access within the costpilot namespace:
| Resource | Verbs | Purpose |
|---|---|---|
configmaps | get, list, watch, update | Agent config + config sync |
secrets | get, list, watch, create, update | API key + mTLS certificates |
serviceaccounts | get, list, watch, create, update | Agent ServiceAccount |
pods | get, list, watch | Agent pod listing |
replicasets (apps) | get, list, watch, create, update, patch | Agent ReplicaSet management |
roles, rolebindings | get, list, watch, create, update, patch | Agent namespace RBAC |
leases (coordination.k8s.io) | get, list, watch, create, update, patch | Operator leader election |
Common overlay patterns
Pin the image tag
# overlays/production/kustomization.yaml
images:
- name: ghcr.io/smrt-devops/cost-pilot/agent
newTag: "v0.1.0"
Change the ingester region to US
patches:
- target:
kind: ConfigMap
name: cp-agent-config
patch: |-
- op: replace
path: /data/ingesterEndpoint
value: "us"
Increase collection interval (lower-traffic clusters)
patches:
- target:
kind: ConfigMap
name: cp-agent-config
patch: |-
- op: replace
path: /data/intervalSeconds
value: "30"
Use a private image registry
patches:
- target:
kind: ConfigMap
name: cp-agent-config
patch: |-
- op: replace
path: /data/image
value: "my-registry.example.com/cost-pilot/agent:v0.1.0"
Use a custom secret name
If your secrets manager creates the API key Secret with a different name:
patches:
- target:
kind: ConfigMap
name: cp-agent-config
patch: |-
- op: replace
path: /data/clusterApiKeySecretName
value: "my-custom-secret-name"
Secrets (not managed by Kustomise)
The cluster API key Secret is intentionally not included in the Kustomise manifests. This is by design — secrets should not be committed to version control alongside the application manifests.
Create the Secret before applying:
kubectl create secret generic cp-agent-secret \
--namespace costpilot \
--from-literal=cluster-api-key=YOUR_API_KEY_HERE
External Secrets Operator
If you use External Secrets Operator:
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: cp-agent-secret
namespace: costpilot
spec:
refreshInterval: 1h
secretStoreRef:
name: my-secret-store
kind: SecretStore
target:
name: cp-agent-secret
creationPolicy: Owner
data:
- secretKey: cluster-api-key
remoteRef:
key: costpilot/cluster-api-key
Sealed Secrets
kubectl create secret generic cp-agent-secret \
--namespace costpilot \
--from-literal=cluster-api-key=YOUR_API_KEY_HERE \
--dry-run=client -o yaml \
| kubeseal --controller-namespace kube-system -o yaml \
> sealed-cp-agent-secret.yaml
Troubleshooting
Operator pod is running but no agent pods appear
Check Operator logs:
kubectl logs -n costpilot -l app=cost-pilot-operator
Common causes:
- ConfigMap not found: The ConfigMap must have labels
app: costpilot-agent, component: config. - Secret not found: The Secret named in
clusterApiKeySecretNamedoes not exist. - RBAC insufficient: The Operator ServiceAccount lacks permissions. Re-apply the manifests to recreate RBAC.
Agent pods are CrashLoopBackOff
kubectl logs -n costpilot <agent-pod-name>
Common causes:
- Metrics Server not installed: The agent requires the Kubernetes Metrics API (
metrics.k8s.io). Install the Kubernetes Metrics Server if it is not present. - Invalid API key: The agent will log
authentication failedon startup. Rotate the API key in Settings → Clusters and update the Secret. - Network egress blocked: The agent must be able to reach
ingest.eu.cost-pilot.com:443(oringest.us.cost-pilot.com:443for the US region). Check network policies and firewall rules.
How do I check the mTLS certificate status?
kubectl get secret -n costpilot | grep mtls
kubectl describe secret -n costpilot cp-agent-mtls
The certificate’s Not After date is visible in the Secret’s annotations. The Operator automatically renews it seven days before expiry.