When integrating Kubecost with an existing Prometheus, we recommend first installing Kubecost with a bundled Prometheus (instructions) as a dry run before integrating with an external Prometheus deployment. You can get in touch ([email protected]) or via our Slack community for assistance.
The Kubecost Prometheus deployment is used both as a source and store of metrics. It’s optimized to not interfere with other observability instrumentation and by default only contains metrics that are useful to the Kubecost product. This results in 70-90% fewer metrics than a Prometheus deployment using default settings.
For the best experience, we generally recommend teams use the bundled prometheus-server & grafana but reuse their existing kube-state-metrics and node-exporter deployments if they already exist. This setup allows for the easiest installation process, easiest ongoing maintenance, minimal duplication of metrics, and more flexible metric retention.
Note: the Kubecost team provides best efforts support for free/community users when integrating with an existing Prometheus deployment.
Kubecost requires the following minimum versions:
- kube-state-metrics - v1.6.0+ (May 19)
- cAdvisor - kubelet v1.11.0+ (May 18)
- node-exporter - v0.16+ (May 18) [Optional]
-
Pass the following parameters in your helm values file:
global.prometheus.fqdn
to match your local Prometheus service address with this formathttp://<prometheus-server-service-name>.<prometheus-server-namespace>.svc
global.prometheus.enabled
set tofalse
Pass this updated file to the Kubecost helm install command with
--values values.yaml
Or add
--set global.prometheus.fqdn=http://<prometheus-server-service-name>.<prometheus-server-namespace>.svc --set global.prometheus.enabled=false
the end of your helm install command -
Have your Prometheus scrape the cost-model
/metrics
endpoint. These metrics are needed for reporting accurate pricing data. Here is an example scrape config:
- job_name: kubecost
honor_labels: true
scrape_interval: 1m
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
dns_sd_configs:
- names:
- kubecost-cost-analyzer.<namespace-of-your-kubecost>
type: 'A'
port: 9003
This config needs to be added to extraScrapeConfigs
in the Prometheus configuration. Example extraScrapeConfigs.yaml
To confirm this job is successfully scraped by Prometheus, you can view the Targets page in Prometheus and look for a job named kubecost
.
Note that this step is optional, and only impacts certain efficiency metrics. View issue/556 for a description of what will be missing if this step is skipped.
You'll need to add the following relabel config to the job that scrapes the node exporter DaemonSet.
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_pod_node_name]
action: replace
target_label: kubernetes_node
Note that this does not override the source label-- it creates a new label called "kubernetes_node" and copies the value of pod into it.
Visiting <your-kubecost-endpoint>/diagnostics.html
provides diagnostics info on this integration. More details
Common issues include the following:
- Wrong Prometheus FQDN: evidenced by the following pod error message
No valid prometheus config file at ...
and the init pods hanging. We recommend runningcurl <your_prometheus_url>/api/v1/status/config
from a pod in the cluster to confirm that your Prometheus config is returned. Here is an example, but this needs to be updated based on your Prometheus address:
kubectl exec kubecost-cost-analyzer-db55d88f6-fr6kc -c cost-analyzer-frontend -n kubecost \
-- curl http://kubecost-prometheus-server.kubecost/api/v1/status/config
If the config file is not returned, this is an indication that an incorrect Prometheus address has been provided. If a config file is returned from one pod in the cluster but not the Kubecost pod, then the Kubecost pod likely has its access restricted by a network policy, service mesh, etc.
-
Prometheus throttling -- ensure Prometheus isn't being CPU throttled due to a low resource request.
-
Wrong dependency version -- see the section above about Requirements
-
Missing scrape configs -- visit Prometheus Target page (screenshot above)
-
Data incorrectly is a single namespace -- make sure that honor_labels is enabled
You can visit Settings in Kubecost to see basic diagnostic information on these Prometheus metrics:
Edit this doc on GitHub