-
Notifications
You must be signed in to change notification settings - Fork 524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to Create a Prometheus Alert for Missing Traces for a Specific Component in Tempo? #4322
Comments
Hi, I believe the span metrics from the Metrics Generator can help you achieve what you want: https://grafana.com/docs/tempo/latest/metrics-generator/span_metrics/ These metrics include additional labels, based on the trace data, for instance, the name of the service that generated the span. You can even define custom labels. |
I will also point out we've recently added "usage trackers" which will be in Tempo 2.7: These will allow you to breakdown received bytes/second by any span or resource labels (namespace, cluster, etc) and publish those metrics directly from the distributor. (no metrics generator/prometheus required) |
Hi @joe-elliott , Thank you for your response. I will explore the usage trackers and get back to you. |
Hi @joe-elliott Any idea when tempo 2.7 version will get released? |
Likely December or January. |
Hi everyone,
I’m working on a Prometheus alert to trigger when traces are missing for any component in Tempo. Currently, I have the following query, which triggers an alert when there are no traces available for a specific time window (e.g., 5 minutes):
sum by (cluster, namespace) (avg_over_time(tempo_ingester_live_traces[5m])) == 0
This works well for triggering an alert when no traces are ingested for the entire system (across any components) within the specified time window. However, I need to modify the query so that the alert is triggered when traces are missing for any component within a specific namespace or cluster.
How can I modify the query so that it triggers an alert when traces are missing for any component (not just globally or for a specific component ) within a cluster or namespace? I want the query to check for missing traces for each component, rather than globally.
I am using Tempo for trace ingestion and Prometheus for monitoring.
The metric I’m working with is
tempo_ingester_live_traces
, which is labeled by component, namespace, and cluster.The text was updated successfully, but these errors were encountered: