Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to Create a Prometheus Alert for Missing Traces for a Specific Component in Tempo? #4322

Open
rajushrajan opened this issue Nov 14, 2024 · 5 comments

Comments

@rajushrajan
Copy link

Hi everyone,

I’m working on a Prometheus alert to trigger when traces are missing for any component in Tempo. Currently, I have the following query, which triggers an alert when there are no traces available for a specific time window (e.g., 5 minutes):

sum by (cluster, namespace) (avg_over_time(tempo_ingester_live_traces[5m])) == 0

This works well for triggering an alert when no traces are ingested for the entire system (across any components) within the specified time window. However, I need to modify the query so that the alert is triggered when traces are missing for any component within a specific namespace or cluster.

How can I modify the query so that it triggers an alert when traces are missing for any component (not just globally or for a specific component ) within a cluster or namespace? I want the query to check for missing traces for each component, rather than globally.

I am using Tempo for trace ingestion and Prometheus for monitoring.
The metric I’m working with is tempo_ingester_live_traces, which is labeled by component, namespace, and cluster.

@javiermolinar
Copy link
Contributor

Hi,

I believe the span metrics from the Metrics Generator can help you achieve what you want:

https://grafana.com/docs/tempo/latest/metrics-generator/span_metrics/

These metrics include additional labels, based on the trace data, for instance, the name of the service that generated the span. You can even define custom labels.

@joe-elliott
Copy link
Member

I will also point out we've recently added "usage trackers" which will be in Tempo 2.7:

#4162

These will allow you to breakdown received bytes/second by any span or resource labels (namespace, cluster, etc) and publish those metrics directly from the distributor. (no metrics generator/prometheus required)

@rajushrajan
Copy link
Author

Hi @joe-elliott , Thank you for your response. I will explore the usage trackers and get back to you.

@rajushrajan
Copy link
Author

Hi @joe-elliott Any idea when tempo 2.7 version will get released?

@joe-elliott
Copy link
Member

Likely December or January.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants