Should we use OpenTelemetry or OpenMetrics semantic conventions? #37
-
Unfortunately, OpenTelemetry and OpenMetrics standardized metrics in slightly different ways such that we either need to pick one or make the queries we write configurable. Differences between OpenTelemetry and OpenMetricsAs far as I can tell, the main differences between the two standards that are relevant to autometrics are as follows: SeparatorsOpenTelemetry uses In the Rust implementation we currently handle this difference by using the Counter suffixOpenTelemetry:
|
Beta Was this translation helpful? Give feedback.
Replies: 9 comments 15 replies
-
Isn't it possible to handle both cases based on a switch? As far as I understand, as a user of Autometrics, I'm going to target either an OpenMetrics setup or an OpenTelemetry setup when I start publishing metrics. Therefore some kind of global flag could be used in the library to tell which implementation (OT/OM) we'd want and generate only the metric names/units that match it |
Beta Was this translation helpful? Give feedback.
-
I would go with OTEL. It's got a lot of traction with many different vendors and is being actively worked on. The metrics specification is stable.
|
Beta Was this translation helpful? Give feedback.
-
Just to add to what @IvanMerrill wrote, we are also focusing on OpenTelemetry for our logging functionality, which is reflected for instance in the ability to add OTEL metadata to our |
Beta Was this translation helpful? Give feedback.
-
There is an option to expose Prometheus endpoint when using OpenTelemetry SDKs. Also, we are working on removing the |
Beta Was this translation helpful? Give feedback.
-
There is a slight advantage of using OpenTelemetry SDKs is that you will support push out of the box, and not just pull. |
Beta Was this translation helpful? Give feedback.
-
So earlier I also suggested it would be preferred to go the OTel route, but for pragmatic reasons I no longer think we should focus too much on it. We use Prometheus clients for pushing metrics and they are getting in the way of doing full OTel compatibility, because they seem to largely embrace OpenMetrics instead. In addition, I don't think we should wait and hope for this situation to resolve itself, since Prometheus storage plays a central part in our strategy and the lack of a standard from our side is already leading to divergence between our implementations. This effectively leads us to:
To resolve this situation, I would suggest embracing OpenMetrics and to effectively follow the Python and TypeScript implementations. This would seem relatively easy to accomplish for both the Rust and the Go implementations (whereas the other way around is both a lot more work, and would also settle them on a metric name that follows neither standard). Note this does not mean I would suggest giving up on OTel altogether. I think it's a nice feature of the Rust implementation that it allows OTel exports as well, which I think other clients should strive to implement too. But for Prometheus exports specifically, I think OpenMetrics is the way to go. |
Beta Was this translation helpful? Give feedback.
-
FYI, it is part of the Spec in OTel to Prometheus that counters should have a The collector will now append the units and |
Beta Was this translation helpful? Give feedback.
-
@gagbo brought up a good point that Prometheus/OpenMetrics naming conventions also include the units in the metric names. This suggests that the name of our histogram, at least as far as Prometheus is concerned, should be OpenTelemetry says:
This suggests that the metric can continue to be called |
Beta Was this translation helpful? Give feedback.
-
Luckily for us, OpenTelemetry changed some of their recommendations in such a way that it's easier for us to conform to both OpenTelemetry and OpenMetrics/Prometheus naming conventions! 🎉 Here are the conclusions:
We're going to support the old values for some time in the shared resources like dashboards and alerting rules by using regexes to match the names, but eventually we'll remove the regexes and just use these names. The Autometrics spec was updated with these changes in #60 |
Beta Was this translation helpful? Give feedback.
Luckily for us, OpenTelemetry changed some of their recommendations in such a way that it's easier for us to conform to both OpenTelemetry and OpenMetrics/Prometheus naming conventions! 🎉
Here are the conclusions:
function.calls
and is exported to Prometheus asfunction_call_total
function.calls.duration
and is exported to Prometheus asfunction_calls_duration_seconds
We're going to support the old values for some time in the shared resources like dashboards and alerting rules by using regexes to match the names, but eventually we'll remove the regexes and just use these names.
The Autometrics spec was…