-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monitor checks or custom task for state change detection #2969
Comments
@nathanielc @aanthony1243 could one of you take a look and see if you can help with this query? |
Hi! There's a few tricks you can use. First, you'll want to use so you can do something like:
next, you want to detect whether last_cpu_usage has any rows and switch to the default if not.
finally based on
|
Hi @aanthony1243, thanks for your tips. My problem with the CSV approach, however, is that I don't know what nodes will come around in the future as they are provisioned. The node_id is a tag in my case, that will get new values as nodes come online. To clarify why I need this initial state, the measurement I need to have records for the My running thought is that I could still use
To exemplify, if a given run of this task finds the values
Let me know if my question needs clarification. |
Hi @invernizzie thanks so much for describing your use case in detail. It's super helpful to us. We are thinking over the best way to solve your problem, and will respond as soon as we have a concrete recommendation. |
@invernizzie I wonder if the Here's an example of its use: This example is over the This approach should be robust to new nodes coming online. One caveat however is that you would need to query back over two checks each time your task is run in order to detect a state change. Would something like this work for you? |
@Anaisdg can you please take a look and see if you should consider documenting this specific use case (aside from specific documentation on stateChangesOnly() which the docs team will add)? |
Hello @invernizzie here is documentation on stateChangesOnly. Does that help? |
This issue has had no recent activity and will be closed soon. |
I am migrating fault detection scripts from Kapacitor and I am running into multiple issues.
My use case is to feed events from fault detection into a stream processing system that will create alerts that are presented in a web UI and trigger notifications to end users through different channels. This is an application feature, not a systems monitoring use case.
Using monitor checks
My main problem is to split events into different buckets per environment. The built-in monitor checks only write to the _monitoring.statuses measurement, so I find the need to fan the records out. For example, for this simple CPU usage check:
I will need to create multiple alerts, one per environment (which means I'll have to update them all in the event of requirement changes). And to land the records in the right measurement I need one helper task like the one below also per environment:
(BTW this is an example check, and not related to my data or the conditions I'm trying to detect.)
Environment separation if important for access control, among other things, so we're not willing to compromise on this. Is this the better way around this problem using monitoring checks?
Additionally, I would like to further customize the data that gets written to the destination measurement. As you can see in the task code I'd like to rename
_check_id
tofault_type
, and probably remove other fields that are irrelevant. Is this possible?Lastly, the way
monitor.check
allows for the data object to be defined seems rigid. For some faults, I need to keep columns from the input measurement itself (e.g.cpu.user
andcpu.system
) as snapshots of the record that triggered the check state change. Is there any way to achieve this?Using a custom task
Because of the problems outlined above, using a custom Flux task seems like a better approach in principle. However, I'm facing an issue that I have been unable to overcome: alert state initialization.
If the following Flux task is created, it will never write anything to the output measurement. This is because the join between the processed input and the output measurement is an inner join, and there is no initial state in the output measurement. Note that in this example, as in my actual use case, new nodes can be created in the system thus creating data with a new tag value for
node_id
—it's not possible to just initialize state upfront.Is there a known solution to this problem? There's a proposal for time interpolation (#2428) that may help make the join outer and allowing to detect a lack of an initial state, but there's no response to the proposal.
Other issues
As you can see, in both approaches it would be useful to have parameterized tasks to avoid creating multiple tasks with the same code but different arguments, such as env in this case. This could be achieved with a field for custom/extra values in the task options.
The text was updated successfully, but these errors were encountered: