Our node does not perform as good as other nodes #940

volovyks · 2024-11-26T21:56:58Z

Description

Such behavior was explored on Testnet and Mainnet. It can lead to failures in all protocols.

volovyks · 2024-11-26T21:57:36Z

PS, it is a modified Dashboard, I will add it soon.

auto-mausx · 2024-11-27T19:09:37Z

So I did notice this started when we moved our node over, I'm not sure if the fact that our node is technically running on a shorter timeframe than the others since we destroyed our node and rebuilt it. I attributed it to that, so perhaps it is the way the metric is exported.

Just for clarity sake, this node is the exact same machine size, disk size, and networking configuration as the rest of the partner nodes. I mirrored the environment from Pagoda 1 for 1 just to avoid any issues.

auto-mausx · 2024-11-27T19:19:21Z

Here's my theory:

This line of code controls the increment of that metric count

crate::metrics::PROTOCOL_ITER_CNT
                .with_label_values(&[my_account_id.as_str()])
                .inc();

I hypothesize that grafana calculates the rate per hour (increase()) by dividing the total count by 60 mins. So since our node is "newer" than the other nodes, there will be significant difference between the total number of iterations from all other nodes to this node. There are months of iterations on the other nodes, and we only have about 27 days worth of iterations.

That is also the reason the other nodes are not exactly aligned with each other, since it took about a week for all of our partners to update.

volovyks · 2024-11-27T21:30:29Z

Let's see how it will behave after the release. I hope increase means how many new iterations happened in the last hour.

auto-mausx · 2024-11-27T21:53:04Z

That is what the docs says it means, so maybe we do have an issue. I am not sure what that may be though.

https://prometheus.io/docs/prometheus/latest/querying/functions/#increase

volovyks added Near BOS NEAR BOS team at Pagoda Emerging Tech Emerging Tech flying formation at Pagoda labels Nov 26, 2024

github-project-automation bot added this to Emerging Technologies Nov 26, 2024

github-project-automation bot moved this to Backlog in Emerging Technologies Nov 26, 2024

volovyks mentioned this issue Nov 26, 2024

Stability #818

Open

volovyks assigned kmaus-near and auto-mausx Nov 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Our node does not perform as good as other nodes #940

Our node does not perform as good as other nodes #940

volovyks commented Nov 26, 2024

volovyks commented Nov 26, 2024

auto-mausx commented Nov 27, 2024

auto-mausx commented Nov 27, 2024

volovyks commented Nov 27, 2024

auto-mausx commented Nov 27, 2024

Our node does not perform as good as other nodes #940

Our node does not perform as good as other nodes #940

Comments

volovyks commented Nov 26, 2024

Description

volovyks commented Nov 26, 2024

auto-mausx commented Nov 27, 2024

auto-mausx commented Nov 27, 2024

volovyks commented Nov 27, 2024

auto-mausx commented Nov 27, 2024