I think something got problem on groups feature when server crash #44

for2gles · 2023-03-22T08:37:04Z

Hi

I set
group concurrency: 1
global concurrency: 15
And during the process one job and server got refreshed for some reason.
And it seems get back to waiting, but it never get active.
Even though I delete the stalled job and add again with same group id but the job is not work.

Seems that specific group id get stuck and doesn't get back to normal when server get crash during process.

for2gles · 2023-03-22T09:06:58Z

When I check the group status, it return maxed but it's not processing at all

manast · 2023-03-22T10:35:00Z

Which version of BullMQ Pro are you using?

manast · 2023-03-22T10:37:16Z

So to summarize your issue:

A job belonging to a group G was processing.
the server was restarted while the job was being processed.
the job has been correctly moved to wait, but the group is still in "maxed" status.
no other job is being processed (active) in that group.

Can you confirm?

for2gles · 2023-03-22T10:49:12Z

Yes true
I'll explain the detailed situation

I'm using 5.1.14 version of BullMQ Pro
I'm also using 1.76.6 version of the BullMQ together for the legacy queue
So I also use QueueSchedular. And I'm still making QueueSchedular for old queue(I don't know it's affect)
The node server is on docker and it's restart by CI/CD
So weird thing is retrying it self is work from my local when I kill the node pid(I don't use docker from local). It's keep process the another job properly.(I used local redis when I test)

Please ask me anything if you need more info.
BTW this is almost 8PM so I may reply tomorrow.
thank you 🙏

manast · 2023-03-22T11:37:26Z

Ok, an explanation for this behavior could be that the standard BullMQ (not Pro) is actually also using the new queues. For example, if a Pro worker crashes or is re-started, the standard BullMQ could move that job to wait, but since it does not know about groups, the group will stay at "maxed".
Also I wonder, why not upgrading to latest BullMQ, or even better only using BullMQ Pro for all queues? (with the newest version you do not need the QueueScheduler either so it is easier)

manast · 2023-03-22T12:06:43Z

Another thing. By any chance, do you share a Redis connection between the standard and the Pro version?

for2gles · 2023-03-23T01:30:18Z

Ok, an explanation for this behavior could be that the standard BullMQ (not Pro) is actually also using the new queues. For example, if a Pro worker crashes or is re-started, the standard BullMQ could move that job to wait, but since it does not know about groups, the group will stay at "maxed".

I don't think so.. because I completely split between the standard BullMQ Queue-Worker and BullMQ Pro Queue-Worker.
So I don't think it's possible some job is made by standard BullMQ Queues.

Also I wonder, why not upgrading to latest BullMQ, or even better only using BullMQ Pro for all queues? (with the newest version you do not need the QueueScheduler either so it is easier)

It's because I just don't want to change it because it is already working well.

Another thing. By any chance, do you share a Redis connection between the standard and the Pro version?

No I differentiate the connection. I don't know why but I was not even start the node server when I use same connection.
This is my comment
BTW do you know why it's not possible to use same connection?

manast · 2023-03-23T09:40:52Z

BTW do you know why it's not possible to use same connection?

Because when BullMQ starts it loads a bunch of lua scripts with that connection, and I think if you use two different versions with the same connection the scripts get mixed up.

manast · 2023-03-23T09:41:12Z

By connection I mean a IORedis instance btw.

manast · 2023-03-23T09:42:14Z

Also, I am releasing a "repairMaxedGroup" function to Pro and exposing it in Taskforce.sh so that you can fix the maxed groups manually. This should never happen, but at least if it happens know you can do something about it. We will need to investigate it further to discover the cause behind it.

manast · 2023-03-23T21:59:51Z

It is released now, please give it a try:

for2gles · 2023-03-24T06:08:41Z

Oh, Thank you. It works!
BTW is the function heavy process?

I just made worker to run every 10 minutes to check every BullMQ Pro Queue like below

Get Groups list for each BullMQ Pro
filter only the status is maxed
run the function that you released

manast · 2023-03-24T09:18:46Z

The function is not designed to be used frequently as this issue should never happen :) If you are able to produce this issue frequently, then please provide some code that reproduces it and we will fix it instead.

for2gles · 2023-03-24T09:28:58Z

OK Thank you

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I think something got problem on groups feature when server crash #44

I think something got problem on groups feature when server crash #44

for2gles commented Mar 22, 2023 •

edited

Loading

for2gles commented Mar 22, 2023

manast commented Mar 22, 2023

manast commented Mar 22, 2023

for2gles commented Mar 22, 2023

manast commented Mar 22, 2023

manast commented Mar 22, 2023

for2gles commented Mar 23, 2023 •

edited

Loading

manast commented Mar 23, 2023

manast commented Mar 23, 2023

manast commented Mar 23, 2023

manast commented Mar 23, 2023

for2gles commented Mar 24, 2023 •

edited

Loading

manast commented Mar 24, 2023

for2gles commented Mar 24, 2023

I think something got problem on groups feature when server crash #44

I think something got problem on groups feature when server crash #44

Comments

for2gles commented Mar 22, 2023 • edited Loading

for2gles commented Mar 22, 2023

manast commented Mar 22, 2023

manast commented Mar 22, 2023

for2gles commented Mar 22, 2023

manast commented Mar 22, 2023

manast commented Mar 22, 2023

for2gles commented Mar 23, 2023 • edited Loading

manast commented Mar 23, 2023

manast commented Mar 23, 2023

manast commented Mar 23, 2023

manast commented Mar 23, 2023

for2gles commented Mar 24, 2023 • edited Loading

manast commented Mar 24, 2023

for2gles commented Mar 24, 2023

for2gles commented Mar 22, 2023 •

edited

Loading

for2gles commented Mar 23, 2023 •

edited

Loading

for2gles commented Mar 24, 2023 •

edited

Loading