Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Suggestion] Move jobs between queues, change the job body, custom metadata for a better failure handling #174

Open
Revisor opened this issue Feb 28, 2016 · 2 comments

Comments

@Revisor
Copy link

Revisor commented Feb 28, 2016

Hi,
this suggestion is connected to #170 in that both concern the handling of failed jobs.

I would like to handle failed jobs as follows:

  • NACK the failed job with an ever growing delay ([Request] NACK with a delay #170)
  • If the number of retries is higher than X, move the job to a failure queue (dead letter queue) with a new TTL, so that it can be inspected manually and acted upon

Neither of these actions are possible in Disque right now and if using a workaround - adding a new, copied job - we lose both the job ID as well as the NACK and add. delivery counters.

That's why I would like to propose four enhancements (proposals 3. and 4. are different solutions of the same problem):

  1. Allow to NACK a job with a delay ([Request] NACK with a delay #170)
  2. Allow to move a job to a different queue with a new TTL
  3. Allow callers to change the job body
  4. OR even better, if feasible: Implement custom job metadata support, like NACKs and additional-deliveries but user-defined and mutable

Ad 3. We use the job body to store job metadata. We use metadata to work around missing features 1. and 2. - we store the original job ID as well as the total number of retries there. It could also be helpful to eg. save the exact time and reason the job has failed. This requires changing the existing job body.
Supporting custom, mutable job metadata as a first class citizen in Disque would be even better.

The point of all these suggestions is to keep the ID of a job intact throughout its lifetime while allowing for a more complex handling (delayed NACKing, moving between queues, storing extra details).

What do you think? Are the suggestions too complex? Are they useful?

@mathieulongtin
Copy link

I kind of like the BURY and KICK command of Beanstalkd for that. When a job
is problematic, you bury it, it stays in the queue but is never
distributed. If you fix the problem, you can kick it and it will be
distributed again.

https://github.com/kr/beanstalkd/blob/v1.3/doc/protocol.txt

Another option for Disque would be to stay pretty bare-boned but allow Lua
functions to be loaded for customized behaviour like you're describing. For
example, some queue might have a Lua callback on nack that set the retry
time, or if too many retries have been done, push the job elsewhere.

On Sun, Feb 28, 2016 at 11:53 AM Revisor [email protected] wrote:

Hi,
this suggestion is connected to #170
#170 in that both concern the
handling of failed jobs.

I would like to handle failed jobs as follows:

Neither of these actions are possible in Disque right now and if using a
workaround - adding a new, copied job - we lose both the job ID as well as
the NACK and add. delivery counters.

That's why I would like to propose four enhancements (proposals 3. and 4.
are different solutions of the same problem):

  1. Allow to NACK a job with a delay ([Request] NACK with a delay #170
    [Request] NACK with a delay #170)
  2. Allow to move a job to a different queue with a new TTL
  3. Allow callers to change the job body
  4. OR even better, if feasible: Implement custom job metadata support,
    like NACKs and additional-deliveries but user-defined and mutable

Ad 3. We use the job body to store job metadata. We use metadata to work
around missing features 1. and 2. - we store the original job ID as well as
the total number of retries there. It could also be helpful to eg. save the
exact time and reason the job has failed. This requires changing the
existing job body.
Supporting custom, mutable job metadata as a first class citizen in Disque
would be even better.

The point of all these suggestions is to keep the ID of a job intact
throughout its lifetime while allowing for a more complex handling (delayed
NACKing, moving between queues, storing extra details).

What do you think? Are the suggestions too complex? Are they useful?


Reply to this email directly or view it on GitHub
#174.

Mathieu Longtin
1-514-803-8977

@misiek08
Copy link

Lua callbacks sounds just sexy. It will allow infinite features to be added.
If lua callbacks implementation will have multiple-callback or callback chain (calling next callback given as argument) it would be really great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants