Skip to content
This repository has been archived by the owner on Sep 26, 2023. It is now read-only.

Work On A Way To Avoid Blocking For 1ms When Sequence Wraps #2

Open
nathankleyn opened this issue Apr 2, 2015 · 6 comments
Open

Work On A Way To Avoid Blocking For 1ms When Sequence Wraps #2

nathankleyn opened this issue Apr 2, 2015 · 6 comments

Comments

@nathankleyn
Copy link
Contributor

Inspired by a discussion with @dubek, I would like to keep investigating any ways we can possibly avoid blocking ID generation on a host for 1ms when the sequence wraps. Here's an excerpt of the conversation I had with @dubek:

We spent a good long while trying various ways to avoid that problem, but it gets really difficult due to restrictions in Redis/Lua scripting.

We have to block until at least the next millisecond as if you generate an ID with a sequence of 4095 and immediately after (and in the same millisecond) one with 0, the one with 0 would be ordered after the one with 4095 and that breaks our k-ordering guarantees (which basically say that we'll guarantee the IDs will be in exact order by time generated, and rough order across many nodes as the logical shard ID could be anything).

Ideally, you'd be able to just block for as long as it takes to make it until the next millisecond, but in Redis Lua scripts there's no way to tell when a millisecond comes (the purity restrictions of the Redis Lua scripts say that TIME, as a command that returns an unpredictable value, must be the last thing you call). So we have to block for at least 1ms even if we only need to block for, maybe, 100ns - which kind of sucks, but isn't the end of the world.

The first thought we had was that if we were to roll the sequence back to 0 on every millisecond that would at least allow us to generate a few more IDs per millisecond before blocking - but as it turns out, this is really difficult. The expiry of the keys in Redis isn't rounded to the millisecond like we thought - instead it can happen part of the way through a millisecond. Therefore, we will still end up wrapping part way through a millisecond and come up against the same ordering restrictions we had without the expiry (which is why we decided in the end to just drop it and keep it safe but slower for now).

Best case scenario would be that Redis added a new command that allowed us to set the expiry and check the TTL remaining for a key in nanoseconds instead of milliseconds - after (not much) pondering about it, I think that allow us to tell how far through the millisecond we are and would allow us to expire the sequence at exactly when a millisecond starts/ends.

@dubek
Copy link

dubek commented Apr 2, 2015

Another command that can be useful is "block until a key is expired" (= "block while a key exists")? That will replace your original EXISTS call at the top of the Lua script.

I looked at the Instragram solution and it seems to me that:

  1. They just wrap the sequence numbers, which means that: (a) they will produce duplicate IDs if they generate more than 1024 IDs in the same millisecond. (b) even if not reaching duplicates, they lose the "ordered within one shard" property.
  2. It looks like they generate the current time milliseconds by calling clock_timestamp() and multiplying by 1000, which sounds problematic to me (I'm no Postgres expert; maybe I'm mis-reading).

So in that sense I don't think we can learn from their solution.

@dubek
Copy link

dubek commented Apr 2, 2015

Hmm, but if we "wait until expired" and there's only one script that executes at once, then we'll have a deadlock. Scratch that.

@philmes
Copy link

philmes commented Apr 3, 2015

It's worth asking if this is actually an issue under real world conditions. A quick benchmark suggests that, on a relatively powerful workstation (4Ghz i7), I can generate about 50 ids per millisecond using the default Jedis implementation. That's some way short of causing a wraparound.

@dubek
Copy link

dubek commented Apr 3, 2015

@philmes seuqences keep increasing even as time advances. So let's say you produce 20 new IDs each millisecond: The first IDs will have sequences 0-19, then on the next millisecond the sequence part will be 20-39, and so on. At some point you're going to hit the 4095 limit, which will start returning errors for one whole millisecond.

@philmes
Copy link

philmes commented Apr 3, 2015

Ah, yes... and because of the purity constraints, it's not possible to check to see if the sequence should be reset and act accordingly.

@nathankleyn
Copy link
Contributor Author

It is very likely that the new Redis v4 modules may be the solution to this. https://redis.io/topics/modules-intro

We may at some point look at converting Icicle to a Redis module for Redis > 4.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants