A Python library for generating Time-Sorted Unique Identifiers (TSID) as defined in https://github.com/f4b6a3/tsid-creator.
This library is a port of the original Java code by Fabio Lima.
pip install tsidpy
The term TSID stands for (roughly) Time-Sorted ID. A TSID is a value that is formed by its creation time along with a random value.
It brings together ideas from Twitter's Snowflake and ULID Spec.
In summary:
- Sorted by generation time.
- Can be stored as a 64-bit integer.
- Can be stored as a 13-char len string.
- String format is encoded to Crockford's base32.
- String format is URL safe, case insensitive and has no hyphens.
- Shorter than other unique identifiers, like UUID, ULID and KSUID.
A TSID has 2 components:
-
A time component (42 bits), consisting in the elapsed milliseconds since
2020-01-01 00:00:00 UTC
(this epoch can be configured) -
A random component (22 bits), containing 2 sub-parts:
- A node identifier (can use 0 to 20 bits)
- A counter (can use 2 to 22 bits)
Note: The counter length depends on the node identifier length.
For example, if we use 10 bits for the node representation:
- The counter is limited to 12 bits.
- The maximum node value is
2^10-1 = 1023
- The maximum counter value is
2^12-1 = 4095
, so the maximum TSIDs that can be generated per millisecond is4096
.
This is the default TSID structure:
adjustable
<---------->
|------------------------------------------|----------|------------|
time (msecs since 2020-01-01) node counter
42 bits 10 bits 12 bits
- time: 2^42 = ~69 to ~139 years with adjustable epoch (see notes below)
- node: up to 2^20 values with adjustable bits.
- counter: 2^2..2^22 with adjustable bits and randomized values every millisecond.
Notes:
- The time component can be used for ~69 years if stored in a
SIGNED 64-bit
integer field (41 usable bits) or ~139 years if stored in aUNSIGNED 64-bit
integer field (42 usable bits).- By default, new TSID generators use 10 bits for the node identifier and 12 bits to the counter. It's possible to adjust the node identifier length to a value between 0 and 20.
- The time component can be 1 ms or more ahead of the system time when necessary to maintain monotonicity and generation speed.
The simplest way to avoid collisions is to make sure that each generator has an exclusive node ID.
The node ID can be passed to the TSIDGenerator
constructor. If no node ID is passed, the generator will use a random value.
- The best UUID type for a database Primary Key
- The primary key dilemma: ID vs UUID and some practical solutions
- Primary keys in the DB - what to use? ID vs UUID or is there something else?
Related with the original library:
- FAQ wiki page
- Javadocs
- How to not use TSID factories
- The best way to generate a TSID entity identifier with JPA and Hibernate
Create a TSID:
from tsidpy import TSID
tsid: TSID = TSID.create()
Create a TSID as an int
:
>>> TSID.create().number
432511671823499267
Create a TSID as a str
:
>>> str(TSID.create())
'0C04Q2BR40003'
Create a TSID as an hexadecimal str
:
>>> TSID.create().to_string('x')
'06009712f0400003'
Note: TSID generators are thread-safe.
The TSID::number
property simply unwraps the internal int
value of a TSID.
>>> from tsidpy import TSID
>>> TSID.create(432511671823499267).number
432511671823499267
Sequence of TSIDs:
38352658567418867
38352658567418868
38352658567418869
38352658567418870
38352658567418871
38352658567418872
38352658567418873
38352658567418874
38352658573940759 < millisecond changed
38352658573940760
38352658573940761
38352658573940762
38352658573940763
38352658573940764
38352658573940765
38352658573940766
^ ^ look
|--------|------|
time random
The TSID::to_string()
method encodes a TSID as a Crockford's base 32 string. The returned string is 13 characters long.
>>> from tsidpy import TSID
>>> tsid: str = TSID.create().to_string()
'0C04Q2BR40004'
Or, alternatively:
>>> tsid: str = str(TSID.create())
'0C04Q2BR40004'
Sequence of TSID strings:
01226N0640J7K
01226N0640J7M
01226N0640J7N
01226N0640J7P
01226N0640J7Q
01226N0640J7R
01226N0640J7S
01226N0640J7T
01226N0693HDA < millisecond changed
01226N0693HDB
01226N0693HDC
01226N0693HDD
01226N0693HDE
01226N0693HDF
01226N0693HDG
01226N0693HDH
^ ^ look
|-------|---|
time random
The string format can be useful for languages that store numbers in IEEE 754 double-precision binary floating-point format, such as Javascript.
Create a TSID using the default generator:
from tsidpy import TSID
tsid: TSID = TSID.create()
Create a TSID from a canonical string (13 chars):
from tsidpy import TSID
tsid: TSID = TSID.from_string('0123456789ABC')
Convert a TSID into a canonical string in lower case:
>>> tsid.to_string('s')
'0123456789abc'
Get the creation timestamp
of a TSID:
>>> tsid.timestamp
1680948418241.0 # datetime.datetime(2023, 4, 8, 12, 6, 58, 241000)
Encode a TSID to base-62:
>>> tsid.to_string('z')
'0T5jFDIkmmy'
A TSIDGenerator
that creates TSIDs similar to Twitter Snowflakes:
- Twitter snowflakes use 10 bits for node id: 5 bits for datacenter ID (max 31) and 5 bits for worker ID (max 31)
- Epoch starts on
2010-11-04T01:42:54.657Z
- Counter uses 12 bits and starts at
0
(max: 4095 values per millisecond)
from tsidpy import TSID, TSIDGenerator
datacenter: int = 1
worker: int = 1
node: int = datacenter << 5 | worker
epoch: datetime = datetime.fromisoformat('2010-11-04T01:42:54.657Z')
twitter_generator: TSIDGenerator = TSIDGenerator(node=node, node_bits=10,
epoch=epoch.timestamp() * 1000,
random_fn=lambda n: 0)
# use the generator
tsid: TSID = twitter_generator.create()
A TSIDGenerator
that creates TSIDs similar to Discord Snowflakes:
- Discord snowflakes use 10 bits for node id: 5 bits for worker ID (max 31) and 5 bits for process ID (max 31)
- Epoch starts on
2015-01-01T00:00:00.000Z
- Counter uses 12 bits and starts at a random value.
from tsidpy import TSID, TSIDGenerator
worker: int = 1
process: int = 1
node: int = worker << 5 | process
epoch: datetime = datetime.fromisoformat("2015-01-01T00:00:00.000Z")
discord_generator: TSIDGenerator = TSIDGenerator(node=node, node_bits=10,
epoch=epoch.timestamp() * 1000)
# use the generator
tsid: TSID = discord_generator.create()
Make TSID.create()
to use the previous Discord generator:
TSID.set_default_generator(discord_generator)
# at this point, you can use the default TSID.create()
tsid: TSID = TSID.create()
# or the generator
tsid: TSID = discord_generator.create()
When creating a TSIDGenerator
, remember you can't use a node id greater than 2^node_bits - 1
. For example, if you need to use a node id greater than 7, you need to use more than 3 bits for the node id:
from tsidpy import TSIDGenerator
gen0 = TSIDGenerator(node=0, node_bis=3) # ok
gen1 = TSIDGenerator(node=1, node_bis=3) # ok
...
gen7 = TSIDGenerator(node=7, node_bis=3) # ok
# error: can't represent 8 with 3 bits
gen8 = TSIDGenerator(node=8, node_bis=3)
Ports, forks and implementations:
Language | Name |
---|---|
Go | vishal-bihani/go-tsid |
Java | vladmihalcea/hypersistence-tsid |
Java | vincentdaogithub/tsid |
.NET | kgkoutis/TSID.Creator.NET |
PHP | odan/tsid |
Python | luismedel/tsid-python |
Rust | jakudlaty/tsid |
TypeScript | yubintw/tsid-ts |
Other OSS:
Language | Name |
---|---|
Java | fillumina/id-encryptor |
.NET | ullmark/hashids.net |
This library is Open Source software released under the MIT license.