Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support range partitioning #75

Open
coryfoo opened this issue Feb 9, 2015 · 1 comment
Open

Support range partitioning #75

coryfoo opened this issue Feb 9, 2015 · 1 comment
Labels

Comments

@coryfoo
Copy link

coryfoo commented Feb 9, 2015

Due to specific customer demands, we need the ability to partition using pre-defined ranges of values. For us, it is long values, but I suppose for others it could alphabetical, or whatever. A couple of issues immediately come to mind with this problem:

  1. AFAIK, the addition of more shards after the initial process is not supported. This would likely need to change as pre-determining all the ranges would be impractical in many scenarios.
  2. One could imagine a scenario where multiple ranges might want to map to the same shard. I think this is supported in the table metadata structure, but if not, then this could be an issue, too.
@jasonmp85 jasonmp85 changed the title Support Range Partitioning Support range partitioning Feb 28, 2015
@jasonmp85
Copy link
Collaborator

Range partitioning is commonly brought up in the context of time series data. Specifically, users often wish to define a partition "width" and add new partitions for the "current" data as time progresses (i.e. no need to predefine the entire range of time). Another variant is to use a special bucket capped at "infinity" and change its upper bound to something reasonable once it has a certain amount of data.

the addition of more shards after the initial process is not supported

With long values, that pattern doesn't seem to apply. How would you add shards? Presumably if the data can take on any long value, you'll need a shard for every possible value, but then the concept of adding shards doesn't apply. Would you need to "split" an existing shard, instead?

One could imagine a scenario where multiple ranges might want to map to the same shard

As for a scenario where "multiple ranges map to the same shard"… can you elaborate? Do you mean a node will have multiple ranges sitting on it (that's already done), or is it crucial that our shard concept cover multiple ranges (keep in mind pg_shard places many distinct shards on each node, so it's already quite easy to have two shards covering different ranges on one node).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants