-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto-partitioning build-in feature #62
Comments
pg_partman can help with automating table-based partition management, but yes having user-configurable auto-partitioning is something we would like to add to the storage layer as well. |
Based on the fact that query as is on Hydra cannot cope with multiple query in timely fashion and that timescaledb released the version 2.11 with support for update/delete on compressed data, I have started comparing the 2 of them again from point of view of reading the data. The reason I am writing this here on this issue is that I believe there is another config on my timescaledb that helps a lot for such good performance: space partitioning of which I do use on the above setup having configured 10 tenants that divide the customers on each table into 10 slices, so the actual query of a given client would hit only 1/10 of total chunks in order to run. In conclusion I believe auto-partitioning feature requested here might also help a lot on the actual query performance from related issue and if simple division could be applied (probably not) => 1/10 x 5.5 seconds the equivalent performance would be of 0.5 seconds compared to Timescaledb. |
In the near term, the native Postgres partitioning is the way to do this for Hydra. We don't plan to add this as a primitive to the storage layer for quite some time. We have For "partition this into X slices by ID" you can use the "partition by hash" function. It's just a matter of writing the SQL, which is a bit repetitive but fortunately need only be done once. Here's an example:
The thing Timescale does here that's pretty tricky to do is partitioning by both time and dimension… in this case, it would be like partitioning by time first, and then by hash on ID. This is possible to do with native partitioning but it gets tricky to manage as there's many partitions to create, and I'm not aware that pg_partman helping with this case.
|
So if I define the 10 tenant manual partition on a columnar table and than insert the data via ETL every ~ 5 minutes in the main table, will the columnar stripe/chunks fallow the rules and the actual split the batch data into 10 chunks behind it ? |
Yes to both. The only caveat I can think of is that |
Hi,
Timescaledb has the built-in auto partitioning on 1 or 2 dimension (time and tenant/deviceid/etc).
Your documentation states Partitioning to be used for time partitioning, but still it can be done only manually.
It would be great if you implement something for auto-creating the partitions based on configuration of intervals and/or secondary criteria. This can help a lot on the usecase of GDPR cleaning where dropping older data would be a lot faster/easier.
The text was updated successfully, but these errors were encountered: