-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: concurrent commit changes to PartitionWriter
#2997
base: main
Are you sure you want to change the base?
Conversation
ACTION NEEDED delta-rs follows the Conventional Commits specification for release automation. The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. |
PartitionWriter
PartitionWriter
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2997 +/- ##
=======================================
Coverage 72.35% 72.36%
=======================================
Files 128 128
Lines 40467 40464 -3
Branches 40467 40464 -3
=======================================
- Hits 29281 29280 -1
Misses 9308 9308
+ Partials 1878 1876 -2 ☔ View full report in Codecov by Sentry. |
What is the motivation behind this change? Can you give more details behind what you are trying to accomplish here? |
I am making a ingest from a third party kafka alike software and Essentially, this just expands the default behavior to allow for a multi ingest system that concurrently updates a table. Instead of needing to reopen a new writer every time you would like to commit the newly updated data. It doesn't break previous versions, aswell as adding new feature for future developments. |
You have to open a new one every time because you shouldn't be committing partitions concurrently with a locking service or mechanism. I do not think this is correct in that case because if you are sharing a partition writer between commits then you could end up putting add actions in the wrong commits. They are meant to be ordered and conflict free, which I don't know if this would accomplish. |
When I talk about concurrently i'm more talking about multi system, I am not in need for the locking mechanism. If I remove the locking mechanism, will this solve your issue and PR be merged? |
There wouldn't be anything left in your PR if you removed the locking mechanism here, right? |
This would still be left pub async fn drain_files_written(&self) -> Vec<Add> {
self.files_written.write().await.drain(..).collect()
} |
added `drain_files_written()` in order to do concurrent updates while writing Signed-off-by: dan-j-d <[email protected]>
Signed-off-by: dan-j-d <[email protected]>
Signed-off-by: stretchadito <[email protected]> Signed-off-by: dan-j-d <[email protected]>
Signed-off-by: dan-j-d <[email protected]>
@hntd187 I have removed the lock implementation. |
let mut actions = Vec::new(); | ||
for (_, writer) in writers { | ||
let writer_actions = writer.close().await?; | ||
actions.extend(writer_actions); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Dan-J-D I'm not sure I understand the motivation of this change. I believe @alexwilcoxson-rel or @wjones127 introduced the original code above to ensure that the process which was running the writer would be able to concurrently produce the Vec<Action>
Based on my understanding of this change, it walks back that performance improvement but I cannot figure out for what gain 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was on a older version of the library and I just copy & pasted my local copy into github, but there were newer changes already made, so I added them back
@@ -427,10 +419,15 @@ impl PartitionWriter { | |||
Ok(()) | |||
} | |||
|
|||
/// Retrieves the list of [Add] actions associated with the files written. | |||
pub async fn drain_files_written(&mut self) -> Vec<Add> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the goal for this function to allow for data files (.parquet
) to be written without the process committing to the Delta table? I have read the discourse with @hntd187 but I'm still struggling to understand how this public API would be used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So let's say in an event where a server has written a lot of data into the table but hasn't committed the additional files yet and there was a power outage. The files will be in storage but not in the metadata. So this can be used to prevent data loss (While writing data, also creating commits of the newly added files).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would something be calling drain periodically while writes are still happening then? What would be the difference vs closing the writer periodically and creating a new one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Saved cost of reallocation, re-initialization and if you are closing after a certain time frame it is possible to create just a lot of small files which aren't optimized. Then will need to OPTIMIZE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Saved cost of reallocation, re-initialization
understood, I don't know how costly it is to create a new writer
if you are closing after a certain time frame it is possible to create just a lot of small files which aren't optimized
as you write and write over time parquet is flushed to object store to a default size of 100MB files I think. Then close flushes the rest and returns the actions. Is this just trying to avoid flushing any buffered small data to object store then?
Is your writer open for a long period of time constantly receiving data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I made an alternative to delta-kafka-ingest for my specific use case and it will be receiving data all day.
Description
Added
drain_files_written()
toPartitionWriter
to drain the not yet committed changes.Related Issue(s)
None
Documentation
None