-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: concurrent commit changes to PartitionWriter
#2997
base: main
Are you sure you want to change the base?
Changes from 5 commits
c3e01a2
a5e132b
03506c4
e8c02d7
e3f8b53
4d0b957
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,12 +6,12 @@ use arrow_array::RecordBatch; | |
use arrow_schema::{ArrowError, SchemaRef as ArrowSchemaRef}; | ||
use bytes::Bytes; | ||
use delta_kernel::expressions::Scalar; | ||
use futures::{StreamExt, TryStreamExt}; | ||
use indexmap::IndexMap; | ||
use object_store::{path::Path, ObjectStore}; | ||
use parquet::arrow::ArrowWriter; | ||
use parquet::basic::Compression; | ||
use parquet::file::properties::WriterProperties; | ||
use tokio::sync::RwLock; | ||
use tracing::debug; | ||
|
||
use crate::crate_version; | ||
|
@@ -218,18 +218,11 @@ impl DeltaWriter { | |
/// This will flush all remaining data. | ||
pub async fn close(mut self) -> DeltaResult<Vec<Add>> { | ||
let writers = std::mem::take(&mut self.partition_writers); | ||
let actions = futures::stream::iter(writers) | ||
.map(|(_, writer)| async move { | ||
let writer_actions = writer.close().await?; | ||
Ok::<_, DeltaTableError>(writer_actions) | ||
}) | ||
.buffered(num_cpus::get()) | ||
.try_fold(Vec::new(), |mut acc, actions| { | ||
acc.extend(actions); | ||
futures::future::ready(Ok(acc)) | ||
}) | ||
.await?; | ||
|
||
let mut actions = Vec::new(); | ||
for (_, writer) in writers { | ||
let writer_actions = writer.close().await?; | ||
actions.extend(writer_actions); | ||
} | ||
Ok(actions) | ||
} | ||
} | ||
|
@@ -378,19 +371,18 @@ impl PartitionWriter { | |
// write file to object store | ||
self.object_store.put(&path, buffer.into()).await?; | ||
|
||
self.files_written.push( | ||
create_add( | ||
&self.config.partition_values, | ||
path.to_string(), | ||
file_size, | ||
&metadata, | ||
self.num_indexed_cols, | ||
&self.stats_columns, | ||
) | ||
.map_err(|err| WriteError::CreateAdd { | ||
source: Box::new(err), | ||
})?, | ||
); | ||
let add_metadata = create_add( | ||
&self.config.partition_values, | ||
path.to_string(), | ||
file_size, | ||
&metadata, | ||
self.num_indexed_cols, | ||
&self.stats_columns, | ||
) | ||
.map_err(|err| WriteError::CreateAdd { | ||
source: Box::new(err), | ||
})?; | ||
self.files_written.push(add_metadata); | ||
|
||
Ok(()) | ||
} | ||
|
@@ -427,10 +419,15 @@ impl PartitionWriter { | |
Ok(()) | ||
} | ||
|
||
/// Retrieves the list of [Add] actions associated with the files written. | ||
pub async fn drain_files_written(&mut self) -> Vec<Add> { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the goal for this function to allow for data files ( There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So let's say in an event where a server has written a lot of data into the table but hasn't committed the additional files yet and there was a power outage. The files will be in storage but not in the metadata. So this can be used to prevent data loss (While writing data, also creating commits of the newly added files). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would something be calling drain periodically while writes are still happening then? What would be the difference vs closing the writer periodically and creating a new one? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Saved cost of reallocation, re-initialization and if you are closing after a certain time frame it is possible to create just a lot of small files which aren't optimized. Then will need to OPTIMIZE. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Is your writer open for a long period of time constantly receiving data? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I made an alternative to delta-kafka-ingest for my specific use case and it will be receiving data all day. |
||
self.files_written.drain(..).collect() | ||
} | ||
|
||
/// Close the writer and get the new [Add] actions. | ||
pub async fn close(mut self) -> DeltaResult<Vec<Add>> { | ||
self.flush_arrow_writer().await?; | ||
Ok(self.files_written) | ||
Ok(self.files_written.clone()) | ||
Dan-J-D marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Dan-J-D I'm not sure I understand the motivation of this change. I believe @alexwilcoxson-rel or @wjones127 introduced the original code above to ensure that the process which was running the writer would be able to concurrently produce the
Vec<Action>
Based on my understanding of this change, it walks back that performance improvement but I cannot figure out for what gain 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was on a older version of the library and I just copy & pasted my local copy into github, but there were newer changes already made, so I added them back