-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fragment list consolidation for tiledb cloud arrays #5059
base: dev
Are you sure you want to change the base?
Conversation
677f049
to
f776d91
Compare
throw_if_not_ok(fragment_consolidator->consolidate_fragments( | ||
array_name, encryption_type, encryption_key, key_length, fragment_uris)); | ||
// Consolidate | ||
auto fragment_consolidator = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed dynamic_cast
that should be avoided when possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not do that now. If we're going to do this, let's change all the places that use Consolidator::create for something better and consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why shouldn't we do that immediate improvement? This is the only occurrence of a dynamic_cast
in consolidation code, and the reason is that consolidate_fragments
is a method of FragmentConsolidator
and not of its parent class Consolidator
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussing offline, I am reverting this change, although I think it's improving the current situation, to get this long standing PR finally unblocked and merged.
// Check if array exists | ||
ObjectType obj_type; | ||
throw_if_not_ok( | ||
object_type(storage_manager->resources(), array_uri, &obj_type)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved this in the else
case so that it's not executed for remote arrays and cause 2 extra REST requests (is_group
and is_array
)
f776d91
to
124ce98
Compare
124ce98
to
4e092ea
Compare
6b273c4
to
155a23c
Compare
array_consolidation_request_builder->initFragments( | ||
fragment_uris->size()); | ||
for (size_t i = 0; i < fragment_uris->size(); i++) { | ||
fragment_list_builder.set(i, (*fragment_uris)[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those need to become relative to cross the wire, otherwise it's a security concern. Use serialize_array_uri_to_relative
and deserialize_array_uri_to_absolute
utility functions
155a23c
to
3ab70bd
Compare
@KiterLuc we should discuss before merging this in case we first need to address this one first: https://app.shortcut.com/tiledb-inc/story/52302/increase-number-of-max-fragments-in-tiledb-fragment-list-consolidation |
@@ -113,7 +118,8 @@ array_consolidation_request_from_capnp( | |||
auto fragment_reader = array_consolidation_request_reader.getFragments(); | |||
fragment_uris.reserve(fragment_reader.size()); | |||
for (const auto& fragment_uri : fragment_reader) { | |||
fragment_uris.emplace_back(fragment_uri); | |||
fragment_uris.emplace_back(deserialize_array_uri_to_absolute( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
REST CI is failing because the path normalization here is not happening on REST since go-capnp handles deserializing the request from generated source code. So we are serializing absolute URIs as relative URIs from core client-side, but REST does not call deserialize_consolidation_request
and the switch back to absolute never happens.
We could add a tiledb_handle_array_consolidation_request
much like tiledb_handle_load_array_schema_request
that would just call deserialize_consolidation_request
, but we'll need a TileDB-Go binding and subsequent REST PR to update the route to use it.
Another option without the handler and Go binding is to handle a second type of relative URIs in FragmentConsolidator::consolidate_fragments that is relative to the array_uri/
directory. The relative URIs handled in consolidate_fragments
linked above is currently relative to the array_uri/__fragments
directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about reconstructing the absolute URIs in HandleConsolidateArrayRequest on REST Server side by prepeding the array URI to arrayConsolidationRequest.Fragments
before calling ConsolidateFragments ? Would that be possible? Looks like the least "invasive" solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened this to track, we discussed and this is a feasible REST-Server side solution: https://app.shortcut.com/tiledb-inc/story/52679/fix-fragment-list-consolidation-handler-to-convert-relative-uris-to-absolute
Let's get this reviewed in the meantime when you have some spare time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done and I merged this branch to fix conflicts. I'll watch for CI failures but LGTM 👍
Co-authored-by: Shaun M Reed <[email protected]>
a4c8c71
to
65c1812
Compare
@KiterLuc this is now ready for merging. |
[sc-48774]
This PR implements the needed changes on Core side for supporting fragment list cloud consolidation for
tiledb://
arrays.In order for this feature to work end-to-end changes are also needed on the REST server side to deserialize the capnp request and call the right method (
consolidate_fragments
) for actually doing the consolidation.TYPE: FEATURE
DESC: Fragment list consolidation for tiledb cloud arrays