Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add partial file retrieval when using GridFS stream download #874

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

laleksiunas
Copy link

@laleksiunas laleksiunas commented May 10, 2023

Implemented support for partial file retrieval according to GridFS spec for GridFsBucket::open_download_stream and GridFsBucket::open_download_stream_by_name.
Unfortunately, there is a breaking change in GridFsBucket::open_download_stream interface - additional options argument to accept start/end range.

  • start and end values are non negative (ensured by design using u64)
  • validate whether start is less than or equal to end
  • validate whether start and end are less than or equal to file length
  • file chunk size is being used to skip and limit chunks when performing partial reads

@isabelatkinson
Copy link
Contributor

Hi @laleksiunas, thanks for creating this pull request! The team is currently focused on other priorities, but we will take a look at this when we have more bandwidth in the future. One piece of immediate feedback I have is that making a breaking change here means that we would not be able to merge this work until we do a major release, which we don't currently have a time frame for. One way to avoid this breaking change would be to add a GridFsBucket::open_download_stream_with_range method rather than change the signature for the existing open_download_stream method.

@laleksiunas
Copy link
Author

laleksiunas commented May 22, 2023

@isabelatkinson Thanks for review, I will add GridFsBucket::open_download_stream_with_range in the upcoming days

@nathanfranke
Copy link

nathanfranke commented Oct 30, 2024

I rebased this to upstream/main: nathanfranke@bed9f9c (two commits)
Tests aren't passing for me even on upstream/v3.5.1 or upstream/main; perhaps my fault.
Some code was moved between files and I made some members public, notably actions::gridfs::download::DownloadRange and those modules.

Otherwise, the async by_name API seems to work well; I did not test with IDs or sync.


One way to avoid this breaking change would be to add a GridFsBucket::open_download_stream_with_range method rather than change the signature for the existing open_download_stream method.

This should be resolved now. open_download_stream and open_download_stream_by_name use the builder pattern (e.g. .open_download_stream_by_name("test").start(0).end(100))


Test project:
rust-mongodb.zip

Note: The mongodb dependency path is hardcoded in this project.

rust mongodb: 100 x 1MiB
    WRITE: 534ms
    READ FULL: 320ms
    READ PARTIAL: 3ms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants