Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[api] Improve normal file upload design for public APIs #3878

Merged
merged 10 commits into from
Nov 21, 2024
Merged

Conversation

Harshg999
Copy link
Collaborator

@Harshg999 Harshg999 commented Nov 5, 2024

What changes were proposed in this pull request?

  • The original file upload implementation is super old and directly uploads the file by checking the available upload handlers set in settings.py when the request reaches the first middleware.
  • This flow did not follow the current Hue filebrowser design of routing the FS calls via ProxyFS.
  • The design also had historical flaws of uploading the file even if request was meant for other file operation (if you send a file with /copy call, it will still upload the file and then try doing the copy operation).
  • There was no flexibility for upload pre-checks and error handling. Most of such stuff was shifted post file upload.

  • The new implementation helps in solving all the above problem along with cleaner code, improved design, upload on-demand, greater control and performance improvements.
  • The new implementation is only available in API form and does not affect or modifies the older implementation (both are available in parallel for the time being) because we are in the process of phasing out the old filebrowser.
  • Once the new filebrowser is upto speed, the old implementation will be removed.

How was this patch tested?

  • Manually
  • Added new unit tests

@Harshg999 Harshg999 changed the title [api] Refactor /upload API [WIP}[api] Refactor /upload API Nov 5, 2024
@Harshg999 Harshg999 changed the title [WIP}[api] Refactor /upload API [WIP][api] Refactor /upload API Nov 5, 2024
@Harshg999 Harshg999 changed the title [WIP][api] Refactor /upload API [api] Refactor /upload API Nov 13, 2024
Copy link

⚠️ No unit test files modified. Please ensure that changes are properly tested. ⚠️

Copy link

⚠️ No unit test files modified. Please ensure that changes are properly tested. ⚠️

@Harshg999 Harshg999 changed the title [api] Refactor /upload API [api] Improve normal file upload design for public APIs Nov 14, 2024
@Harshg999 Harshg999 self-assigned this Nov 14, 2024
Copy link
Collaborator

@bjornalm bjornalm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, I have a couple of thoughts without having looked to deep into the python code.

  1. I think I have raised this before, but why are we not taking the opportunity while building completely new APIs to implement the new Error handling? Both the UI and the API are being rebuilt so I doubt there will be a better opportunity.

https://docs.google.com/document/d/1PLaOEZnDHI0GiCqCzkRpnVnrenPE8LbGm1WlaJTn0cg/edit?usp=sharing

  1. Since this is a professional file service API, we should consider the difference between Base-2 and Base-10 when dealing with file sizes, e.g. how to calculate and how to communicate sizes in a correct and consistent manner. I don't know how big of an issue this is for our new filebrowser, but I think we should to look into it.

For instance, I notice that FILE_UPLOAD_CHUNK_SIZE is calculated as Base-2 and should be expressed as KiB, MiB, GiB etc but is communicated as base-10.

We should look into which files systems use Base-2 (like macOS Terminal and unix) and which use base-10 (like S3) and have a strategy for dealing with that. My fear is that in the context of big data these differences can have a large impact if we don't get it right.

  1. We should also add unit tests to this new API

apps/filebrowser/src/filebrowser/api.py Outdated Show resolved Hide resolved
apps/filebrowser/src/filebrowser/api.py Show resolved Hide resolved
apps/filebrowser/src/filebrowser/api.py Show resolved Hide resolved
Copy link

⚠️ No unit test files modified. Please ensure that changes are properly tested. ⚠️

@Harshg999
Copy link
Collaborator Author

Nice work, I have a couple of thoughts without having looked to deep into the python code.

  1. I think I have raised this before, but why are we not taking the opportunity while building completely new APIs to implement the new Error handling? Both the UI and the API are being rebuilt so I doubt there will be a better opportunity.

https://docs.google.com/document/d/1PLaOEZnDHI0GiCqCzkRpnVnrenPE8LbGm1WlaJTn0cg/edit?usp=sharing

  1. Since this is a professional file service API, we should consider the difference between Base-2 and Base-10 when dealing with file sizes, e.g. how to calculate and how to communicate sizes in a correct and consistent manner. I don't know how big of an issue this is for our new filebrowser, but I think we should to look into it.

For instance, I notice that FILE_UPLOAD_CHUNK_SIZE is calculated as Base-2 and should be expressed as KiB, MiB, GiB etc but is communicated as base-10.

We should look into which files systems use Base-2 (like macOS Terminal and unix) and which use base-10 (like S3) and have a strategy for dealing with that. My fear is that in the context of big data these differences can have a large impact if we don't get it right.

  1. We should also add unit tests to this new API
  1. Will start on improving error messages further once the core logic is refactored, stable and checked-in.
  2. With the new APIs, we are now only dealing with bytes and also sending filesize as bytes instead of earlier human readable format. UI can convert the new file size in bytes to their desired format.
  3. Added unit tests for new upload API.

@Harshg999 Harshg999 enabled auto-merge (squash) November 20, 2024 17:54
Copy link
Collaborator

@bjornalm bjornalm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

@Harshg999 Harshg999 merged commit 577b140 into master Nov 21, 2024
6 checks passed
@Harshg999 Harshg999 deleted the fix-upload branch November 21, 2024 07:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants