-
Notifications
You must be signed in to change notification settings - Fork 607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Digest-less images always re-download after last-modified time changes on server #2902
Labels
bug
Something isn't working
Comments
nirs
added a commit
to nirs/lima
that referenced
this issue
Nov 13, 2024
We tested that modified last modified time on the server cause a redownload, but we did not test that after a redownload we update the cache, so the next attempt will used the cached download. Add a failing test verifying the issue. Part-of: lima-vm#2902 Signed-off-by: Nir Soffer <[email protected]>
nirs
added a commit
to nirs/lima
that referenced
this issue
Nov 13, 2024
We tested that modified last modified time on the server cause a redownload, but we did not test that after a redownload we update the cache, so the next attempt will used the cached download. Add a failing test verifying the issue, and improve test comments and configuration to make it more clear. Part-of: lima-vm#2902 Signed-off-by: Nir Soffer <[email protected]>
nirs
changed the title
Caching downloaded images not updated when last modified time changes
Digest-less images always re-download after last-modified time changes on server
Nov 14, 2024
nirs
added a commit
to nirs/lima
that referenced
this issue
Nov 14, 2024
To solve the races during concurrent downloads and avoid unneeded work and bandwidth, we allow one concurrent download of the same image. When a limactl process try to access the cache, it takes a lock on the file cache directory. If multiple processes try to get the lock in the same time, only one will take the lock, and the other will block. The process that took the lock tries to get the file from the cache. This is the fast path and the common case. This can fail if the file is not in the cache, the digest does not match, or the cached last modified time does not match the last modified returned from the server. If the process cannot get the file from the cache, it downloads the file from the remote server, and update the cached data and metadata files. Finally the process release the lock on the cache directory. Other limactl processes waiting on the lock wake up and take the lock. In the common case they will find the image in the cache and will release the lock quickly. Since we have exactly one concurrent download, updating the metadata files is trivial and we don't need the writeFirst() helper. Fixes: lima-vm#2902 Fixes: lima-vm#2732 Signed-off-by: Nir Soffer <[email protected]>
nirs
added a commit
to nirs/lima
that referenced
this issue
Nov 14, 2024
We tested that modified last modified time on the server cause a redownload, but we did not test that after a redownload we update the cache, so the next attempt will used the cached download. Add a failing test verifying the issue, and improve test comments and configuration to make it more clear. Part-of: lima-vm#2902 Signed-off-by: Nir Soffer <[email protected]>
nirs
added a commit
to nirs/lima
that referenced
this issue
Nov 14, 2024
To solve the races during concurrent downloads and avoid unneeded work and bandwidth, we allow one concurrent download of the same image. When a limactl process try to access the cache, it takes a lock on the file cache directory. If multiple processes try to get the lock in the same time, only one will take the lock, and the other will block. The process that took the lock tries to get the file from the cache. This is the fast path and the common case. This can fail if the file is not in the cache, the digest does not match, or the cached last modified time does not match the last modified returned from the server. If the process cannot get the file from the cache, it downloads the file from the remote server, and update the cached data and metadata files. Finally the process release the lock on the cache directory. Other limactl processes waiting on the lock wake up and take the lock. In the common case they will find the image in the cache and will release the lock quickly. Since we have exactly one concurrent download, updating the metadata files is trivial and we don't need the writeFirst() helper. Fixes: lima-vm#2902 Fixes: lima-vm#2732 Signed-off-by: Nir Soffer <[email protected]>
Previously any cached image would remain until deleted, might need a "imagePullPolicy" some day? images:
# Try to use release-yyyyMMdd image if available. Note that release-yyyyMMdd will be removed after several months.
- location: "https://cloud-images.ubuntu.com/releases/24.10/release-20241023/ubuntu-24.10-server-cloudimg-amd64.img"
arch: "x86_64"
digest: "sha256:ee070d95a2ba5a1500264e75b3e14aa85518220c24d25f1535407c55f0e33e4d"
# Fallback to the latest release image.
# Hint: run `limactl prune` to invalidate the cache
- location: "https://cloud-images.ubuntu.com/releases/24.10/release/ubuntu-24.10-server-cloudimg-amd64.img"
arch: "x86_64" But if it does decide to download a new file, then the "timestamp" (file) should be updated as well... |
Indeed, fixed in #2903 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description
When downloading cached image, if the last modified time from the server is different from the cached timed, we download the image again (good), but the cached time is not updated, so we download the image again for every new instance.
Example:
I think this is caused by the change to fix concurrent downloads - we store the new time once. If the file exists, we don't replace it. We probably need to replace the file when we know that the old time is stale.
Can be reproduced with:
Workaround
Pruning the cache will fix this until the next last-modifed time change on the server:
limactl prune
The text was updated successfully, but these errors were encountered: