Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Digest-less images always re-download after last-modified time changes on server #2902

Open
nirs opened this issue Nov 13, 2024 · 2 comments · May be fixed by #2903
Open

Digest-less images always re-download after last-modified time changes on server #2902

nirs opened this issue Nov 13, 2024 · 2 comments · May be fixed by #2903
Labels
bug Something isn't working

Comments

@nirs
Copy link
Member

nirs commented Nov 13, 2024

Description

When downloading cached image, if the last modified time from the server is different from the cached timed, we download the image again (good), but the cached time is not updated, so we download the image again for every new instance.

Example:

% limactl create --tty=0 test.yaml
...
INFO[0000] Re-downloading digest-less image: last-modified mismatch (cached: "Wed, 09 Oct 2024 00:31:27 GMT", remote: "Thu, 07 Nov 2024 14:31:23 GMT") 
Downloading the image (ubuntu-24.04-server-cloudimg-arm64.img)
571.32 MiB / 571.32 MiB [----------------------------------] 100.00% 60.94 MiB/s
...

% limactl create --tty=0 test.yaml --name test2
...
INFO[0000] Re-downloading digest-less image: last-modified mismatch (cached: "Wed, 09 Oct 2024 00:31:27 GMT", remote: "Thu, 07 Nov 2024 14:31:23 GMT") 
Downloading the image (ubuntu-24.04-server-cloudimg-arm64.img)
571.32 MiB / 571.32 MiB [----------------------------------] 100.00% 50.72 MiB/s

% limactl create --tty=0 test.yaml --name test3
...
INFO[0000] Re-downloading digest-less image: last-modified mismatch (cached: "Wed, 09 Oct 2024 00:31:27 GMT", remote: "Thu, 07 Nov 2024 14:31:23 GMT") 
Downloading the image (ubuntu-24.04-server-cloudimg-arm64.img)
571.32 MiB / 571.32 MiB [----------------------------------] 100.00% 58.23 MiB/s

I think this is caused by the change to fix concurrent downloads - we store the new time once. If the file exists, we don't replace it. We probably need to replace the file when we know that the old time is stale.

Can be reproduced with:

images:
  - location: "https://cloud-images.ubuntu.com/releases/24.04/release/ubuntu-24.04-server-cloudimg-arm64.img"
    arch: "aarch64"
  - location: "https://cloud-images.ubuntu.com/releases/24.04/release/ubuntu-24.04-server-cloudimg-amd64.img"
    arch: "x86_64"
vmType: vz
plain: true

Workaround

Pruning the cache will fix this until the next last-modifed time change on the server:

limactl prune
nirs added a commit to nirs/lima that referenced this issue Nov 13, 2024
We tested that modified last modified time on the server cause a
redownload, but we did not test that after a redownload we update the
cache, so the next attempt will used the cached download.

Add a failing test verifying the issue.

Part-of: lima-vm#2902
Signed-off-by: Nir Soffer <[email protected]>
@nirs nirs linked a pull request Nov 13, 2024 that will close this issue
nirs added a commit to nirs/lima that referenced this issue Nov 13, 2024
We tested that modified last modified time on the server cause a
redownload, but we did not test that after a redownload we update the
cache, so the next attempt will used the cached download.

Add a failing test verifying the issue, and improve test comments and
configuration to make it more clear.

Part-of: lima-vm#2902
Signed-off-by: Nir Soffer <[email protected]>
@nirs nirs changed the title Caching downloaded images not updated when last modified time changes Digest-less images always re-download after last-modified time changes on server Nov 14, 2024
nirs added a commit to nirs/lima that referenced this issue Nov 14, 2024
To solve the races during concurrent downloads and avoid unneeded work
and bandwidth, we allow one concurrent download of the same image.

When a limactl process try to access the cache, it takes a lock on the
file cache directory. If multiple processes try to get the lock in the
same time, only one will take the lock, and the other will block.

The process that took the lock tries to get the file from the cache.
This is the fast path and the common case. This can fail if the file is
not in the cache, the digest does not match, or the cached last modified
time does not match the last modified returned from the server.

If the process cannot get the file from the cache, it downloads the file
from the remote server, and update the cached data and metadata files.

Finally the process release the lock on the cache directory. Other
limactl processes waiting on the lock wake up and take the lock. In the
common case they will find the image in the cache and will release the
lock quickly.

Since we have exactly one concurrent download, updating the metadata
files is trivial and we don't need the writeFirst() helper.

Fixes: lima-vm#2902
Fixes: lima-vm#2732
Signed-off-by: Nir Soffer <[email protected]>
nirs added a commit to nirs/lima that referenced this issue Nov 14, 2024
We tested that modified last modified time on the server cause a
redownload, but we did not test that after a redownload we update the
cache, so the next attempt will used the cached download.

Add a failing test verifying the issue, and improve test comments and
configuration to make it more clear.

Part-of: lima-vm#2902
Signed-off-by: Nir Soffer <[email protected]>
nirs added a commit to nirs/lima that referenced this issue Nov 14, 2024
To solve the races during concurrent downloads and avoid unneeded work
and bandwidth, we allow one concurrent download of the same image.

When a limactl process try to access the cache, it takes a lock on the
file cache directory. If multiple processes try to get the lock in the
same time, only one will take the lock, and the other will block.

The process that took the lock tries to get the file from the cache.
This is the fast path and the common case. This can fail if the file is
not in the cache, the digest does not match, or the cached last modified
time does not match the last modified returned from the server.

If the process cannot get the file from the cache, it downloads the file
from the remote server, and update the cached data and metadata files.

Finally the process release the lock on the cache directory. Other
limactl processes waiting on the lock wake up and take the lock. In the
common case they will find the image in the cache and will release the
lock quickly.

Since we have exactly one concurrent download, updating the metadata
files is trivial and we don't need the writeFirst() helper.

Fixes: lima-vm#2902
Fixes: lima-vm#2732
Signed-off-by: Nir Soffer <[email protected]>
@afbjorklund
Copy link
Member

Previously any cached image would remain until deleted, might need a "imagePullPolicy" some day?

images:
# Try to use release-yyyyMMdd image if available. Note that release-yyyyMMdd will be removed after several months.
- location: "https://cloud-images.ubuntu.com/releases/24.10/release-20241023/ubuntu-24.10-server-cloudimg-amd64.img"
  arch: "x86_64"
  digest: "sha256:ee070d95a2ba5a1500264e75b3e14aa85518220c24d25f1535407c55f0e33e4d"
# Fallback to the latest release image.
# Hint: run `limactl prune` to invalidate the cache
- location: "https://cloud-images.ubuntu.com/releases/24.10/release/ubuntu-24.10-server-cloudimg-amd64.img"
  arch: "x86_64"

But if it does decide to download a new file, then the "timestamp" (file) should be updated as well...

@nirs
Copy link
Member Author

nirs commented Nov 16, 2024

But if it does decide to download a new file, then the "timestamp" (file) should be updated as well...

Indeed, fixed in #2903

@nirs nirs added the bug Something isn't working label Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants