Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forget after backup - stalls #168

Open
psyciknz opened this issue Jun 20, 2023 · 9 comments
Open

Forget after backup - stalls #168

psyciknz opened this issue Jun 20, 2023 · 9 comments
Labels

Comments

@psyciknz
Copy link

restic-backup            | Dirs:         5485 new,   415 changed, 193964 unmodified
restic-backup            | Data Blobs:   2789 new
restic-backup            | Tree Blobs:   5585 new
restic-backup            | Added to the repository: 1.215 GiB (1.215 GiB stored)
restic-backup            |
restic-backup            | processed 1212781 files, 270.371 GiB in 29:20
restic-backup            | snapshot d278d44c saved
restic-backup            | Backup successful
restic-backup            | Forget about old snapshots based on RESTIC_FORGET_ARGS = --keep-last 10 --keep-daily 2 --keep-weekly 1 --keep-monthly 1 --tag Drogo
restic-backup            | Fatal: unable to create lock in backend: repository is already locked by PID 129 on raspups by root (UID 0, GID 0)
restic-backup            | lock was created at 2023-06-20 13:15:14 (4m35.219845879s ago)
restic-backup            | storage ID a07f901e`

Trying to resolve this situation, this is a forget running after a backup. So it doesn't tend to execute the post_command in this situation. When this scenario occurs, does it attempt to retry, or will complete at some point?

@thierrybla
Copy link

thierrybla commented Jun 22, 2023

I also run into this issue.
It does not retry, it restarts the container.

@mentos1386
Copy link

I have the same issue. My container is constantly restarting due to this.

@thierrybla
Copy link

I fixed it by doing a manual unlock, now the forget / prune works as expected.

@psyciknz
Copy link
Author

psyciknz commented Jul 4, 2023

Yeah that's the only way I fix it as well. Ideally a retry or something would be handy.

@djmaze djmaze added the bug label Jul 6, 2023
@djmaze
Copy link
Owner

djmaze commented Jul 6, 2023

Not sure if a retry is always the right thing to do here. I have situations where a long running prune is still ongoing or even crashed. That's when waiting does not solve the problem, but you need to manually unlock.

I think the forget is not that important. If it fails, the container should just go back to its schedule and wait for the next backup cron trigger. IMO.

@thierrybla
Copy link

Would it be possible to add an unlock on a schedule?

@thierrybla
Copy link

I think the forget is not that important. If it fails, the container should just go back to its schedule and wait for the next backup cron trigger. IMO.

Yes it will, but even the next or any forget after the first failed one does not work. You really should manually unlock before the next forget will work once or twice and it goes back to staying locked.

Is it not possible to throw a unlock before the forget command comes through?

@djmaze
Copy link
Owner

djmaze commented Aug 6, 2023

I think I don't really understand. @thierrybla If your repository stays lock every few runs, something is broken in your setup. Maybe you have prunes running in parallel while trying to backup/forget?

@thierrybla
Copy link

I think I don't really understand. @thierrybla If your repository stays lock every few runs, something is broken in your setup. Maybe you have prunes running in parallel while trying to backup/forget?

I am not sure what I am doing wrong or how to debug it. I am running one container with the following flags:

image

Any idea on how to debug this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants