Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

content: call content.flush within content-checkout.put #6242

Open
chu11 opened this issue Aug 28, 2024 · 5 comments · May be fixed by #6260
Open

content: call content.flush within content-checkout.put #6242

chu11 opened this issue Aug 28, 2024 · 5 comments · May be fixed by #6260
Assignees

Comments

@chu11
Copy link
Member

chu11 commented Aug 28, 2024

Per comment / idea in #6240

Pro: no need to require user to call before calling content-checkpoint.put. If backing store isn't loaded, the content module will know not to call content.flush.

Con: only work if user doesn't use BYPASS flag. but presumably user will know what they are doing if they use BYPASS.

@flux-framework flux-framework deleted a comment Aug 28, 2024
@flux-framework flux-framework deleted a comment Aug 28, 2024
@chu11 chu11 self-assigned this Aug 29, 2024
@chu11
Copy link
Member Author

chu11 commented Aug 30, 2024

doh! flux restore calls kvs_checkpoint_commit() to reset the checkpoint to sequence number 0, i.e. content.flush not needed. Could just special case handle that?

@garlick
Copy link
Member

garlick commented Aug 30, 2024

Ah maybe we just need a new flag like KVS_CHECKPOINT_NO_FLUSH.

@chu11
Copy link
Member Author

chu11 commented Aug 30, 2024

Ah maybe we just need a new flag like KVS_CHECKPOINT_NO_FLUSH.

Yeah, we could do that too. Unfortunately no flags passed in the RPC (flags are only used within kvs_checkpoint_commit()) so gotta pass it through.

Edit: Ehhh I guess adding flags maybe easier, b/c the sequence number is hidden within an object.

chu11 added a commit to chu11/flux-core that referenced this issue Aug 30, 2024
Problem: Before checkpointing, users need to remember to call
content.flush, to ensure data has been flushed to the backing store.
It is easy to forget this.

Within the content module, call content.flush before checkpointing.

Fixes flux-framework#6242
@chu11
Copy link
Member Author

chu11 commented Aug 30, 2024

Ugh ... so should FLUX_KVS_SYNC fail if a backing store does not exist? At the moment, it does. if we move the content.flush into the content module, we can now decide if it should/shouldn't. I originally called content.flush only if the backing module existed, but that broke some tests.

If we also need a KVS_CHECKPOINT_REQUIRE_FLUSH flag, this refactor probably isn't worth it.

@chu11
Copy link
Member Author

chu11 commented Aug 31, 2024

Thinking about this a bit more, maybe we should think about design of the content + backing module + sync / checkpoint with our new goals in mind. Right now ...

  • KVS_SYNC fails if a backing module is not loaded
  • KVS checkpoint succeeds if a backing module is not loaded

This feels inconsistent and appears counter intuitive. The nuanced test in t0028-content-backing-none.t where it checkpoints w/o a backing module and the KVS module can be reloaded is probably not a realistic use case to be supported??

I should note that for some reason this was very explicitly called out for support in #4492. For some reason back then, supporting checkpointing on "none" was desired, perhaps for backwards compatibility in places.

I did notice this in rc1

if test "$(backing_module)" != "none"; then
    flux content flush || exit_rc=1
fi

So if we are doing a "none" backing module, we are just skipping this sort of stuff anyways already.

chu11 added a commit to chu11/flux-core that referenced this issue Sep 4, 2024
Problem: Before checkpointing, users need to remember to call
content.flush, to ensure data has been flushed to the backing store.
It is easy to forget this.

Within the content module, call content.flush before checkpointing.

Fixes flux-framework#6242
chu11 added a commit to chu11/flux-core that referenced this issue Sep 4, 2024
Problem: Before checkpointing, users need to remember to call
content.flush, to ensure data has been flushed to the backing store.
It is easy to forget this.

Within the content module, call content.flush before checkpointing.

Fixes flux-framework#6242
@chu11 chu11 linked a pull request Sep 5, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants