Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Modernize start method #135

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

wenningerk
Copy link

As the name says focus is bringing sbd from a forking model to either 'simple' or 'notice' start-model in systemd.
Discussion started off in #134.
PR atm still contains changes from #134 - will be rebased once #134 is merged.

@wenningerk
Copy link
Author

Continuing discussion started in #134:

I wasn't thinking of putting anything pacemaker-related in sbd's public API, since that's just for sbd's internal use. The public API would just have low-level stuff for external code to check on sbd. It should be segregated in a separate directory and not use any pacemaker APIs.

That wasn't my point. sbd already has pacemaker as build-time requirement.
So building a library as part of sbd-build that pacemaker is supposed to build
against creates some kind of build-order-issue.

Ah right, of course. Hmm.

Isolating the pacemaker-related stuff into the pacemaker project or even its own project could be worthwhile. I believe at one point we had discussed making the interface more generic so the corosync/pacemaker stuff wasn't so tightly integrated. I'm not sure if that would open any demand for other sbd "plugins" but it could be interesting. The question is whether it's worth the effort ...

Discussion regarding integration of the components was controversal as well suggesting tight integration of sbd into corosync.
We do have 3 types of watchers in sbd - disk, cluster & pacemaker
How sbd is handling those is quite different and interwoven.
But maybe trying to find some generic abstraction could as well be helpful to make sbd-code more understandable.
Finding an abstraction for the watchdog that can be implemented as a plugin maybe interesting as well although it isn't relevant for the build-order thing.

exit_status = inquisitor();
} else {
sbd_detach();
inquisitor_child();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I follow correctly, with "forking", the parent doesn't exit until it gets SIG_LIVENESS. Do we need to go all the way to "sd_notify" to keep the behavior similar?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As to my knowledge I'm afraid yes.
Unless of course we totally rely on the daemons talking to each other.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment in inquisitor_decouple btw. is a lie - I'll remove/correct it in the course of this PR - the inquisitor decouples once cluster-servant is online - meaning it has received it's own membership.

@@ -9,11 +9,16 @@ RefuseManualStop=true
RefuseManualStart=true

[Service]
Type=forking
PIDFile=@localstatedir@/run/sbd.pid
Type=simple

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we make the unit file changes a configure option defaulting to forking (for now at least), then we have backward compatibility with older pacemaker

"exec" seems more useful than "simple" if we don't want to go all the way to "sd_notify"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, slightly as it needs the initial fork & exec to succeed but I guess not to an extent that it is really useful for us here.

@gao-yan
Copy link
Member

gao-yan commented Oct 5, 2021

Type=forking is a so-called traditional way, but it doesn't necessarily mean it's legacy or bad? Especially for handling ordered startups of sbd and pacemaker in our case, it's probably one of the most suitable ways...

Although we now have startup synchronization between sbd and pacemaker, but if to switch to Type=simple, AFAICS SBD_SYNC_RESOURCE_STARTUP=yes being enabled/defaulted would have to be required to get the correct startup behavior, not to mention that would be another trap for upgrade of existing setups where there might be the explicit setting SBD_SYNC_RESOURCE_STARTUP=no in sysconfig ...

Type=notify might be another option, but that'd require to introduce dependencies on systemd/dbus for sbd. So eventually it'd be more of a question whether that'd have real benefits so that it'd be worth it ...

@wenningerk
Copy link
Author

Type=simple was just a first try and it became obvious quite quickly that Type=notify would probably be the way to go.
Guess we should still be able to build without systemd or dbus but on the other hand on a system under systemd control using the benefits of it wouldn't be bad. If sbdy insists on building sbd without library-dependency to dbus or systemd we could still fall back to calling sd_notify-tool from the C-code.
And fiddling around with pid-files and where the system prefers them to currently be isn't that elegant either.
Furthermore checking for a pid-file from pacemaker is in my eyes anyway a weak point of the whole synchronization. I'd definitely prefer asking systemd if sbd is enabled or running and leave the current code just as a fallback - weather to be controlled via cmdline and/or compile-switch is open for discussion.
Anyway I guess there are a couple of points her to discuss and we shouldn't rush it.

@gao-yan
Copy link
Member

gao-yan commented Oct 6, 2021

Type=simple was just a first try and it became obvious quite quickly that Type=notify would probably be the way to go. Guess we should still be able to build without systemd or dbus but on the other hand on a system under systemd control using the benefits of it wouldn't be bad. If sbdy insists on building sbd without library-dependency to dbus or systemd we could still fall back to calling sd_notify-tool from the C-code.

Yes, it'd for sure make sense to have a build detection/option.

And fiddling around with pid-files and where the system prefers them to currently be isn't that elegant either. Furthermore checking for a pid-file from pacemaker is in my eyes anyway a weak point of the whole synchronization. I'd definitely prefer asking systemd if sbd is enabled or running and leave the current code just as a fallback

Good point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants