-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: do not rely on DAC_OVERRIDE capability on Linux despite being root #69
base: main
Are you sure you want to change the base?
Conversation
f2dbda0
to
4831385
Compare
In the light of ongoing effort to make the pacemaker-API more accessible (moving code to libraries, rearrange libraries, API-server, ...) wouldn't it make sense to have most of that code in a central place where potential users of the API would have it handy (libqb?). Not saying it would necessarily have to happen behind the scenes without code-changes in sbd though. |
CAP_SETGID is required to use initgroups(). Given that this is a trivial bypass of CAP_DAC_OVERRIDE, I think it's reasonable to assume both capabilities could be taken away, and we shouldn't rely on it. I think a comprehensive solution would be to figure out all the capabilities currently needed by each cluster component, and make sure they are explicitly listed in the relevant SELinux policies. We can set a goal to avoid the need for some of those, but I think that should be more carefully and holistically considered than reactive. |
For the particular issue at hand, I'm thinking the simplest solution is for pacemaker's spec file to add root to the haclient group. I haven't tested it but I think it would work (except for any sbd process started before the pacemaker package was installed, but that shouldn't be any trouble to deal with). It does give access to all root processes rather than selected ones that need it, but that's also the effect of the typical example solution given for similar problems (i.e. make the file owned by root group). |
It turns out, for example, that SELinux will prevent this capability for confined processes running as root. It then means that sbd cannot access, as a client, files used for joining two local communication sides within libqb-arranged IPC mechanism in case those files do not have permissions to explicitly allow file-based access with credentials of this client -- which is exactly what happens when the IPC servers are pacemaker daemons not run as root on their own. Solution is two-phased: 1. have sbd add respective non-privileged group corresponding to the server side of the IPC -- this patch 2. ensure this server side (pacemaker) does allow group-derived access (i.e., the access permissions for group are as relaxed as needed, umask notwithstanding) -- outside of the sbd's scope Signed-off-by: Jan Pokorný <[email protected]>
4831385
to
9695ba0
Compare
On 14/02/19 23:37 -0800, wenningerk wrote:
In the light of ongoing effort to make the pacemaker-API more
accessible (moving code to libraries, rearrange libraries,
API-server, ...) wouldn't it make sense to have most of that code
in a central place where potential users of the API would have it
handy (libqb?).
I agree with the reusability sentiment, however there are some catches:
- being dependent on particularly new version of the dependencies,
which is bound to cause pains until some years later
(OK if you have the right version of the dependency, or sorry,
out of luck, you cannot solve your immediate problem easily
otherwise)
- universality implies heavy adapting/boilerplaiting to the
specific conventions of the client side (mapping of the
exit codes, etc.) -- here, the code saved by having a dedicated
universal implementation is of about the same size as said
adaptor layer would have
- sbd, IIUIC, tries to only use a minimal set of pacemaker API,
and there can still be use-cases for sbd to be used outside
of pacemaker?
hence I don't see any particular conflict here.
Not saying it would necessarily have to happen behind the scenes
without code-changes in sbd though.
Libqb is now passed a very minimal set of intention predestinations.
This could be systemically improved upon with making libqb externally
configurable (/etc/dbus-1/system.d/ drop-in configuration setting
a precedence, for instance), alleviating any new "hardwired"
(recompilation requiring, hence undesired) burden.
…--
Jan (Poki)
|
On 15/02/19 09:30 -0800, Ken Gaillot wrote:
CAP_SETGID is required to use initgroups(). Given that this is a
trivial bypass of CAP_DAC_OVERRIDE, I think it's reasonable to
assume both capabilities could be taken away, and we shouldn't
rely on it.
By that logic, shall we assume that root is totally helpless?
I hope not, and unlike with initiative to drop dac_override
that goes for years already, I am not aware of anything similar
regarding setgid, which is enabled for 6+ years:
https://github.com/fedora-selinux/selinux-policy-contrib/blob/55adef764d74757501c61d33898c526e44c245b3/rhcs.te#L149
and would not be very wise either, since it would possibly affect
every and each daemon process run directly with `pacemaker-execd`.
I think a comprehensive solution would be to figure out all the
capabilities currently needed by each cluster component,
including unpredictable scope of services to be run from within
pacemaker?
and make sure they are explicitly listed in the relevant SELinux
policies. We can set a goal to avoid the need for some of those, but
I think that should be more carefully and holistically considered
than reactive.
Unless there's some way for one-touch solution for users to add
ad-hoc policies as detected when running their services, this
would work only in theory. Even if that solution existed, who
is then to validate that all the policies were distributed
everywhere that they are required (after failovers, etc.).
Sounds rather scary and anti-HA.
…--
Jan (Poki)
|
No, we should explicitly list the capabilities we need in the cluster SELinux policy, both because it is correct and because we can't predict what the default set will be in the future.
The cluster SELinux policy (which is not limited to sbd) should include all capabilities needed by all cluster components throughout the stack. These are no more or less predictable than everything that's already in the policy.
I don't follow what you're trying to say here. I'm suggesting that the SELinux policy for cluster components should explicitly grant the capabilities required by those components, which I believe is the purpose of having the policy. Any user running a cluster with SELinux enabled already has to verify that their SELinux policies on every node allow their services to run, that doesn't change. |
Klaus mentioned that the original problem description says that some of the relevant IPC files have mode 0600, which makes both group-based approaches (adding root to the haclient group, or the process taking on haclient privileges) insufficient. It looks like adding the capability to the SELinux policy will be the only solution (technically, one of the group approaches could be combined with modifying the IPC permissions, but I suspect that will be impractical). |
Btw. are there any systematic approaches how to expose capabilities required in a repo - preferably easy to parse to automatized handling? |
It should be reasonably simple to get a baseline based on what system calls the code uses. Dreaming a little bit, that could be applied to common libraries to generate a knowledge database for their calls as well. A quick search doesn't show anything existing, but these are of interest anyway:
|
Hmm - guess tooling - especially static analysis - would have to be quite smart to spit out a list that would be sufficiently restrictive and comprehensible. |
Right, that's why I was thinking just a baseline to start from (e.g. anything that uses setgid(), initgroups() or setgroups() requires CAP_SETGID). Maybe one could list those as "definite" and others as "possibly required under certain circumstances", which CAP_DAC_OVERRIDE would probably always qualify for. |
On 18/02/19 17:04 +0000, Ken Gaillot wrote:
On 18/02/19 15:27 +0100, Jan Pokorný wrote:
> On 15/02/19 09:30 -0800, Ken Gaillot wrote:
>> CAP_SETGID is required to use initgroups(). Given that this is a
>> trivial bypass of CAP_DAC_OVERRIDE, I think it's reasonable to
>> assume both capabilities could be taken away, and we shouldn't
>> rely on it.
>
> By that logic, shall we assume that root is totally helpless?
>
> I hope not, and unlike with initiative to drop dac_override
> that goes for years already, I am not aware of anything similar
> regarding setgid, which is enabled for 6+ years:
> https://github.com/fedora-selinux/selinux-policy-contrib/blob/55adef764d74757501c61d33898c526e44c245b3/rhcs.te#L149
> and would not be very wise either, since it would possibly affect
> every and each daemon process run directly with `pacemaker-execd`.
No, we should explicitly list the capabilities we need in the
cluster SELinux policy, both because it is correct and because we
can't predict what the default set will be in the future.
Which cluster SELinux policy do you have in mind?
The one I linked above? Do you want to re-review it?
Or write a new one from scratch?
>> I think a comprehensive solution would be to figure out all the
>> capabilities currently needed by each cluster component,
>
> including unpredictable scope of services to be run from within
> pacemaker?
The cluster SELinux policy (which is not limited to sbd)
sbd currently has a dettached policy in the referenced project
(sbd_t vs. cluster_t), which likely means you want to write
a new one from scratch
should include all capabilities needed by all cluster components
throughout the stack. These are no more or less predictable than
everything that's already in the policy.
Logical error likely stemming from missing the "since it would
possibly affect every and each daemon process run directly with
`pacemaker-execd`" part above. That needs to be absorbed first.
Then you'll see that there's indeed a predictable and non-predictable
part of that, and that some (unpredictable!) daemons shall rather make
do without DAC_OVERRIDE capability when run as a cluster resource
already (otherwise cluster_t in Fedora's upstream SELinux policy
will need to get that back as well, meaning weakening the containment).
Cluster proper components alone are, on the contrary, indeed
predictable, but we can just as well assume that the currently
assigned set in the said upstream policy "just works unless disproved".
>> and make sure they are explicitly listed in the relevant SELinux
>> policies. We can set a goal to avoid the need for some of those, but
>> I think that should be more carefully and holistically considered
>> than reactive.
>
> Unless there's some way for one-touch solution for users to add
> ad-hoc policies as detected when running their services, this
> would work only in theory. Even if that solution existed, who
> is then to validate that all the policies were distributed
> everywhere that they are required (after failovers, etc.).
>
> Sounds rather scary and anti-HA.
I don't follow what you're trying to say here.
likely because the already restated point didn't make it across
I'm suggesting that the SELinux policy for cluster components should
explicitly grant the capabilities required by those components,
which I believe is the purpose of having the policy.
True, but the applicability of the policy is highly limited with
unability to enumerate all the cluster stack use cases, as opposed
to self-hosted boundaries like httpd_t -- observe: adding new httpd
modules is rather an exception (easy to follow the suite and give it
a proper label), whereas involving custom daemons is one of typical use
cases for the cluster stack (good luck with tutoring these users to
come up with brand new policy modules and respective transitions from
cluster_t just because you want as tight SELinux boundaries left for
the inner cluster stack as possible).
Any user running a cluster with SELinux enabled already has to
verify that their SELinux policies on every node allow their
services to run, that doesn't change.
Again, this is the response to something unthought of.
From a brief test, cluster_t is derived with whatever spawned
process, regardless of whether or how the respective binary
is labelled. With your envisioned pedantic use of the
security boundary, it could start to be a real problem, then.
At least, systemd resources are isolated from this :-)
Partial solutions would be:
- make pacemaker-execd able to mangle with SELinux context
on its own so as to somehow neutralize the narrow scope you
envision (note that cluster_t is shared also with corosync,
amongst others; also you would need to have the power to do that,
which sort of defeats the separation), possibly doable with
transition rules directly in SELinux policy
... or even better ...
- drop capabilities programmatically directly in the daemons
themselves (between fork and execve in pacemakerd?, except
for pacemaker-execd, indeed) and seal that possibly also
with `NO_NEW_PRIVS`:
https://www.kernel.org/doc/html/latest/userspace-api/no_new_privs.html
The latter seems most reasonable to me -- and it's both
a documentation and the effectively actionable mark in the code.
Sadly, due to the multi-daemon arrangement, we cannot use systemd
as the catch-all solution for that currently (which would also
conveniently open BPF-based syscall filtering to us if you want
to micromanage the security to the tiniest bit).
But without major redesigning relevant SELinux policies or writing
one or more replacement modules from scratch, your original plan
won't fly, I am afraid.
…--
Poki
|
Btw. thanks for |
I'm still not following you. Subsidiary processes are run as their own type when relevant, e.g.:
Perhaps you mean custom daemons that don't have any SELinux labelling? But that's no different than running such a daemon without a cluster on a SELinux-enabled host. The user is responsible for that, with or without a cluster. Am I misunderstanding your point? |
Yes, "every and each daemon process run directly with And, per https://bugzilla.redhat.com/show_bug.cgi?id=1677325#c9, relaxed Now back to free-standing But I don't agree |
Users are responsible for all aspects of custom services, e.g. providing an LSB or OCF script, and configuring firewall rules on all nodes -- SELinux policies for custom services fall into this same category.
Except that some IPC sockets do not have any group permissions, so changing groups via any means won't matter. We'd also have to find every place that those permissions are set, verify that group access is not a security risk, and modify them to add group read/write. I think dac_override is a simpler and more appropriate solution. |
Looks like that's a consequence of a shoddy usage of |
Should we still consider any changes here or can we close this being resolved for fedora via selinux-policy. |
Can one of the admins verify this patch? |
It turns out, for example, that SELinux will prevent this capability
for confined processes running as root. It then means that sbd cannot
access, as a client, files used for joining two local communication sides
within libqb-arranged IPC mechanism in case those files do not have
permissions to explicitly allow file-based access with credentials
of this client -- which is exactly what happens when the IPC servers
are pacemaker daemons not run as root on their own.
Solution is two-phased:
server side of the IPC -- this patch
(i.e., the access permissions for group are as relaxed as needed,
umask notwithstanding) -- outside of the sbd's scope