Fix: do not rely on DAC_OVERRIDE capability on Linux despite being root #69

jnpkrn · 2019-02-14T21:31:41Z

It turns out, for example, that SELinux will prevent this capability
for confined processes running as root. It then means that sbd cannot
access, as a client, files used for joining two local communication sides
within libqb-arranged IPC mechanism in case those files do not have
permissions to explicitly allow file-based access with credentials
of this client -- which is exactly what happens when the IPC servers
are pacemaker daemons not run as root on their own.

Solution is two-phased:

have sbd add respective non-privileged group corresponding to the
server side of the IPC -- this patch
ensure this server side (pacemaker) does allow group-derived access
(i.e., the access permissions for group are as relaxed as needed,
umask notwithstanding) -- outside of the sbd's scope

wenningerk · 2019-02-15T07:37:30Z

In the light of ongoing effort to make the pacemaker-API more accessible (moving code to libraries, rearrange libraries, API-server, ...) wouldn't it make sense to have most of that code in a central place where potential users of the API would have it handy (libqb?). Not saying it would necessarily have to happen behind the scenes without code-changes in sbd though.

src/sbd-pacemaker.c

kgaillot · 2019-02-15T17:30:13Z

CAP_SETGID is required to use initgroups(). Given that this is a trivial bypass of CAP_DAC_OVERRIDE, I think it's reasonable to assume both capabilities could be taken away, and we shouldn't rely on it.

I think a comprehensive solution would be to figure out all the capabilities currently needed by each cluster component, and make sure they are explicitly listed in the relevant SELinux policies. We can set a goal to avoid the need for some of those, but I think that should be more carefully and holistically considered than reactive.

kgaillot · 2019-02-15T21:42:25Z

For the particular issue at hand, I'm thinking the simplest solution is for pacemaker's spec file to add root to the haclient group. I haven't tested it but I think it would work (except for any sbd process started before the pacemaker package was installed, but that shouldn't be any trouble to deal with). It does give access to all root processes rather than selected ones that need it, but that's also the effect of the typical example solution given for similar problems (i.e. make the file owned by root group).

It turns out, for example, that SELinux will prevent this capability for confined processes running as root. It then means that sbd cannot access, as a client, files used for joining two local communication sides within libqb-arranged IPC mechanism in case those files do not have permissions to explicitly allow file-based access with credentials of this client -- which is exactly what happens when the IPC servers are pacemaker daemons not run as root on their own. Solution is two-phased: 1. have sbd add respective non-privileged group corresponding to the server side of the IPC -- this patch 2. ensure this server side (pacemaker) does allow group-derived access (i.e., the access permissions for group are as relaxed as needed, umask notwithstanding) -- outside of the sbd's scope Signed-off-by: Jan Pokorný <[email protected]>

jnpkrn · 2019-02-18T14:12:43Z

On 14/02/19 23:37 -0800, wenningerk wrote: In the light of ongoing effort to make the pacemaker-API more accessible (moving code to libraries, rearrange libraries, API-server, ...) wouldn't it make sense to have most of that code in a central place where potential users of the API would have it handy (libqb?).

I agree with the reusability sentiment, however there are some catches: - being dependent on particularly new version of the dependencies, which is bound to cause pains until some years later (OK if you have the right version of the dependency, or sorry, out of luck, you cannot solve your immediate problem easily otherwise) - universality implies heavy adapting/boilerplaiting to the specific conventions of the client side (mapping of the exit codes, etc.) -- here, the code saved by having a dedicated universal implementation is of about the same size as said adaptor layer would have - sbd, IIUIC, tries to only use a minimal set of pacemaker API, and there can still be use-cases for sbd to be used outside of pacemaker? hence I don't see any particular conflict here.

Not saying it would necessarily have to happen behind the scenes without code-changes in sbd though.

Libqb is now passed a very minimal set of intention predestinations. This could be systemically improved upon with making libqb externally configurable (/etc/dbus-1/system.d/ drop-in configuration setting a precedence, for instance), alleviating any new "hardwired" (recompilation requiring, hence undesired) burden.

…

-- Jan (Poki)

jnpkrn · 2019-02-18T14:27:51Z

On 15/02/19 09:30 -0800, Ken Gaillot wrote: CAP_SETGID is required to use initgroups(). Given that this is a trivial bypass of CAP_DAC_OVERRIDE, I think it's reasonable to assume both capabilities could be taken away, and we shouldn't rely on it.

By that logic, shall we assume that root is totally helpless? I hope not, and unlike with initiative to drop dac_override that goes for years already, I am not aware of anything similar regarding setgid, which is enabled for 6+ years: https://github.com/fedora-selinux/selinux-policy-contrib/blob/55adef764d74757501c61d33898c526e44c245b3/rhcs.te#L149 and would not be very wise either, since it would possibly affect every and each daemon process run directly with `pacemaker-execd`.

I think a comprehensive solution would be to figure out all the capabilities currently needed by each cluster component,

including unpredictable scope of services to be run from within pacemaker?

and make sure they are explicitly listed in the relevant SELinux policies. We can set a goal to avoid the need for some of those, but I think that should be more carefully and holistically considered than reactive.

Unless there's some way for one-touch solution for users to add ad-hoc policies as detected when running their services, this would work only in theory. Even if that solution existed, who is then to validate that all the policies were distributed everywhere that they are required (after failovers, etc.). Sounds rather scary and anti-HA.

…

-- Jan (Poki)

kgaillot · 2019-02-18T17:04:19Z

On 15/02/19 09:30 -0800, Ken Gaillot wrote: CAP_SETGID is required to use initgroups(). Given that this is a trivial bypass of CAP_DAC_OVERRIDE, I think it's reasonable to assume both capabilities could be taken away, and we shouldn't rely on it.
By that logic, shall we assume that root is totally helpless? I hope not, and unlike with initiative to drop dac_override that goes for years already, I am not aware of anything similar regarding setgid, which is enabled for 6+ years: https://github.com/fedora-selinux/selinux-policy-contrib/blob/55adef764d74757501c61d33898c526e44c245b3/rhcs.te#L149 and would not be very wise either, since it would possibly affect every and each daemon process run directly with pacemaker-execd.

No, we should explicitly list the capabilities we need in the cluster SELinux policy, both because it is correct and because we can't predict what the default set will be in the future.

I think a comprehensive solution would be to figure out all the capabilities currently needed by each cluster component,
including unpredictable scope of services to be run from within pacemaker?

The cluster SELinux policy (which is not limited to sbd) should include all capabilities needed by all cluster components throughout the stack. These are no more or less predictable than everything that's already in the policy.

and make sure they are explicitly listed in the relevant SELinux policies. We can set a goal to avoid the need for some of those, but I think that should be more carefully and holistically considered than reactive.
Unless there's some way for one-touch solution for users to add ad-hoc policies as detected when running their services, this would work only in theory. Even if that solution existed, who is then to validate that all the policies were distributed everywhere that they are required (after failovers, etc.). Sounds rather scary and anti-HA.

I don't follow what you're trying to say here. I'm suggesting that the SELinux policy for cluster components should explicitly grant the capabilities required by those components, which I believe is the purpose of having the policy. Any user running a cluster with SELinux enabled already has to verify that their SELinux policies on every node allow their services to run, that doesn't change.

kgaillot · 2019-02-18T17:08:12Z

Klaus mentioned that the original problem description says that some of the relevant IPC files have mode 0600, which makes both group-based approaches (adding root to the haclient group, or the process taking on haclient privileges) insufficient. It looks like adding the capability to the SELinux policy will be the only solution (technically, one of the group approaches could be combined with modifying the IPC permissions, but I suspect that will be impractical).

wenningerk · 2019-02-18T17:10:58Z

Btw. are there any systematic approaches how to expose capabilities required in a repo - preferably easy to parse to automatized handling?

kgaillot · 2019-02-18T17:29:10Z

Btw. are there any systematic approaches how to expose capabilities required in a repo - preferably easy to parse to automatized handling?

It should be reasonably simple to get a baseline based on what system calls the code uses. Dreaming a little bit, that could be applied to common libraries to generate a knowledge database for their calls as well. A quick search doesn't show anything existing, but these are of interest anyway:

https://github.com/iovisor/bcc/blob/master/tools/capable.py -- logs what capabilities are actually used by running processes
pscap from libcap-ng-utils package -- shows the effective capabilities that running processes have (i.e. all, unless it dropped or was denied some)

wenningerk · 2019-02-18T17:49:38Z

It should be reasonably simple to get a baseline based on what system calls the code uses. Dreaming a little bit, that could be applied to common libraries to generate a knowledge database for their calls as well. A quick search doesn't show anything existing, but these are of interest anyway:

Hmm - guess tooling - especially static analysis - would have to be quite smart to spit out a list that would be sufficiently restrictive and comprehensible.
But giving hints which capabilities in a manually administered list aren't used respectively which capabilities might be required in addition seems reasonable. Maybe with the possibility to add suppressions ...

kgaillot · 2019-02-18T19:03:05Z

Hmm - guess tooling - especially static analysis - would have to be quite smart to spit out a list that would be sufficiently restrictive and comprehensible.
But giving hints which capabilities in a manually administered list aren't used respectively which capabilities might be required in addition seems reasonable. Maybe with the possibility to add suppressions ...

Right, that's why I was thinking just a baseline to start from (e.g. anything that uses setgid(), initgroups() or setgroups() requires CAP_SETGID). Maybe one could list those as "definite" and others as "possibly required under certain circumstances", which CAP_DAC_OVERRIDE would probably always qualify for.

jnpkrn · 2019-02-18T22:25:59Z

On 18/02/19 17:04 +0000, Ken Gaillot wrote: On 18/02/19 15:27 +0100, Jan Pokorný wrote: > On 15/02/19 09:30 -0800, Ken Gaillot wrote: >> CAP_SETGID is required to use initgroups(). Given that this is a >> trivial bypass of CAP_DAC_OVERRIDE, I think it's reasonable to >> assume both capabilities could be taken away, and we shouldn't >> rely on it. > > By that logic, shall we assume that root is totally helpless? > > I hope not, and unlike with initiative to drop dac_override > that goes for years already, I am not aware of anything similar > regarding setgid, which is enabled for 6+ years: > https://github.com/fedora-selinux/selinux-policy-contrib/blob/55adef764d74757501c61d33898c526e44c245b3/rhcs.te#L149 > and would not be very wise either, since it would possibly affect > every and each daemon process run directly with `pacemaker-execd`. No, we should explicitly list the capabilities we need in the cluster SELinux policy, both because it is correct and because we can't predict what the default set will be in the future.

Which cluster SELinux policy do you have in mind? The one I linked above? Do you want to re-review it? Or write a new one from scratch?

>> I think a comprehensive solution would be to figure out all the >> capabilities currently needed by each cluster component, > > including unpredictable scope of services to be run from within > pacemaker? The cluster SELinux policy (which is not limited to sbd)

sbd currently has a dettached policy in the referenced project (sbd_t vs. cluster_t), which likely means you want to write a new one from scratch

should include all capabilities needed by all cluster components throughout the stack. These are no more or less predictable than everything that's already in the policy.

Logical error likely stemming from missing the "since it would possibly affect every and each daemon process run directly with `pacemaker-execd`" part above. That needs to be absorbed first. Then you'll see that there's indeed a predictable and non-predictable part of that, and that some (unpredictable!) daemons shall rather make do without DAC_OVERRIDE capability when run as a cluster resource already (otherwise cluster_t in Fedora's upstream SELinux policy will need to get that back as well, meaning weakening the containment). Cluster proper components alone are, on the contrary, indeed predictable, but we can just as well assume that the currently assigned set in the said upstream policy "just works unless disproved".

>> and make sure they are explicitly listed in the relevant SELinux >> policies. We can set a goal to avoid the need for some of those, but >> I think that should be more carefully and holistically considered >> than reactive. > > Unless there's some way for one-touch solution for users to add > ad-hoc policies as detected when running their services, this > would work only in theory. Even if that solution existed, who > is then to validate that all the policies were distributed > everywhere that they are required (after failovers, etc.). > > Sounds rather scary and anti-HA. I don't follow what you're trying to say here.

likely because the already restated point didn't make it across

I'm suggesting that the SELinux policy for cluster components should explicitly grant the capabilities required by those components, which I believe is the purpose of having the policy.

True, but the applicability of the policy is highly limited with unability to enumerate all the cluster stack use cases, as opposed to self-hosted boundaries like httpd_t -- observe: adding new httpd modules is rather an exception (easy to follow the suite and give it a proper label), whereas involving custom daemons is one of typical use cases for the cluster stack (good luck with tutoring these users to come up with brand new policy modules and respective transitions from cluster_t just because you want as tight SELinux boundaries left for the inner cluster stack as possible).

Any user running a cluster with SELinux enabled already has to verify that their SELinux policies on every node allow their services to run, that doesn't change.

Again, this is the response to something unthought of. From a brief test, cluster_t is derived with whatever spawned process, regardless of whether or how the respective binary is labelled. With your envisioned pedantic use of the security boundary, it could start to be a real problem, then. At least, systemd resources are isolated from this :-) Partial solutions would be: - make pacemaker-execd able to mangle with SELinux context on its own so as to somehow neutralize the narrow scope you envision (note that cluster_t is shared also with corosync, amongst others; also you would need to have the power to do that, which sort of defeats the separation), possibly doable with transition rules directly in SELinux policy ... or even better ... - drop capabilities programmatically directly in the daemons themselves (between fork and execve in pacemakerd?, except for pacemaker-execd, indeed) and seal that possibly also with `NO_NEW_PRIVS`: https://www.kernel.org/doc/html/latest/userspace-api/no_new_privs.html The latter seems most reasonable to me -- and it's both a documentation and the effectively actionable mark in the code. Sadly, due to the multi-daemon arrangement, we cannot use systemd as the catch-all solution for that currently (which would also conveniently open BPF-based syscall filtering to us if you want to micromanage the security to the tiniest bit). But without major redesigning relevant SELinux policies or writing one or more replacement modules from scratch, your original plan won't fly, I am afraid.

…

-- Poki

jnpkrn · 2019-02-18T22:41:18Z

Btw. thanks for pscap tip, but it doesn't look that SELinux and
the respective logic (like pretend as if dac_override doesn't exist
unless permitted with SELinux explictly for a root-level process)
gets reflected there. And the other tool needs significant run-time
coverage for it to be useful at all. And somewhat similar to that
is latrace tool, although it didn't work well for me on Fedora
last time I tried: https://bugzilla.redhat.com/show_bug.cgi?id=1628206

kgaillot · 2019-02-19T16:42:37Z

Logical error likely stemming from missing the "since it would possibly affect every and each daemon process run directly with pacemaker-execd" part above. That needs to be absorbed first.

I'm still not following you. Subsidiary processes are run as their own type when relevant, e.g.:

# pcs resource show web
 Resource: web (class=ocf provider=heartbeat type=apache)
  Operations: monitor interval=10s timeout=20s (web-monitor-interval-10s)
              start interval=0s timeout=40s (web-start-interval-0s)
              stop interval=0s timeout=60s (web-stop-interval-0s)
# ps axZ | grep 'http[d]'
system_u:system_r:httpd_t:s0      9100 ?        Ss     0:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
system_u:system_r:httpd_t:s0      9101 ?        S      0:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
system_u:system_r:httpd_t:s0      9102 ?        S      0:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
system_u:system_r:httpd_t:s0      9103 ?        S      0:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
system_u:system_r:httpd_t:s0      9104 ?        S      0:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
system_u:system_r:httpd_t:s0      9105 ?        S      0:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid

Perhaps you mean custom daemons that don't have any SELinux labelling? But that's no different than running such a daemon without a cluster on a SELinux-enabled host. The user is responsible for that, with or without a cluster.

Am I misunderstanding your point?

jnpkrn · 2019-02-20T15:17:39Z

Perhaps you mean custom daemons that don't have any SELinux
labelling? But that's no different than running such a daemon
without a cluster on a SELinux-enabled host. The user is responsible
for that, with or without a cluster.

Yes, "every and each daemon process run directly with pacemaker-execd"
could be totally custom (we have no interest to limit what the cluster
stack is used for, quite the contrary as long as declared axioms of
use hold) and lacking their own policies, and the very type that pacemaker
daemons run as (cluster_t at least in Fedora/RHEL derived systems, which
we talk about all the time now, anyway) is derived for these.

And, per https://bugzilla.redhat.com/show_bug.cgi?id=1677325#c9, relaxed
unconfined_domain_type attribute correlates on the allowed operations
-- likely to make it all work in said circumstances. That's why I am
trying to show hardening on this SELinux front for the whole cluster
stack together with proactively retrofitting the tightest boundaries
possible is hence rather pointless unless you want a complete rewrite
of the policy. Contrary to the other, orthogonal, very targeted (SELinux
policy's cluster_t is rather catch-all low effort) measures I proposed.

Now back to free-standing sbd_t type, I owe an apology here that
I missed that this one indeed lacks setgid and that would indeed need
to be added for this patch (only first part of the overall solution) to
work even with the lack of dac_override capability.

But I don't agree setgid is as strong as dac_override except for
the minority cases -- not even here unless the second part of the
solution is applied. Since group permissions for file can still
prevent any sort of an access, and lacking fowner (as is the case
with sbd), such guarded root-privileged process cannot change the
permissions/ownership of the file. So, allowing setgid in exchange
for dac_override would be a welcome step towards more security,
which is what we surely want.

kgaillot · 2019-02-20T15:38:29Z

Yes, "every and each daemon process run directly with pacemaker-execd"
could be totally custom

Users are responsible for all aspects of custom services, e.g. providing an LSB or OCF script, and configuring firewall rules on all nodes -- SELinux policies for custom services fall into this same category.

So, allowing setgid in exchange
for dac_override would be a welcome step towards more security,
which is what we surely want.

Except that some IPC sockets do not have any group permissions, so changing groups via any means won't matter. We'd also have to find every place that those permissions are set, verify that group access is not a security risk, and modify them to add group read/write. I think dac_override is a simpler and more appropriate solution.

jnpkrn · 2019-02-20T15:58:22Z

Except that some IPC sockets do not have any group permissions.

Looks like that's a consequence of a shoddy usage of umask
in pacemaker where opening files with particular permissions
would do instead :-/

wenningerk · 2020-01-30T09:53:34Z

Should we still consider any changes here or can we close this being resolved for fedora via selinux-policy.
Don't know for other distributions but we had no complaints so far ...

knet-ci-bot · 2020-01-30T09:53:37Z

Can one of the admins verify this patch?

jnpkrn force-pushed the DAC_OVERRIDE-partial-fix branch from f2dbda0 to 4831385 Compare February 14, 2019 21:50

wenningerk reviewed Feb 15, 2019

View reviewed changes

src/sbd-pacemaker.c Outdated Show resolved Hide resolved

jnpkrn force-pushed the DAC_OVERRIDE-partial-fix branch from 4831385 to 9695ba0 Compare February 18, 2019 13:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: do not rely on DAC_OVERRIDE capability on Linux despite being root #69

Fix: do not rely on DAC_OVERRIDE capability on Linux despite being root #69

jnpkrn commented Feb 14, 2019 •

edited

Loading

wenningerk commented Feb 15, 2019

kgaillot commented Feb 15, 2019

kgaillot commented Feb 15, 2019

jnpkrn commented Feb 18, 2019 via email

jnpkrn commented Feb 18, 2019 via email

kgaillot commented Feb 18, 2019

kgaillot commented Feb 18, 2019 •

edited

Loading

wenningerk commented Feb 18, 2019

kgaillot commented Feb 18, 2019

wenningerk commented Feb 18, 2019

kgaillot commented Feb 18, 2019

jnpkrn commented Feb 18, 2019 via email

jnpkrn commented Feb 18, 2019

kgaillot commented Feb 19, 2019

jnpkrn commented Feb 20, 2019

kgaillot commented Feb 20, 2019

jnpkrn commented Feb 20, 2019

wenningerk commented Jan 30, 2020

knet-ci-bot commented Jan 30, 2020

Fix: do not rely on DAC_OVERRIDE capability on Linux despite being root #69

Are you sure you want to change the base?

Fix: do not rely on DAC_OVERRIDE capability on Linux despite being root #69

Conversation

jnpkrn commented Feb 14, 2019 • edited Loading

wenningerk commented Feb 15, 2019

kgaillot commented Feb 15, 2019

kgaillot commented Feb 15, 2019

jnpkrn commented Feb 18, 2019 via email

jnpkrn commented Feb 18, 2019 via email

kgaillot commented Feb 18, 2019

kgaillot commented Feb 18, 2019 • edited Loading

wenningerk commented Feb 18, 2019

kgaillot commented Feb 18, 2019

wenningerk commented Feb 18, 2019

kgaillot commented Feb 18, 2019

jnpkrn commented Feb 18, 2019 via email

jnpkrn commented Feb 18, 2019

kgaillot commented Feb 19, 2019

jnpkrn commented Feb 20, 2019

kgaillot commented Feb 20, 2019

jnpkrn commented Feb 20, 2019

wenningerk commented Jan 30, 2020

knet-ci-bot commented Jan 30, 2020

jnpkrn commented Feb 14, 2019 •

edited

Loading

kgaillot commented Feb 18, 2019 •

edited

Loading