Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: do not rely on DAC_OVERRIDE capability on Linux despite being root #69

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jnpkrn
Copy link

@jnpkrn jnpkrn commented Feb 14, 2019

It turns out, for example, that SELinux will prevent this capability
for confined processes running as root. It then means that sbd cannot
access, as a client, files used for joining two local communication sides
within libqb-arranged IPC mechanism in case those files do not have
permissions to explicitly allow file-based access with credentials
of this client -- which is exactly what happens when the IPC servers
are pacemaker daemons not run as root on their own.

Solution is two-phased:

  1. have sbd add respective non-privileged group corresponding to the
    server side of the IPC -- this patch
  2. ensure this server side (pacemaker) does allow group-derived access
    (i.e., the access permissions for group are as relaxed as needed,
    umask notwithstanding) -- outside of the sbd's scope

@wenningerk
Copy link

In the light of ongoing effort to make the pacemaker-API more accessible (moving code to libraries, rearrange libraries, API-server, ...) wouldn't it make sense to have most of that code in a central place where potential users of the API would have it handy (libqb?). Not saying it would necessarily have to happen behind the scenes without code-changes in sbd though.

src/sbd-pacemaker.c Outdated Show resolved Hide resolved
@kgaillot
Copy link

CAP_SETGID is required to use initgroups(). Given that this is a trivial bypass of CAP_DAC_OVERRIDE, I think it's reasonable to assume both capabilities could be taken away, and we shouldn't rely on it.

I think a comprehensive solution would be to figure out all the capabilities currently needed by each cluster component, and make sure they are explicitly listed in the relevant SELinux policies. We can set a goal to avoid the need for some of those, but I think that should be more carefully and holistically considered than reactive.

@kgaillot
Copy link

For the particular issue at hand, I'm thinking the simplest solution is for pacemaker's spec file to add root to the haclient group. I haven't tested it but I think it would work (except for any sbd process started before the pacemaker package was installed, but that shouldn't be any trouble to deal with). It does give access to all root processes rather than selected ones that need it, but that's also the effect of the typical example solution given for similar problems (i.e. make the file owned by root group).

It turns out, for example, that SELinux will prevent this capability
for confined processes running as root.  It then means that sbd cannot
access, as a client, files used for joining two local communication sides
within libqb-arranged IPC mechanism in case those files do not have
permissions to explicitly allow file-based access with credentials
of this client -- which is exactly what happens when the IPC servers
are pacemaker daemons not run as root on their own.

Solution is two-phased:
1. have sbd add respective non-privileged group corresponding to the
   server side of the IPC -- this patch
2. ensure this server side (pacemaker) does allow group-derived access
   (i.e., the access permissions for group are as relaxed as needed,
   umask notwithstanding) -- outside of the sbd's scope

Signed-off-by: Jan Pokorný <[email protected]>
@jnpkrn
Copy link
Author

jnpkrn commented Feb 18, 2019 via email

@jnpkrn
Copy link
Author

jnpkrn commented Feb 18, 2019 via email

@kgaillot
Copy link

On 15/02/19 09:30 -0800, Ken Gaillot wrote: CAP_SETGID is required to use initgroups(). Given that this is a trivial bypass of CAP_DAC_OVERRIDE, I think it's reasonable to assume both capabilities could be taken away, and we shouldn't rely on it.
By that logic, shall we assume that root is totally helpless? I hope not, and unlike with initiative to drop dac_override that goes for years already, I am not aware of anything similar regarding setgid, which is enabled for 6+ years: https://github.com/fedora-selinux/selinux-policy-contrib/blob/55adef764d74757501c61d33898c526e44c245b3/rhcs.te#L149 and would not be very wise either, since it would possibly affect every and each daemon process run directly with pacemaker-execd.

No, we should explicitly list the capabilities we need in the cluster SELinux policy, both because it is correct and because we can't predict what the default set will be in the future.

I think a comprehensive solution would be to figure out all the capabilities currently needed by each cluster component,
including unpredictable scope of services to be run from within pacemaker?

The cluster SELinux policy (which is not limited to sbd) should include all capabilities needed by all cluster components throughout the stack. These are no more or less predictable than everything that's already in the policy.

and make sure they are explicitly listed in the relevant SELinux policies. We can set a goal to avoid the need for some of those, but I think that should be more carefully and holistically considered than reactive.
Unless there's some way for one-touch solution for users to add ad-hoc policies as detected when running their services, this would work only in theory. Even if that solution existed, who is then to validate that all the policies were distributed everywhere that they are required (after failovers, etc.). Sounds rather scary and anti-HA.

I don't follow what you're trying to say here. I'm suggesting that the SELinux policy for cluster components should explicitly grant the capabilities required by those components, which I believe is the purpose of having the policy. Any user running a cluster with SELinux enabled already has to verify that their SELinux policies on every node allow their services to run, that doesn't change.

@kgaillot
Copy link

kgaillot commented Feb 18, 2019

Klaus mentioned that the original problem description says that some of the relevant IPC files have mode 0600, which makes both group-based approaches (adding root to the haclient group, or the process taking on haclient privileges) insufficient. It looks like adding the capability to the SELinux policy will be the only solution (technically, one of the group approaches could be combined with modifying the IPC permissions, but I suspect that will be impractical).

@wenningerk
Copy link

Btw. are there any systematic approaches how to expose capabilities required in a repo - preferably easy to parse to automatized handling?

@kgaillot
Copy link

Btw. are there any systematic approaches how to expose capabilities required in a repo - preferably easy to parse to automatized handling?

It should be reasonably simple to get a baseline based on what system calls the code uses. Dreaming a little bit, that could be applied to common libraries to generate a knowledge database for their calls as well. A quick search doesn't show anything existing, but these are of interest anyway:

@wenningerk
Copy link

It should be reasonably simple to get a baseline based on what system calls the code uses. Dreaming a little bit, that could be applied to common libraries to generate a knowledge database for their calls as well. A quick search doesn't show anything existing, but these are of interest anyway:

Hmm - guess tooling - especially static analysis - would have to be quite smart to spit out a list that would be sufficiently restrictive and comprehensible.
But giving hints which capabilities in a manually administered list aren't used respectively which capabilities might be required in addition seems reasonable. Maybe with the possibility to add suppressions ...

@kgaillot
Copy link

Hmm - guess tooling - especially static analysis - would have to be quite smart to spit out a list that would be sufficiently restrictive and comprehensible.
But giving hints which capabilities in a manually administered list aren't used respectively which capabilities might be required in addition seems reasonable. Maybe with the possibility to add suppressions ...

Right, that's why I was thinking just a baseline to start from (e.g. anything that uses setgid(), initgroups() or setgroups() requires CAP_SETGID). Maybe one could list those as "definite" and others as "possibly required under certain circumstances", which CAP_DAC_OVERRIDE would probably always qualify for.

@jnpkrn
Copy link
Author

jnpkrn commented Feb 18, 2019 via email

@jnpkrn
Copy link
Author

jnpkrn commented Feb 18, 2019

Btw. thanks for pscap tip, but it doesn't look that SELinux and
the respective logic (like pretend as if dac_override doesn't exist
unless permitted with SELinux explictly for a root-level process)
gets reflected there. And the other tool needs significant run-time
coverage for it to be useful at all. And somewhat similar to that
is latrace tool, although it didn't work well for me on Fedora
last time I tried: https://bugzilla.redhat.com/show_bug.cgi?id=1628206

@kgaillot
Copy link

Logical error likely stemming from missing the "since it would possibly affect every and each daemon process run directly with pacemaker-execd" part above. That needs to be absorbed first.

I'm still not following you. Subsidiary processes are run as their own type when relevant, e.g.:

# pcs resource show web
 Resource: web (class=ocf provider=heartbeat type=apache)
  Operations: monitor interval=10s timeout=20s (web-monitor-interval-10s)
              start interval=0s timeout=40s (web-start-interval-0s)
              stop interval=0s timeout=60s (web-stop-interval-0s)
# ps axZ | grep 'http[d]'
system_u:system_r:httpd_t:s0      9100 ?        Ss     0:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
system_u:system_r:httpd_t:s0      9101 ?        S      0:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
system_u:system_r:httpd_t:s0      9102 ?        S      0:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
system_u:system_r:httpd_t:s0      9103 ?        S      0:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
system_u:system_r:httpd_t:s0      9104 ?        S      0:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
system_u:system_r:httpd_t:s0      9105 ?        S      0:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid

Perhaps you mean custom daemons that don't have any SELinux labelling? But that's no different than running such a daemon without a cluster on a SELinux-enabled host. The user is responsible for that, with or without a cluster.

Am I misunderstanding your point?

@jnpkrn
Copy link
Author

jnpkrn commented Feb 20, 2019

Perhaps you mean custom daemons that don't have any SELinux
labelling? But that's no different than running such a daemon
without a cluster on a SELinux-enabled host. The user is responsible
for that, with or without a cluster.

Yes, "every and each daemon process run directly with pacemaker-execd"
could be totally custom (we have no interest to limit what the cluster
stack is used for, quite the contrary as long as declared axioms of
use hold) and lacking their own policies, and the very type that pacemaker
daemons run as (cluster_t at least in Fedora/RHEL derived systems, which
we talk about all the time now, anyway) is derived for these.

And, per https://bugzilla.redhat.com/show_bug.cgi?id=1677325#c9, relaxed
unconfined_domain_type attribute correlates on the allowed operations
-- likely to make it all work in said circumstances. That's why I am
trying to show hardening on this SELinux front for the whole cluster
stack together with proactively retrofitting the tightest boundaries
possible is hence rather pointless unless you want a complete rewrite
of the policy. Contrary to the other, orthogonal, very targeted (SELinux
policy's cluster_t is rather catch-all low effort) measures I proposed.


Now back to free-standing sbd_t type, I owe an apology here that
I missed that this one indeed lacks setgid and that would indeed need
to be added for this patch (only first part of the overall solution) to
work even with the lack of dac_override capability.

But I don't agree setgid is as strong as dac_override except for
the minority cases -- not even here unless the second part of the
solution is applied. Since group permissions for file can still
prevent any sort of an access, and lacking fowner (as is the case
with sbd), such guarded root-privileged process cannot change the
permissions/ownership of the file. So, allowing setgid in exchange
for dac_override would be a welcome step towards more security,
which is what we surely want.

@kgaillot
Copy link

Yes, "every and each daemon process run directly with pacemaker-execd"
could be totally custom

Users are responsible for all aspects of custom services, e.g. providing an LSB or OCF script, and configuring firewall rules on all nodes -- SELinux policies for custom services fall into this same category.

So, allowing setgid in exchange
for dac_override would be a welcome step towards more security,
which is what we surely want.

Except that some IPC sockets do not have any group permissions, so changing groups via any means won't matter. We'd also have to find every place that those permissions are set, verify that group access is not a security risk, and modify them to add group read/write. I think dac_override is a simpler and more appropriate solution.

@jnpkrn
Copy link
Author

jnpkrn commented Feb 20, 2019

Except that some IPC sockets do not have any group permissions.

Looks like that's a consequence of a shoddy usage of umask
in pacemaker where opening files with particular permissions
would do instead :-/

@wenningerk
Copy link

Should we still consider any changes here or can we close this being resolved for fedora via selinux-policy.
Don't know for other distributions but we had no complaints so far ...

@knet-ci-bot
Copy link

Can one of the admins verify this patch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants