Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bootctl install only ESP binaries #167

Open
wants to merge 96 commits into
base: master
Choose a base branch
from

Conversation

dbnicholson
Copy link
Member

We'd like to use bootctl install so that bootctl update recognizes the installation and makes updates. However, the install as performed now could cause issues with the loader.sln fake symlink scheme used for handling atomic ostree updates. Strip down the installation so it just manages the ESP binaries. None of the other parts are required and our systems haven't used them to this point.

https://phabricator.endlessm.com/T34703

DaanDeMeyer and others added 30 commits May 24, 2022 15:27
This backports the same fix from 6e91653
in systemd upstream that we can't backport directly because that commit
introduces a new feature.
This reverts commit 75d7b59.

This commit was confirmed to have introduced a regression with LUKS,
so revert it for now.

systemd/systemd#23429

Conflicts:
	src/core/device.c
kernel's 'make install' invokes install.sh which calls /sbin/install-kernel.
Thus we are invoked as e.g.
  /sbin/installkernel 5.18.0 arch/x86/boot/bzImage System.map /boot
The last two arguments would be passed as "initrds".

Before , we would just quitely ignore
/boot, because it doesn't pass the 'test -f' test, and possibly try to do
something with System.map. 742561e tightened
the check, so we now throw an error.

It seems that the correct thing is to ignore those two arguments, because
our plugin syntax has no notion of System.map. And the installation directory
we can figure out ourselves better. Effectively, this makes things behave
like before, but less by accident.

Fixes #23490.

(cherry picked from commit 620ecc9)
This should cover cases regarding devices with `OPTIONS+="db_persist"`
during initrd->sysroot transition.

See:
  * systemd/systemd#23429
  * systemd/systemd#23218
  * systemd/systemd#23489
  * https://bugzilla.redhat.com/show_bug.cgi?id=2087225
(cherry picked from commit 1fb7f8e)
Co-Authored-By: Yu Watanabe <[email protected]>
(cherry picked from commit b22d90e)
dm-crypt device units generated by systemd-cryptsetup-generator
habe BindsTo= dependencies on their backend devices. The dm-crypt
devices have the db_persist flag set, and thus survive the udev db
cleanup while switching root. But backend devices usually don't survive.
These devices are neither mounted nor used for swap, thus they will
seen as DEVICE_NOT_FOUND after switching root.

The BindsTo dependency will cause systemd to schedule a stop
job for the dm-crypt device, breaking boot:

[   68.929457] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Unit is stopped because bound to inactive unit dev-disk-by\x2duuid-3bf91f73\x2d1ee8\x2d4cfc\x2d9048\x2d93ba349b786d.device.
[   68.945660] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Trying to enqueue job systemd-cryptsetup@cr_root.service/stop/replace
[   69.473459] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Installed new job systemd-cryptsetup@cr_root.service/stop as 343

Avoid this by not setting the state of the backend devices to
DEVICE_DEAD.

Fixes the LUKS setup issue reported in #23429.

(cherry picked from commit cf1ac0c)
On switching root, a device may have a persistent databse. In that case,
Device.enumerated_found may have DEVICE_FOUND_UDEV flag, and it is not
necessary to downgrade the Device.deserialized_found and
Device.deserialized_state. Otherwise, the state of the device unit may
be changed plugged -> dead -> plugged, if the device has not been mounted.

Fixes #23429.

[mwilck: cherry-picked from #23437]

(cherry picked from commit 4fc69e8)
shmat() requires the CAP_IPC_OWNER capability. When running test-seccomp
in environments with root + CAP_SYS_ADMIN, but not CAP_IPC_OWNER,
memory_deny_write_execute_shmat would fail. This fixes it.

(cherry picked from commit 7e46a5c)
Fixes #21832.

(cherry picked from commit 223a359)
Fixes #23401

(cherry picked from commit 5ee38ad)
Fixes #22816.

(cherry picked from commit 8f24777)
Fixes #22966. Since there are competing conventions, let's not
change our code, but make the docs match what is implemented.

(cherry picked from commit b72308d)
as it may take a bit longer on slower machines:

```
[  OK  ] Reached target System Reboot.
Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
Failed to open watchdog device /dev/watchdog0, ignoring: No such file or directory
binfmt_misc is not mounted, not detaching entries.
Sending SIGTERM to remaining processes...
ERROR:test-shutdown:Timeout exceeded.
<pexpect.pty_spawn.spawn object at 0x7f3d4bcd20b0>
command: /systemd-meson-build/systemd-nspawn
<...snip...>
buffer (last 100 chars): 'mbinfmt_misc is not mounted, not detaching entries.\x1b[0m\r\nSending SIGTERM to remaining processes...\r\n'
before (last 100 chars): 'mbinfmt_misc is not mounted, not detaching entries.\x1b[0m\r\nSending SIGTERM to remaining processes...\r\n'
after: <class 'pexpect.exceptions.TIMEOUT'>
match: None
match_index: None
exitstatus: None
flag_eof: False
pid: 572528
child_fd: 5
closed: False
timeout: 30
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_re:
    0: re.compile('H login: ')
INFO:test-shutdown:killing child pid 572528
E: nspawn failed with exit code 1
```

(cherry picked from commit 3e624bb)
Include this header to fix errors when including hwdb-internal.h:
  ../src/libsystemd/sd-hwdb/hwdb-internal.h:16:21: error: field ‘st’ has incomplete type
     16 |         struct stat st;

(cherry picked from commit 9745b51)
Fixes #23486.

(cherry picked from commit 89b6a3f)
Some compiler wrappers like honggfuzz pass -fno-builtin explicitly
and because of that the tests where fabs is used fail to compile
with something like
```
FAILED: test-bus-marshal
...
/usr/bin/ld: test-bus-marshal.p/src_libsystemd_sd-bus_test-bus-marshal.c.o: undefined reference to symbol 'fabs@@GLIBC_2.2.5'
/usr/bin/ld: /usr/lib64/libm.so.6: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
```

Fun fact: it took honggfuzz less than a minute to discover
GHSA-gmc7-pqv9-966m used by
systemd to compress/descompress some stuff.

(cherry picked from commit f232c83)
… after SIGKILL but processes still remain

After sending a SIGKILL to a process, the process might disappear from
`cgroup.threads` but still show up in `cgroup.procs` and still remains in the
cgroup and cause migrating new processes to `Delegate=yes` cgroups to fail with
`-EBUSY`. This is especially likely for heavyweight processes that consume more
kernel CPU time to clean up.

Fix this by only returning 0 when both `cgroup.threads` and
`cgroup.procs` are empty.

(cherry picked from commit 37f0289)
This just adds an unused parameter for future use. No change in
behaviour.

(cherry picked from commit 1661833)
Fixes #23520. Replaces #23555.

The problem started with cdf3706 and
90b1ec0 which together started printing the
wall message in more cases. The motivation for those change was reasonable, but
this clearly causes problems described in #23520: users are getting unexpected
wall messages. Xterm, urxvt, (anything using libutempter?), and tmux (in some
configurations), register local pty sessions in utmp.

So let's try to suppress the message for local pseudo-terminal logins. This
patch based on #23538, but instead of filtering just on /dev/pts, it uses the
.ut_addr_v6 to only filter out local entries.

(cherry picked from commit 51a2b57)
…hibernate

Fixes: #23520

[zjs: I added the comment and tweaked the patch a bit.

The call to reset_scheduled_shutdown() is moved down a bit to allow the
callback to have access to information about the operation being cancelled.
This all happens within the same function, so there should be no observable
change in behaviour.]

(cherry picked from commit ea74f39)
DnsPacket.ifindex=1 (loopback) is normalized to 0 whenever a message is
received on the loopback iface, so for both listeners, 127.0.0.53 and
127.0.0.54, the ifindex will be set to 0 by manager_recv() for queries
that have a local origin.

Replies to such local messages need to set a proper ifindex in any
case, as the supplied source-address would otherwise be ignored in
manager_ipv4_send() (CMSG generation is skipped due to ifindex > 0 check).

Note that this change only forces `ifindex` to loopback if it was actually
normalized to `0` before (due to a loopback detection) in order to keep the
nat-to-127.0.0.54-from-another-interface usecase that was described in
a8d0906 intact.
Also note that nat is not supported for the main stub 127.0.0.53 which is
why forcing LOOPBACK_IFINDEX was/is fine for that case.

Fixes #23495

(cherry picked from commit dfa14e2)
Carlo Caione and others added 20 commits July 26, 2022 11:04
udisk (and udiskctl) relies on ID_FS_TYPE key when detecting the fs on
the CDROM. In our current setup ID_FS_TYPE is not set and udisk is
unable to detect any filesystem on the CDROM, refusing to mount it
using udiskctl (or programmatically).

ID_FS_TYPE (among other things) is set by 'blkid' in
60-persistent-storage.rules with the rule:

KERNEL=="sr*", ENV{DISK_EJECT_REQUEST}!="?*", \
 ENV{ID_CDROM_MEDIA_TRACK_COUNT_DATA}=="?*", \
 ENV{ID_CDROM_MEDIA_SESSION_LAST_OFFSET}=="", \
 MPORT{builtin}="blkid --noraid"

Unfortunately this rule is never matched/executed since
ID_CDROM_MEDIA_TRACK_COUNT_DATA is never set.

ID_CDROM_MEDIA_TRACK_COUNT_DATA should be set by 'cdrom_id' in
60-cdrom_id.rules but on our system 'cdrom_id /dev/sr0' fails with
-EBUSY.

The problem is that 'cdrom_id' tries to open /dev/sr0 using O_EXCL.
O_EXCL on block devices fails with EBUSY if, and only if, there's
someone else also opening the device with O_EXCL [1]. In our case this call
fails since we are already booting from /dev/sr0 keeping it already
opened with O_EXCL.

As suggested here [2] we fix this by not relying anymore on
ID_CDROM_MEDIA_TRACK_COUNT_DATA for executing 'blkid' and importing
ID_FS_*.

[1] https://lists.freedesktop.org/archives/systemd-devel/2014-September/023160.html
[2] https://bugs.freedesktop.org/show_bug.cgi?id=52474

Signed-off-by: Carlo Caione <[email protected]>
Arduino devices expose a USB modem interface for communication. We want
these to be accessible by all users in the system. Ideally we would add
all real users to the dialout group, but this is not straightforward on
an OSTree-based system at the moment.

https://phabricator.endlessm.com/T21435
ostree uses symlinks on the boot filesystem, this isn't great when that
filesystem is vfat, which can't do symlinks. Since the EFI ESP must be
vfat on most implementation, this makes ostree incompaible with sd-boot.

To allow ostree to keep making symlinks, we make a fake symlink that's
just a text file with the name of the file that would be linked to.

https://phabricator.endlessm.com/T27040
We want a combination of the built in parameters from the efi image
and safe parameters from the loader config.

https://phabricator.endlessm.com/T27591
…ot mode

If secure boot is off it could be useful to pass arbitrary parameters
for debug purposes.

https://phabricator.endlessm.com/T27591
We've encountered firmware that searches the boot loader for the string
"Microsoft" and uses that to determine which ACPI table to deliver to
the kernel.

Make sure that string is present so these computers do the right thing.

https://phabricator.endlessm.com/T27753
This service stores a random-seed in the ESP so it can be passed to the
kernel by systemd-boot on the next boot, to seed the kernel's entropy
pool. This unit is only active if a boot loader fully supporting the
Boot Loader Specificiation is detected (via a LoaderFeatures EFI var),
which currently is only true for Endless PAYG images, which use
systemd-boot instead of GRUB. This random seed is stored in
/boot/loader/random-seed, with /boot/loader being created if it does not
exist.

The problem here is that in our systemd-boot + OSTree setup on PAYG
images we need /boot/loader to be a symbolic link pointing to either
/boot/loader.1 or /boot/loader.0 (OSTree requirement) living in the ESP
(systemd-boot requirement) which is FAT32 (UEFI spec) and does not
support symlinks. To solve that we implemented a fake symlink as a file
in /boot/loader.lnk containing the path that should be the /boot/loader
target, and taught OSTree about it, giving higher precedence to the real
/boot/loader in case it exists. So if systemd-boot-system-token.service
creates /boot/loader, most OSTree operations break, because the entries/
directory is not found.

Let's disable this service here to avoid that problem. This unit is
enabled by the build system at install time instead of using the more
traditional approach of having a [Install] section and using systemctl
and the preset system, so we have to disable it in units/meson.build.

There is also an accompanying commit in the packaging branch that
removes the installation of the symbolic link in
sysinit.target.wants/systemd-boot-system-token.service.

https://phabricator.endlessm.com/T29475
…espace"

This reverts commit af918c4.

This fixes a test-mountpoint-util failure:
The output from the failed tests:

354/574 test-mountpoint-util                      FAIL           0.64s (killed by signal 6 SIGABRT)

--- command ---
18:44:42 SYSTEMD_KBD_MODEL_MAP='/build/src/src/locale/kbd-model-map' PATH='/build/src/_build:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin' SYSTEMD_LANGUAGE_FALLBACK_MAP='/build/src/src/locale/language-fallback-map' /build/src/_build/test-mountpoint-util
--- stderr ---
Found container virtualization none.
Assertion 'mount(NULL, "/", NULL, MS_PRIVATE | MS_REC, NULL) >= 0' failed at src/test/test-mountpoint-util.c:301, function main(). Aborting.
-------

https://phabricator.endlessm.com/T31222
For some reason passing this through debian/rules is not working.

https://phabricator.endlessm.com/T33712
Revert "[Endless] sd-boot: Work around odd behaviour in some firmware"
This reverts commit 9746125.

Debian's policy is to never clean-up /var/tmp to keep consistency with
the SysV init system. Flatpak creates temporary files in /var/tmp during
app updates but does not remove them on error, to avoid re-downloading
them on a future update attempt, and expects these files to be
automatically cleaned-up by the system eventually, according to the
site's policy. With this policy in place these files are never removed,
wasting the user's storage space.

Revert this commit back to upstream's default policy of cleaning up /tmp
every 10 days and /var/tmp every 30 days.

https://phabricator.endlessm.com/T23762
https://phabricator.endlessm.com/T33887
Revert "[DEB] Bring tmpfiles.d/tmp.conf in line with Debian defaults"
Note that -O0 is deliberately filtered out as we have to compile with at
least -O1 due to #24202.

Fixes: #24323
(cherry picked from commit 7aa4762)
meson: Fix build with --optimization=plain
If LoaderDevicePartUUID isn't set because the boot loader doesn't support it,
assume that the ESP partition on the root disk is the booted ESP. This is a
weaker guarantee but likely the same for the vast majority of systems. Allowing
the ESP automount in this case helps break a dependency loop. Existing boot
loaders can be changed to set LoaderDevicePartUUID, but they can't be delivered
to existing systems if the ESP is not mounted.

Upstream: systemd/systemd#26430

https://phabricator.endlessm.com/T29930
In order to make ostree deployments work on the ESP VFAT filesystem, we use a
fake `loader.sln` symlink (see 9806a3d). That means that on our systemd-boot
systems, `/boot/loader` doesn't exist.

Rather than confuse the situation by letting `bootctl` create files in `/boot`
that will either not be used or cause errors, change the `update`/`install`
actions to only install the ESP binaries. systemd-boot works fine without any
of the other files.

This can be dropped when ostree deployments on the ESP are ever reworked to not
need our fake symlink hacks.

https://phabricator.endlessm.com/T34703
This was a mistake in 9806a3d as the intended usage was to make reading of
`loader/loader.conf` also support the `loader.sln` fake symlink. In practice
this isn't an issue since we don't install `loader.conf` and it would be lost
on every ostree deployment, but it should be corrected.

This can be squashed into 9806a3d on the next rebase.

https://phabricator.endlessm.com/T34703
@dbnicholson
Copy link
Member Author

retest this please

7 similar comments
@dbnicholson
Copy link
Member Author

retest this please

@dbnicholson
Copy link
Member Author

retest this please

@dbnicholson
Copy link
Member Author

retest this please

@dbnicholson
Copy link
Member Author

retest this please

@dbnicholson
Copy link
Member Author

retest this please

@dbnicholson
Copy link
Member Author

retest this please

@dbnicholson
Copy link
Member Author

retest this please

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.