Troubleshooting the Apocalypse

· 370 words · 2 minute read

I log into my Fedora-based desktop today to a notification that updates are available. No big deal, just drop into a terminal and run `sudo dnf update,` right?

All goes well, until I get to here:

Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Segmentation fault

Great. A segfault in the worst possible place.

I search around to no avail, so I decide to do some investigation on my own. DNF is written in python3, so I fetch the python3 package from mirrors.kernel.org, and run `rpm -U python3-3.4.3-5.fc23.x86_64.rpm.`

BDB2053 Freeing read locks for locker 0x77b: 20496/139644554041088
BDB2053 Freeing read locks for locker 0x77d: 20496/139644554041088
Segmentation fault

At this point, the good news is that the problem isn’t in python or DNF, but the even worse than before news is that it’s in the actual package manager. Those who know me can take hubris in knowing this is the one time I wish I were running Debian (there, I could just download the apt package and extract it with tar).

I find an article on how to extract RPM packages, and extract the RPM package from mirrors.kernel.org.

As I’m not a big fan of blanket replacements of one of the core parts of the distribution, I enlist gdb for a few hints.

# gdb ./rpm
GNU gdb (GDB) Fedora 7.10.1-30.fc23
Copyright (C) 2015 Free Software Foundation, Inc.
...
(gdb) run -U ../rpm-4.13.0-0.rc1.7.fc23.x86_64.rpm
Starting program: ./rpm/bin/rpm -U ../rpm-4.13.0-0.rc1.7.fc23.x86_64.rpm
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.22-7.fc23.x86_64
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00007fffed1e7405 in process_file () from /lib64/libselinux.so.1

That last line is a pretty good lead, so I replace /usr/lib64/libselinux.so.1 with a copy from another Fedora box. It was at this point I was glad WordPress autosaves, as the next thing I see on my screen is

systemd[1]: Caught <SEGV>;, dumped core as pid 21117.

Lesson learned. Don’t try to overwrite core SELinux libraries while the system is running. Luckily, I have a live cd handy.

After replacing the file and rebooting, the upgrade completes successfully. Apocalypse resolved. Time for lunch.

BONUS: While I was troubleshooting this issue, I came across this article. Definitely worth a read.