Skip to content
Commit 40b95326 authored by Harald Sitter's avatar Harald Sitter 🐧
Browse files

add coredumpd support

This only does something when used with a new enough KCrash.

Coredumpd is a coredump handler that comes with systemd. When a process
dumps its core it is sent to coredumpd which records the crash in the
systemd journal and stores the core on disk. This allows us to pick up
the crash after the fact and file a bug report. For example when
software crashes on session logout.

To facilitate bug reporting KCrash writes the metadata we ordinarily get
through ARGV to disk as an INI file. Since we still want to support both
operation modes this commit introduces large amounts of extra tooling
specifically meant to connect coredumpd crashes, to metadata, to drkonqi
argvs. All of this does depend on systemd and is generally working with
version 245, but 248 is vastly more recommended because of various
refinements and bugfixes.

Architecturally a coredumpd crash works like this:

KCrash
======

The app crashes. KCrash's signal handler runs. It records the metadata
to a file in `~/.cache`. Re-raises the signal to then trigger a core
dump.

coredumpd
=========

Coredumpd gets invoked by the kernel, captures the core, records the
crash with all the metadata it has available (proc maps, pid, time,
etc.) to journald. It does this by invoking an instance of
`systemd-coredump@.service`

`drkonqi-coredump-processor@.service`
===================================

This is wanted by `systemd-coredump@.service` and instantiated using the
same instance "name" as coredump@ (this then allows us to find the correct
crash). The processor connects to journald and searches/waits for the
crash for the correct coredump@ instance to appear in the journal. Once
the crash record has been found a connection to a user-scope socket is
opened...

`drkonqi-coredump-launcher.socket`
================================

Is a user-scope socket that purely exists for
`drkonqi-coredump-processor@.service` to talk to. When a connection is
opened an instance of `drkonqi-coredump-launcher@.service` is spun up to
deal with the traffic.

`drkonqi-coredump-launcher@.service`
==================================

Is the actual launcher service, it is socket activated from
system-scope. On the socket it gets the crash metadata streamed from the
system-level processor (thereby eliminating the need to talk to journald
again - the processor forwards the data it looked up).

The launcher then glues the coredumpd metadata into the same file as the
KCrash metadata, turning the .ini file into a comprehensive record of
the crash.

Once the file is complete it forks drkonqi with the same arguments as
though KCrash had invoked it directly so the user can file a crash
report.

Drkonqi
=======

Drkonqi itself has grown a new CoredumpBackend analogous to the KCrash
backend. Its main concern is preparing the core for tracing. Depending
on the systemd version that is either delegated to coredumpctl (the CLI
for coredumpd) or partially done on our end. In either event coredumpctl
is a runtime requirement to not have to concern ourselves with where a
core is actually stored from the coredumpd side of things (could be compressed,
on disk, or in journal).

gdbrc now also supports the coredump backend by extending the commandline
templates with core-based tracin, for the coredump backend only.
As a side effect, debuggers now can have a corefile template variable
which is the path to the on-disk corefile in the event that the legacy
coredumpd backend is used. Newer coredumpd-248+ allows us to invoke gdb
through coredumpctl directly, eliminating the need to faff about with
core files manually on our end.

Everything else stays the same. As far as the UI bits are concerned
nothing changes between a kcrash backend and a coredump backend.

Metadata file presence currently is doubling as "this crash has not been
dealt with" indicator. As such, metadata files are only cleaned up if the
user somehow interacts with drkonqi to discard the dialog. This is to
assist with future development to implement "an application has crashed
in the past" style behavior (e.g. when apps crashed on logout).

`drkonqi-coredump-cleanup.{service,timer}`
========================================

Is a cleanup system in case crashes fall through the cracks and don't
get their metadata files clean up. This is largely a stop-gap measure
because this commit does not deal with actually picking up crashes that
happened at logout - this requires additional UI engineering first.
parent 74962561
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment