From: Christian Brauner <brauner@kernel.org>
To: Alexander Mikhalitsyn <alexander@mihalicyn.com>
Cc: linux-fsdevel@vger.kernel.org, "Jann Horn" <jannh@google.com>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Kuniyuki Iwashima" <kuniyu@amazon.com>,
"Eric Dumazet" <edumazet@google.com>,
"Oleg Nesterov" <oleg@redhat.com>,
"David S. Miller" <davem@davemloft.net>,
"Alexander Viro" <viro@zeniv.linux.org.uk>,
"Daan De Meyer" <daan.j.demeyer@gmail.com>,
"David Rheinsberg" <david@readahead.eu>,
"Jakub Kicinski" <kuba@kernel.org>, "Jan Kara" <jack@suse.cz>,
"Lennart Poettering" <lennart@poettering.net>,
"Luca Boccassi" <bluca@debian.org>, "Mike Yuan" <me@yhndnzj.com>,
"Paolo Abeni" <pabeni@redhat.com>,
"Simon Horman" <horms@kernel.org>,
"Zbigniew Jędrzejewski-Szmek" <zbyszek@in.waw.pl>,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
linux-security-module@vger.kernel.org
Subject: Re: [PATCH v7 4/9] coredump: add coredump socket
Date: Fri, 16 May 2025 10:30:28 +0200 [thread overview]
Message-ID: <20250516-verplanen-bewaffnen-85a3b0a1e941@brauner> (raw)
In-Reply-To: <CAJqdLrpjoUQERfbRfxUNC=WN8iQQySPNBUAvvTiub=8p_iJTuA@mail.gmail.com>
On Thu, May 15, 2025 at 03:47:33PM +0200, Alexander Mikhalitsyn wrote:
> Am Do., 15. Mai 2025 um 00:04 Uhr schrieb Christian Brauner
> <brauner@kernel.org>:
> >
> > Coredumping currently supports two modes:
> >
> > (1) Dumping directly into a file somewhere on the filesystem.
> > (2) Dumping into a pipe connected to a usermode helper process
> > spawned as a child of the system_unbound_wq or kthreadd.
> >
> > For simplicity I'm mostly ignoring (1). There's probably still some
> > users of (1) out there but processing coredumps in this way can be
> > considered adventurous especially in the face of set*id binaries.
> >
> > The most common option should be (2) by now. It works by allowing
> > userspace to put a string into /proc/sys/kernel/core_pattern like:
> >
> > |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h
> >
> > The "|" at the beginning indicates to the kernel that a pipe must be
> > used. The path following the pipe indicator is a path to a binary that
> > will be spawned as a usermode helper process. Any additional parameters
> > pass information about the task that is generating the coredump to the
> > binary that processes the coredump.
> >
> > In the example core_pattern shown above systemd-coredump is spawned as a
> > usermode helper. There's various conceptual consequences of this
> > (non-exhaustive list):
> >
> > - systemd-coredump is spawned with file descriptor number 0 (stdin)
> > connected to the read-end of the pipe. All other file descriptors are
> > closed. That specifically includes 1 (stdout) and 2 (stderr). This has
> > already caused bugs because userspace assumed that this cannot happen
> > (Whether or not this is a sane assumption is irrelevant.).
> >
> > - systemd-coredump will be spawned as a child of system_unbound_wq. So
> > it is not a child of any userspace process and specifically not a
> > child of PID 1. It cannot be waited upon and is in a weird hybrid
> > upcall which are difficult for userspace to control correctly.
> >
> > - systemd-coredump is spawned with full kernel privileges. This
> > necessitates all kinds of weird privilege dropping excercises in
> > userspace to make this safe.
> >
> > - A new usermode helper has to be spawned for each crashing process.
> >
> > This series adds a new mode:
> >
> > (3) Dumping into an AF_UNIX socket.
> >
> > Userspace can set /proc/sys/kernel/core_pattern to:
> >
> > @/path/to/coredump.socket
> >
> > The "@" at the beginning indicates to the kernel that an AF_UNIX
> > coredump socket will be used to process coredumps.
> >
> > The coredump socket must be located in the initial mount namespace.
> > When a task coredumps it opens a client socket in the initial network
> > namespace and connects to the coredump socket.
> >
> > - The coredump server uses SO_PEERPIDFD to get a stable handle on the
> > connected crashing task. The retrieved pidfd will provide a stable
> > reference even if the crashing task gets SIGKILLed while generating
> > the coredump.
> >
> > - By setting core_pipe_limit non-zero userspace can guarantee that the
> > crashing task cannot be reaped behind it's back and thus process all
> > necessary information in /proc/<pid>. The SO_PEERPIDFD can be used to
> > detect whether /proc/<pid> still refers to the same process.
> >
> > The core_pipe_limit isn't used to rate-limit connections to the
> > socket. This can simply be done via AF_UNIX sockets directly.
> >
> > - The pidfd for the crashing task will grow new information how the task
> > coredumps.
> >
> > - The coredump server should mark itself as non-dumpable.
> >
> > - A container coredump server in a separate network namespace can simply
> > bind to another well-know address and systemd-coredump fowards
> > coredumps to the container.
> >
> > - Coredumps could in the future also be handled via per-user/session
> > coredump servers that run only with that users privileges.
> >
> > The coredump server listens on the coredump socket and accepts a
> > new coredump connection. It then retrieves SO_PEERPIDFD for the
> > client, inspects uid/gid and hands the accepted client to the users
> > own coredump handler which runs with the users privileges only
> > (It must of coure pay close attention to not forward crashing suid
> > binaries.).
> >
> > The new coredump socket will allow userspace to not have to rely on
> > usermode helpers for processing coredumps and provides a safer way to
> > handle them instead of relying on super privileged coredumping helpers
> > that have and continue to cause significant CVEs.
> >
> > This will also be significantly more lightweight since no fork()+exec()
> > for the usermodehelper is required for each crashing process. The
> > coredump server in userspace can e.g., just keep a worker pool.
> >
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
>
> Reviewed-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
>
> > ---
> > fs/coredump.c | 133 ++++++++++++++++++++++++++++++++++++++++++++++++----
> > include/linux/net.h | 1 +
> > net/unix/af_unix.c | 53 ++++++++++++++++-----
> > 3 files changed, 166 insertions(+), 21 deletions(-)
> >
> > diff --git a/fs/coredump.c b/fs/coredump.c
> > index a70929c3585b..e1256ebb89c1 100644
> > --- a/fs/coredump.c
> > +++ b/fs/coredump.c
> > @@ -44,7 +44,11 @@
> > #include <linux/sysctl.h>
> > #include <linux/elf.h>
> > #include <linux/pidfs.h>
> > +#include <linux/net.h>
> > +#include <linux/socket.h>
> > +#include <net/net_namespace.h>
> > #include <uapi/linux/pidfd.h>
> > +#include <uapi/linux/un.h>
> >
> > #include <linux/uaccess.h>
> > #include <asm/mmu_context.h>
> > @@ -79,6 +83,7 @@ unsigned int core_file_note_size_limit = CORE_FILE_NOTE_SIZE_DEFAULT;
> > enum coredump_type_t {
> > COREDUMP_FILE = 1,
> > COREDUMP_PIPE = 2,
> > + COREDUMP_SOCK = 3,
> > };
> >
> > struct core_name {
> > @@ -232,13 +237,16 @@ static int format_corename(struct core_name *cn, struct coredump_params *cprm,
> > cn->corename = NULL;
> > if (*pat_ptr == '|')
> > cn->core_type = COREDUMP_PIPE;
> > + else if (*pat_ptr == '@')
> > + cn->core_type = COREDUMP_SOCK;
> > else
> > cn->core_type = COREDUMP_FILE;
> > if (expand_corename(cn, core_name_size))
> > return -ENOMEM;
> > cn->corename[0] = '\0';
> >
> > - if (cn->core_type == COREDUMP_PIPE) {
> > + switch (cn->core_type) {
> > + case COREDUMP_PIPE: {
> > int argvs = sizeof(core_pattern) / 2;
> > (*argv) = kmalloc_array(argvs, sizeof(**argv), GFP_KERNEL);
> > if (!(*argv))
> > @@ -247,6 +255,33 @@ static int format_corename(struct core_name *cn, struct coredump_params *cprm,
> > ++pat_ptr;
> > if (!(*pat_ptr))
> > return -ENOMEM;
> > + break;
> > + }
> > + case COREDUMP_SOCK: {
> > + /* skip the @ */
> > + pat_ptr++;
>
> nit: I would do
> if (!(*pat_ptr))
> return -ENOMEM;
> as we do for the COREDUMP_PIPE case above.
> just in case if something will change in cn_printf() to eliminate any
> chance of crashes in there.
Ok.
next prev parent reply other threads:[~2025-05-16 8:30 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-14 22:03 [PATCH v7 0/9] coredump: add coredump socket Christian Brauner
2025-05-14 22:03 ` [PATCH v7 1/9] coredump: massage format_corname() Christian Brauner
2025-05-15 13:19 ` Alexander Mikhalitsyn
2025-05-15 13:36 ` Serge E. Hallyn
2025-05-15 20:52 ` Jann Horn
2025-05-14 22:03 ` [PATCH v7 2/9] coredump: massage do_coredump() Christian Brauner
2025-05-15 13:21 ` Alexander Mikhalitsyn
2025-05-15 20:52 ` Jann Horn
2025-05-14 22:03 ` [PATCH v7 3/9] coredump: reflow dump helpers a little Christian Brauner
2025-05-15 13:22 ` Alexander Mikhalitsyn
2025-05-15 20:53 ` Jann Horn
2025-05-14 22:03 ` [PATCH v7 4/9] coredump: add coredump socket Christian Brauner
2025-05-15 13:47 ` Alexander Mikhalitsyn
2025-05-16 8:30 ` Christian Brauner [this message]
2025-05-15 17:00 ` Kuniyuki Iwashima
2025-05-15 20:52 ` Jann Horn
2025-05-15 21:04 ` Kuniyuki Iwashima
2025-05-16 10:14 ` Christian Brauner
2025-05-15 20:54 ` Jann Horn
2025-05-15 21:15 ` Kuniyuki Iwashima
2025-05-16 10:09 ` Christian Brauner
2025-05-16 10:20 ` Christian Brauner
2025-05-14 22:03 ` [PATCH v7 5/9] pidfs, coredump: add PIDFD_INFO_COREDUMP Christian Brauner
2025-05-15 14:08 ` Alexander Mikhalitsyn
2025-05-15 20:56 ` Jann Horn
2025-05-15 21:37 ` Jann Horn
2025-05-16 10:34 ` Christian Brauner
2025-05-16 14:26 ` Jann Horn
2025-05-14 22:03 ` [PATCH v7 6/9] coredump: show supported coredump modes Christian Brauner
2025-05-15 13:56 ` Alexander Mikhalitsyn
2025-05-15 20:56 ` Jann Horn
2025-05-14 22:03 ` [PATCH v7 7/9] coredump: validate socket name as it is written Christian Brauner
2025-05-15 14:03 ` Alexander Mikhalitsyn
2025-05-15 20:56 ` Jann Horn
2025-05-16 9:54 ` Christian Brauner
2025-05-16 13:29 ` Christian Brauner
2025-05-14 22:03 ` [PATCH v7 8/9] selftests/pidfd: add PIDFD_INFO_COREDUMP infrastructure Christian Brauner
2025-05-15 14:35 ` Alexander Mikhalitsyn
2025-05-14 22:03 ` [PATCH v7 9/9] selftests/coredump: add tests for AF_UNIX coredumps Christian Brauner
2025-05-15 14:37 ` Alexander Mikhalitsyn
2025-05-14 22:38 ` [PATCH v7 0/9] coredump: add coredump socket Luca Boccassi
2025-05-15 9:17 ` Christian Brauner
2025-05-15 9:26 ` Lennart Poettering
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250516-verplanen-bewaffnen-85a3b0a1e941@brauner \
--to=brauner@kernel.org \
--cc=alexander@mihalicyn.com \
--cc=bluca@debian.org \
--cc=daan.j.demeyer@gmail.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=david@readahead.eu \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=jack@suse.cz \
--cc=jannh@google.com \
--cc=kuba@kernel.org \
--cc=kuniyu@amazon.com \
--cc=lennart@poettering.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-security-module@vger.kernel.org \
--cc=me@yhndnzj.com \
--cc=netdev@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=pabeni@redhat.com \
--cc=viro@zeniv.linux.org.uk \
--cc=zbyszek@in.waw.pl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox