From: Benjamin Drung <benjamin.drung@canonical.com>
To: Christian Brauner <brauner@kernel.org>
Cc: linux-fsdevel@vger.kernel.org, "Oleg Nesterov" <oleg@redhat.com>,
"Luca Boccassi" <luca.boccassi@gmail.com>,
"Lennart Poettering" <lennart@poettering.net>,
"Daan De Meyer" <daan.j.demeyer@gmail.com>,
"Mike Yuan" <me@yhndnzj.com>,
"Zbigniew Jędrzejewski-Szmek" <zbyszek@in.waw.pl>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 3/3] coredump: hand a pidfd to the usermode coredump helper
Date: Wed, 30 Apr 2025 13:39:58 +0200 [thread overview]
Message-ID: <43933af22cdbff81e75a07461b29b91afacedadc.camel@canonical.com> (raw)
In-Reply-To: <20250425-erbschaft-nummer-8ddbe420ae22@brauner>
On Fri, 2025-04-25 at 18:49 +0200, Christian Brauner wrote:
> On Fri, Apr 25, 2025 at 02:03:34PM +0200, Benjamin Drung wrote:
> > On Fri, 2025-04-25 at 13:57 +0200, Christian Brauner wrote:
> > > On Fri, Apr 25, 2025 at 01:31:56PM +0200, Benjamin Drung wrote:
> > > > Hi,
> > > >
> > > > On Mon, 2025-04-14 at 15:55 +0200, Christian Brauner wrote:
> > > > > Give userspace a way to instruct the kernel to install a pidfd into the
> > > > > usermode helper process. This makes coredump handling a lot more
> > > > > reliable for userspace. In parallel with this commit we already have
> > > > > systemd adding support for this in [1].
> > > > >
> > > > > We create a pidfs file for the coredumping process when we process the
> > > > > corename pattern. When the usermode helper process is forked we then
> > > > > install the pidfs file as file descriptor three into the usermode
> > > > > helpers file descriptor table so it's available to the exec'd program.
> > > > >
> > > > > Since usermode helpers are either children of the system_unbound_wq
> > > > > workqueue or kthreadd we know that the file descriptor table is empty
> > > > > and can thus always use three as the file descriptor number.
> > > > >
> > > > > Note, that we'll install a pidfd for the thread-group leader even if a
> > > > > subthread is calling do_coredump(). We know that task linkage hasn't
> > > > > been removed due to delay_group_leader() and even if this @current isn't
> > > > > the actual thread-group leader we know that the thread-group leader
> > > > > cannot be reaped until @current has exited.
> > > > >
> > > > > Link: https://github.com/systemd/systemd/pull/37125 [1]
> > > > > Tested-by: Luca Boccassi <luca.boccassi@gmail.com>
> > > > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > > > ---
> > > > > fs/coredump.c | 59 ++++++++++++++++++++++++++++++++++++++++++++----
> > > > > include/linux/coredump.h | 1 +
> > > > > 2 files changed, 56 insertions(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/fs/coredump.c b/fs/coredump.c
> > > > > index 9da592aa8f16..403be0ff780e 100644
> > > > > --- a/fs/coredump.c
> > > > > +++ b/fs/coredump.c
> > > > > @@ -43,6 +43,9 @@
> > > > > #include <linux/timekeeping.h>
> > > > > #include <linux/sysctl.h>
> > > > > #include <linux/elf.h>
> > > > > +#include <linux/pidfs.h>
> > > > > +#include <uapi/linux/pidfd.h>
> > > > > +#include <linux/vfsdebug.h>
> > > > >
> > > > > #include <linux/uaccess.h>
> > > > > #include <asm/mmu_context.h>
> > > > > @@ -60,6 +63,12 @@ static void free_vma_snapshot(struct coredump_params *cprm);
> > > > > #define CORE_FILE_NOTE_SIZE_DEFAULT (4*1024*1024)
> > > > > /* Define a reasonable max cap */
> > > > > #define CORE_FILE_NOTE_SIZE_MAX (16*1024*1024)
> > > > > +/*
> > > > > + * File descriptor number for the pidfd for the thread-group leader of
> > > > > + * the coredumping task installed into the usermode helper's file
> > > > > + * descriptor table.
> > > > > + */
> > > > > +#define COREDUMP_PIDFD_NUMBER 3
> > > > >
> > > > > static int core_uses_pid;
> > > > > static unsigned int core_pipe_limit;
> > > > > @@ -339,6 +348,27 @@ static int format_corename(struct core_name *cn, struct coredump_params *cprm,
> > > > > case 'C':
> > > > > err = cn_printf(cn, "%d", cprm->cpu);
> > > > > break;
> > > > > + /* pidfd number */
> > > > > + case 'F': {
> > > > > + /*
> > > > > + * Installing a pidfd only makes sense if
> > > > > + * we actually spawn a usermode helper.
> > > > > + */
> > > > > + if (!ispipe)
> > > > > + break;
> > > > > +
> > > > > + /*
> > > > > + * Note that we'll install a pidfd for the
> > > > > + * thread-group leader. We know that task
> > > > > + * linkage hasn't been removed yet and even if
> > > > > + * this @current isn't the actual thread-group
> > > > > + * leader we know that the thread-group leader
> > > > > + * cannot be reaped until @current has exited.
> > > > > + */
> > > > > + cprm->pid = task_tgid(current);
> > > > > + err = cn_printf(cn, "%d", COREDUMP_PIDFD_NUMBER);
> > > > > + break;
> > > > > + }
> > > > > default:
> > > > > break;
> > > > > }
> > > > >
> > > >
> > > > I tried this change with Apport: I took the Ubuntu mainline kernel build
> > > > https://kernel.ubuntu.com/mainline/daily/2025-04-24/ (that refers to
> > > > mainline commit e54f9b0410347c49b7ffdd495578811e70d7a407) and applied
> > > > these three patches on top. Then I modified Apport to take the
> > > > additional `-F%F` and tested that on Ubuntu 25.04 (plucky). The result
> > > > is the coredump failed as long as there was `-F%F` on
> > >
> > > I have no clue what -F%F is and whether that leading -F is something
> > > specific to Apport but the specifier is %F not -F%F. For example:
> > >
> > > > cat /proc/sys/kernel/core_pattern
> > > |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h %F
> > >
> > > And note that this requires the pipe logic to be used, aka "|" needs to
> > > be specified. Without it this doesn't make sense.
> >
> > Apport takes short option parameters. They match the kernel template
> > specifiers. The failing pattern:
> >
> > $ cat /proc/sys/kernel/core_pattern
> > > /usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -F%F -- %E
> >
> > Once I drop %F Apport is called without issues:
> >
> > $ cat /proc/sys/kernel/core_pattern
> > > /usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E
>
> Youm must have CONFIG_DEBUG_VFS=y enabled where we trample the pidfs
> file we just allocated. It's a debug only assert. I've removed it now
> and pushed it to vfs-6.16.coredump. Can you either try with that or
> simply unset CONFIG_DEBUG_VFS and retest.
Yes, the Ubuntu mainline builds have CONFIG_DEBUG_VFS=y set. I tried
your vfs-6.16.coredump and it works. Thanks.
--
Benjamin Drung
Debian & Ubuntu Developer
prev parent reply other threads:[~2025-04-30 11:40 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-14 13:55 [PATCH v2 0/3] coredump: hand a pidfd to the usermode coredump helper Christian Brauner
2025-04-14 13:55 ` [PATCH v2 1/3] pidfs: move O_RDWR into pidfs_alloc_file() Christian Brauner
2025-04-14 13:55 ` [PATCH v2 2/3] coredump: fix error handling for replace_fd() Christian Brauner
2025-04-14 13:55 ` [PATCH v2 3/3] coredump: hand a pidfd to the usermode coredump helper Christian Brauner
2025-04-14 14:14 ` Oleg Nesterov
2025-04-14 14:26 ` Christian Brauner
2025-04-14 14:28 ` Oleg Nesterov
2025-04-14 14:41 ` Christian Brauner
2025-04-25 11:31 ` Benjamin Drung
2025-04-25 11:57 ` Christian Brauner
2025-04-25 12:03 ` Benjamin Drung
2025-04-25 16:49 ` Christian Brauner
2025-04-30 11:39 ` Benjamin Drung [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43933af22cdbff81e75a07461b29b91afacedadc.camel@canonical.com \
--to=benjamin.drung@canonical.com \
--cc=brauner@kernel.org \
--cc=daan.j.demeyer@gmail.com \
--cc=lennart@poettering.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luca.boccassi@gmail.com \
--cc=me@yhndnzj.com \
--cc=oleg@redhat.com \
--cc=zbyszek@in.waw.pl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox