* [PATCH] pidfd: prevent creation of pidfds for kthreads
@ 2024-07-31 10:01 Christian Brauner
2024-07-31 14:51 ` Oleg Nesterov
2024-08-18 3:58 ` Eric Biggers
0 siblings, 2 replies; 13+ messages in thread
From: Christian Brauner @ 2024-07-31 10:01 UTC (permalink / raw)
To: linux-fsdevel, linux-kernel
Cc: Christian Brauner, Oleg Nesterov, Aleksa Sarai, Tycho Andersen,
Daan De Meyer, Tejun Heo, stable
It's currently possible to create pidfds for kthreads but it is unclear
what that is supposed to mean. Until we have use-cases for it and we
figured out what behavior we want block the creation of pidfds for
kthreads.
Fixes: 32fcb426ec00 ("pid: add pidfd_open()")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
kernel/fork.c | 25 ++++++++++++++++++++++---
1 file changed, 22 insertions(+), 3 deletions(-)
diff --git a/kernel/fork.c b/kernel/fork.c
index cc760491f201..18bdc87209d0 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2053,11 +2053,24 @@ static int __pidfd_prepare(struct pid *pid, unsigned int flags, struct file **re
*/
int pidfd_prepare(struct pid *pid, unsigned int flags, struct file **ret)
{
- bool thread = flags & PIDFD_THREAD;
-
- if (!pid || !pid_has_task(pid, thread ? PIDTYPE_PID : PIDTYPE_TGID))
+ if (!pid)
return -EINVAL;
+ scoped_guard(rcu) {
+ struct task_struct *tsk;
+
+ if (flags & PIDFD_THREAD)
+ tsk = pid_task(pid, PIDTYPE_PID);
+ else
+ tsk = pid_task(pid, PIDTYPE_TGID);
+ if (!tsk)
+ return -EINVAL;
+
+ /* Don't create pidfds for kernel threads for now. */
+ if (tsk->flags & PF_KTHREAD)
+ return -EINVAL;
+ }
+
return __pidfd_prepare(pid, flags, ret);
}
@@ -2403,6 +2416,12 @@ __latent_entropy struct task_struct *copy_process(
if (clone_flags & CLONE_PIDFD) {
int flags = (clone_flags & CLONE_THREAD) ? PIDFD_THREAD : 0;
+ /* Don't create pidfds for kernel threads for now. */
+ if (args->kthread) {
+ retval = -EINVAL;
+ goto bad_fork_free_pid;
+ }
+
/* Note that no task has been attached to @pid yet. */
retval = __pidfd_prepare(pid, flags, &pidfile);
if (retval < 0)
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] pidfd: prevent creation of pidfds for kthreads
2024-07-31 10:01 [PATCH] pidfd: prevent creation of pidfds for kthreads Christian Brauner
@ 2024-07-31 14:51 ` Oleg Nesterov
2024-08-01 6:58 ` Christian Brauner
2024-08-18 3:58 ` Eric Biggers
1 sibling, 1 reply; 13+ messages in thread
From: Oleg Nesterov @ 2024-07-31 14:51 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, linux-kernel, Aleksa Sarai, Tycho Andersen,
Daan De Meyer, Tejun Heo, stable
On 07/31, Christian Brauner wrote:
>
> It's currently possible to create pidfds for kthreads but it is unclear
> what that is supposed to mean. Until we have use-cases for it and we
> figured out what behavior we want block the creation of pidfds for
> kthreads.
Hmm... could you explain your concerns? Why do you think we should disallow
pidfd_open(pid-of-kthread) ?
> @@ -2403,6 +2416,12 @@ __latent_entropy struct task_struct *copy_process(
> if (clone_flags & CLONE_PIDFD) {
> int flags = (clone_flags & CLONE_THREAD) ? PIDFD_THREAD : 0;
>
> + /* Don't create pidfds for kernel threads for now. */
> + if (args->kthread) {
> + retval = -EINVAL;
> + goto bad_fork_free_pid;
Do we really need this check? Userspace can't use args->kthread != NULL,
the kernel users should not use CLONE_PIDFD.
Oleg.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] pidfd: prevent creation of pidfds for kthreads
2024-07-31 14:51 ` Oleg Nesterov
@ 2024-08-01 6:58 ` Christian Brauner
2024-08-01 8:01 ` Oleg Nesterov
0 siblings, 1 reply; 13+ messages in thread
From: Christian Brauner @ 2024-08-01 6:58 UTC (permalink / raw)
To: Oleg Nesterov
Cc: linux-fsdevel, linux-kernel, Aleksa Sarai, Tycho Andersen,
Daan De Meyer, Tejun Heo, stable
On Wed, Jul 31, 2024 at 04:51:33PM GMT, Oleg Nesterov wrote:
> On 07/31, Christian Brauner wrote:
> >
> > It's currently possible to create pidfds for kthreads but it is unclear
> > what that is supposed to mean. Until we have use-cases for it and we
> > figured out what behavior we want block the creation of pidfds for
> > kthreads.
>
> Hmm... could you explain your concerns? Why do you think we should disallow
> pidfd_open(pid-of-kthread) ?
It basically just works now and it's not intentional - at least not on
my part. You can't send signals to them, you may or may not get notified
via poll when a kthread exits. If we ever want this to be useful I would
like to enable it explicitly.
Plus, this causes confusion in userspace. When you have qemu running
with kvm support then kvm creates several kthreads (that inherit the
cgroup of the calling process). If you try to kill those instances via
systemctl kill or systemctl stop then pidfds for these kthreads are
opened but sending a signal to them is meaningless.
(So imho this causes more confusion then it is actually helpful. If we
add supports for kthreads I'd also like pidfs to gain a way to identify
them via statx() or fdinfo.)
> > @@ -2403,6 +2416,12 @@ __latent_entropy struct task_struct *copy_process(
> > if (clone_flags & CLONE_PIDFD) {
> > int flags = (clone_flags & CLONE_THREAD) ? PIDFD_THREAD : 0;
> >
> > + /* Don't create pidfds for kernel threads for now. */
> > + if (args->kthread) {
> > + retval = -EINVAL;
> > + goto bad_fork_free_pid;
>
> Do we really need this check? Userspace can't use args->kthread != NULL,
> the kernel users should not use CLONE_PIDFD.
Yeah, I know. That's really just proactive so that user of e.g.,
copy_process() such as vhost or so on don't start handing out pidfds for
stuff without requring changes to the helper itself.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] pidfd: prevent creation of pidfds for kthreads
2024-08-01 6:58 ` Christian Brauner
@ 2024-08-01 8:01 ` Oleg Nesterov
2024-08-01 13:48 ` Christian Brauner
0 siblings, 1 reply; 13+ messages in thread
From: Oleg Nesterov @ 2024-08-01 8:01 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, linux-kernel, Aleksa Sarai, Tycho Andersen,
Daan De Meyer, Tejun Heo, stable
OK, I won't argue, but ....
On 08/01, Christian Brauner wrote:
>
> On Wed, Jul 31, 2024 at 04:51:33PM GMT, Oleg Nesterov wrote:
> > On 07/31, Christian Brauner wrote:
> > >
> > > It's currently possible to create pidfds for kthreads but it is unclear
> > > what that is supposed to mean. Until we have use-cases for it and we
> > > figured out what behavior we want block the creation of pidfds for
> > > kthreads.
> >
> > Hmm... could you explain your concerns? Why do you think we should disallow
> > pidfd_open(pid-of-kthread) ?
>
> It basically just works now and it's not intentional - at least not on
> my part. You can't send signals to them,
Yes, you can't send signals to kthread. So what?
You can't send signals to the normal processes if check_kill_permission()
fails. And even if you are root, you can't send an unhandled signal via
pidfd = pidfd_open(1).
> you may or may not get notified
> via poll when a kthread exits.
Why? the exiting kthread should not differ in this respect?
> (So imho this causes more confusion then it is actually helpful. If we
> add supports for kthreads I'd also like pidfs to gain a way to identify
> them via statx() or fdinfo.)
/proc/$pid/status has a "Kthread" field...
> > > @@ -2403,6 +2416,12 @@ __latent_entropy struct task_struct *copy_process(
> > > if (clone_flags & CLONE_PIDFD) {
> > > int flags = (clone_flags & CLONE_THREAD) ? PIDFD_THREAD : 0;
> > >
> > > + /* Don't create pidfds for kernel threads for now. */
> > > + if (args->kthread) {
> > > + retval = -EINVAL;
> > > + goto bad_fork_free_pid;
> >
> > Do we really need this check? Userspace can't use args->kthread != NULL,
> > the kernel users should not use CLONE_PIDFD.
>
> Yeah, I know. That's really just proactive so that user of e.g.,
> copy_process() such as vhost or so on don't start handing out pidfds for
> stuff without requring changes to the helper itself.
Then I'd suggest WARN_ON_ONCE(args->kthread).
But as I said I won't argue. I see nothing wrong in this patch.
Oleg.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] pidfd: prevent creation of pidfds for kthreads
2024-08-01 8:01 ` Oleg Nesterov
@ 2024-08-01 13:48 ` Christian Brauner
2024-08-01 13:59 ` Oleg Nesterov
0 siblings, 1 reply; 13+ messages in thread
From: Christian Brauner @ 2024-08-01 13:48 UTC (permalink / raw)
To: Oleg Nesterov
Cc: linux-fsdevel, linux-kernel, Aleksa Sarai, Tycho Andersen,
Daan De Meyer, Tejun Heo, stable
On Thu, Aug 01, 2024 at 10:01:20AM GMT, Oleg Nesterov wrote:
> OK, I won't argue, but ....
>
> On 08/01, Christian Brauner wrote:
> >
> > On Wed, Jul 31, 2024 at 04:51:33PM GMT, Oleg Nesterov wrote:
> > > On 07/31, Christian Brauner wrote:
> > > >
> > > > It's currently possible to create pidfds for kthreads but it is unclear
> > > > what that is supposed to mean. Until we have use-cases for it and we
> > > > figured out what behavior we want block the creation of pidfds for
> > > > kthreads.
> > >
> > > Hmm... could you explain your concerns? Why do you think we should disallow
> > > pidfd_open(pid-of-kthread) ?
> >
> > It basically just works now and it's not intentional - at least not on
> > my part. You can't send signals to them,
>
> Yes, you can't send signals to kthread. So what?
>
> You can't send signals to the normal processes if check_kill_permission()
> fails. And even if you are root, you can't send an unhandled signal via
> pidfd = pidfd_open(1).
>
> > you may or may not get notified
> > via poll when a kthread exits.
>
> Why? the exiting kthread should not differ in this respect?
Why do you want to allow it? I see zero reason to get a reference to a
kthread if there's no use-case for it. kthreads are mostly a kernel
thing so why give userspace handles to it. And as I said before, there's
userspace out there that's already confused why they can get references
to them in the first place.
>
> > (So imho this causes more confusion then it is actually helpful. If we
> > add supports for kthreads I'd also like pidfs to gain a way to identify
> > them via statx() or fdinfo.)
>
> /proc/$pid/status has a "Kthread" field...
Going forward, I don't want to force people to parse basic stuff out of
procfs. Ideally, they'll be able to mostly rely on pidfd operations
only.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] pidfd: prevent creation of pidfds for kthreads
2024-08-01 13:48 ` Christian Brauner
@ 2024-08-01 13:59 ` Oleg Nesterov
0 siblings, 0 replies; 13+ messages in thread
From: Oleg Nesterov @ 2024-08-01 13:59 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, linux-kernel, Aleksa Sarai, Tycho Andersen,
Daan De Meyer, Tejun Heo, stable
On 08/01, Christian Brauner wrote:
>
> On Thu, Aug 01, 2024 at 10:01:20AM GMT, Oleg Nesterov wrote:
> > OK, I won't argue, but ....
> >
> > > you may or may not get notified
> > > via poll when a kthread exits.
> >
> > Why? the exiting kthread should not differ in this respect?
>
> Why do you want to allow it?
Again, I didn't try to "nack" your patch. Just tried to understand
your motivation.
And "may not get notified" above doesn't look right to me...
> > /proc/$pid/status has a "Kthread" field...
>
> Going forward, I don't want to force people to parse basic stuff out of
> procfs. Ideally, they'll be able to mostly rely on pidfd operations
> only.
Agreed.
Oleg.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] pidfd: prevent creation of pidfds for kthreads
2024-07-31 10:01 [PATCH] pidfd: prevent creation of pidfds for kthreads Christian Brauner
2024-07-31 14:51 ` Oleg Nesterov
@ 2024-08-18 3:58 ` Eric Biggers
2024-08-19 8:41 ` Christian Brauner
1 sibling, 1 reply; 13+ messages in thread
From: Eric Biggers @ 2024-08-18 3:58 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, linux-kernel, Oleg Nesterov, Aleksa Sarai,
Tycho Andersen, Daan De Meyer, Tejun Heo, stable
Hi Christian,
On Wed, Jul 31, 2024 at 12:01:12PM +0200, Christian Brauner wrote:
> It's currently possible to create pidfds for kthreads but it is unclear
> what that is supposed to mean. Until we have use-cases for it and we
> figured out what behavior we want block the creation of pidfds for
> kthreads.
>
> Fixes: 32fcb426ec00 ("pid: add pidfd_open()")
> Cc: stable@vger.kernel.org
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---
> kernel/fork.c | 25 ++++++++++++++++++++++---
> 1 file changed, 22 insertions(+), 3 deletions(-)
Unfortunately this commit broke systemd-shutdown's ability to kill processes,
which makes some filesystems no longer get unmounted at shutdown.
It looks like systemd-shutdown relies on being able to create a pidfd for any
process listed in /proc (even a kthread), and if it gets EINVAL it treats it a
fatal error and stops looking for more processes...
This is what shows up in the system log:
systemd[1]: Shutting down.
systemd-shutdown[1]: Syncing filesystems and block devices.
systemd-shutdown[1]: Sending SIGTERM to remaining processes...
systemd-shutdown[1]: Failed to enumerate /proc/: Invalid argument
systemd-shutdown[1]: Sending SIGKILL to remaining processes...
systemd-shutdown[1]: Failed to enumerate /proc/: Invalid argument
systemd-shutdown[1]: Unmounting file systems.
(sd-umount)[17359]: Unmounting '/run/credentials/systemd-vconsole-setup.service'.
(sd-umount)[17360]: Unmounting '/run/credentials/systemd-journald.service'.
(sd-remount)[17361]: Remounting '/' read-only with options ''.
(sd-remount)[17361]: Failed to remount '/' read-only: Device or resource busy
(sd-remount)[17362]: Remounting '/' read-only with options ''.
(sd-remount)[17362]: Failed to remount '/' read-only: Device or resource busy
systemd-shutdown[1]: Not all file systems unmounted, 1 left.
- Eric
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] pidfd: prevent creation of pidfds for kthreads
2024-08-18 3:58 ` Eric Biggers
@ 2024-08-19 8:41 ` Christian Brauner
2024-08-20 19:34 ` Eric Biggers
2024-08-23 5:23 ` Linux regression tracking (Thorsten Leemhuis)
0 siblings, 2 replies; 13+ messages in thread
From: Christian Brauner @ 2024-08-19 8:41 UTC (permalink / raw)
To: Eric Biggers
Cc: linux-fsdevel, linux-kernel, Oleg Nesterov, Aleksa Sarai,
Tycho Andersen, Daan De Meyer, Tejun Heo, stable
[-- Attachment #1: Type: text/plain, Size: 1252 bytes --]
On Sat, Aug 17, 2024 at 08:58:18PM GMT, Eric Biggers wrote:
> Hi Christian,
>
> On Wed, Jul 31, 2024 at 12:01:12PM +0200, Christian Brauner wrote:
> > It's currently possible to create pidfds for kthreads but it is unclear
> > what that is supposed to mean. Until we have use-cases for it and we
> > figured out what behavior we want block the creation of pidfds for
> > kthreads.
> >
> > Fixes: 32fcb426ec00 ("pid: add pidfd_open()")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > ---
> > kernel/fork.c | 25 ++++++++++++++++++++++---
> > 1 file changed, 22 insertions(+), 3 deletions(-)
>
> Unfortunately this commit broke systemd-shutdown's ability to kill processes,
> which makes some filesystems no longer get unmounted at shutdown.
>
> It looks like systemd-shutdown relies on being able to create a pidfd for any
> process listed in /proc (even a kthread), and if it gets EINVAL it treats it a
> fatal error and stops looking for more processes...
Thanks for the report!
I talked to Daan De Meyer who made that change and he said that this
must a systemd version that hasn't gotten his fixes yet. In any case, if
this causes regression then I'll revert it right now. See the appended
revert.
[-- Attachment #2: 0001-Revert-pidfd-prevent-creation-of-pidfds-for-kthreads.patch --]
[-- Type: text/x-diff, Size: 2038 bytes --]
From 780d60bac21ebee5c2465d21fe51b67bb9b054db Mon Sep 17 00:00:00 2001
From: Christian Brauner <brauner@kernel.org>
Date: Mon, 19 Aug 2024 10:38:23 +0200
Subject: [PATCH] Revert "pidfd: prevent creation of pidfds for kthreads"
This reverts commit 3b5bbe798b2451820e74243b738268f51901e7d0.
Eric reported that systemd-shutdown gets broken by blocking the creating
of pidfds for kthreads as older versions seems to rely on being able to
create a pidfd for any process in /proc.
Reported-by: Eric Biggers <ebiggers@kernel.org>
Link: https://lore.kernel.org/r/20240818035818.GA1929@sol.localdomain
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
kernel/fork.c | 25 +++----------------------
1 file changed, 3 insertions(+), 22 deletions(-)
diff --git a/kernel/fork.c b/kernel/fork.c
index 18bdc87209d0..cc760491f201 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2053,23 +2053,10 @@ static int __pidfd_prepare(struct pid *pid, unsigned int flags, struct file **re
*/
int pidfd_prepare(struct pid *pid, unsigned int flags, struct file **ret)
{
- if (!pid)
- return -EINVAL;
-
- scoped_guard(rcu) {
- struct task_struct *tsk;
-
- if (flags & PIDFD_THREAD)
- tsk = pid_task(pid, PIDTYPE_PID);
- else
- tsk = pid_task(pid, PIDTYPE_TGID);
- if (!tsk)
- return -EINVAL;
+ bool thread = flags & PIDFD_THREAD;
- /* Don't create pidfds for kernel threads for now. */
- if (tsk->flags & PF_KTHREAD)
- return -EINVAL;
- }
+ if (!pid || !pid_has_task(pid, thread ? PIDTYPE_PID : PIDTYPE_TGID))
+ return -EINVAL;
return __pidfd_prepare(pid, flags, ret);
}
@@ -2416,12 +2403,6 @@ __latent_entropy struct task_struct *copy_process(
if (clone_flags & CLONE_PIDFD) {
int flags = (clone_flags & CLONE_THREAD) ? PIDFD_THREAD : 0;
- /* Don't create pidfds for kernel threads for now. */
- if (args->kthread) {
- retval = -EINVAL;
- goto bad_fork_free_pid;
- }
-
/* Note that no task has been attached to @pid yet. */
retval = __pidfd_prepare(pid, flags, &pidfile);
if (retval < 0)
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] pidfd: prevent creation of pidfds for kthreads
2024-08-19 8:41 ` Christian Brauner
@ 2024-08-20 19:34 ` Eric Biggers
2024-08-21 7:41 ` Christian Brauner
2024-08-23 5:23 ` Linux regression tracking (Thorsten Leemhuis)
1 sibling, 1 reply; 13+ messages in thread
From: Eric Biggers @ 2024-08-20 19:34 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, linux-kernel, Oleg Nesterov, Aleksa Sarai,
Tycho Andersen, Daan De Meyer, Tejun Heo, stable
On Mon, Aug 19, 2024 at 10:41:15AM +0200, Christian Brauner wrote:
> On Sat, Aug 17, 2024 at 08:58:18PM GMT, Eric Biggers wrote:
> > Hi Christian,
> >
> > On Wed, Jul 31, 2024 at 12:01:12PM +0200, Christian Brauner wrote:
> > > It's currently possible to create pidfds for kthreads but it is unclear
> > > what that is supposed to mean. Until we have use-cases for it and we
> > > figured out what behavior we want block the creation of pidfds for
> > > kthreads.
> > >
> > > Fixes: 32fcb426ec00 ("pid: add pidfd_open()")
> > > Cc: stable@vger.kernel.org
> > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > ---
> > > kernel/fork.c | 25 ++++++++++++++++++++++---
> > > 1 file changed, 22 insertions(+), 3 deletions(-)
> >
> > Unfortunately this commit broke systemd-shutdown's ability to kill processes,
> > which makes some filesystems no longer get unmounted at shutdown.
> >
> > It looks like systemd-shutdown relies on being able to create a pidfd for any
> > process listed in /proc (even a kthread), and if it gets EINVAL it treats it a
> > fatal error and stops looking for more processes...
>
> Thanks for the report!
> I talked to Daan De Meyer who made that change and he said that this
> must a systemd version that hasn't gotten his fixes yet. In any case, if
> this causes regression then I'll revert it right now. See the appended
> revert.
Thanks for queueing up a revert.
This was on systemd 256.4 which was released less than a month ago.
I'm not sure what systemd fix you are talking about. Looking at killall() in
src/shared/killall.c on the latest "main" branch of systemd, it calls
proc_dir_read_pidref() => pidref_set_pid() => pidfd_open(), and EINVAL gets
passed back up to killall() and treated as a fatal error. ignore_proc() skips
kernel threads but is executed too late. I didn't test it, so I could be wrong,
but based on the code it does not appear to be fixed.
- Erici
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] pidfd: prevent creation of pidfds for kthreads
2024-08-20 19:34 ` Eric Biggers
@ 2024-08-21 7:41 ` Christian Brauner
2024-08-21 7:47 ` Daan De Meyer
0 siblings, 1 reply; 13+ messages in thread
From: Christian Brauner @ 2024-08-21 7:41 UTC (permalink / raw)
To: Eric Biggers
Cc: linux-fsdevel, linux-kernel, Oleg Nesterov, Aleksa Sarai,
Tycho Andersen, Daan De Meyer, Tejun Heo, stable
On Tue, Aug 20, 2024 at 12:34:14PM GMT, Eric Biggers wrote:
> On Mon, Aug 19, 2024 at 10:41:15AM +0200, Christian Brauner wrote:
> > On Sat, Aug 17, 2024 at 08:58:18PM GMT, Eric Biggers wrote:
> > > Hi Christian,
> > >
> > > On Wed, Jul 31, 2024 at 12:01:12PM +0200, Christian Brauner wrote:
> > > > It's currently possible to create pidfds for kthreads but it is unclear
> > > > what that is supposed to mean. Until we have use-cases for it and we
> > > > figured out what behavior we want block the creation of pidfds for
> > > > kthreads.
> > > >
> > > > Fixes: 32fcb426ec00 ("pid: add pidfd_open()")
> > > > Cc: stable@vger.kernel.org
> > > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > > ---
> > > > kernel/fork.c | 25 ++++++++++++++++++++++---
> > > > 1 file changed, 22 insertions(+), 3 deletions(-)
> > >
> > > Unfortunately this commit broke systemd-shutdown's ability to kill processes,
> > > which makes some filesystems no longer get unmounted at shutdown.
> > >
> > > It looks like systemd-shutdown relies on being able to create a pidfd for any
> > > process listed in /proc (even a kthread), and if it gets EINVAL it treats it a
> > > fatal error and stops looking for more processes...
> >
> > Thanks for the report!
> > I talked to Daan De Meyer who made that change and he said that this
> > must a systemd version that hasn't gotten his fixes yet. In any case, if
> > this causes regression then I'll revert it right now. See the appended
> > revert.
>
> Thanks for queueing up a revert.
>
> This was on systemd 256.4 which was released less than a month ago.
>
> I'm not sure what systemd fix you are talking about. Looking at killall() in
> src/shared/killall.c on the latest "main" branch of systemd, it calls
> proc_dir_read_pidref() => pidref_set_pid() => pidfd_open(), and EINVAL gets
> passed back up to killall() and treated as a fatal error. ignore_proc() skips
> kernel threads but is executed too late. I didn't test it, so I could be wrong,
> but based on the code it does not appear to be fixed.
Yeah, I think you're right. What they fixed is
ead48ec35c86 ("cgroup-util: Don't try to open pidfd for kernel threads")
when reading pids from cgroup.procs. Daan is currently prepping a fix
for reading pids from /proc as well.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] pidfd: prevent creation of pidfds for kthreads
2024-08-21 7:41 ` Christian Brauner
@ 2024-08-21 7:47 ` Daan De Meyer
0 siblings, 0 replies; 13+ messages in thread
From: Daan De Meyer @ 2024-08-21 7:47 UTC (permalink / raw)
To: Christian Brauner
Cc: Eric Biggers, linux-fsdevel, linux-kernel, Oleg Nesterov,
Aleksa Sarai, Tycho Andersen, Tejun Heo, stable
Hi,
I just prepped https://github.com/systemd/systemd/pull/34058 which
will apply the same fix from ead48ec35c86 ("cgroup-util: Don't try to
open pidfd for kernel threads") for pids read from /proc as well.
Cheers,
Daan
On Wed, 21 Aug 2024 at 09:41, Christian Brauner <brauner@kernel.org> wrote:
>
> On Tue, Aug 20, 2024 at 12:34:14PM GMT, Eric Biggers wrote:
> > On Mon, Aug 19, 2024 at 10:41:15AM +0200, Christian Brauner wrote:
> > > On Sat, Aug 17, 2024 at 08:58:18PM GMT, Eric Biggers wrote:
> > > > Hi Christian,
> > > >
> > > > On Wed, Jul 31, 2024 at 12:01:12PM +0200, Christian Brauner wrote:
> > > > > It's currently possible to create pidfds for kthreads but it is unclear
> > > > > what that is supposed to mean. Until we have use-cases for it and we
> > > > > figured out what behavior we want block the creation of pidfds for
> > > > > kthreads.
> > > > >
> > > > > Fixes: 32fcb426ec00 ("pid: add pidfd_open()")
> > > > > Cc: stable@vger.kernel.org
> > > > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > > > ---
> > > > > kernel/fork.c | 25 ++++++++++++++++++++++---
> > > > > 1 file changed, 22 insertions(+), 3 deletions(-)
> > > >
> > > > Unfortunately this commit broke systemd-shutdown's ability to kill processes,
> > > > which makes some filesystems no longer get unmounted at shutdown.
> > > >
> > > > It looks like systemd-shutdown relies on being able to create a pidfd for any
> > > > process listed in /proc (even a kthread), and if it gets EINVAL it treats it a
> > > > fatal error and stops looking for more processes...
> > >
> > > Thanks for the report!
> > > I talked to Daan De Meyer who made that change and he said that this
> > > must a systemd version that hasn't gotten his fixes yet. In any case, if
> > > this causes regression then I'll revert it right now. See the appended
> > > revert.
> >
> > Thanks for queueing up a revert.
> >
> > This was on systemd 256.4 which was released less than a month ago.
> >
> > I'm not sure what systemd fix you are talking about. Looking at killall() in
> > src/shared/killall.c on the latest "main" branch of systemd, it calls
> > proc_dir_read_pidref() => pidref_set_pid() => pidfd_open(), and EINVAL gets
> > passed back up to killall() and treated as a fatal error. ignore_proc() skips
> > kernel threads but is executed too late. I didn't test it, so I could be wrong,
> > but based on the code it does not appear to be fixed.
>
> Yeah, I think you're right. What they fixed is
> ead48ec35c86 ("cgroup-util: Don't try to open pidfd for kernel threads")
> when reading pids from cgroup.procs. Daan is currently prepping a fix
> for reading pids from /proc as well.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] pidfd: prevent creation of pidfds for kthreads
2024-08-19 8:41 ` Christian Brauner
2024-08-20 19:34 ` Eric Biggers
@ 2024-08-23 5:23 ` Linux regression tracking (Thorsten Leemhuis)
2024-08-23 6:12 ` Greg KH
1 sibling, 1 reply; 13+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2024-08-23 5:23 UTC (permalink / raw)
To: stable, Greg KH, Sasha Levin
Cc: linux-fsdevel, linux-kernel, Oleg Nesterov, Aleksa Sarai,
Tycho Andersen, Daan De Meyer, Tejun Heo, Eric Biggers,
Christian Brauner, Linux kernel regressions list
On 19.08.24 10:41, Christian Brauner wrote:
> On Sat, Aug 17, 2024 at 08:58:18PM GMT, Eric Biggers wrote:
>> On Wed, Jul 31, 2024 at 12:01:12PM +0200, Christian Brauner wrote:
>>> It's currently possible to create pidfds for kthreads but it is unclear
>>> what that is supposed to mean. Until we have use-cases for it and we
>>> figured out what behavior we want block the creation of pidfds for
>>> kthreads.
>>>
>>> Fixes: 32fcb426ec00 ("pid: add pidfd_open()")
>>> Cc: stable@vger.kernel.org
>>> Signed-off-by: Christian Brauner <brauner@kernel.org>
>>> ---
>>> kernel/fork.c | 25 ++++++++++++++++++++++---
>>> 1 file changed, 22 insertions(+), 3 deletions(-)
>>
>> Unfortunately this commit broke systemd-shutdown's ability to kill processes,
>> which makes some filesystems no longer get unmounted at shutdown.
>>
>> It looks like systemd-shutdown relies on being able to create a pidfd for any
>> process listed in /proc (even a kthread), and if it gets EINVAL it treats it a
>> fatal error and stops looking for more processes...
>
> Thanks for the report!
> I talked to Daan De Meyer who made that change and he said that this
> must a systemd version that hasn't gotten his fixes yet. In any case, if
> this causes regression then I'll revert it right now. See the appended
> revert.
Greg, Sasha, JFYI in case you are not already aware of it: I by
chance[1] noticed that the patch Christian plans to revert is still in
the 6.10-queue. You might want to drop it (or apply the revert as well,
which is in -next, but not yet in mainline afaics).
Ciao, Thorsten
[1] after I noticed this thread a third time, this time via some SELinux
problems it caused in Fedora land:
https://bugzilla.redhat.com/show_bug.cgi?id=2306818
https://bugzilla.redhat.com/show_bug.cgi?id=2305270
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] pidfd: prevent creation of pidfds for kthreads
2024-08-23 5:23 ` Linux regression tracking (Thorsten Leemhuis)
@ 2024-08-23 6:12 ` Greg KH
0 siblings, 0 replies; 13+ messages in thread
From: Greg KH @ 2024-08-23 6:12 UTC (permalink / raw)
To: Linux regressions mailing list
Cc: stable, Sasha Levin, linux-fsdevel, linux-kernel, Oleg Nesterov,
Aleksa Sarai, Tycho Andersen, Daan De Meyer, Tejun Heo,
Eric Biggers, Christian Brauner
On Fri, Aug 23, 2024 at 07:23:12AM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 19.08.24 10:41, Christian Brauner wrote:
> > On Sat, Aug 17, 2024 at 08:58:18PM GMT, Eric Biggers wrote:
> >> On Wed, Jul 31, 2024 at 12:01:12PM +0200, Christian Brauner wrote:
> >>> It's currently possible to create pidfds for kthreads but it is unclear
> >>> what that is supposed to mean. Until we have use-cases for it and we
> >>> figured out what behavior we want block the creation of pidfds for
> >>> kthreads.
> >>>
> >>> Fixes: 32fcb426ec00 ("pid: add pidfd_open()")
> >>> Cc: stable@vger.kernel.org
> >>> Signed-off-by: Christian Brauner <brauner@kernel.org>
> >>> ---
> >>> kernel/fork.c | 25 ++++++++++++++++++++++---
> >>> 1 file changed, 22 insertions(+), 3 deletions(-)
> >>
> >> Unfortunately this commit broke systemd-shutdown's ability to kill processes,
> >> which makes some filesystems no longer get unmounted at shutdown.
> >>
> >> It looks like systemd-shutdown relies on being able to create a pidfd for any
> >> process listed in /proc (even a kthread), and if it gets EINVAL it treats it a
> >> fatal error and stops looking for more processes...
> >
> > Thanks for the report!
> > I talked to Daan De Meyer who made that change and he said that this
> > must a systemd version that hasn't gotten his fixes yet. In any case, if
> > this causes regression then I'll revert it right now. See the appended
> > revert.
>
> Greg, Sasha, JFYI in case you are not already aware of it: I by
> chance[1] noticed that the patch Christian plans to revert is still in
> the 6.10-queue. You might want to drop it (or apply the revert as well,
> which is in -next, but not yet in mainline afaics).
I was hoping it would get into Linus's tree "soon" so I could take the
revert too. As it's in -next, I'll grab it from there when I get a
chance.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2024-08-23 6:12 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-31 10:01 [PATCH] pidfd: prevent creation of pidfds for kthreads Christian Brauner
2024-07-31 14:51 ` Oleg Nesterov
2024-08-01 6:58 ` Christian Brauner
2024-08-01 8:01 ` Oleg Nesterov
2024-08-01 13:48 ` Christian Brauner
2024-08-01 13:59 ` Oleg Nesterov
2024-08-18 3:58 ` Eric Biggers
2024-08-19 8:41 ` Christian Brauner
2024-08-20 19:34 ` Eric Biggers
2024-08-21 7:41 ` Christian Brauner
2024-08-21 7:47 ` Daan De Meyer
2024-08-23 5:23 ` Linux regression tracking (Thorsten Leemhuis)
2024-08-23 6:12 ` Greg KH
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).