From: Oleg Nesterov <oleg@redhat.com>
To: Christian Brauner <brauner@kernel.org>
Cc: linux-fsdevel@vger.kernel.org, Jeff Layton <jlayton@kernel.org>,
Lennart Poettering <lennart@poettering.net>,
Daan De Meyer <daan.j.demeyer@gmail.com>,
Mike Yuan <me@yhndnzj.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC 3/4] pidfd: improve uapi when task isn't found
Date: Fri, 4 Apr 2025 14:37:38 +0200 [thread overview]
Message-ID: <20250404123737.GC3720@redhat.com> (raw)
In-Reply-To: <20250403-work-pidfd-fixes-v1-3-a123b6ed6716@kernel.org>
On 04/03, Christian Brauner wrote:
>
> We currently report EINVAL whenever a struct pid has no tasked attached
> anymore thereby conflating two concepts:
>
> (1) The task has already been reaped.
> (2) The caller requested a pidfd for a thread-group leader but the pid
> actually references a struct pid that isn't used as a thread-group
> leader.
>
> This is causing issues for non-threaded workloads as in [1].
>
> This patch tries to allow userspace to distinguish between (1) and (2).
> This is racy of course but that shouldn't matter.
>
> Link: https://github.com/systemd/systemd/pull/36982 [1]
> Signed-off-by: Christian Brauner <brauner@kernel.org>
For this series:
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
But I have a couple of cosmetic nits...
> int pidfd_prepare(struct pid *pid, unsigned int flags, struct file **ret)
> {
> - bool thread = flags & PIDFD_THREAD;
> + int err = 0;
>
> - if (!pid_has_task(pid, thread ? PIDTYPE_PID : PIDTYPE_TGID))
> - return -EINVAL;
> + if (!(flags & PIDFD_THREAD)) {
> + /*
> + * If this is struct pid isn't used as a thread-group
> + * leader pid but the caller requested to create a
> + * thread-group leader pidfd then report ENOENT to the
> + * caller as a hint.
> + */
> + if (!pid_has_task(pid, PIDTYPE_TGID))
> + err = -ENOENT;
> + }
> +
> + /*
> + * If this wasn't a thread-group leader struct pid or the task
> + * got reaped in the meantime report -ESRCH to userspace.
> + *
> + * This is racy of course. This could've not been a thread-group
> + * leader struct pid and we set ENOENT above but in the meantime
> + * the task got reaped. Or there was a multi-threaded-exec by a
> + * subthread and we were a thread-group leader but now got
> + * killed.
The comment about the multi-threaded-exec looks a bit misleading to me.
If this pid is a group-leader-pid and we race with de_thread() which does
exchange_tids(tsk, leader);
transfer_pid(leader, tsk, PIDTYPE_TGID);
nothing "bad" can happen, both pid_has_task(PIDTYPE_PID) or
pid_has_task(PIDTYPE_TGID) can't return NULL during (or after) this
transition.
hlists_swap_heads_rcu() or hlist_replace_rcu() can't make
hlist_head->first == NULL during this transition...
Or I misunderstood the comment?
And... the code looks a bit overcomplicated to me, why not simply
int pidfd_prepare(struct pid *pid, unsigned int flags, struct file **ret)
{
if (!pid_has_task(pid, PIDTYPE_PID))
return -ESRCH;
if (!(flags & PIDFD_THREAD) && !pid_has_task(pid, PIDTYPE_TGID))
return -ENOENT;
return __pidfd_prepare(pid, flags, ret);
}
? Of course, the comments should stay.
But again, this is cosmetic/subjective, please do what you like more.
Oleg.
next prev parent reply other threads:[~2025-04-04 12:38 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-03 14:09 [PATCH RFC 0/4] pidfd: improve uapi when task isn't found Christian Brauner
2025-04-03 14:09 ` [PATCH RFC 1/4] selftests/pidfd: adapt to recent changes Christian Brauner
2025-04-03 14:09 ` [PATCH RFC 2/4] pidfd: remove unneeded NULL check from pidfd_prepare() Christian Brauner
2025-04-03 14:09 ` [PATCH RFC 3/4] pidfd: improve uapi when task isn't found Christian Brauner
2025-04-04 12:37 ` Oleg Nesterov [this message]
2025-04-04 13:38 ` Christian Brauner
2025-04-04 14:53 ` Oleg Nesterov
2025-04-09 15:38 ` Christian Brauner
2025-04-09 18:18 ` [RFC PATCH] pidfs: ensure consistent ENOENT/ESRCH reporting Christian Brauner
2025-04-09 18:40 ` Oleg Nesterov
2025-04-10 10:18 ` Oleg Nesterov
2025-04-10 10:43 ` Christian Brauner
2025-04-10 13:10 ` Oleg Nesterov
2025-04-10 20:05 ` Christian Brauner
2025-04-10 20:24 ` Christian Brauner
2025-04-11 11:08 ` Christian Brauner
2025-04-11 11:25 ` Oleg Nesterov
2025-04-11 11:41 ` Oleg Nesterov
2025-04-03 14:09 ` [PATCH RFC 4/4] selftest/pidfd: add test for thread-group leader pidfd open for thread Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250404123737.GC3720@redhat.com \
--to=oleg@redhat.com \
--cc=brauner@kernel.org \
--cc=daan.j.demeyer@gmail.com \
--cc=jlayton@kernel.org \
--cc=lennart@poettering.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=me@yhndnzj.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.