From: Oleg Nesterov <oleg@redhat.com>
To: Christian Brauner <brauner@kernel.org>
Cc: 高翔 <gaoxiang17@xiaomi.com>, "Al Viro" <viro@zeniv.linux.org.uk>,
"Xiang Gao" <gxxa03070307@gmail.com>,
"mjguzik@gmail.com" <mjguzik@gmail.com>,
"Liam.Howlett@oracle.com" <Liam.Howlett@oracle.com>,
"joel.granados@kernel.org" <joel.granados@kernel.org>,
"lorenzo.stoakes@oracle.com" <lorenzo.stoakes@oracle.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
Date: Sun, 10 Aug 2025 17:42:55 +0200 [thread overview]
Message-ID: <20250810154255.GA11928@redhat.com> (raw)
In-Reply-To: <20250808-erleichtern-hinreichend-35335e412d9e@brauner>
On 08/08, Christian Brauner wrote:
>
> On Tue, Aug 05, 2025 at 02:43:01PM +0200, Oleg Nesterov wrote:
> >
> > After the quick grep I don't see the problematic users, but if a zombie
> > task T does task_ppid_nr_ns(current, NULL) the kernel can crash:
> >
> > - pid_alive() succeeds, the task is not reaped yet
> >
> > - the parent/debugger does wait()->release_task(T), T->thread_pid
> > is NULL now
> >
> > - T calls task_tgid_nr_ns()-> ... pid_nr_ns(ns => NULL) because
> > task_active_pid_ns(T) returns NULL
> >
> > Do you think this worth fixing?
>
> If it's not too much work and it is an actual real-world concern then I
> think we should fix it. But I trust your judgement here!
Well, I don't really know what to do ;)
Initially I was going to do something like below to "fix" task_tgid_nr_ns()
--- a/include/linux/pid.h
+++ b/include/linux/pid.h
@@ -299,7 +299,9 @@ static inline pid_t task_ppid_nr_ns(const struct task_struct *tsk, struct pid_na
pid_t pid = 0;
rcu_read_lock();
- if (pid_alive(tsk))
+ if (!ns)
+ ns = task_active_pid_ns(current);
+ if (ns && pid_alive(tsk))
pid = task_tgid_nr_ns(rcu_dereference(tsk->real_parent), ns);
rcu_read_unlock();
and then add WARN_ON() into pid_nr_ns(). But note that we should not check 'ns'
before 'pid', we need
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -491,7 +491,13 @@ pid_t pid_nr_ns(struct pid *pid, struct pid_namespace *ns)
struct upid *upid;
pid_t nr = 0;
- if (pid && ns->level <= pid->level) {
+ if (!pid)
+ return nr;
+
+ if (WARN_ON_ONCE(!ns)
+ return nr;
+
+ if (ns->level <= pid->level) {
upid = &pid->numbers[ns->level];
if (upid->ns == ns)
nr = upid->nr;
Think of task_pid_vnr(current) from a zombie, it should return 0 without warning.
But this looks overcomplicated to me. So I am going to send the patch which simply
changes __task_pid_nr_ns()
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -514,7 +514,8 @@ pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type,
rcu_read_lock();
if (!ns)
ns = task_active_pid_ns(current);
- nr = pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns);
+ if (ns)
+ nr = pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns);
rcu_read_unlock();
and leave pid_nr_ns() alone.
People will do mistakes, it is better not to crash and just return 0 if, say,
a zombie task does task_pid_vnr(another_task). Currently it is not simple to
do this correctly because, again, the pid_alive(current) check can't help.
pid_nr_ns() is more "low-level", its users should know what they are doing.
Although the quick grep suggests many of cleanups. Say, why kvm_create_pit()
does get_pid? why can't it simply use task_tgid_vnr(current). Even do_getpgid()
and sys_getsid() look overcomplicated. Later.
Oleg.
next prev parent reply other threads:[~2025-08-10 15:44 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-02 2:21 [PATCH] pid: Add a judgment for ns null in pid_nr_ns Xiang Gao
2025-08-02 2:25 ` Al Viro
[not found] ` <15b18541f37447dd8d5dbd8012662f67@xiaomi.com>
2025-08-02 5:52 ` 答复: [External Mail]Re: " Al Viro
2025-08-02 5:54 ` Al Viro
[not found] ` <c7968242db914979953277226fe55fc8@xiaomi.com>
2025-08-02 8:04 ` 答复: " Al Viro
2025-08-02 8:45 ` Oleg Nesterov
[not found] ` <80be47cb31d14ffc9f9a7d8d4408ab0a@xiaomi.com>
2025-08-04 11:49 ` Oleg Nesterov
2025-08-04 12:14 ` Christian Brauner
2025-08-04 12:44 ` Oleg Nesterov
2025-08-05 12:43 ` Oleg Nesterov
2025-08-08 14:56 ` Christian Brauner
2025-08-10 15:42 ` Oleg Nesterov [this message]
[not found] ` <aa5272ddcec944e2a35ca7104f6a86bf@xiaomi.com>
2025-08-05 19:43 ` 答复: [External Mail]Re: " Oleg Nesterov
2025-08-08 14:54 ` Christian Brauner
2025-08-02 8:43 ` Oleg Nesterov
2025-08-10 17:36 ` [PATCH 1/4] pid: make __task_pid_nr_ns(ns => NULL) safe for zombie callers Oleg Nesterov
2025-08-10 17:36 ` [PATCH 2/4] pid: introduce task_ppid_vnr() Oleg Nesterov
2025-08-10 17:36 ` [PATCH 3/4] pid: change bacct_add_tsk() to use task_ppid_nr_ns() Oleg Nesterov
2025-08-10 17:36 ` [PATCH 4/4] pid: change task_state() " Oleg Nesterov
2025-08-19 11:40 ` [PATCH] pid: Add a judgment for ns null in pid_nr_ns Christian Brauner
2025-08-19 14:25 ` Oleg Nesterov
2025-09-01 15:30 ` Oleg Nesterov
2025-09-01 15:44 ` Mateusz Guzik
2025-09-01 15:55 ` Mateusz Guzik
2025-09-02 14:37 ` Oleg Nesterov
2026-01-04 7:25 ` Qing Wang
-- strict thread matches above, loose matches on Subject: below --
2025-02-11 6:17 Xiang Gao
2025-02-11 6:49 ` Baoquan He
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250810154255.GA11928@redhat.com \
--to=oleg@redhat.com \
--cc=Liam.Howlett@oracle.com \
--cc=brauner@kernel.org \
--cc=gaoxiang17@xiaomi.com \
--cc=gxxa03070307@gmail.com \
--cc=joel.granados@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mjguzik@gmail.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.