* [PATCH] pid: Add a judgment for ns null in pid_nr_ns
@ 2025-08-02 2:21 Xiang Gao
2025-08-02 2:25 ` Al Viro
` (5 more replies)
0 siblings, 6 replies; 25+ messages in thread
From: Xiang Gao @ 2025-08-02 2:21 UTC (permalink / raw)
To: brauner, oleg, mjguzik, Liam.Howlett, viro
Cc: joel.granados, lorenzo.stoakes, linux-kernel, gaoxiang17
From: gaoxiang17 <gaoxiang17@xiaomi.com>
__task_pid_nr_ns
ns = task_active_pid_ns(current);
pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns);
if (pid && ns->level <= pid->level) {
Sometimes null is returned for task_active_pid_ns. Then it will trigger kernel panic in pid_nr_ns.
For example:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000058
Mem abort info:
ESR = 0x0000000096000007
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x07: level 3 translation fault
Data abort info:
ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
CM = 0, WnR = 0, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
user pgtable: 4k pages, 39-bit VAs, pgdp=00000002175aa000
[0000000000000058] pgd=08000002175ab003, p4d=08000002175ab003, pud=08000002175ab003, pmd=08000002175be003, pte=0000000000000000
pstate: 834000c5 (Nzcv daIF +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
pc : __task_pid_nr_ns+0x74/0xd0
lr : __task_pid_nr_ns+0x24/0xd0
sp : ffffffc08001bd10
x29: ffffffc08001bd10 x28: ffffffd4422b2000 x27: 0000000000000001
x26: ffffffd442821168 x25: ffffffd442821000 x24: 00000f89492eab31
x23: 00000000000000c0 x22: ffffff806f5693c0 x21: ffffff806f5693c0
x20: 0000000000000001 x19: 0000000000000000 x18: 0000000000000000
x17: 00000000529c6ef0 x16: 00000000529c6ef0 x15: 00000000023a1adc
x14: 0000000000000003 x13: 00000000007ef6d8 x12: 001167c391c78800
x11: 00ffffffffffffff x10: 0000000000000000 x9 : 0000000000000001
x8 : ffffff80816fa3c0 x7 : 0000000000000000 x6 : 49534d702d535449
x5 : ffffffc080c4c2c0 x4 : ffffffd43ee128c8 x3 : ffffffd43ee124dc
x2 : 0000000000000000 x1 : 0000000000000001 x0 : ffffff806f5693c0
Call trace:
__task_pid_nr_ns+0x74/0xd0
...
__handle_irq_event_percpu+0xd4/0x284
handle_irq_event+0x48/0xb0
handle_fasteoi_irq+0x160/0x2d8
generic_handle_domain_irq+0x44/0x60
gic_handle_irq+0x4c/0x114
call_on_irq_stack+0x3c/0x74
do_interrupt_handler+0x4c/0x84
el1_interrupt+0x34/0x58
el1h_64_irq_handler+0x18/0x24
el1h_64_irq+0x68/0x6c
account_kernel_stack+0x60/0x144
exit_task_stack_account+0x1c/0x80
do_exit+0x7e4/0xaf8
...
get_signal+0x7bc/0x8d8
do_notify_resume+0x128/0x828
el0_svc+0x6c/0x70
el0t_64_sync_handler+0x68/0xbc
el0t_64_sync+0x1a8/0x1ac
Code: 35fffe54 911a02a8 f9400108 b4000128 (b9405a69)
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Oops: Fatal exception in interrupt
Signed-off-by: gaoxiang17 <gaoxiang17@xiaomi.com>
---
kernel/pid.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/pid.c b/kernel/pid.c
index c45a28c16cd2..14e908f2f0cb 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -491,7 +491,7 @@ pid_t pid_nr_ns(struct pid *pid, struct pid_namespace *ns)
struct upid *upid;
pid_t nr = 0;
- if (pid && ns->level <= pid->level) {
+ if (pid && ns && ns->level <= pid->level) {
upid = &pid->numbers[ns->level];
if (upid->ns == ns)
nr = upid->nr;
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
2025-08-02 2:21 [PATCH] pid: Add a judgment for ns null in pid_nr_ns Xiang Gao
@ 2025-08-02 2:25 ` Al Viro
2025-08-02 8:43 ` Oleg Nesterov
[not found] ` <15b18541f37447dd8d5dbd8012662f67@xiaomi.com>
2025-08-10 17:36 ` [PATCH 1/4] pid: make __task_pid_nr_ns(ns => NULL) safe for zombie callers Oleg Nesterov
` (4 subsequent siblings)
5 siblings, 2 replies; 25+ messages in thread
From: Al Viro @ 2025-08-02 2:25 UTC (permalink / raw)
To: Xiang Gao
Cc: brauner, oleg, mjguzik, Liam.Howlett, joel.granados,
lorenzo.stoakes, linux-kernel, gaoxiang17
On Sat, Aug 02, 2025 at 10:21:23AM +0800, Xiang Gao wrote:
> From: gaoxiang17 <gaoxiang17@xiaomi.com>
>
> __task_pid_nr_ns
> ns = task_active_pid_ns(current);
> pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns);
> if (pid && ns->level <= pid->level) {
>
> Sometimes null is returned for task_active_pid_ns. Then it will trigger kernel panic in pid_nr_ns.
In which conditions does that happen?
> __task_pid_nr_ns+0x74/0xd0
> ...
> __handle_irq_event_percpu+0xd4/0x284
> handle_irq_event+0x48/0xb0
Huh? Just what is it doing inside an IRQ handler?
Hell, the notion of current process is not usable in those,
let alone any properties of such...
Details, please.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 答复: [External Mail]Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
[not found] ` <15b18541f37447dd8d5dbd8012662f67@xiaomi.com>
@ 2025-08-02 5:52 ` Al Viro
2025-08-02 5:54 ` Al Viro
[not found] ` <c7968242db914979953277226fe55fc8@xiaomi.com>
2025-08-02 8:45 ` Oleg Nesterov
1 sibling, 2 replies; 25+ messages in thread
From: Al Viro @ 2025-08-02 5:52 UTC (permalink / raw)
To: 高翔
Cc: Xiang Gao, brauner@kernel.org, oleg@redhat.com, mjguzik@gmail.com,
Liam.Howlett@oracle.com, joel.granados@kernel.org,
lorenzo.stoakes@oracle.com, linux-kernel@vger.kernel.org
On Sat, Aug 02, 2025 at 04:32:39AM +0000, 高翔 wrote:
> > __task_pid_nr_ns+0x74/0xd0
> > ...
> > __handle_irq_event_percpu+0xd4/0x284
> > handle_irq_event+0x48/0xb0
>
> Huh? Just what is it doing inside an IRQ handler?
> Hell, the notion of current process is not usable in those,
> let alone any properties of such...
>
> Details, please.
>
>
> Obtain the current process pid in the ufs compl command. This scene is possible.
Nothing of that sort in the mainline (or -next, for that matter), but...
> Call trace:
> __task_pid_nr_ns+0x74/0xd0
> get_common_info+0x9c/0x1c0 [io_xxx 39b55c95a0fe9416f7d7be396be0fd1d6f590f17]
> io_monitor_global_log+0x1a0/0x294 [io_xxx 39b55c95a0fe9416f7d7be396be0fd1d6f590f17]
> cb_android_vh_ufs_compl_command+0x304/0x578 [io_xxx 39b55c95a0fe9416f7d7be396be0fd1d6f590f17]
> __traceiter_android_vh_ufs_compl_command+0x54/0x78
> ...
> __handle_irq_event_percpu+0xd4/0x284
... for asynchronous callbacks if that sort there is no such thing as the current
process. At all. Using current, let alone looking for its PID, userns, etc. in
such context is a bug. Don't do it. It's a bug in your module.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 答复: [External Mail]Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
2025-08-02 5:52 ` 答复: [External Mail]Re: " Al Viro
@ 2025-08-02 5:54 ` Al Viro
[not found] ` <c7968242db914979953277226fe55fc8@xiaomi.com>
1 sibling, 0 replies; 25+ messages in thread
From: Al Viro @ 2025-08-02 5:54 UTC (permalink / raw)
To: 高翔
Cc: Xiang Gao, brauner@kernel.org, oleg@redhat.com, mjguzik@gmail.com,
Liam.Howlett@oracle.com, joel.granados@kernel.org,
lorenzo.stoakes@oracle.com, linux-kernel@vger.kernel.org
On Sat, Aug 02, 2025 at 06:52:52AM +0100, Al Viro wrote:
> ... for asynchronous callbacks if that sort there is no such thing as the current
of, even...
> process. At all. Using current, let alone looking for its PID, userns, etc. in
> such context is a bug. Don't do it. It's a bug in your module.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 答复: 答复: [External Mail]Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
[not found] ` <c7968242db914979953277226fe55fc8@xiaomi.com>
@ 2025-08-02 8:04 ` Al Viro
0 siblings, 0 replies; 25+ messages in thread
From: Al Viro @ 2025-08-02 8:04 UTC (permalink / raw)
To: 高翔
Cc: Xiang Gao, brauner@kernel.org, oleg@redhat.com, mjguzik@gmail.com,
Liam.Howlett@oracle.com, joel.granados@kernel.org,
lorenzo.stoakes@oracle.com, linux-kernel@vger.kernel.org
On Sat, Aug 02, 2025 at 06:38:20AM +0000, 高翔 wrote:
> I see, but does adding the judgment for ns make the code more robust?
It's an impossible condition for anything that runs in a context where
get_current() is allowed, so... I don't see any benefit in adding it.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
2025-08-02 2:25 ` Al Viro
@ 2025-08-02 8:43 ` Oleg Nesterov
[not found] ` <15b18541f37447dd8d5dbd8012662f67@xiaomi.com>
1 sibling, 0 replies; 25+ messages in thread
From: Oleg Nesterov @ 2025-08-02 8:43 UTC (permalink / raw)
To: Al Viro
Cc: Xiang Gao, brauner, mjguzik, Liam.Howlett, joel.granados,
lorenzo.stoakes, linux-kernel, gaoxiang17
On 08/02, Al Viro wrote:
>
> In which conditions does that happen?
>
> > __task_pid_nr_ns+0x74/0xd0
> > ...
> > __handle_irq_event_percpu+0xd4/0x284
> > handle_irq_event+0x48/0xb0
>
> Huh? Just what is it doing inside an IRQ handler?
> Hell, the notion of current process is not usable in those,
> let alone any properties of such...
Well, at least get_curent() should work in this case...
But we can forget about IRQ. I guess the problem is that if the exiting
task has already passed exit_notify() and it was (auto)reaped, then, say,
task_pid_vnr(&init_task) will crash.
Not that I think the exiting task should do anything like this...
Oleg.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 答复: [External Mail]Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
[not found] ` <15b18541f37447dd8d5dbd8012662f67@xiaomi.com>
2025-08-02 5:52 ` 答复: [External Mail]Re: " Al Viro
@ 2025-08-02 8:45 ` Oleg Nesterov
[not found] ` <80be47cb31d14ffc9f9a7d8d4408ab0a@xiaomi.com>
1 sibling, 1 reply; 25+ messages in thread
From: Oleg Nesterov @ 2025-08-02 8:45 UTC (permalink / raw)
To: 高翔
Cc: Al Viro, Xiang Gao, brauner@kernel.org, mjguzik@gmail.com,
Liam.Howlett@oracle.com, joel.granados@kernel.org,
lorenzo.stoakes@oracle.com, linux-kernel@vger.kernel.org
On 08/02, 高翔 wrote:
>
> Obtain the current process pid in the ufs compl command. This scene is possible.
How exactly your module tries to obtain the current process pid?
task_pid_vnr(current) should work and return 0 if the task was reaped.
Oleg.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
[not found] ` <80be47cb31d14ffc9f9a7d8d4408ab0a@xiaomi.com>
@ 2025-08-04 11:49 ` Oleg Nesterov
2025-08-04 12:14 ` Christian Brauner
0 siblings, 1 reply; 25+ messages in thread
From: Oleg Nesterov @ 2025-08-04 11:49 UTC (permalink / raw)
To: 高翔
Cc: Al Viro, Xiang Gao, brauner@kernel.org, mjguzik@gmail.com,
Liam.Howlett@oracle.com, joel.granados@kernel.org,
lorenzo.stoakes@oracle.com, linux-kernel@vger.kernel.org
On 08/04, 高翔 wrote:
>
> struct task_struct *tsk = current;
>
> struct task_struct *parent;
>
> ...
>
> info->pid = task_pid_vnr(tsk);
> rcu_read_lock();
> parent = rcu_dereference(tsk->real_parent);
> get_task_struct(parent);
> rcu_read_unlock();
> info->ppid = task_tgid_vnr(parent);
> strncpy(info->ptask_name, parent->comm, TASK_COMM_LEN);
> put_task_struct(parent);
So I guess the kernel crashes when you try to obtain another process's pid, not
the current process's pid. This is was I suspected.
This code is buggy. tsk->real_parent points to nowhere if tsk = current was reaped.
rcu_read_lock() alone can't help. Even get_task_struct(parent) is not safe. And it
is not needed.
You need something like
info->pid = info->ppid = 0;
rcu_read_lock();
if (pid_alive(tsk)) {
info->pid = task_pid_vnr(tsk);
info->ppid = task_tgid_vnr(tsk->real_parent);
}
rcu_read_unlock();
Oleg.
>
>
>
> ________________________________
> 发件人: Oleg Nesterov <oleg@redhat.com>
> 发送时间: 2025年8月2日 16:45:26
> 收件人: 高翔
> 抄送: Al Viro; Xiang Gao; brauner@kernel.org; mjguzik@gmail.com; Liam.Howlett@oracle.com; joel.granados@kernel.org; lorenzo.stoakes@oracle.com; linux-kernel@vger.kernel.org
> 主题: Re: 答复: [External Mail]Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
>
> [外部邮件] 此邮件来源于小米公司外部,请谨慎处理。若对邮件安全性存疑,请将邮件转发给misec@xiaomi.com进行反馈
>
> On 08/02, 高翔 wrote:
> >
> > Obtain the current process pid in the ufs compl command. This scene is possible.
>
> How exactly your module tries to obtain the current process pid?
>
> task_pid_vnr(current) should work and return 0 if the task was reaped.
>
> Oleg.
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
2025-08-04 11:49 ` Oleg Nesterov
@ 2025-08-04 12:14 ` Christian Brauner
2025-08-04 12:44 ` Oleg Nesterov
[not found] ` <aa5272ddcec944e2a35ca7104f6a86bf@xiaomi.com>
0 siblings, 2 replies; 25+ messages in thread
From: Christian Brauner @ 2025-08-04 12:14 UTC (permalink / raw)
To: Oleg Nesterov
Cc: 高翔, Al Viro, Xiang Gao, mjguzik@gmail.com,
Liam.Howlett@oracle.com, joel.granados@kernel.org,
lorenzo.stoakes@oracle.com, linux-kernel@vger.kernel.org
On Mon, Aug 04, 2025 at 01:49:01PM +0200, Oleg Nesterov wrote:
> On 08/04, 高翔 wrote:
> >
> > struct task_struct *tsk = current;
> >
> > struct task_struct *parent;
> >
> > ...
> >
> > info->pid = task_pid_vnr(tsk);
> > rcu_read_lock();
> > parent = rcu_dereference(tsk->real_parent);
> > get_task_struct(parent);
> > rcu_read_unlock();
> > info->ppid = task_tgid_vnr(parent);
> > strncpy(info->ptask_name, parent->comm, TASK_COMM_LEN);
> > put_task_struct(parent);
>
> So I guess the kernel crashes when you try to obtain another process's pid, not
> the current process's pid. This is was I suspected.
>
> This code is buggy. tsk->real_parent points to nowhere if tsk = current was reaped.
> rcu_read_lock() alone can't help. Even get_task_struct(parent) is not safe. And it
> is not needed.
>
> You need something like
>
> info->pid = info->ppid = 0;
>
> rcu_read_lock();
> if (pid_alive(tsk)) {
> info->pid = task_pid_vnr(tsk);
> info->ppid = task_tgid_vnr(tsk->real_parent);
> }
> rcu_read_unlock();
I distinctly remember having seen a similar patch about 5 years ago that
did the exact same fix for some out-of-tree abuse:
20201201024811.GA72235@ip-172-31-62-0.us-west-2.compute.internal
Where in the kernel is that code supposed to live? Is this again an
out-of-tree module?
__task_pid_nr_ns+0x74/0xd0
get_common_info+0x9c/0x1c0 [io_xxx 39b55c95a0fe9416f7d7be396be0fd1d6f590f17]
io_monitor_global_log+0x1a0/0x294 [io_xxx 39b55c95a0fe9416f7d7be396be0fd1d6f590f17]
cb_android_vh_ufs_compl_command+0x304/0x578 [io_xxx 39b55c95a0fe9416f7d7be396be0fd1d6f590f17]
__traceiter_android_vh_ufs_compl_command+0x54/0x78
all looks to me like some out-of-tree stuff...
So no, not taking that.
The only thing I can see as a sensible addition is:
diff --git a/kernel/pid.c b/kernel/pid.c
index c45a28c16cd2..f27cbc208c5e 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -491,6 +491,9 @@ pid_t pid_nr_ns(struct pid *pid, struct pid_namespace *ns)
struct upid *upid;
pid_t nr = 0;
+ if (unlikely(WARN_ON_ONCE(!ns)))
+ return 0;
+
if (pid && ns->level <= pid->level) {
upid = &pid->numbers[ns->level];
if (upid->ns == ns)
So that we yell at abusers of pid_nr_ns(). But even that...
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
2025-08-04 12:14 ` Christian Brauner
@ 2025-08-04 12:44 ` Oleg Nesterov
2025-08-05 12:43 ` Oleg Nesterov
[not found] ` <aa5272ddcec944e2a35ca7104f6a86bf@xiaomi.com>
1 sibling, 1 reply; 25+ messages in thread
From: Oleg Nesterov @ 2025-08-04 12:44 UTC (permalink / raw)
To: Christian Brauner
Cc: 高翔, Al Viro, Xiang Gao, mjguzik@gmail.com,
Liam.Howlett@oracle.com, joel.granados@kernel.org,
lorenzo.stoakes@oracle.com, linux-kernel@vger.kernel.org
On 08/04, Christian Brauner wrote:
>
> On Mon, Aug 04, 2025 at 01:49:01PM +0200, Oleg Nesterov wrote:
> >
> > You need something like
> >
> > info->pid = info->ppid = 0;
> >
> > rcu_read_lock();
> > if (pid_alive(tsk)) {
> > info->pid = task_pid_vnr(tsk);
> > info->ppid = task_tgid_vnr(tsk->real_parent);
> > }
> > rcu_read_unlock();
>
> I distinctly remember having seen a similar patch about 5 years ago that
> did the exact same fix for some out-of-tree abuse:
> 20201201024811.GA72235@ip-172-31-62-0.us-west-2.compute.internal
>
> Where in the kernel is that code supposed to live? Is this again an
> out-of-tree module?
I don't know.
But. I need to re-check, but I just realized that pid_alive() can't
really help, tsk->thread_pid is not stable even if tsk == current.
This means that, say, task_ppid_nr_ns() is buggy.
I'll return to this tomorrow.
Oleg.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
2025-08-04 12:44 ` Oleg Nesterov
@ 2025-08-05 12:43 ` Oleg Nesterov
2025-08-08 14:56 ` Christian Brauner
0 siblings, 1 reply; 25+ messages in thread
From: Oleg Nesterov @ 2025-08-05 12:43 UTC (permalink / raw)
To: Christian Brauner
Cc: 高翔, Al Viro, Xiang Gao, mjguzik@gmail.com,
Liam.Howlett@oracle.com, joel.granados@kernel.org,
lorenzo.stoakes@oracle.com, linux-kernel@vger.kernel.org
On 08/04, Oleg Nesterov wrote:
>
> But. I need to re-check, but I just realized that pid_alive() can't
> really help, tsk->thread_pid is not stable even if tsk == current.
>
> This means that, say, task_ppid_nr_ns() is buggy.
After the quick grep I don't see the problematic users, but if a zombie
task T does task_ppid_nr_ns(current, NULL) the kernel can crash:
- pid_alive() succeeds, the task is not reaped yet
- the parent/debugger does wait()->release_task(T), T->thread_pid
is NULL now
- T calls task_tgid_nr_ns()-> ... pid_nr_ns(ns => NULL) because
task_active_pid_ns(T) returns NULL
Do you think this worth fixing?
In any case, I think that task_state(), sched_show_task(), bacct_add_tsk()
should be changed to use task_ppid_nr_ns(tsk).
Oleg.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 答复: [External Mail]Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
[not found] ` <aa5272ddcec944e2a35ca7104f6a86bf@xiaomi.com>
@ 2025-08-05 19:43 ` Oleg Nesterov
2025-08-08 14:54 ` Christian Brauner
0 siblings, 1 reply; 25+ messages in thread
From: Oleg Nesterov @ 2025-08-05 19:43 UTC (permalink / raw)
To: 高翔
Cc: Christian Brauner, Al Viro, Xiang Gao, mjguzik@gmail.com,
Liam.Howlett@oracle.com, joel.granados@kernel.org,
lorenzo.stoakes@oracle.com, linux-kernel@vger.kernel.org
On 08/05, 高翔 wrote:
>
> I also think "WARN ON ONCE" is quite sensible .
I am starting to agree, but lets wait for Christian.
Oleg.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 答复: [External Mail]Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
2025-08-05 19:43 ` 答复: [External Mail]Re: " Oleg Nesterov
@ 2025-08-08 14:54 ` Christian Brauner
0 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2025-08-08 14:54 UTC (permalink / raw)
To: Oleg Nesterov
Cc: 高翔, Al Viro, Xiang Gao, mjguzik@gmail.com,
Liam.Howlett@oracle.com, joel.granados@kernel.org,
lorenzo.stoakes@oracle.com, linux-kernel@vger.kernel.org
On Tue, Aug 05, 2025 at 09:43:02PM +0200, Oleg Nesterov wrote:
> On 08/05, 高翔 wrote:
> >
> > I also think "WARN ON ONCE" is quite sensible .
>
> I am starting to agree, but lets wait for Christian.
Yeah, fine by me.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
2025-08-05 12:43 ` Oleg Nesterov
@ 2025-08-08 14:56 ` Christian Brauner
2025-08-10 15:42 ` Oleg Nesterov
0 siblings, 1 reply; 25+ messages in thread
From: Christian Brauner @ 2025-08-08 14:56 UTC (permalink / raw)
To: Oleg Nesterov
Cc: 高翔, Al Viro, Xiang Gao, mjguzik@gmail.com,
Liam.Howlett@oracle.com, joel.granados@kernel.org,
lorenzo.stoakes@oracle.com, linux-kernel@vger.kernel.org
On Tue, Aug 05, 2025 at 02:43:01PM +0200, Oleg Nesterov wrote:
> On 08/04, Oleg Nesterov wrote:
> >
> > But. I need to re-check, but I just realized that pid_alive() can't
> > really help, tsk->thread_pid is not stable even if tsk == current.
> >
> > This means that, say, task_ppid_nr_ns() is buggy.
>
> After the quick grep I don't see the problematic users, but if a zombie
> task T does task_ppid_nr_ns(current, NULL) the kernel can crash:
>
> - pid_alive() succeeds, the task is not reaped yet
>
> - the parent/debugger does wait()->release_task(T), T->thread_pid
> is NULL now
>
> - T calls task_tgid_nr_ns()-> ... pid_nr_ns(ns => NULL) because
> task_active_pid_ns(T) returns NULL
>
> Do you think this worth fixing?
If it's not too much work and it is an actual real-world concern then I
think we should fix it. But I trust your judgement here!
> In any case, I think that task_state(), sched_show_task(), bacct_add_tsk()
> should be changed to use task_ppid_nr_ns(tsk).
Sounds good!
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
2025-08-08 14:56 ` Christian Brauner
@ 2025-08-10 15:42 ` Oleg Nesterov
0 siblings, 0 replies; 25+ messages in thread
From: Oleg Nesterov @ 2025-08-10 15:42 UTC (permalink / raw)
To: Christian Brauner
Cc: 高翔, Al Viro, Xiang Gao, mjguzik@gmail.com,
Liam.Howlett@oracle.com, joel.granados@kernel.org,
lorenzo.stoakes@oracle.com, linux-kernel@vger.kernel.org
On 08/08, Christian Brauner wrote:
>
> On Tue, Aug 05, 2025 at 02:43:01PM +0200, Oleg Nesterov wrote:
> >
> > After the quick grep I don't see the problematic users, but if a zombie
> > task T does task_ppid_nr_ns(current, NULL) the kernel can crash:
> >
> > - pid_alive() succeeds, the task is not reaped yet
> >
> > - the parent/debugger does wait()->release_task(T), T->thread_pid
> > is NULL now
> >
> > - T calls task_tgid_nr_ns()-> ... pid_nr_ns(ns => NULL) because
> > task_active_pid_ns(T) returns NULL
> >
> > Do you think this worth fixing?
>
> If it's not too much work and it is an actual real-world concern then I
> think we should fix it. But I trust your judgement here!
Well, I don't really know what to do ;)
Initially I was going to do something like below to "fix" task_tgid_nr_ns()
--- a/include/linux/pid.h
+++ b/include/linux/pid.h
@@ -299,7 +299,9 @@ static inline pid_t task_ppid_nr_ns(const struct task_struct *tsk, struct pid_na
pid_t pid = 0;
rcu_read_lock();
- if (pid_alive(tsk))
+ if (!ns)
+ ns = task_active_pid_ns(current);
+ if (ns && pid_alive(tsk))
pid = task_tgid_nr_ns(rcu_dereference(tsk->real_parent), ns);
rcu_read_unlock();
and then add WARN_ON() into pid_nr_ns(). But note that we should not check 'ns'
before 'pid', we need
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -491,7 +491,13 @@ pid_t pid_nr_ns(struct pid *pid, struct pid_namespace *ns)
struct upid *upid;
pid_t nr = 0;
- if (pid && ns->level <= pid->level) {
+ if (!pid)
+ return nr;
+
+ if (WARN_ON_ONCE(!ns)
+ return nr;
+
+ if (ns->level <= pid->level) {
upid = &pid->numbers[ns->level];
if (upid->ns == ns)
nr = upid->nr;
Think of task_pid_vnr(current) from a zombie, it should return 0 without warning.
But this looks overcomplicated to me. So I am going to send the patch which simply
changes __task_pid_nr_ns()
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -514,7 +514,8 @@ pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type,
rcu_read_lock();
if (!ns)
ns = task_active_pid_ns(current);
- nr = pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns);
+ if (ns)
+ nr = pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns);
rcu_read_unlock();
and leave pid_nr_ns() alone.
People will do mistakes, it is better not to crash and just return 0 if, say,
a zombie task does task_pid_vnr(another_task). Currently it is not simple to
do this correctly because, again, the pid_alive(current) check can't help.
pid_nr_ns() is more "low-level", its users should know what they are doing.
Although the quick grep suggests many of cleanups. Say, why kvm_create_pit()
does get_pid? why can't it simply use task_tgid_vnr(current). Even do_getpgid()
and sys_getsid() look overcomplicated. Later.
Oleg.
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 1/4] pid: make __task_pid_nr_ns(ns => NULL) safe for zombie callers
2025-08-02 2:21 [PATCH] pid: Add a judgment for ns null in pid_nr_ns Xiang Gao
2025-08-02 2:25 ` Al Viro
@ 2025-08-10 17:36 ` Oleg Nesterov
2025-08-10 17:36 ` [PATCH 2/4] pid: introduce task_ppid_vnr() Oleg Nesterov
` (3 subsequent siblings)
5 siblings, 0 replies; 25+ messages in thread
From: Oleg Nesterov @ 2025-08-10 17:36 UTC (permalink / raw)
To: Andrew Morton, Christian Brauner, 高翔
Cc: Al Viro, Mateusz Guzik, Xiang Gao, linux-kernel
task_pid_vnr(another_task) will crash if the caller was already reaped.
The pid_alive(current) check can't really help, the parent/debugger can
call release_task() right after this check.
This also means that even task_ppid_nr_ns(current, NULL) is not safe,
pid_alive() only ensures that it is safe to dereference ->real_parent.
Change __task_pid_nr_ns() to ensure ns != NULL.
Originally-by: 高翔 <gaoxiang17@xiaomi.com>
Link: https://lore.kernel.org/all/20250802022123.3536934-1-gxxa03070307@gmail.com/
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
kernel/pid.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/pid.c b/kernel/pid.c
index 8317bcbc7cf7..58d97a78f07e 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -514,7 +514,8 @@ pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type,
rcu_read_lock();
if (!ns)
ns = task_active_pid_ns(current);
- nr = pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns);
+ if (ns)
+ nr = pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns);
rcu_read_unlock();
return nr;
--
2.25.1.362.g51ebf55
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 2/4] pid: introduce task_ppid_vnr()
2025-08-02 2:21 [PATCH] pid: Add a judgment for ns null in pid_nr_ns Xiang Gao
2025-08-02 2:25 ` Al Viro
2025-08-10 17:36 ` [PATCH 1/4] pid: make __task_pid_nr_ns(ns => NULL) safe for zombie callers Oleg Nesterov
@ 2025-08-10 17:36 ` Oleg Nesterov
2025-08-10 17:36 ` [PATCH 3/4] pid: change bacct_add_tsk() to use task_ppid_nr_ns() Oleg Nesterov
` (2 subsequent siblings)
5 siblings, 0 replies; 25+ messages in thread
From: Oleg Nesterov @ 2025-08-10 17:36 UTC (permalink / raw)
To: Andrew Morton, Christian Brauner, 高翔
Cc: Al Viro, Mateusz Guzik, Xiang Gao, linux-kernel
for consistency with other task_xid_vnr() helpers.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
fs/pidfs.c | 2 +-
include/linux/pid.h | 5 +++++
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/pidfs.c b/fs/pidfs.c
index 4625e097e3a0..de5e71c2b49b 100644
--- a/fs/pidfs.c
+++ b/fs/pidfs.c
@@ -361,7 +361,7 @@ static long pidfd_info(struct file *file, unsigned int cmd, unsigned long arg)
* the fields are set correctly, or return ESRCH to avoid providing
* incomplete information. */
- kinfo.ppid = task_ppid_nr_ns(task, NULL);
+ kinfo.ppid = task_ppid_vnr(task);
kinfo.tgid = task_tgid_vnr(task);
kinfo.pid = task_pid_vnr(task);
kinfo.mask |= PIDFD_INFO_PID;
diff --git a/include/linux/pid.h b/include/linux/pid.h
index 453ae6d8a68d..4a733634c556 100644
--- a/include/linux/pid.h
+++ b/include/linux/pid.h
@@ -306,6 +306,11 @@ static inline pid_t task_ppid_nr_ns(const struct task_struct *tsk, struct pid_na
return pid;
}
+static inline pid_t task_ppid_vnr(const struct task_struct *tsk)
+{
+ return task_ppid_nr_ns(tsk, NULL);
+}
+
static inline pid_t task_ppid_nr(const struct task_struct *tsk)
{
return task_ppid_nr_ns(tsk, &init_pid_ns);
--
2.25.1.362.g51ebf55
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 3/4] pid: change bacct_add_tsk() to use task_ppid_nr_ns()
2025-08-02 2:21 [PATCH] pid: Add a judgment for ns null in pid_nr_ns Xiang Gao
` (2 preceding siblings ...)
2025-08-10 17:36 ` [PATCH 2/4] pid: introduce task_ppid_vnr() Oleg Nesterov
@ 2025-08-10 17:36 ` Oleg Nesterov
2025-08-10 17:36 ` [PATCH 4/4] pid: change task_state() " Oleg Nesterov
2025-08-19 11:40 ` [PATCH] pid: Add a judgment for ns null in pid_nr_ns Christian Brauner
5 siblings, 0 replies; 25+ messages in thread
From: Oleg Nesterov @ 2025-08-10 17:36 UTC (permalink / raw)
To: Andrew Morton, Christian Brauner, 高翔
Cc: Al Viro, Mateusz Guzik, Xiang Gao, linux-kernel
to simplify the code.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
kernel/tsacct.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/kernel/tsacct.c b/kernel/tsacct.c
index 16b283f9d831..6ea2f6363b90 100644
--- a/kernel/tsacct.c
+++ b/kernel/tsacct.c
@@ -57,12 +57,11 @@ void bacct_add_tsk(struct user_namespace *user_ns,
stats->ac_sched = tsk->policy;
stats->ac_pid = task_pid_nr_ns(tsk, pid_ns);
stats->ac_tgid = task_tgid_nr_ns(tsk, pid_ns);
+ stats->ac_ppid = task_ppid_nr_ns(tsk, pid_ns);
rcu_read_lock();
tcred = __task_cred(tsk);
stats->ac_uid = from_kuid_munged(user_ns, tcred->uid);
stats->ac_gid = from_kgid_munged(user_ns, tcred->gid);
- stats->ac_ppid = pid_alive(tsk) ?
- task_tgid_nr_ns(rcu_dereference(tsk->real_parent), pid_ns) : 0;
rcu_read_unlock();
task_cputime(tsk, &utime, &stime);
--
2.25.1.362.g51ebf55
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 4/4] pid: change task_state() to use task_ppid_nr_ns()
2025-08-02 2:21 [PATCH] pid: Add a judgment for ns null in pid_nr_ns Xiang Gao
` (3 preceding siblings ...)
2025-08-10 17:36 ` [PATCH 3/4] pid: change bacct_add_tsk() to use task_ppid_nr_ns() Oleg Nesterov
@ 2025-08-10 17:36 ` Oleg Nesterov
2025-08-19 11:40 ` [PATCH] pid: Add a judgment for ns null in pid_nr_ns Christian Brauner
5 siblings, 0 replies; 25+ messages in thread
From: Oleg Nesterov @ 2025-08-10 17:36 UTC (permalink / raw)
To: Andrew Morton, Christian Brauner, 高翔
Cc: Al Viro, Mateusz Guzik, Xiang Gao, linux-kernel
to simplify the code.
Note that only tpid and max_fds really need rcu_read_lock(), we could move
task_ppid_nr_ns/task_tgid_nr_ns/task_numa_group_id/get_task_cred outside of
rcu read section.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
fs/proc/array.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/fs/proc/array.c b/fs/proc/array.c
index d6a0369caa93..69269745d73b 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -157,13 +157,11 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
unsigned int max_fds = 0;
rcu_read_lock();
- ppid = pid_alive(p) ?
- task_tgid_nr_ns(rcu_dereference(p->real_parent), ns) : 0;
-
tracer = ptrace_parent(p);
if (tracer)
tpid = task_pid_nr_ns(tracer, ns);
+ ppid = task_ppid_nr_ns(p, ns);
tgid = task_tgid_nr_ns(p, ns);
ngid = task_numa_group_id(p);
cred = get_task_cred(p);
--
2.25.1.362.g51ebf55
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
2025-08-02 2:21 [PATCH] pid: Add a judgment for ns null in pid_nr_ns Xiang Gao
` (4 preceding siblings ...)
2025-08-10 17:36 ` [PATCH 4/4] pid: change task_state() " Oleg Nesterov
@ 2025-08-19 11:40 ` Christian Brauner
2025-08-19 14:25 ` Oleg Nesterov
5 siblings, 1 reply; 25+ messages in thread
From: Christian Brauner @ 2025-08-19 11:40 UTC (permalink / raw)
To: Xiang Gao
Cc: Christian Brauner, joel.granados, lorenzo.stoakes, linux-kernel,
gaoxiang17, oleg, mjguzik, Liam.Howlett, viro
On Sat, 02 Aug 2025 10:21:23 +0800, Xiang Gao wrote:
> __task_pid_nr_ns
> ns = task_active_pid_ns(current);
> pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns);
> if (pid && ns->level <= pid->level) {
>
> Sometimes null is returned for task_active_pid_ns. Then it will trigger kernel panic in pid_nr_ns.
>
> [...]
Applied to the vfs-6.18.pidfs branch of the vfs/vfs.git tree.
Patches in the vfs-6.18.pidfs branch should appear in linux-next soon.
Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.
It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.
Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.
tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs-6.18.pidfs
[1/1] pid: Add a judgment for ns null in pid_nr_ns
https://git.kernel.org/vfs/vfs/c/006568ab4c5c
[1/4] pid: make __task_pid_nr_ns(ns => NULL) safe for zombie callers
https://git.kernel.org/vfs/vfs/c/abdfd4948e45
[3/4] pid: change bacct_add_tsk() to use task_ppid_nr_ns()
https://git.kernel.org/vfs/vfs/c/b1afcaddd6c8
[4/4] pid: change task_state() to use task_ppid_nr_ns()
https://git.kernel.org/vfs/vfs/c/d00f5232851c
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
2025-08-19 11:40 ` [PATCH] pid: Add a judgment for ns null in pid_nr_ns Christian Brauner
@ 2025-08-19 14:25 ` Oleg Nesterov
2025-09-01 15:30 ` Oleg Nesterov
0 siblings, 1 reply; 25+ messages in thread
From: Oleg Nesterov @ 2025-08-19 14:25 UTC (permalink / raw)
To: Christian Brauner
Cc: Xiang Gao, joel.granados, lorenzo.stoakes, linux-kernel,
gaoxiang17, mjguzik, Liam.Howlett, viro
On 08/19, Christian Brauner wrote:
>
> On Sat, 02 Aug 2025 10:21:23 +0800, Xiang Gao wrote:
> > __task_pid_nr_ns
> > ns = task_active_pid_ns(current);
> > pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns);
> > if (pid && ns->level <= pid->level) {
> >
> > Sometimes null is returned for task_active_pid_ns. Then it will trigger kernel panic in pid_nr_ns.
> >
> > [...]
>
> Applied to the vfs-6.18.pidfs branch of the vfs/vfs.git tree.
> Patches in the vfs-6.18.pidfs branch should appear in linux-next soon.
>
> Please report any outstanding bugs that were missed during review in a
> new review to the original patch series allowing us to drop it.
>
> It's encouraged to provide Acked-bys and Reviewed-bys even though the
> patch has now been applied. If possible patch trailers will be updated.
>
> Note that commit hashes shown below are subject to change due to rebase,
> trailer updates or similar. If in doubt, please check the listed branch.
>
> tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
> branch: vfs-6.18.pidfs
>
> [1/1] pid: Add a judgment for ns null in pid_nr_ns
> https://git.kernel.org/vfs/vfs/c/006568ab4c5c
> [1/4] pid: make __task_pid_nr_ns(ns => NULL) safe for zombie callers
> https://git.kernel.org/vfs/vfs/c/abdfd4948e45
> [3/4] pid: change bacct_add_tsk() to use task_ppid_nr_ns()
> https://git.kernel.org/vfs/vfs/c/b1afcaddd6c8
> [4/4] pid: change task_state() to use task_ppid_nr_ns()
> https://git.kernel.org/vfs/vfs/c/d00f5232851c
Hmm. The 1st patch adds the ns != NULL check into pid_nr_ns().
This means that "[1/4] pid: make __task_pid_nr_ns(ns => NULL) safe for zombie callers"
(commit abdfd4948e45c51b19 in vfs-6.18.pidfs) is not needed.
OTOH... You didn't take
[PATCH 2/4] pid: introduce task_ppid_vnr()
https://lore.kernel.org/all/20250810173610.GA19995@redhat.com/
currently in -mm tree. It is purely cosmetic, but imo makes sense.
Oleg.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
2025-08-19 14:25 ` Oleg Nesterov
@ 2025-09-01 15:30 ` Oleg Nesterov
2025-09-01 15:44 ` Mateusz Guzik
0 siblings, 1 reply; 25+ messages in thread
From: Oleg Nesterov @ 2025-09-01 15:30 UTC (permalink / raw)
To: Christian Brauner
Cc: Xiang Gao, joel.granados, lorenzo.stoakes, linux-kernel,
gaoxiang17, mjguzik, Liam.Howlett, viro
ping...
We need either
[1/1] pid: Add a judgment for ns null in pid_nr_ns
https://git.kernel.org/vfs/vfs/c/006568ab4c5c
or
[1/4] pid: make __task_pid_nr_ns(ns => NULL) safe for zombie callers
https://git.kernel.org/vfs/vfs/c/abdfd4948e45
in any case imo the changelog should explain why do we care
to check ns != NUll, "Sometimes null is returned for task_active_pid_ns"
doesn't look like a good explanation...
Oleg.
On 08/19, Oleg Nesterov wrote:
>
> On 08/19, Christian Brauner wrote:
> >
> > On Sat, 02 Aug 2025 10:21:23 +0800, Xiang Gao wrote:
> > > __task_pid_nr_ns
> > > ns = task_active_pid_ns(current);
> > > pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns);
> > > if (pid && ns->level <= pid->level) {
> > >
> > > Sometimes null is returned for task_active_pid_ns. Then it will trigger kernel panic in pid_nr_ns.
> > >
> > > [...]
> >
> > Applied to the vfs-6.18.pidfs branch of the vfs/vfs.git tree.
> > Patches in the vfs-6.18.pidfs branch should appear in linux-next soon.
> >
> > Please report any outstanding bugs that were missed during review in a
> > new review to the original patch series allowing us to drop it.
> >
> > It's encouraged to provide Acked-bys and Reviewed-bys even though the
> > patch has now been applied. If possible patch trailers will be updated.
> >
> > Note that commit hashes shown below are subject to change due to rebase,
> > trailer updates or similar. If in doubt, please check the listed branch.
> >
> > tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
> > branch: vfs-6.18.pidfs
> >
> > [1/1] pid: Add a judgment for ns null in pid_nr_ns
> > https://git.kernel.org/vfs/vfs/c/006568ab4c5c
> > [1/4] pid: make __task_pid_nr_ns(ns => NULL) safe for zombie callers
> > https://git.kernel.org/vfs/vfs/c/abdfd4948e45
> > [3/4] pid: change bacct_add_tsk() to use task_ppid_nr_ns()
> > https://git.kernel.org/vfs/vfs/c/b1afcaddd6c8
> > [4/4] pid: change task_state() to use task_ppid_nr_ns()
> > https://git.kernel.org/vfs/vfs/c/d00f5232851c
>
> Hmm. The 1st patch adds the ns != NULL check into pid_nr_ns().
>
> This means that "[1/4] pid: make __task_pid_nr_ns(ns => NULL) safe for zombie callers"
> (commit abdfd4948e45c51b19 in vfs-6.18.pidfs) is not needed.
>
> OTOH... You didn't take
>
> [PATCH 2/4] pid: introduce task_ppid_vnr()
> https://lore.kernel.org/all/20250810173610.GA19995@redhat.com/
>
> currently in -mm tree. It is purely cosmetic, but imo makes sense.
>
> Oleg.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
2025-09-01 15:30 ` Oleg Nesterov
@ 2025-09-01 15:44 ` Mateusz Guzik
2025-09-01 15:55 ` Mateusz Guzik
0 siblings, 1 reply; 25+ messages in thread
From: Mateusz Guzik @ 2025-09-01 15:44 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Christian Brauner, Xiang Gao, joel.granados, lorenzo.stoakes,
linux-kernel, gaoxiang17, Liam.Howlett, viro
On Mon, Sep 1, 2025 at 5:32 PM Oleg Nesterov <oleg@redhat.com> wrote:
>
> ping...
>
> We need either
>
> [1/1] pid: Add a judgment for ns null in pid_nr_ns
> https://git.kernel.org/vfs/vfs/c/006568ab4c5c
>
> or
>
> [1/4] pid: make __task_pid_nr_ns(ns => NULL) safe for zombie callers
> https://git.kernel.org/vfs/vfs/c/abdfd4948e45
>
> in any case imo the changelog should explain why do we care
> to check ns != NUll, "Sometimes null is returned for task_active_pid_ns"
> doesn't look like a good explanation...
>
Since I caught this a stray patchset I'll bite: given the totally
arbitrary task struct in an irq handler, why even allow querying it
from that level? The task is literally random, and even possibly dead
as in this crash report.
To my reading the code which runs into woes here is private to a
vendor. Maybe I missed something, but I don't see a justification for
querying the task in an irq handler to begin with (and per above I
don't understand what the point is).
That is to say, if this was up to me, I would at best assert we are in
the process context and that ns is not NULL. As a result I would very
much *ban* the call as reported here, unless there is a good reason to
make it work (what is it?).
That's my side rant, feel free to ignore. :->
--
Mateusz Guzik <mjguzik gmail.com>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
2025-09-01 15:44 ` Mateusz Guzik
@ 2025-09-01 15:55 ` Mateusz Guzik
2025-09-02 14:37 ` Oleg Nesterov
0 siblings, 1 reply; 25+ messages in thread
From: Mateusz Guzik @ 2025-09-01 15:55 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Christian Brauner, Xiang Gao, joel.granados, lorenzo.stoakes,
linux-kernel, gaoxiang17, Liam.Howlett, viro
On Mon, Sep 1, 2025 at 5:44 PM Mateusz Guzik <mjguzik@gmail.com> wrote:
>
> On Mon, Sep 1, 2025 at 5:32 PM Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > ping...
> >
> > We need either
> >
> > [1/1] pid: Add a judgment for ns null in pid_nr_ns
> > https://git.kernel.org/vfs/vfs/c/006568ab4c5c
> >
> > or
> >
> > [1/4] pid: make __task_pid_nr_ns(ns => NULL) safe for zombie callers
> > https://git.kernel.org/vfs/vfs/c/abdfd4948e45
> >
> > in any case imo the changelog should explain why do we care
> > to check ns != NUll, "Sometimes null is returned for task_active_pid_ns"
> > doesn't look like a good explanation...
> >
>
> Since I caught this a stray patchset I'll bite: given the totally
> arbitrary task struct in an irq handler, why even allow querying it
> from that level? The task is literally random, and even possibly dead
> as in this crash report.
>
> To my reading the code which runs into woes here is private to a
> vendor. Maybe I missed something, but I don't see a justification for
> querying the task in an irq handler to begin with (and per above I
> don't understand what the point is).
>
> That is to say, if this was up to me, I would at best assert we are in
> the process context and that ns is not NULL. As a result I would very
> much *ban* the call as reported here, unless there is a good reason to
> make it work (what is it?).
>
> That's my side rant, feel free to ignore. :->
>
Maybe even go a little further and assert that the task at hand is
fully constructed and not exiting yet.
The worry here is that normally you would expect the task to be all
there (so to speak).
For example imagine the task is mid-teardown, with some pointers
already freed and whatnot and interrupt comes in. Code might show up
which will deref parts of task_struct which are no longer legally
accessible, but this only blows up if you are unlucky enough while
racing. In practice these accesses might even never trip over anything
despite being illegal (unless someone nullifies a pointer or
something), hiding the problem.
All in all, this is a can of worms and from my cursory reading no
justification for doing allowing it was provided.
--
Mateusz Guzik <mjguzik gmail.com>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
2025-09-01 15:55 ` Mateusz Guzik
@ 2025-09-02 14:37 ` Oleg Nesterov
0 siblings, 0 replies; 25+ messages in thread
From: Oleg Nesterov @ 2025-09-02 14:37 UTC (permalink / raw)
To: Mateusz Guzik
Cc: Christian Brauner, Xiang Gao, joel.granados, lorenzo.stoakes,
linux-kernel, gaoxiang17, Liam.Howlett, viro
On 09/01, Mateusz Guzik wrote:
>
> On Mon, Sep 1, 2025 at 5:44 PM Mateusz Guzik <mjguzik@gmail.com> wrote:
> >
> > On Mon, Sep 1, 2025 at 5:32 PM Oleg Nesterov <oleg@redhat.com> wrote:
> > >
> > > ping...
> > >
> > > We need either
> > >
> > > [1/1] pid: Add a judgment for ns null in pid_nr_ns
> > > https://git.kernel.org/vfs/vfs/c/006568ab4c5c
> > >
> > > or
> > >
> > > [1/4] pid: make __task_pid_nr_ns(ns => NULL) safe for zombie callers
> > > https://git.kernel.org/vfs/vfs/c/abdfd4948e45
> > >
> > > in any case imo the changelog should explain why do we care
> > > to check ns != NUll, "Sometimes null is returned for task_active_pid_ns"
> > > doesn't look like a good explanation...
> > >
> >
> > Since I caught this a stray patchset I'll bite: given the totally
> > arbitrary task struct in an irq handler, why even allow querying it
> > from that level? The task is literally random, and even possibly dead
> > as in this crash report.
I won't really argue. And initially I was going to "ignore" the original
bug report. If nothing else, the code which triggered the crash was buggy.
But then I changed my mind. People will do mistakes, I think it would be
better to make this API a bit safer. See below.
But in any case, at least one of the patches above should be removed from
vfs-6.18.pidfs.
OTOH, the trivial/cosmetic
[PATCH 2/4] pid: introduce task_ppid_vnr()
https://lore.kernel.org/all/20250810173610.GA19995@redhat.com/
makes sense imo, and it was missed.
Now. I mostly agree about IRQ, but to me it would be better to avoid the
crash if, say, task_pid_vnr(&init_task) is called from IRQ context.
But lets forget about IRQ. Please look at the changelog in
[PATCH 1/4] pid: make __task_pid_nr_ns(ns => NULL) safe for zombie callers
Even task_ppid_nr_ns(current, NULL) is not safe if it is called by the
exiting task after exit_notify() in the process context, and this is not
immediately clear. This doesn't look good to me.
> Maybe even go a little further and assert that the task at hand is
> fully constructed and not exiting yet.
So what do you suggest? Add the current->exit_state check somewhere?
Oleg.
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2025-09-02 14:38 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-02 2:21 [PATCH] pid: Add a judgment for ns null in pid_nr_ns Xiang Gao
2025-08-02 2:25 ` Al Viro
2025-08-02 8:43 ` Oleg Nesterov
[not found] ` <15b18541f37447dd8d5dbd8012662f67@xiaomi.com>
2025-08-02 5:52 ` 答复: [External Mail]Re: " Al Viro
2025-08-02 5:54 ` Al Viro
[not found] ` <c7968242db914979953277226fe55fc8@xiaomi.com>
2025-08-02 8:04 ` 答复: " Al Viro
2025-08-02 8:45 ` Oleg Nesterov
[not found] ` <80be47cb31d14ffc9f9a7d8d4408ab0a@xiaomi.com>
2025-08-04 11:49 ` Oleg Nesterov
2025-08-04 12:14 ` Christian Brauner
2025-08-04 12:44 ` Oleg Nesterov
2025-08-05 12:43 ` Oleg Nesterov
2025-08-08 14:56 ` Christian Brauner
2025-08-10 15:42 ` Oleg Nesterov
[not found] ` <aa5272ddcec944e2a35ca7104f6a86bf@xiaomi.com>
2025-08-05 19:43 ` 答复: [External Mail]Re: " Oleg Nesterov
2025-08-08 14:54 ` Christian Brauner
2025-08-10 17:36 ` [PATCH 1/4] pid: make __task_pid_nr_ns(ns => NULL) safe for zombie callers Oleg Nesterov
2025-08-10 17:36 ` [PATCH 2/4] pid: introduce task_ppid_vnr() Oleg Nesterov
2025-08-10 17:36 ` [PATCH 3/4] pid: change bacct_add_tsk() to use task_ppid_nr_ns() Oleg Nesterov
2025-08-10 17:36 ` [PATCH 4/4] pid: change task_state() " Oleg Nesterov
2025-08-19 11:40 ` [PATCH] pid: Add a judgment for ns null in pid_nr_ns Christian Brauner
2025-08-19 14:25 ` Oleg Nesterov
2025-09-01 15:30 ` Oleg Nesterov
2025-09-01 15:44 ` Mateusz Guzik
2025-09-01 15:55 ` Mateusz Guzik
2025-09-02 14:37 ` Oleg Nesterov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).