All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: gaoxiang17 <gaoxiang17@xiaomi.com>, Baoquan He <bhe@redhat.com>,
	Christian Brauner <brauner@kernel.org>,
	Sasha Levin <sashal@kernel.org>,
	oleg@redhat.com, mjguzik@gmail.com, Liam.Howlett@Oracle.com,
	jack@suse.cz, joel.granados@kernel.org, viro@zeniv.linux.org.uk,
	lorenzo.stoakes@oracle.com
Subject: [PATCH AUTOSEL 6.17-5.4] pid: Add a judgment for ns null in pid_nr_ns
Date: Wed,  1 Oct 2025 09:36:42 -0400	[thread overview]
Message-ID: <20251001133653.978885-8-sashal@kernel.org> (raw)
In-Reply-To: <20251001133653.978885-1-sashal@kernel.org>

From: gaoxiang17 <gaoxiang17@xiaomi.com>

[ Upstream commit 006568ab4c5ca2309ceb36fa553e390b4aa9c0c7 ]

__task_pid_nr_ns
        ns = task_active_pid_ns(current);
        pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns);
                if (pid && ns->level <= pid->level) {

Sometimes null is returned for task_active_pid_ns. Then it will trigger kernel panic in pid_nr_ns.

For example:
	Unable to handle kernel NULL pointer dereference at virtual address 0000000000000058
	Mem abort info:
	ESR = 0x0000000096000007
	EC = 0x25: DABT (current EL), IL = 32 bits
	SET = 0, FnV = 0
	EA = 0, S1PTW = 0
	FSC = 0x07: level 3 translation fault
	Data abort info:
	ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
	CM = 0, WnR = 0, TnD = 0, TagAccess = 0
	GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
	user pgtable: 4k pages, 39-bit VAs, pgdp=00000002175aa000
	[0000000000000058] pgd=08000002175ab003, p4d=08000002175ab003, pud=08000002175ab003, pmd=08000002175be003, pte=0000000000000000
	pstate: 834000c5 (Nzcv daIF +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
	pc : __task_pid_nr_ns+0x74/0xd0
	lr : __task_pid_nr_ns+0x24/0xd0
	sp : ffffffc08001bd10
	x29: ffffffc08001bd10 x28: ffffffd4422b2000 x27: 0000000000000001
	x26: ffffffd442821168 x25: ffffffd442821000 x24: 00000f89492eab31
	x23: 00000000000000c0 x22: ffffff806f5693c0 x21: ffffff806f5693c0
	x20: 0000000000000001 x19: 0000000000000000 x18: 0000000000000000
	x17: 00000000529c6ef0 x16: 00000000529c6ef0 x15: 00000000023a1adc
	x14: 0000000000000003 x13: 00000000007ef6d8 x12: 001167c391c78800
	x11: 00ffffffffffffff x10: 0000000000000000 x9 : 0000000000000001
	x8 : ffffff80816fa3c0 x7 : 0000000000000000 x6 : 49534d702d535449
	x5 : ffffffc080c4c2c0 x4 : ffffffd43ee128c8 x3 : ffffffd43ee124dc
	x2 : 0000000000000000 x1 : 0000000000000001 x0 : ffffff806f5693c0
	Call trace:
	__task_pid_nr_ns+0x74/0xd0
	...
	__handle_irq_event_percpu+0xd4/0x284
	handle_irq_event+0x48/0xb0
	handle_fasteoi_irq+0x160/0x2d8
	generic_handle_domain_irq+0x44/0x60
	gic_handle_irq+0x4c/0x114
	call_on_irq_stack+0x3c/0x74
	do_interrupt_handler+0x4c/0x84
	el1_interrupt+0x34/0x58
	el1h_64_irq_handler+0x18/0x24
	el1h_64_irq+0x68/0x6c
	account_kernel_stack+0x60/0x144
	exit_task_stack_account+0x1c/0x80
	do_exit+0x7e4/0xaf8
	...
	get_signal+0x7bc/0x8d8
	do_notify_resume+0x128/0x828
	el0_svc+0x6c/0x70
	el0t_64_sync_handler+0x68/0xbc
	el0t_64_sync+0x1a8/0x1ac
	Code: 35fffe54 911a02a8 f9400108 b4000128 (b9405a69)
	---[ end trace 0000000000000000 ]---
	Kernel panic - not syncing: Oops: Fatal exception in interrupt

Signed-off-by: gaoxiang17 <gaoxiang17@xiaomi.com>
Link: https://lore.kernel.org/20250802022123.3536934-1-gxxa03070307@gmail.com
Reviewed-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Based on my comprehensive analysis, I'll provide my backport
recommendation:

**Backport Status: YES**

## Analysis

### Code Change Review
The commit adds a single NULL check in `pid_nr_ns()` at
kernel/pid.c:494:
```c
- if (pid && ns->level <= pid->level) {
+       if (pid && ns && ns->level <= pid->level) {
```

This prevents dereferencing `ns` when it's NULL.

### Why This Fix Is Needed

1. **Real Production Crash**: The commit message includes a detailed
   crash dump showing a kernel panic at `__task_pid_nr_ns+0x74/0xd0`
   with NULL pointer dereference at address `0x0000000000000058` (offset
   of `level` field in `struct pid_namespace`).

2. **Race Condition During Task Exit**: `task_active_pid_ns()` can
   return NULL when called on a task that's being reaped. The call chain
   shows the crash occurred during interrupt handling while a task was
   exiting (`do_exit` → `exit_task_stack_account` → interrupt →
   `__task_pid_nr_ns`).

3. **Long-Standing Bug**: This isn't a regression - similar issues were
   fixed in 2015 (commit 81b1a832d7974) but this particular path was
   missed.

### Critical Context - Follow-Up Fix Required
My research found that Oleg Nesterov submitted a follow-up commit
(abdfd4948e45c) on Aug 10, 2025 titled "pid: make __task_pid_nr_ns(ns =>
NULL) safe for zombie callers" which adds an additional NULL check in
`__task_pid_nr_ns()` itself. Both commits address complementary aspects
of the same race condition and should be backported together.

### Backport Justification

✅ **Fixes important user-visible bug**: Kernel panic in production
systems
✅ **Minimal code change**: Single NULL check, no behavioral changes
✅ **Low regression risk**: Defensive check that only prevents crashes
✅ **Confined to subsystem**: Only affects PID namespace handling
✅ **Reviewed by maintainers**: Baoquan He (reviewed), Christian Brauner
(signed-off)
✅ **Stable tree criteria met**: Important bugfix, minimal risk, no
feature additions

### Recommendation
**Strongly recommend backporting to all active stable trees**,
preferably together with the follow-up commit abdfd4948e45c to ensure
complete protection against this race condition. The fix is critical for
containerized environments where PID namespaces are heavily used.

 kernel/pid.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/pid.c b/kernel/pid.c
index c45a28c16cd25..14e908f2f0cbf 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -491,7 +491,7 @@ pid_t pid_nr_ns(struct pid *pid, struct pid_namespace *ns)
 	struct upid *upid;
 	pid_t nr = 0;
 
-	if (pid && ns->level <= pid->level) {
+	if (pid && ns && ns->level <= pid->level) {
 		upid = &pid->numbers[ns->level];
 		if (upid->ns == ns)
 			nr = upid->nr;
-- 
2.51.0


  parent reply	other threads:[~2025-10-01 13:37 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-01 13:36 [PATCH AUTOSEL 6.17-5.4] minixfs: Verify inode mode when loading from disk Sasha Levin
2025-10-01 13:36 ` [PATCH AUTOSEL 6.17-6.16] mnt_ns_tree_remove(): DTRT if mnt_ns had never been added to mnt_ns_list Sasha Levin
2025-10-01 13:36 ` [PATCH AUTOSEL 6.17-5.15] writeback: Avoid softlockup when switching many inodes Sasha Levin
2025-10-01 13:36 ` [PATCH AUTOSEL 6.17-6.16] mount: handle NULL values in mnt_ns_release() Sasha Levin
2025-10-01 13:36 ` [PATCH AUTOSEL 6.17-6.12] copy_file_range: limit size if in compat mode Sasha Levin
2025-10-01 13:36 ` [PATCH AUTOSEL 6.17-5.4] fs: Add 'initramfs_options' to set initramfs mount options Sasha Levin
2025-10-01 13:36 ` [PATCH AUTOSEL 6.17-6.16] pidfs: validate extensible ioctls Sasha Levin
2025-10-01 13:36 ` Sasha Levin [this message]
2025-10-01 13:36 ` [PATCH AUTOSEL 6.17-5.4] cramfs: Verify inode mode when loading from disk Sasha Levin
2025-10-01 13:36 ` [PATCH AUTOSEL 6.17-6.16] nsfs: validate extensible ioctls Sasha Levin
2025-10-01 13:36 ` [PATCH AUTOSEL 6.17-5.15] writeback: Avoid excessively long inode switching times Sasha Levin
2025-10-01 13:36 ` [PATCH AUTOSEL 6.17] iomap: error out on file IO when there is no inline_data buffer Sasha Levin
2025-10-01 13:36 ` [PATCH AUTOSEL 6.17-5.10] pid: make __task_pid_nr_ns(ns => NULL) safe for zombie callers Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251001133653.978885-8-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=Liam.Howlett@Oracle.com \
    --cc=bhe@redhat.com \
    --cc=brauner@kernel.org \
    --cc=gaoxiang17@xiaomi.com \
    --cc=jack@suse.cz \
    --cc=joel.granados@kernel.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mjguzik@gmail.com \
    --cc=oleg@redhat.com \
    --cc=patches@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.