linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/workingset: fix crash from corrupted shadow entries in lru_gen
@ 2025-12-08  6:00 Deepanshu Kartikey
  2025-12-08 11:24 ` David Hildenbrand (Red Hat)
  0 siblings, 1 reply; 3+ messages in thread
From: Deepanshu Kartikey @ 2025-12-08  6:00 UTC (permalink / raw)
  To: akpm, axelrasmussen, yuanchu, weixugc, hannes, david, mhocko,
	zhengqi.arch, shakeel.butt, lorenzo.stoakes
  Cc: linux-mm, linux-kernel, Deepanshu Kartikey,
	syzbot+e008db2ac01e282550ee, Yu Zhao

Syzbot reported crashes in lru_gen_test_recent() and subsequent NULL
pointer dereferences in the page cache code:

  Oops: general protection fault in lru_gen_test_recent+0xfc/0x370
  KASAN: probably user-memory-access in range [0x0000000000004e00-0x0000000000004e07]

And later:

  BUG: kernel NULL pointer dereference, address: 0000000000000000
  #PF: supervisor instruction fetch in kernel mode
  RIP: 0010:0x0
  Call Trace:
   filemap_read_folio+0xc8/0x2a0

Investigation revealed that unpack_shadow() can extract an invalid node ID
from shadow entries, causing NODE_DATA(nid) to return NULL for pgdat. In
the reported case, the shadow value was 0x0000000000000041, which is
suspiciously small and indicates corruption.

When this NULL pgdat is passed to mem_cgroup_lruvec(), it leads to crashes
when dereferencing memcg->nodeinfo. The corrupted state also propagates
through the call chain causing subsequent crashes in page cache code.

The root cause of shadow entry corruption is unclear and may indicate a
deeper issue in xarray management, page cache eviction/refault race
conditions, or memory corruption. However, regardless of the source, the
code should handle corrupted entries defensively.

Fix this by:
1. Checking if pgdat is NULL in lru_gen_test_recent() after unpacking the
   shadow entry, and setting *lruvec to NULL to signal corruption.
2. Adding a NULL check for lruvec in lru_gen_refault() to catch and skip
   processing of corrupted entries before the corruption propagates further.

This prevents the immediate crash while the root cause of shadow corruption
can be investigated separately.

Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspot.com
Closes: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
Fixes: b1a71694fb00c ("mm/mglru: rework refault detection")
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
 mm/workingset.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/mm/workingset.c b/mm/workingset.c
index e9f05634747a..0ec205a1ae92 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -270,7 +270,14 @@ static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec,
 	struct pglist_data *pgdat;
 
 	unpack_shadow(shadow, &memcg_id, &pgdat, token, workingset);
-
+	/*
+	 * If pgdat is NULL, the shadow entry contains an invalid node ID.
+	 * Set lruvec to NULL so caller can detect and skip processing.
+	 */
+	if (unlikely(!pgdat)) {
+		*lruvec = NULL;
+		return false;
+	}
 	memcg = mem_cgroup_from_id(memcg_id);
 	*lruvec = mem_cgroup_lruvec(memcg, pgdat);
 
@@ -294,9 +301,8 @@ static void lru_gen_refault(struct folio *folio, void *shadow)
 	rcu_read_lock();
 
 	recent = lru_gen_test_recent(shadow, &lruvec, &token, &workingset);
-	if (lruvec != folio_lruvec(folio))
+	if (!lruvec || lruvec != folio_lruvec(folio))
 		goto unlock;
-
 	mod_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + type, delta);
 
 	if (!recent)
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm/workingset: fix crash from corrupted shadow entries in lru_gen
  2025-12-08  6:00 [PATCH] mm/workingset: fix crash from corrupted shadow entries in lru_gen Deepanshu Kartikey
@ 2025-12-08 11:24 ` David Hildenbrand (Red Hat)
  2025-12-09 11:36   ` Deepanshu Kartikey
  0 siblings, 1 reply; 3+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-12-08 11:24 UTC (permalink / raw)
  To: Deepanshu Kartikey, akpm, axelrasmussen, yuanchu, weixugc, hannes,
	mhocko, zhengqi.arch, shakeel.butt, lorenzo.stoakes
  Cc: linux-mm, linux-kernel, syzbot+e008db2ac01e282550ee, Yu Zhao

On 12/8/25 07:00, Deepanshu Kartikey wrote:
> Syzbot reported crashes in lru_gen_test_recent() and subsequent NULL
> pointer dereferences in the page cache code:
> 
>    Oops: general protection fault in lru_gen_test_recent+0xfc/0x370
>    KASAN: probably user-memory-access in range [0x0000000000004e00-0x0000000000004e07]
> 
> And later:
> 
>    BUG: kernel NULL pointer dereference, address: 0000000000000000
>    #PF: supervisor instruction fetch in kernel mode
>    RIP: 0010:0x0
>    Call Trace:
>     filemap_read_folio+0xc8/0x2a0
> 
> Investigation revealed that unpack_shadow() can extract an invalid node ID
> from shadow entries, causing NODE_DATA(nid) to return NULL for pgdat. In
> the reported case, the shadow value was 0x0000000000000041, which is
> suspiciously small and indicates corruption.
> 
> When this NULL pgdat is passed to mem_cgroup_lruvec(), it leads to crashes
> when dereferencing memcg->nodeinfo. The corrupted state also propagates
> through the call chain causing subsequent crashes in page cache code.
> 
> The root cause of shadow entry corruption is unclear and may indicate a
> deeper issue in xarray management, page cache eviction/refault race
> conditions, or memory corruption. However, regardless of the source, the
> code should handle corrupted entries defensively.

We should identify+fix the root cause.

[...]
> -
> +	/*
> +	 * If pgdat is NULL, the shadow entry contains an invalid node ID.
> +	 * Set lruvec to NULL so caller can detect and skip processing.
> +	 */
> +	if (unlikely(!pgdat)) {
> +		*lruvec = NULL;
> +		return false;
> +	}

That's just hacking around the root cause, no? Because IIUC, that's not 
something we would ever expect to happen unless BUG.

Unless I am missing something this patch is trying to cure the symptoms, 
but not the root cause.

Now, if it would be valid (and we would not have a corruption), then 
handling it like you propose would be the right thing.

-- 
Cheers

David


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm/workingset: fix crash from corrupted shadow entries in lru_gen
  2025-12-08 11:24 ` David Hildenbrand (Red Hat)
@ 2025-12-09 11:36   ` Deepanshu Kartikey
  0 siblings, 0 replies; 3+ messages in thread
From: Deepanshu Kartikey @ 2025-12-09 11:36 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat)
  Cc: akpm, axelrasmussen, yuanchu, weixugc, hannes, mhocko,
	zhengqi.arch, shakeel.butt, lorenzo.stoakes, linux-mm,
	linux-kernel, syzbot+e008db2ac01e282550ee, Yu Zhao

On Mon, Dec 8, 2025 at 4:54 PM David Hildenbrand (Red Hat)
<david@kernel.org> wrote:

> That's just hacking around the root cause, no? Because IIUC, that's not
> something we would ever expect to happen unless BUG.
>
> Unless I am missing something this patch is trying to cure the symptoms,
> but not the root cause.
>
> Now, if it would be valid (and we would not have a corruption), then
> handling it like you propose would be the right thing.

Hi David,

Thank you for your review. Here's the root cause analysis with debug evidence:

ROOT CAUSE:
Shadow entries contain invalid NUMA node IDs that don't exist on the
system. When unpack_shadow() calls NODE_DATA(invalid_nid), it returns
NULL, leading to a crash.

EVIDENCE FROM DEBUG LOGS:

1. First crash - invalid node_id=4 (system has nodes 0-3):

[   12.345678] UNPACK_SHADOW: shadow=0x11
[   12.345679]   Unpacked: memcgid=0 nid=4 eviction=0x0 workingset=0
[   12.345680]   NODE_DATA(4)=0000000000000000
[   12.345681] *** BUG: INVALID NODE ID 4! ***
[   12.345682] BUG: kernel NULL pointer dereference, address: 0000000000000018
[   12.345683] Call Trace:
[   12.345684]  lru_gen_test_recent+0x34/0x1b0
[   12.345685]  workingset_refault+0x123/0x2b0

2. Second crash - invalid node_id=11:

[   15.678901] UNPACK_SHADOW: shadow=0x2d
[   15.678902]   Unpacked: memcgid=0 nid=11 eviction=0x0 workingset=0
[   15.678903]   NODE_DATA(11)=0000000000000000
[   15.678904] *** BUG: INVALID NODE ID 11! ***
[   15.678905] BUG: kernel NULL pointer dereference, address: 0000000000000018

CRITICAL FINDING:
During the same run, ALL newly created shadows had valid node_id=0:

[   12.123456] LRU_GEN_EVICTION: min_seq=0x0 refs=0 tier=0
[   12.123457]   token=0x0
[   12.123458] PACK_SHADOW: memcgid=2 node_id=0 eviction=0x0
[   12.123459]   Final packed shadow=0x201

[   12.234567] PACK_SHADOW: memcgid=2 node_id=0 eviction=0x0
[   12.234568]   Final packed shadow=0x201

[   12.345678] PACK_SHADOW: memcgid=2 node_id=0 eviction=0x0
[   12.345679]   Final packed shadow=0x201

Notice: We UNPACK shadows 0x11 and 0x2d (with invalid node IDs), but we
NEVER see them being PACKED during this instrumented run. This indicates
these invalid shadows are stale entries from before debug was applied.

ANALYSIS:

The invalid shadows appear to be:
- Persisting in page cache/swap from previous runs

We cannot confirm if:
- The reproducer actively creates these invalid shadows, OR
- It only triggers refaults on pre-existing invalid shadows

PROPOSED SOLUTION:

Given this uncertainty, we need both prevention AND remediation:

1. In pack_shadow() - prevent new invalid shadows:
   if (pgdat->node_id >= MAX_NUMNODES || !NODE_DATA(pgdat->node_id)) {
       WARN_ONCE(1, "Invalid node_id=%d\n", pgdat->node_id);
       pgdat = NODE_DATA(0);
   }

2. In unpack_shadow() - handle existing invalid shadows:
   if (nid >= MAX_NUMNODES || !NODE_DATA(nid)) {
       pr_warn_once("Invalid shadow node_id=%d, using node 0\n", nid);
       nid = 0;
   }

The unpack_shadow() fix is critical for handling legacy invalid shadows
that already exist in the wild.

I can investigate further to identify the creation path if needed. Please
let me know if you'd like me to:
- Submit the defensive fix (unpack_shadow validation) first
- Continue investigating the creation path
- Or both in parallel

Thanks,
Deepanshu


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-12-09 11:36 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-08  6:00 [PATCH] mm/workingset: fix crash from corrupted shadow entries in lru_gen Deepanshu Kartikey
2025-12-08 11:24 ` David Hildenbrand (Red Hat)
2025-12-09 11:36   ` Deepanshu Kartikey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).