public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [syzbot] [mm?] general protection fault in lru_gen_test_recent (2)
@ 2025-12-07  8:55 syzbot
  2025-12-07 12:44 ` Forwarded: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent syzbot
                   ` (14 more replies)
  0 siblings, 15 replies; 31+ messages in thread
From: syzbot @ 2025-12-07  8:55 UTC (permalink / raw)
  To: akpm, axelrasmussen, david, hannes, linux-kernel, linux-mm,
	lorenzo.stoakes, mhocko, shakeel.butt, syzkaller-bugs, weixugc,
	yuanchu, zhengqi.arch

Hello,

syzbot found the following issue on:

HEAD commit:    c06c303832ec ocfs2: fix xattr array entry __counted_by error
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=14cbfc1a580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=5aef7d5187304591
dashboard link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
compiler:       gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=127f2992580000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15cf4eb4580000

Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-c06c3038.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/1a5115eeda38/vmlinux-c06c3038.xz
kernel image: https://storage.googleapis.com/syzbot-assets/98eb17e54bb8/bzImage-c06c3038.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com

Oops: general protection fault, probably for non-canonical address 0xdffffc00000009c0: 0000 [#1] SMP KASAN NOPTI
KASAN: probably user-memory-access in range [0x0000000000004e00-0x0000000000004e07]
CPU: 2 UID: 0 PID: 6121 Comm: syz.0.27 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:mem_cgroup_lruvec include/linux/memcontrol.h:720 [inline]
RIP: 0010:lru_gen_test_recent+0xee/0x320 mm/workingset.c:275
Code: 38 80 b5 ff 48 85 db 0f 84 79 01 00 00 e8 2a 80 b5 ff 49 8d bd 00 4e 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e a3 01 00 00 4d 63 b5 00 4e 00
RSP: 0018:ffffc90003e17828 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: ffff888100068000 RCX: ffffc90003e1772c
RDX: 00000000000009c0 RSI: ffffffff82096446 RDI: 0000000000004e00
RBP: ffffc90003e178c0 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: ffff888028282ff0 R12: ffffc90003e178e0
R13: 0000000000000000 R14: ffffc90003e178b0 R15: 0000000000000000
FS:  00007f6361dfa6c0(0000) GS:ffff8880d6b0d000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000409000000 CR3: 0000000039516000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 lru_gen_refault mm/workingset.c:296 [inline]
 workingset_refault+0x251/0xca0 mm/workingset.c:546
 filemap_add_folio+0x23d/0x610 mm/filemap.c:981
 do_read_cache_folio+0x23c/0x5c0 mm/filemap.c:4063
 freader_get_folio+0x33a/0x930 lib/buildid.c:58
 freader_fetch+0xbd/0x740 lib/buildid.c:101
 __build_id_parse.isra.0+0xdd/0x6c0 lib/buildid.c:289
 do_procmap_query+0xb0e/0x1080 fs/proc/task_mmu.c:733
 procfs_procmap_ioctl+0x9d/0xe0 fs/proc/task_mmu.c:813
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:597 [inline]
 __se_sys_ioctl fs/ioctl.c:583 [inline]
 __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f6360f8f7c9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f6361dfa038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f63611e5fa0 RCX: 00007f6360f8f7c9
RDX: 0000200000000180 RSI: 00000000c0686611 RDI: 0000000000000003
RBP: 00007f6361013f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f63611e6038 R14: 00007f63611e5fa0 R15: 00007ffe36ff8138
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:mem_cgroup_lruvec include/linux/memcontrol.h:720 [inline]
RIP: 0010:lru_gen_test_recent+0xee/0x320 mm/workingset.c:275
Code: 38 80 b5 ff 48 85 db 0f 84 79 01 00 00 e8 2a 80 b5 ff 49 8d bd 00 4e 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e a3 01 00 00 4d 63 b5 00 4e 00
RSP: 0018:ffffc90003e17828 EFLAGS: 00010206

RAX: dffffc0000000000 RBX: ffff888100068000 RCX: ffffc90003e1772c
RDX: 00000000000009c0 RSI: ffffffff82096446 RDI: 0000000000004e00
RBP: ffffc90003e178c0 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: ffff888028282ff0 R12: ffffc90003e178e0
R13: 0000000000000000 R14: ffffc90003e178b0 R15: 0000000000000000
FS:  00007f6361dfa6c0(0000) GS:ffff8880d6b0d000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000409000000 CR3: 0000000039516000 CR4: 0000000000352ef0
----------------
Code disassembly (best guess):
   0:	38 80 b5 ff 48 85    	cmp    %al,-0x7ab7004b(%rax)
   6:	db 0f                	fisttpl (%rdi)
   8:	84 79 01             	test   %bh,0x1(%rcx)
   b:	00 00                	add    %al,(%rax)
   d:	e8 2a 80 b5 ff       	call   0xffb5803c
  12:	49 8d bd 00 4e 00 00 	lea    0x4e00(%r13),%rdi
  19:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
  20:	fc ff df
  23:	48 89 fa             	mov    %rdi,%rdx
  26:	48 c1 ea 03          	shr    $0x3,%rdx
* 2a:	0f b6 04 02          	movzbl (%rdx,%rax,1),%eax <-- trapping instruction
  2e:	84 c0                	test   %al,%al
  30:	74 08                	je     0x3a
  32:	3c 03                	cmp    $0x3,%al
  34:	0f 8e a3 01 00 00    	jle    0x1dd
  3a:	4d                   	rex.WRB
  3b:	63                   	.byte 0x63
  3c:	b5 00                	mov    $0x0,%ch
  3e:	4e                   	rex.WRX


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Forwarded: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent
  2025-12-07  8:55 [syzbot] [mm?] general protection fault in lru_gen_test_recent (2) syzbot
@ 2025-12-07 12:44 ` syzbot
  2025-12-07 14:35 ` syzbot
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-07 12:44 UTC (permalink / raw)
  To: linux-kernel, syzkaller-bugs

For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com.

***

Subject: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent
Author: kartikey406@gmail.com

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

Add NULL check for memcg in lru_gen_test_recent() to prevent crash when
mem_cgroup_from_id() returns NULL.

The crash occurs when a folio's shadow entry contains a memcg_id that
no longer maps to a valid memory cgroup. This can happen when:

1. The memory cgroup has been deleted/freed
2. A folio was created without proper memcg association (e.g., during
   procmap_query build ID parsing via freader_get_folio)
3. The memcg_id in the shadow entry is invalid or zero

When lru_gen_test_recent() calls mem_cgroup_from_id(), it may return
NULL. The subsequent call to mem_cgroup_lruvec() with a NULL memcg
triggers a crash because the inline function's code calculates
memcg->nodeinfo offset (0x4e00) before the NULL check can execute,
causing a NULL pointer dereference that KASAN detects.

Although mem_cgroup_lruvec() has a NULL check internally, compiler
inlining and optimization causes the offset calculation to occur
first, making the internal check unreachable.

The fix adds an explicit NULL check after mem_cgroup_from_id() and
falls back to root_mem_cgroup, which is consistent with how
mem_cgroup_lruvec() itself handles NULL pointers.

Reproducer triggers this via:
  procfs_procmap_ioctl() -> do_procmap_query() -> __build_id_parse() ->
  freader_get_folio() -> filemap_add_folio() -> workingset_refault() ->
  lru_gen_refault() -> lru_gen_test_recent()

KASAN report:
  general protection fault in mem_cgroup_lruvec
  RIP: mem_cgroup_lruvec+0xee/0x320 include/linux/memcontrol.h:720
  Call Trace:
   lru_gen_test_recent+0xee/0x320 mm/workingset.c:275
   workingset_refault+0x251/0xca0 mm/workingset.c:546
   filemap_add_folio+0x23d/0x610 mm/filemap.c:981

Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
 mm/workingset.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/workingset.c b/mm/workingset.c
index e9f05634747a..8b6332cfb4f0 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -272,6 +272,8 @@ static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec,
 	unpack_shadow(shadow, &memcg_id, &pgdat, token, workingset);
 
 	memcg = mem_cgroup_from_id(memcg_id);
+	if (!memcg)
+		memcg = root_mem_cgroup;
 	*lruvec = mem_cgroup_lruvec(memcg, pgdat);
 
 	max_seq = READ_ONCE((*lruvec)->lrugen.max_seq);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [syzbot] [mm?] general protection fault in lru_gen_test_recent (2)
       [not found] <20251207124441.2614564-1-kartikey406@gmail.com>
@ 2025-12-07 13:10 ` syzbot
  0 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-07 13:10 UTC (permalink / raw)
  To: kartikey406, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
general protection fault in lru_gen_test_recent

Oops: general protection fault, probably for non-canonical address 0xdffffc00000009c0: 0000 [#1] SMP KASAN NOPTI
KASAN: probably user-memory-access in range [0x0000000000004e00-0x0000000000004e07]
CPU: 0 UID: 0 PID: 6513 Comm: syz.0.29 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:mem_cgroup_lruvec include/linux/memcontrol.h:720 [inline]
RIP: 0010:lru_gen_test_recent+0xfc/0x370 mm/workingset.c:277
Code: 2a 80 b5 ff 48 85 db 0f 84 a9 01 00 00 e8 1c 80 b5 ff 49 8d bd 00 4e 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e d3 01 00 00 4d 63 b5 00 4e 00
RSP: 0018:ffffc900035f7828 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: ffff88801d688000 RCX: ffffc900035f772c
RDX: 00000000000009c0 RSI: ffffffff82096454 RDI: 0000000000004e00
RBP: ffffc900035f78c0 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: ffff88802b10aff0 R12: ffffc900035f78e0
R13: 0000000000000000 R14: ffffc900035f78b0 R15: 0000000000000000
FS:  00007f7ae80646c0(0000) GS:ffff8880d6909000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000409000000 CR3: 000000002e141000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 lru_gen_refault mm/workingset.c:298 [inline]
 workingset_refault+0x251/0xca0 mm/workingset.c:548
 filemap_add_folio+0x23d/0x610 mm/filemap.c:981
 do_read_cache_folio+0x23c/0x5c0 mm/filemap.c:4063
 freader_get_folio+0x33a/0x930 lib/buildid.c:58
 freader_fetch+0xbd/0x740 lib/buildid.c:101
 __build_id_parse.isra.0+0xdd/0x6c0 lib/buildid.c:289
 do_procmap_query+0xb0e/0x1080 fs/proc/task_mmu.c:733
 procfs_procmap_ioctl+0x9d/0xe0 fs/proc/task_mmu.c:813
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:597 [inline]
 __se_sys_ioctl fs/ioctl.c:583 [inline]
 __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f7ae718f7c9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f7ae8064038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f7ae73e5fa0 RCX: 00007f7ae718f7c9
RDX: 0000200000000180 RSI: 00000000c0686611 RDI: 0000000000000003
RBP: 00007f7ae7213f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f7ae73e6038 R14: 00007f7ae73e5fa0 R15: 00007ffdab92bc98
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:mem_cgroup_lruvec include/linux/memcontrol.h:720 [inline]
RIP: 0010:lru_gen_test_recent+0xfc/0x370 mm/workingset.c:277
Code: 2a 80 b5 ff 48 85 db 0f 84 a9 01 00 00 e8 1c 80 b5 ff 49 8d bd 00 4e 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e d3 01 00 00 4d 63 b5 00 4e 00
RSP: 0018:ffffc900035f7828 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: ffff88801d688000 RCX: ffffc900035f772c
RDX: 00000000000009c0 RSI: ffffffff82096454 RDI: 0000000000004e00
RBP: ffffc900035f78c0 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: ffff88802b10aff0 R12: ffffc900035f78e0
R13: 0000000000000000 R14: ffffc900035f78b0 R15: 0000000000000000
FS:  00007f7ae80646c0(0000) GS:ffff8880d6909000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000409000000 CR3: 000000002e141000 CR4: 0000000000352ef0
----------------
Code disassembly (best guess):
   0:	2a 80 b5 ff 48 85    	sub    -0x7ab7004b(%rax),%al
   6:	db 0f                	fisttpl (%rdi)
   8:	84 a9 01 00 00 e8    	test   %ch,-0x17ffffff(%rcx)
   e:	1c 80                	sbb    $0x80,%al
  10:	b5 ff                	mov    $0xff,%ch
  12:	49 8d bd 00 4e 00 00 	lea    0x4e00(%r13),%rdi
  19:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
  20:	fc ff df
  23:	48 89 fa             	mov    %rdi,%rdx
  26:	48 c1 ea 03          	shr    $0x3,%rdx
* 2a:	0f b6 04 02          	movzbl (%rdx,%rax,1),%eax <-- trapping instruction
  2e:	84 c0                	test   %al,%al
  30:	74 08                	je     0x3a
  32:	3c 03                	cmp    $0x3,%al
  34:	0f 8e d3 01 00 00    	jle    0x20d
  3a:	4d                   	rex.WRB
  3b:	63                   	.byte 0x63
  3c:	b5 00                	mov    $0x0,%ch
  3e:	4e                   	rex.WRX


Tested on:

commit:         37bb2e72 Merge tag 'staging-6.19-rc1' of git://git.ker..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13f6ceb4580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=dbcb767d1e1208ac
dashboard link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
compiler:       gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1336a992580000


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Forwarded: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent
  2025-12-07  8:55 [syzbot] [mm?] general protection fault in lru_gen_test_recent (2) syzbot
  2025-12-07 12:44 ` Forwarded: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent syzbot
@ 2025-12-07 14:35 ` syzbot
  2025-12-07 15:05 ` syzbot
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-07 14:35 UTC (permalink / raw)
  To: linux-kernel, syzkaller-bugs

For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com.

***

Subject: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent
Author: kartikey406@gmail.com

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

Add NULL check for memcg in lru_gen_test_recent() to prevent crash when
mem_cgroup_from_id() returns NULL.

The crash occurs when a folio's shadow entry contains a memcg_id that
no longer maps to a valid memory cgroup. This can happen when:

1. The memory cgroup has been deleted/freed
2. A folio was created without proper memcg association (e.g., during
   procmap_query build ID parsing via freader_get_folio)
3. The memcg_id in the shadow entry is invalid or zero

When lru_gen_test_recent() calls mem_cgroup_from_id(), it may return
NULL. The subsequent call to mem_cgroup_lruvec() with NULL memcg
triggers a crash.

Although mem_cgroup_lruvec() has an internal NULL check, the crash
occurs before reaching it due to compiler optimization. Since
mem_cgroup_lruvec() is an inline function, the compiler calculates
the offset memcg->nodeinfo (0x4e00) before the function's NULL check
can execute, causing a NULL pointer dereference.

Fix this by introducing an effective_memcg variable that is explicitly
set to root_mem_cgroup when memcg is NULL. This approach forces the
compiler to use a separate register/memory location, preventing the
premature offset calculation that caused the crash with a simple
in-place NULL check.

Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
Fixes: ac35a4902374 ("mm: multi-gen LRU: minimal implementation")
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
 mm/workingset.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/workingset.c b/mm/workingset.c
index e9f05634747a..dad8b16af105 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -266,13 +266,14 @@ static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec,
 {
 	int memcg_id;
 	unsigned long max_seq;
-	struct mem_cgroup *memcg;
+	struct mem_cgroup *memcg, *effective_memcg;
 	struct pglist_data *pgdat;
 
 	unpack_shadow(shadow, &memcg_id, &pgdat, token, workingset);
 
 	memcg = mem_cgroup_from_id(memcg_id);
-	*lruvec = mem_cgroup_lruvec(memcg, pgdat);
+	effective_memcg = memcg ? : root_mem_cgroup;
+	*lruvec = mem_cgroup_lruvec(effective_memcg, pgdat);
 
 	max_seq = READ_ONCE((*lruvec)->lrugen.max_seq);
 	max_seq &= EVICTION_MASK >> LRU_REFS_WIDTH;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [syzbot] [mm?] general protection fault in lru_gen_test_recent (2)
       [not found] <20251207143534.2719842-1-kartikey406@gmail.com>
@ 2025-12-07 14:50 ` syzbot
  0 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-07 14:50 UTC (permalink / raw)
  To: kartikey406, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
general protection fault in lru_gen_test_recent

Oops: general protection fault, probably for non-canonical address 0xdffffc00000009c0: 0000 [#1] SMP KASAN NOPTI
KASAN: probably user-memory-access in range [0x0000000000004e00-0x0000000000004e07]
CPU: 3 UID: 0 PID: 6430 Comm: syz.0.26 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:mem_cgroup_lruvec include/linux/memcontrol.h:720 [inline]
RIP: 0010:lru_gen_test_recent+0xfc/0x370 mm/workingset.c:276
Code: 2a 80 b5 ff 48 85 db 0f 84 a9 01 00 00 e8 1c 80 b5 ff 49 8d bd 00 4e 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e d3 01 00 00 4d 63 b5 00 4e 00
RSP: 0018:ffffc9000325f828 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: ffff88801cac8000 RCX: ffffc9000325f72c
RDX: 00000000000009c0 RSI: ffffffff82096454 RDI: 0000000000004e00
RBP: ffffc9000325f8c0 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: ffff88802ed4d4b0 R12: ffffc9000325f8e0
R13: 0000000000000000 R14: ffffc9000325f8b0 R15: 0000000000000000
FS:  00007f03a13f86c0(0000) GS:ffff8880d6c09000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055ec36aadde8 CR3: 0000000055323000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 lru_gen_refault mm/workingset.c:297 [inline]
 workingset_refault+0x251/0xca0 mm/workingset.c:547
 filemap_add_folio+0x23d/0x610 mm/filemap.c:981
 do_read_cache_folio+0x23c/0x5c0 mm/filemap.c:4063
 freader_get_folio+0x33a/0x930 lib/buildid.c:58
 freader_fetch+0xbd/0x740 lib/buildid.c:101
 __build_id_parse.isra.0+0xdd/0x6c0 lib/buildid.c:289
 do_procmap_query+0xb0e/0x1080 fs/proc/task_mmu.c:733
 procfs_procmap_ioctl+0x9d/0xe0 fs/proc/task_mmu.c:813
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:597 [inline]
 __se_sys_ioctl fs/ioctl.c:583 [inline]
 __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f03a058f7c9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f03a13f8038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f03a07e5fa0 RCX: 00007f03a058f7c9
RDX: 0000200000000180 RSI: 00000000c0686611 RDI: 0000000000000003
RBP: 00007f03a0613f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f03a07e6038 R14: 00007f03a07e5fa0 R15: 00007ffc40446548
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:mem_cgroup_lruvec include/linux/memcontrol.h:720 [inline]
RIP: 0010:lru_gen_test_recent+0xfc/0x370 mm/workingset.c:276
Code: 2a 80 b5 ff 48 85 db 0f 84 a9 01 00 00 e8 1c 80 b5 ff 49 8d bd 00 4e 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e d3 01 00 00 4d 63 b5 00 4e 00
RSP: 0018:ffffc9000325f828 EFLAGS: 00010206

RAX: dffffc0000000000 RBX: ffff88801cac8000 RCX: ffffc9000325f72c
RDX: 00000000000009c0 RSI: ffffffff82096454 RDI: 0000000000004e00
RBP: ffffc9000325f8c0 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: ffff88802ed4d4b0 R12: ffffc9000325f8e0
R13: 0000000000000000 R14: ffffc9000325f8b0 R15: 0000000000000000
FS:  00007f03a13f86c0(0000) GS:ffff8880d6c09000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055ec36aadde8 CR3: 0000000055323000 CR4: 0000000000352ef0
----------------
Code disassembly (best guess):
   0:	2a 80 b5 ff 48 85    	sub    -0x7ab7004b(%rax),%al
   6:	db 0f                	fisttpl (%rdi)
   8:	84 a9 01 00 00 e8    	test   %ch,-0x17ffffff(%rcx)
   e:	1c 80                	sbb    $0x80,%al
  10:	b5 ff                	mov    $0xff,%ch
  12:	49 8d bd 00 4e 00 00 	lea    0x4e00(%r13),%rdi
  19:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
  20:	fc ff df
  23:	48 89 fa             	mov    %rdi,%rdx
  26:	48 c1 ea 03          	shr    $0x3,%rdx
* 2a:	0f b6 04 02          	movzbl (%rdx,%rax,1),%eax <-- trapping instruction
  2e:	84 c0                	test   %al,%al
  30:	74 08                	je     0x3a
  32:	3c 03                	cmp    $0x3,%al
  34:	0f 8e d3 01 00 00    	jle    0x20d
  3a:	4d                   	rex.WRB
  3b:	63                   	.byte 0x63
  3c:	b5 00                	mov    $0x0,%ch
  3e:	4e                   	rex.WRX


Tested on:

commit:         37bb2e72 Merge tag 'staging-6.19-rc1' of git://git.ker..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=16d4f21a580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=dbcb767d1e1208ac
dashboard link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
compiler:       gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1145a992580000


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Forwarded: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent
  2025-12-07  8:55 [syzbot] [mm?] general protection fault in lru_gen_test_recent (2) syzbot
  2025-12-07 12:44 ` Forwarded: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent syzbot
  2025-12-07 14:35 ` syzbot
@ 2025-12-07 15:05 ` syzbot
  2025-12-07 15:31 ` syzbot
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-07 15:05 UTC (permalink / raw)
  To: linux-kernel, syzkaller-bugs

For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com.

***

Subject: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent
Author: kartikey406@gmail.com

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

Add NULL check for memcg in lru_gen_test_recent() to prevent crash when
mem_cgroup_from_id() returns NULL.

The crash occurs when a folio's shadow entry contains a memcg_id that
no longer maps to a valid memory cgroup. This can happen when:

1. The memory cgroup has been deleted/freed
2. A folio was created without proper memcg association (e.g., during
   procmap_query build ID parsing via freader_get_folio)
3. The memcg_id in the shadow entry is invalid or zero

When lru_gen_test_recent() calls mem_cgroup_from_id(), it may return
NULL. The subsequent call to mem_cgroup_lruvec() with NULL memcg
triggers a crash.

Although mem_cgroup_lruvec() has an internal NULL check, the crash
occurs before reaching it due to compiler optimization. Since
mem_cgroup_lruvec() is an inline function, the compiler calculates
the offset memcg->nodeinfo (0x4e00) before the function's NULL check
can execute, causing a NULL pointer dereference.

Fix this by introducing an effective_memcg variable that is explicitly
set to root_mem_cgroup when memcg is NULL. This approach forces the
compiler to use a separate register/memory location, preventing the
premature offset calculation that caused the crash with a simple
in-place NULL check.

Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
 mm/workingset.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/mm/workingset.c b/mm/workingset.c
index e9f05634747a..847580173fb0 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -272,8 +272,15 @@ static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec,
 	unpack_shadow(shadow, &memcg_id, &pgdat, token, workingset);
 
 	memcg = mem_cgroup_from_id(memcg_id);
+	pr_warn("DEBUG: memcg_id=%d memcg=%p root_mem_cgroup=%p\n",memcg_id, memcg, root_mem_cgroup);
+	if (!memcg) {
+		pr_warn("DEBUG: memcg is NULL, using root_mem_cgroup\n");
+		memcg = root_mem_cgroup;
+		pr_warn("DEBUG: after assignment memcg=%p\n", memcg);
+	}
+	pr_warn("DEBUG: about to call mem_cgroup_lruvec with memcg=%p\n", memcg);
 	*lruvec = mem_cgroup_lruvec(memcg, pgdat);
-
+	pr_warn("DEBUG: mem_cgroup_lruvec returned successfully\n");
 	max_seq = READ_ONCE((*lruvec)->lrugen.max_seq);
 	max_seq &= EVICTION_MASK >> LRU_REFS_WIDTH;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [syzbot] [mm?] general protection fault in lru_gen_test_recent (2)
       [not found] <20251207150512.2759308-1-kartikey406@gmail.com>
@ 2025-12-07 15:22 ` syzbot
  0 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-07 15:22 UTC (permalink / raw)
  To: kartikey406, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
general protection fault in lru_gen_test_recent

DEBUG: memcg_id=0 memcg=0000000000000000 root_mem_cgroup=ffff88801cac8000
DEBUG: memcg is NULL, using root_mem_cgroup
DEBUG: after assignment memcg=ffff88801cac8000
DEBUG: about to call mem_cgroup_lruvec with memcg=ffff88801cac8000
Oops: general protection fault, probably for non-canonical address 0xdffffc00000009c0: 0000 [#1] SMP KASAN NOPTI
KASAN: probably user-memory-access in range [0x0000000000004e00-0x0000000000004e07]
CPU: 0 UID: 0 PID: 6414 Comm: syz.0.20 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:mem_cgroup_lruvec include/linux/memcontrol.h:720 [inline]
RIP: 0010:lru_gen_test_recent+0x14a/0x420 mm/workingset.c:282
Code: dc 7f b5 ff 48 85 db 0f 84 a4 01 00 00 e8 ce 7f b5 ff 49 8d be 00 4e 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 10 02 00 00 4d 63 ae 00 4e 00
RSP: 0018:ffffc9000177f828 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: ffff88801cac8000 RCX: ffffffff819c8d55
RDX: 00000000000009c0 RSI: ffffffff820964a2 RDI: 0000000000004e00
RBP: ffffc9000177f8c0 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000080000000 R11: ffff888026ad8b30 R12: ffffc9000177f8e0
R13: ffffffff90882820 R14: 0000000000000000 R15: 0000000000000000
FS:  00007fb55eb846c0(0000) GS:ffff8880d6909000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000409000000 CR3: 0000000022e91000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 lru_gen_refault mm/workingset.c:303 [inline]
 workingset_refault+0x251/0xca0 mm/workingset.c:553
 filemap_add_folio+0x23d/0x610 mm/filemap.c:981
 do_read_cache_folio+0x23c/0x5c0 mm/filemap.c:4063
 freader_get_folio+0x33a/0x930 lib/buildid.c:58
 freader_fetch+0xbd/0x740 lib/buildid.c:101
 __build_id_parse.isra.0+0xdd/0x6c0 lib/buildid.c:289
 do_procmap_query+0xb0e/0x1080 fs/proc/task_mmu.c:733
 procfs_procmap_ioctl+0x9d/0xe0 fs/proc/task_mmu.c:813
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:597 [inline]
 __se_sys_ioctl fs/ioctl.c:583 [inline]
 __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fb55dd8f7c9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fb55eb84038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fb55dfe5fa0 RCX: 00007fb55dd8f7c9
RDX: 0000200000000180 RSI: 00000000c0686611 RDI: 0000000000000003
RBP: 00007fb55de13f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fb55dfe6038 R14: 00007fb55dfe5fa0 R15: 00007ffd3962a3c8
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:mem_cgroup_lruvec include/linux/memcontrol.h:720 [inline]
RIP: 0010:lru_gen_test_recent+0x14a/0x420 mm/workingset.c:282
Code: dc 7f b5 ff 48 85 db 0f 84 a4 01 00 00 e8 ce 7f b5 ff 49 8d be 00 4e 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 10 02 00 00 4d 63 ae 00 4e 00
RSP: 0018:ffffc9000177f828 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: ffff88801cac8000 RCX: ffffffff819c8d55
RDX: 00000000000009c0 RSI: ffffffff820964a2 RDI: 0000000000004e00
RBP: ffffc9000177f8c0 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000080000000 R11: ffff888026ad8b30 R12: ffffc9000177f8e0
R13: ffffffff90882820 R14: 0000000000000000 R15: 0000000000000000
FS:  00007fb55eb846c0(0000) GS:ffff8880d6909000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000409000000 CR3: 0000000022e91000 CR4: 0000000000352ef0
----------------
Code disassembly (best guess):
   0:	dc 7f b5             	fdivrl -0x4b(%rdi)
   3:	ff 48 85             	decl   -0x7b(%rax)
   6:	db 0f                	fisttpl (%rdi)
   8:	84 a4 01 00 00 e8 ce 	test   %ah,-0x31180000(%rcx,%rax,1)
   f:	7f b5                	jg     0xffffffc6
  11:	ff 49 8d             	decl   -0x73(%rcx)
  14:	be 00 4e 00 00       	mov    $0x4e00,%esi
  19:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
  20:	fc ff df
  23:	48 89 fa             	mov    %rdi,%rdx
  26:	48 c1 ea 03          	shr    $0x3,%rdx
* 2a:	0f b6 04 02          	movzbl (%rdx,%rax,1),%eax <-- trapping instruction
  2e:	84 c0                	test   %al,%al
  30:	74 08                	je     0x3a
  32:	3c 03                	cmp    $0x3,%al
  34:	0f 8e 10 02 00 00    	jle    0x24a
  3a:	4d                   	rex.WRB
  3b:	63                   	.byte 0x63
  3c:	ae                   	scas   %es:(%rdi),%al
  3d:	00 4e 00             	add    %cl,0x0(%rsi)


Tested on:

commit:         37bb2e72 Merge tag 'staging-6.19-rc1' of git://git.ker..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13dcf21a580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=dbcb767d1e1208ac
dashboard link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
compiler:       gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=14f5a992580000


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Forwarded: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent
  2025-12-07  8:55 [syzbot] [mm?] general protection fault in lru_gen_test_recent (2) syzbot
                   ` (2 preceding siblings ...)
  2025-12-07 15:05 ` syzbot
@ 2025-12-07 15:31 ` syzbot
  2025-12-07 15:38 ` syzbot
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-07 15:31 UTC (permalink / raw)
  To: linux-kernel, syzkaller-bugs

For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com.

***

Subject: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent
Author: kartikey406@gmail.com

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

Add NULL check for memcg in lru_gen_test_recent() to prevent crash when
mem_cgroup_from_id() returns NULL.

The crash occurs when a folio's shadow entry contains a memcg_id that
no longer maps to a valid memory cgroup. This can happen when:

1. The memory cgroup has been deleted/freed
2. A folio was created without proper memcg association (e.g., during
   procmap_query build ID parsing via freader_get_folio)
3. The memcg_id in the shadow entry is invalid or zero

When lru_gen_test_recent() calls mem_cgroup_from_id(), it may return
NULL. The subsequent call to mem_cgroup_lruvec() with NULL memcg
triggers a crash.

Although mem_cgroup_lruvec() has an internal NULL check, the crash
occurs before reaching it due to compiler optimization. Since
mem_cgroup_lruvec() is an inline function, the compiler calculates
the offset memcg->nodeinfo (0x4e00) before the function's NULL check
can execute, causing a NULL pointer dereference.

Fix this by introducing an effective_memcg variable that is explicitly
set to root_mem_cgroup when memcg is NULL. This approach forces the
compiler to use a separate register/memory location, preventing the
premature offset calculation that caused the crash with a simple
in-place NULL check.

Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
Fixes: ac35a4902374 ("mm: multi-gen LRU: minimal implementation")
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
 mm/workingset.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/workingset.c b/mm/workingset.c
index e9f05634747a..8166793b38dc 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -272,6 +272,8 @@ static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec,
 	unpack_shadow(shadow, &memcg_id, &pgdat, token, workingset);
 
 	memcg = mem_cgroup_from_id(memcg_id);
+	if (unlikely(!memcg))
+		WRITE_ONCE(memcg, root_mem_cgroup);
 	*lruvec = mem_cgroup_lruvec(memcg, pgdat);
 
 	max_seq = READ_ONCE((*lruvec)->lrugen.max_seq);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Forwarded: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent
  2025-12-07  8:55 [syzbot] [mm?] general protection fault in lru_gen_test_recent (2) syzbot
                   ` (3 preceding siblings ...)
  2025-12-07 15:31 ` syzbot
@ 2025-12-07 15:38 ` syzbot
  2025-12-07 16:07 ` syzbot
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-07 15:38 UTC (permalink / raw)
  To: linux-kernel, syzkaller-bugs

For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com.

***

Subject: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent
Author: kartikey406@gmail.com

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

Add NULL check for memcg in lru_gen_test_recent() to prevent crash when
mem_cgroup_from_id() returns NULL.

The crash occurs when a folio's shadow entry contains a memcg_id that
no longer maps to a valid memory cgroup. This can happen when:

1. The memory cgroup has been deleted/freed
2. A folio was created without proper memcg association (e.g., during
   procmap_query build ID parsing via freader_get_folio)
3. The memcg_id in the shadow entry is invalid or zero

When lru_gen_test_recent() calls mem_cgroup_from_id(), it may return
NULL. The subsequent call to mem_cgroup_lruvec() with NULL memcg
triggers a crash.

Although mem_cgroup_lruvec() has an internal NULL check, the crash
occurs before reaching it due to compiler optimization. Since
mem_cgroup_lruvec() is an inline function, the compiler calculates
the offset memcg->nodeinfo (0x4e00) before the function's NULL check
can execute, causing a NULL pointer dereference.

Fix this by introducing an effective_memcg variable that is explicitly
set to root_mem_cgroup when memcg is NULL. This approach forces the
compiler to use a separate register/memory location, preventing the
premature offset calculation that caused the crash with a simple
in-place NULL check.

Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
Fixes: ac35a4902374 ("mm: multi-gen LRU: minimal implementation")
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
 mm/workingset.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/mm/workingset.c b/mm/workingset.c
index e9f05634747a..4fa33b57f0ca 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -272,7 +272,11 @@ static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec,
 	unpack_shadow(shadow, &memcg_id, &pgdat, token, workingset);
 
 	memcg = mem_cgroup_from_id(memcg_id);
-	*lruvec = mem_cgroup_lruvec(memcg, pgdat);
+	if(unlikely(!memcg)) {
+		*lruvec = &pgdat->__lruvec;
+	} else {
+		*lruvec = mem_cgroup_lruvec(memcg, pgdat);
+	}
 
 	max_seq = READ_ONCE((*lruvec)->lrugen.max_seq);
 	max_seq &= EVICTION_MASK >> LRU_REFS_WIDTH;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [syzbot] [mm?] general protection fault in lru_gen_test_recent (2)
       [not found] <20251207153059.2790675-1-kartikey406@gmail.com>
@ 2025-12-07 15:45 ` syzbot
  0 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-07 15:45 UTC (permalink / raw)
  To: kartikey406, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
general protection fault in lru_gen_test_recent

Oops: general protection fault, probably for non-canonical address 0xdffffc00000009c0: 0000 [#1] SMP KASAN NOPTI
KASAN: probably user-memory-access in range [0x0000000000004e00-0x0000000000004e07]
CPU: 0 UID: 0 PID: 6460 Comm: syz.0.36 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:mem_cgroup_lruvec include/linux/memcontrol.h:720 [inline]
RIP: 0010:lru_gen_test_recent+0xfc/0x370 mm/workingset.c:277
Code: 2a 80 b5 ff 48 85 db 0f 84 79 01 00 00 e8 1c 80 b5 ff 49 8d bd 00 4e 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e d3 01 00 00 4d 63 b5 00 4e 00
RSP: 0018:ffffc9000313f828 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: ffff88801d688000 RCX: ffffc9000313f72c
RDX: 00000000000009c0 RSI: ffffffff82096454 RDI: 0000000000004e00
RBP: ffffc9000313f8c0 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: ffff8880280f0b30 R12: ffffc9000313f8e0
R13: 0000000000000000 R14: ffffc9000313f8b0 R15: 0000000000000000
FS:  00007fb06f8496c0(0000) GS:ffff8880d6909000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000409000000 CR3: 0000000057f81000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 lru_gen_refault mm/workingset.c:298 [inline]
 workingset_refault+0x251/0xca0 mm/workingset.c:548
 filemap_add_folio+0x23d/0x610 mm/filemap.c:981
 do_read_cache_folio+0x23c/0x5c0 mm/filemap.c:4063
 freader_get_folio+0x33a/0x930 lib/buildid.c:58
 freader_fetch+0xbd/0x740 lib/buildid.c:101
 __build_id_parse.isra.0+0xdd/0x6c0 lib/buildid.c:289
 do_procmap_query+0xb0e/0x1080 fs/proc/task_mmu.c:733
 procfs_procmap_ioctl+0x9d/0xe0 fs/proc/task_mmu.c:813
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:597 [inline]
 __se_sys_ioctl fs/ioctl.c:583 [inline]
 __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fb06e98f7c9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fb06f849038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fb06ebe5fa0 RCX: 00007fb06e98f7c9
RDX: 0000200000000180 RSI: 00000000c0686611 RDI: 0000000000000003
RBP: 00007fb06ea13f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fb06ebe6038 R14: 00007fb06ebe5fa0 R15: 00007ffd239d7168
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:mem_cgroup_lruvec include/linux/memcontrol.h:720 [inline]
RIP: 0010:lru_gen_test_recent+0xfc/0x370 mm/workingset.c:277
Code: 2a 80 b5 ff 48 85 db 0f 84 79 01 00 00 e8 1c 80 b5 ff 49 8d bd 00 4e 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e d3 01 00 00 4d 63 b5 00 4e 00
RSP: 0018:ffffc9000313f828 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: ffff88801d688000 RCX: ffffc9000313f72c
RDX: 00000000000009c0 RSI: ffffffff82096454 RDI: 0000000000004e00
RBP: ffffc9000313f8c0 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: ffff8880280f0b30 R12: ffffc9000313f8e0
R13: 0000000000000000 R14: ffffc9000313f8b0 R15: 0000000000000000
FS:  00007fb06f8496c0(0000) GS:ffff8880d6a09000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fb06e973460 CR3: 0000000057f81000 CR4: 0000000000352ef0
----------------
Code disassembly (best guess):
   0:	2a 80 b5 ff 48 85    	sub    -0x7ab7004b(%rax),%al
   6:	db 0f                	fisttpl (%rdi)
   8:	84 79 01             	test   %bh,0x1(%rcx)
   b:	00 00                	add    %al,(%rax)
   d:	e8 1c 80 b5 ff       	call   0xffb5802e
  12:	49 8d bd 00 4e 00 00 	lea    0x4e00(%r13),%rdi
  19:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
  20:	fc ff df
  23:	48 89 fa             	mov    %rdi,%rdx
  26:	48 c1 ea 03          	shr    $0x3,%rdx
* 2a:	0f b6 04 02          	movzbl (%rdx,%rax,1),%eax <-- trapping instruction
  2e:	84 c0                	test   %al,%al
  30:	74 08                	je     0x3a
  32:	3c 03                	cmp    $0x3,%al
  34:	0f 8e d3 01 00 00    	jle    0x20d
  3a:	4d                   	rex.WRB
  3b:	63                   	.byte 0x63
  3c:	b5 00                	mov    $0x0,%ch
  3e:	4e                   	rex.WRX


Tested on:

commit:         9e906a9d Merge tag 'perf-tools-for-v6.19-2025-12-06' o..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1601521a580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=dbcb767d1e1208ac
dashboard link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
compiler:       gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1202f21a580000


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [syzbot] [mm?] general protection fault in lru_gen_test_recent (2)
       [not found] <20251207153807.2801160-1-kartikey406@gmail.com>
@ 2025-12-07 16:00 ` syzbot
  0 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-07 16:00 UTC (permalink / raw)
  To: kartikey406, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
general protection fault in lru_gen_test_recent

Oops: general protection fault, probably for non-canonical address 0xdffffc0000000a4f: 0000 [#1] SMP KASAN NOPTI
KASAN: probably user-memory-access in range [0x0000000000005278-0x000000000000527f]
CPU: 2 UID: 0 PID: 6563 Comm: syz.0.66 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:lru_gen_test_recent+0x1a4/0x2f0 mm/workingset.c:281
Code: 48 c1 ea 03 80 3c 02 00 0f 85 17 01 00 00 48 8d bb c0 00 00 00 49 89 1c 24 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 21 01 00 00 48 8b 9b c0 00 00 00 48 89 ea 48 b8
RSP: 0018:ffffc900027ef828 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: 00000000000051b8 RCX: ffffc900027ef72c
RDX: 0000000000000a4f RSI: ffffffff820964dd RDI: 0000000000005278
RBP: ffffc900027ef8c0 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: ffff88802c77aff0 R12: ffffc900027ef8e0
R13: 0000000000000000 R14: ffffc900027ef8b0 R15: 0000000000000000
FS:  00007ff4529496c0(0000) GS:ffff8880d6b09000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000409000000 CR3: 0000000055b8f000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 lru_gen_refault mm/workingset.c:300 [inline]
 workingset_refault+0x251/0xca0 mm/workingset.c:550
 filemap_add_folio+0x23d/0x610 mm/filemap.c:981
 do_read_cache_folio+0x23c/0x5c0 mm/filemap.c:4063
 freader_get_folio+0x33a/0x930 lib/buildid.c:58
 freader_fetch+0xbd/0x740 lib/buildid.c:101
 __build_id_parse.isra.0+0xdd/0x6c0 lib/buildid.c:289
 do_procmap_query+0xb0e/0x1080 fs/proc/task_mmu.c:733
 procfs_procmap_ioctl+0x9d/0xe0 fs/proc/task_mmu.c:813
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:597 [inline]
 __se_sys_ioctl fs/ioctl.c:583 [inline]
 __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff451b8f7c9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ff452949038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ff451de5fa0 RCX: 00007ff451b8f7c9
RDX: 0000200000000180 RSI: 00000000c0686611 RDI: 0000000000000003
RBP: 00007ff451c13f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ff451de6038 R14: 00007ff451de5fa0 R15: 00007ffda7f04458
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:lru_gen_test_recent+0x1a4/0x2f0 mm/workingset.c:281
Code: 48 c1 ea 03 80 3c 02 00 0f 85 17 01 00 00 48 8d bb c0 00 00 00 49 89 1c 24 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 21 01 00 00 48 8b 9b c0 00 00 00 48 89 ea 48 b8
RSP: 0018:ffffc900027ef828 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: 00000000000051b8 RCX: ffffc900027ef72c
RDX: 0000000000000a4f RSI: ffffffff820964dd RDI: 0000000000005278
RBP: ffffc900027ef8c0 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: ffff88802c77aff0 R12: ffffc900027ef8e0
R13: 0000000000000000 R14: ffffc900027ef8b0 R15: 0000000000000000
FS:  00007ff4529496c0(0000) GS:ffff8880d6a09000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff451b73460 CR3: 0000000055b8f000 CR4: 0000000000352ef0
----------------
Code disassembly (best guess):
   0:	48 c1 ea 03          	shr    $0x3,%rdx
   4:	80 3c 02 00          	cmpb   $0x0,(%rdx,%rax,1)
   8:	0f 85 17 01 00 00    	jne    0x125
   e:	48 8d bb c0 00 00 00 	lea    0xc0(%rbx),%rdi
  15:	49 89 1c 24          	mov    %rbx,(%r12)
  19:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
  20:	fc ff df
  23:	48 89 fa             	mov    %rdi,%rdx
  26:	48 c1 ea 03          	shr    $0x3,%rdx
* 2a:	80 3c 02 00          	cmpb   $0x0,(%rdx,%rax,1) <-- trapping instruction
  2e:	0f 85 21 01 00 00    	jne    0x155
  34:	48 8b 9b c0 00 00 00 	mov    0xc0(%rbx),%rbx
  3b:	48 89 ea             	mov    %rbp,%rdx
  3e:	48                   	rex.W
  3f:	b8                   	.byte 0xb8


Tested on:

commit:         9e906a9d Merge tag 'perf-tools-for-v6.19-2025-12-06' o..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1363ceb4580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=dbcb767d1e1208ac
dashboard link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
compiler:       gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=168666c2580000


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Forwarded: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent
  2025-12-07  8:55 [syzbot] [mm?] general protection fault in lru_gen_test_recent (2) syzbot
                   ` (4 preceding siblings ...)
  2025-12-07 15:38 ` syzbot
@ 2025-12-07 16:07 ` syzbot
  2025-12-08  2:31 ` Forwarded: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent() syzbot
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-07 16:07 UTC (permalink / raw)
  To: linux-kernel, syzkaller-bugs

For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com.

***

Subject: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent
Author: kartikey406@gmail.com

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master


Add NULL check for memcg in lru_gen_test_recent() to prevent crash when
mem_cgroup_from_id() returns NULL.

The crash occurs when a folio's shadow entry contains a memcg_id that
no longer maps to a valid memory cgroup. This can happen when:

1. The memory cgroup has been deleted/freed
2. A folio was created without proper memcg association (e.g., during
   procmap_query build ID parsing via freader_get_folio)
3. The memcg_id in the shadow entry is invalid or zero

When lru_gen_test_recent() calls mem_cgroup_from_id(), it may return
NULL. The subsequent call to mem_cgroup_lruvec() with NULL memcg
triggers a crash.

Although mem_cgroup_lruvec() has an internal NULL check, the crash
occurs before reaching it due to compiler optimization. Since
mem_cgroup_lruvec() is an inline function, the compiler calculates
the offset memcg->nodeinfo (0x4e00) before the function's NULL check
can execute, causing a NULL pointer dereference.

Fix this by introducing an effective_memcg variable that is explicitly
set to root_mem_cgroup when memcg is NULL. This approach forces the
compiler to use a separate register/memory location, preventing the
premature offset calculation that caused the crash with a simple
in-place NULL check.

Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
Fixes: ac35a4902374 ("mm: multi-gen LRU: minimal implementation")
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
 mm/workingset.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/mm/workingset.c b/mm/workingset.c
index e9f05634747a..6a45e98317e9 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -272,8 +272,13 @@ static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec,
 	unpack_shadow(shadow, &memcg_id, &pgdat, token, workingset);
 
 	memcg = mem_cgroup_from_id(memcg_id);
+	if (unlikely(!memcg)) {
+		pr_warn("DEBUG: memcg is NULL (memcg_id=%d), pgdat=%p, returning false\n",memcg_id, pgdat);
+		pr_warn("DEBUG: shadow=%p token=%lx workingset=%d\n",shadow, *token, *workingset);
+		return false;
+	}
 	*lruvec = mem_cgroup_lruvec(memcg, pgdat);
-
+	pr_warn("DEBUG: memcg=%p, lruvec=%p, continuing normally\n", memcg, *lruvec);
 	max_seq = READ_ONCE((*lruvec)->lrugen.max_seq);
 	max_seq &= EVICTION_MASK >> LRU_REFS_WIDTH;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [syzbot] [mm?] general protection fault in lru_gen_test_recent (2)
       [not found] <20251207160752.2863580-1-kartikey406@gmail.com>
@ 2025-12-07 16:23 ` syzbot
  0 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-07 16:23 UTC (permalink / raw)
  To: kartikey406, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
BUG: unable to handle kernel NULL pointer dereference in filemap_read_folio

DEBUG: memcg is NULL (memcg_id=0), pgdat=0000000000000000, returning false
DEBUG: shadow=0000000000000013 token=0 workingset=1
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 36e08067 P4D 36e08067 PUD 0 
Oops: Oops: 0010 [#1] SMP KASAN NOPTI
CPU: 2 UID: 0 PID: 6417 Comm: syz.0.25 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc9000444f988 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff81f7e52e
RDX: ffff888036dfc980 RSI: ffffea00011b2b00 RDI: ffff8880329bbdc0
RBP: ffffea00011b2b00 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 1ffff92000889f32
R13: ffff8880329bbdc0 R14: 0000000000000000 R15: dffffc0000000000
FS:  00007f75eb5a26c0(0000) GS:ffff8880d6b09000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000005576a000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 filemap_read_folio+0xc8/0x2a0 mm/filemap.c:2496
 do_read_cache_folio+0x266/0x5c0 mm/filemap.c:4096
 freader_get_folio+0x33a/0x930 lib/buildid.c:58
 freader_fetch+0xbd/0x740 lib/buildid.c:101
 __build_id_parse.isra.0+0xdd/0x6c0 lib/buildid.c:289
 do_procmap_query+0xb0e/0x1080 fs/proc/task_mmu.c:733
 procfs_procmap_ioctl+0x9d/0xe0 fs/proc/task_mmu.c:813
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:597 [inline]
 __se_sys_ioctl fs/ioctl.c:583 [inline]
 __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f75ea78f7c9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f75eb5a2038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f75ea9e5fa0 RCX: 00007f75ea78f7c9
RDX: 0000200000000180 RSI: 00000000c0686611 RDI: 0000000000000003
RBP: 00007f75ea813f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f75ea9e6038 R14: 00007f75ea9e5fa0 R15: 00007fff20402df8
 </TASK>
Modules linked in:
CR2: 0000000000000000
---[ end trace 0000000000000000 ]---
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc9000444f988 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff81f7e52e
RDX: ffff888036dfc980 RSI: ffffea00011b2b00 RDI: ffff8880329bbdc0
RBP: ffffea00011b2b00 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 1ffff92000889f32
R13: ffff8880329bbdc0 R14: 0000000000000000 R15: dffffc0000000000
FS:  00007f75eb5a26c0(0000) GS:ffff8880d6b09000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000005576a000 CR4: 0000000000352ef0


Tested on:

commit:         9e906a9d Merge tag 'perf-tools-for-v6.19-2025-12-06' o..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=162e66c2580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=dbcb767d1e1208ac
dashboard link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
compiler:       gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=14b1521a580000


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Forwarded: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent()
  2025-12-07  8:55 [syzbot] [mm?] general protection fault in lru_gen_test_recent (2) syzbot
                   ` (5 preceding siblings ...)
  2025-12-07 16:07 ` syzbot
@ 2025-12-08  2:31 ` syzbot
  2025-12-08  2:47 ` syzbot
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-08  2:31 UTC (permalink / raw)
  To: linux-kernel, syzkaller-bugs

For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com.

***

Subject: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent()
Author: kartikey406@gmail.com

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

Syzbot reported a general protection fault in lru_gen_test_recent() when
accessing memcg->nodeinfo with a NULL memcg pointer:

  Oops: general protection fault in lru_gen_test_recent+0xfc/0x370
  KASAN: probably user-memory-access in range [0x0000000000004e00-0x0000000000004e07]
  RIP: 0010:lru_gen_test_recent+0xfc/0x370

The crash occurs when unpack_shadow() extracts parameters from a shadow
entry. There are two cases where NULL pointers can be returned:

1. pgdat can be NULL when NODE_DATA(nid) returns NULL for an invalid or
   offlined NUMA node ID stored in the shadow entry.

2. memcg can be NULL when mem_cgroup_from_id() fails to find the memory
   cgroup (e.g., if it was destroyed).

The existing code directly passes these potentially NULL pointers to
mem_cgroup_lruvec(), which dereferences memcg->nodeinfo without checking,
leading to the crash.

Fix this by:
- Checking if pgdat is NULL and returning early if so, as we cannot
  determine page recency without a valid node.
- Checking if memcg is NULL and falling back to pgdat->__lruvec (the
  root memcg's lruvec) instead of calling mem_cgroup_lruvec() which
  would dereference NULL.

Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspot.com
Link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
 mm/workingset.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/mm/workingset.c b/mm/workingset.c
index e9f05634747a..335c2d34ac94 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -270,10 +270,18 @@ static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec,
 	struct pglist_data *pgdat;
 
 	unpack_shadow(shadow, &memcg_id, &pgdat, token, workingset);
-
+	if(unlikely(!pgdat))
+		return false;
 	memcg = mem_cgroup_from_id(memcg_id);
-	*lruvec = mem_cgroup_lruvec(memcg, pgdat);
-
+	if (unlikely(!memcg)) {
+		pr_warn("DEBUG: memcg is NULL (memcg_id=%d), pgdat=%p, returning false\n",memcg_id, pgdat);
+		pr_warn("DEBUG: shadow=%p token=%lx workingset=%d\n",shadow, *token, *workingset);
+		 memcg = root_mem_cgroup;
+		*lruvec = &pgdat->__lruvec;
+	} else {
+		*lruvec = mem_cgroup_lruvec(memcg, pgdat);
+	}
+	pr_warn("DEBUG: memcg=%p, lruvec=%p, continuing normally\n", memcg, *lruvec);
 	max_seq = READ_ONCE((*lruvec)->lrugen.max_seq);
 	max_seq &= EVICTION_MASK >> LRU_REFS_WIDTH;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Forwarded: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent()
  2025-12-07  8:55 [syzbot] [mm?] general protection fault in lru_gen_test_recent (2) syzbot
                   ` (6 preceding siblings ...)
  2025-12-08  2:31 ` Forwarded: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent() syzbot
@ 2025-12-08  2:47 ` syzbot
  2025-12-08  3:56 ` Forwarded: [PATCH] mm/workingset: add debug for corrupted shadow entry investigation syzbot
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-08  2:47 UTC (permalink / raw)
  To: linux-kernel, syzkaller-bugs

For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com.

***

Subject: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent()
Author: kartikey406@gmail.com

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

Syzbot reported a general protection fault in lru_gen_test_recent() when
accessing invalid memory addresses:

  Oops: general protection fault in lru_gen_test_recent+0xfc/0x370
  KASAN: probably user-memory-access in range [0x0000000000004e00-0x0000000000004e07]
  RIP: 0010:lru_gen_test_recent+0xfc/0x370

The crash occurs when unpack_shadow() extracts a pglist_data pointer from
a shadow entry. The pgdat can be NULL when NODE_DATA(nid) returns NULL for
an invalid or offlined NUMA node ID stored in the shadow entry.

The existing code doesn't check for NULL pgdat before passing it to
mem_cgroup_lruvec(), which can lead to crashes when dereferencing the
invalid pointer.

Fix this by checking if pgdat is NULL and setting lruvec to NULL before
returning false. The caller in lru_gen_refault() will then skip processing
via the check "if (lruvec != folio_lruvec(folio)) goto unlock", preventing
use of the invalid lruvec.

Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspot.com
Link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
---
 mm/workingset.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/workingset.c b/mm/workingset.c
index e9f05634747a..b63948f4e91a 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -270,7 +270,10 @@ static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec,
 	struct pglist_data *pgdat;
 
 	unpack_shadow(shadow, &memcg_id, &pgdat, token, workingset);
-
+	if (unlikely(!pgdat)) {
+		*lruvec = NULL;
+		return false;
+	}
 	memcg = mem_cgroup_from_id(memcg_id);
 	*lruvec = mem_cgroup_lruvec(memcg, pgdat);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [syzbot] [mm?] general protection fault in lru_gen_test_recent (2)
       [not found] <20251208023132.2923514-1-kartikey406@gmail.com>
@ 2025-12-08  3:00 ` syzbot
  0 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-08  3:00 UTC (permalink / raw)
  To: kartikey406, linux-kernel, syzbot, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
BUG: unable to handle kernel NULL pointer dereference in filemap_read_folio

BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 253cc067 P4D 253cc067 PUD 0 
Oops: Oops: 0010 [#1] SMP KASAN NOPTI
CPU: 3 UID: 0 PID: 6434 Comm: syz.0.22 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc90003af7988 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff81f7e52e
RDX: ffff888029640000 RSI: ffffea00010f2540 RDI: ffff88801cabb180
RBP: ffffea00010f2540 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 1ffff9200075ef32
R13: ffff88801cabb180 R14: 0000000000000000 R15: dffffc0000000000
FS:  00007f51eb3756c0(0000) GS:ffff8880d6c09000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 0000000039a0a000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 filemap_read_folio+0xc8/0x2a0 mm/filemap.c:2496
 do_read_cache_folio+0x266/0x5c0 mm/filemap.c:4096
 freader_get_folio+0x33a/0x930 lib/buildid.c:58
 freader_fetch+0xbd/0x740 lib/buildid.c:101
 __build_id_parse.isra.0+0xdd/0x6c0 lib/buildid.c:289
 do_procmap_query+0xb0e/0x1080 fs/proc/task_mmu.c:733
 procfs_procmap_ioctl+0x9d/0xe0 fs/proc/task_mmu.c:813
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:597 [inline]
 __se_sys_ioctl fs/ioctl.c:583 [inline]
 __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f51ea58f7c9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f51eb375038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f51ea7e5fa0 RCX: 00007f51ea58f7c9
RDX: 0000200000000180 RSI: 00000000c0686611 RDI: 0000000000000003
RBP: 00007f51ea613f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f51ea7e6038 R14: 00007f51ea7e5fa0 R15: 00007ffedaa41fc8
 </TASK>
Modules linked in:
CR2: 0000000000000000
---[ end trace 0000000000000000 ]---
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc90003af7988 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff81f7e52e
RDX: ffff888029640000 RSI: ffffea00010f2540 RDI: ffff88801cabb180
RBP: ffffea00010f2540 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 1ffff9200075ef32
R13: ffff88801cabb180 R14: 0000000000000000 R15: dffffc0000000000
FS:  00007f51eb3756c0(0000) GS:ffff8880d6c09000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 0000000039a0a000 CR4: 0000000000352ef0


Tested on:

commit:         c2f2b01b Merge tag 'i3c/for-6.19' of git://git.kernel...
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13f0aeb4580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=dbcb767d1e1208ac
dashboard link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
compiler:       gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=134f6992580000


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [syzbot] [mm?] general protection fault in lru_gen_test_recent (2)
       [not found] <20251208024712.2925251-1-kartikey406@gmail.com>
@ 2025-12-08  3:28 ` syzbot
  0 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-08  3:28 UTC (permalink / raw)
  To: kartikey406, linux-kernel, syzbot, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
BUG: unable to handle kernel NULL pointer dereference in filemap_read_folio

BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 2c3cf067 P4D 2c3cf067 PUD 0 
Oops: Oops: 0010 [#1] SMP KASAN NOPTI
CPU: 0 UID: 0 PID: 6434 Comm: syz.0.29 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc900038e7988 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff81f7e52e
RDX: ffff888028d4c980 RSI: ffffea000162df00 RDI: ffff88802ca0ba40
RBP: ffffea000162df00 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000000 R11: ffff888028d4d4b0 R12: 1ffff9200071cf32
R13: ffff88802ca0ba40 R14: 0000000000000000 R15: dffffc0000000000
FS:  00007fba957a36c0(0000) GS:ffff8880d6909000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 0000000034f91000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 filemap_read_folio+0xc8/0x2a0 mm/filemap.c:2496
 do_read_cache_folio+0x266/0x5c0 mm/filemap.c:4096
 freader_get_folio+0x33a/0x930 lib/buildid.c:58
 freader_fetch+0xbd/0x740 lib/buildid.c:101
 __build_id_parse.isra.0+0xdd/0x6c0 lib/buildid.c:289
 do_procmap_query+0xb0e/0x1080 fs/proc/task_mmu.c:733
 procfs_procmap_ioctl+0x9d/0xe0 fs/proc/task_mmu.c:813
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:597 [inline]
 __se_sys_ioctl fs/ioctl.c:583 [inline]
 __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fba9498f7c9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fba957a3038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fba94be5fa0 RCX: 00007fba9498f7c9
RDX: 0000200000000180 RSI: 00000000c0686611 RDI: 0000000000000003
RBP: 00007fba94a13f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fba94be6038 R14: 00007fba94be5fa0 R15: 00007ffca4d48ae8
 </TASK>
Modules linked in:
CR2: 0000000000000000
---[ end trace 0000000000000000 ]---
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc900038e7988 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff81f7e52e
RDX: ffff888028d4c980 RSI: ffffea000162df00 RDI: ffff88802ca0ba40
RBP: ffffea000162df00 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000000 R11: ffff888028d4d4b0 R12: 1ffff9200071cf32
R13: ffff88802ca0ba40 R14: 0000000000000000 R15: dffffc0000000000
FS:  00007fba957a36c0(0000) GS:ffff8880d6909000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 0000000034f91000 CR4: 0000000000352ef0


Tested on:

commit:         c2f2b01b Merge tag 'i3c/for-6.19' of git://git.kernel...
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=10818ec2580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=dbcb767d1e1208ac
dashboard link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
compiler:       gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1410aeb4580000


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Forwarded: [PATCH] mm/workingset: add debug for corrupted shadow entry investigation
  2025-12-07  8:55 [syzbot] [mm?] general protection fault in lru_gen_test_recent (2) syzbot
                   ` (7 preceding siblings ...)
  2025-12-08  2:47 ` syzbot
@ 2025-12-08  3:56 ` syzbot
  2025-12-08  4:49 ` Forwarded: [PATCH] mm/workingset: fix crash from corrupted shadow entries in lru_gen syzbot
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-08  3:56 UTC (permalink / raw)
  To: linux-kernel, syzkaller-bugs

For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com.

***

Subject: [PATCH] mm/workingset: add debug for corrupted shadow entry investigation
Author: kartikey406@gmail.com

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master


When pgdat is NULL in lru_gen_test_recent(), it indicates a corrupted
shadow entry. Currently returning false allows execution to continue,
which leads to a subsequent crash in filemap_read_folio() with a NULL
function pointer dereference.

Add debug output and stack dump to understand:
1. When pgdat is NULL (corrupted shadow entries)
2. The full call path leading to this situation
3. Why continuing execution after return false causes crashes

This will help determine the proper place to handle corrupted shadow
entries - either stop earlier in the call chain or handle the corruption
differently in lru_gen_test_recent().

Related-to: syzbot+e008db2ac01e282550ee@syzkaller.appspot.com
Link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
 mm/workingset.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/mm/workingset.c b/mm/workingset.c
index e9f05634747a..a848572f8c8a 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -270,7 +270,13 @@ static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec,
 	struct pglist_data *pgdat;
 
 	unpack_shadow(shadow, &memcg_id, &pgdat, token, workingset);
-
+	if (unlikely(!pgdat)) {
+		pr_warn("FATAL: Corrupted shadow entry - pgdat is NULL! shadow=%p\n", shadow);
+		pr_warn("This indicates page cache corruption - cannot proceed\n");
+		dump_stack();
+		*lruvec = NULL;
+		return false;
+	}
 	memcg = mem_cgroup_from_id(memcg_id);
 	*lruvec = mem_cgroup_lruvec(memcg, pgdat);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [syzbot] [mm?] general protection fault in lru_gen_test_recent (2)
       [not found] <20251208035638.2927077-1-kartikey406@gmail.com>
@ 2025-12-08  4:14 ` syzbot
  0 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-08  4:14 UTC (permalink / raw)
  To: kartikey406, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
BUG: unable to handle kernel NULL pointer dereference in filemap_read_folio

RBP: 00007fb7c7c13f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fb7c7de6038 R14: 00007fb7c7de5fa0 R15: 00007ffd557726f8
 </TASK>
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 4f090067 P4D 4f090067 
PUD 0 
Oops: Oops: 0010 [#1] SMP KASAN NOPTI
CPU: 3 UID: 0 PID: 6482 Comm: syz.0.48 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc90003d47988 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff81f7e52e
RDX: ffff88802283a4c0 RSI: ffffea00010cb180 RDI: ffff888031ad5500
RBP: ffffea00010cb180 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 1ffff920007a8f32
R13: ffff888031ad5500 R14: 0000000000000000 R15: dffffc0000000000
FS:  00007fb7c8ab16c0(0000) GS:ffff8880d6c09000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000004ec6a000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 filemap_read_folio+0xc8/0x2a0 mm/filemap.c:2496
 do_read_cache_folio+0x266/0x5c0 mm/filemap.c:4096
 freader_get_folio+0x33a/0x930 lib/buildid.c:58
 freader_fetch+0xbd/0x740 lib/buildid.c:101
 __build_id_parse.isra.0+0xdd/0x6c0 lib/buildid.c:289
 do_procmap_query+0xb0e/0x1080 fs/proc/task_mmu.c:733
 procfs_procmap_ioctl+0x9d/0xe0 fs/proc/task_mmu.c:813
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:597 [inline]
 __se_sys_ioctl fs/ioctl.c:583 [inline]
 __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fb7c7b8f7c9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fb7c8ab1038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fb7c7de5fa0 RCX: 00007fb7c7b8f7c9
RDX: 0000200000000180 RSI: 00000000c0686611 RDI: 0000000000000003
RBP: 00007fb7c7c13f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fb7c7de6038 R14: 00007fb7c7de5fa0 R15: 00007ffd557726f8
 </TASK>
Modules linked in:
CR2: 0000000000000000
---[ end trace 0000000000000000 ]---
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc90003d47988 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff81f7e52e
RDX: ffff88802283a4c0 RSI: ffffea00010cb180 RDI: ffff888031ad5500
RBP: ffffea00010cb180 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 1ffff920007a8f32
R13: ffff888031ad5500 R14: 0000000000000000 R15: dffffc0000000000
FS:  00007fb7c8ab16c0(0000) GS:ffff8880d6c09000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000004ec6a000 CR4: 0000000000352ef0


Tested on:

commit:         c2f2b01b Merge tag 'i3c/for-6.19' of git://git.kernel...
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=16598ec2580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=dbcb767d1e1208ac
dashboard link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
compiler:       gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=15098ec2580000


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Forwarded: [PATCH] mm/workingset: fix crash from corrupted shadow entries in lru_gen
  2025-12-07  8:55 [syzbot] [mm?] general protection fault in lru_gen_test_recent (2) syzbot
                   ` (8 preceding siblings ...)
  2025-12-08  3:56 ` Forwarded: [PATCH] mm/workingset: add debug for corrupted shadow entry investigation syzbot
@ 2025-12-08  4:49 ` syzbot
  2025-12-08  5:14 ` syzbot
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-08  4:49 UTC (permalink / raw)
  To: linux-kernel, syzkaller-bugs

For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com.

***

Subject: [PATCH] mm/workingset: fix crash from corrupted shadow entries in lru_gen
Author: kartikey406@gmail.com

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master


Syzbot reported crashes in lru_gen_test_recent() and subsequent NULL
pointer dereferences in the page cache code:

  Oops: general protection fault in lru_gen_test_recent+0xfc/0x370
  KASAN: probably user-memory-access in range [0x0000000000004e00-0x0000000000004e07]

And later:

  BUG: kernel NULL pointer dereference, address: 0000000000000000
  #PF: supervisor instruction fetch in kernel mode
  RIP: 0010:0x0
  Call Trace:
   filemap_read_folio+0xc8/0x2a0

The root cause is that unpack_shadow() can extract an invalid node ID
from a corrupted shadow entry, causing NODE_DATA(nid) to return NULL for
pgdat. When this NULL pgdat is passed to mem_cgroup_lruvec(), it leads
to crashes when dereferencing memcg->nodeinfo.

Even if we detect and return early from lru_gen_test_recent(), the
corrupted state propagates through the call chain, eventually causing
crashes in the page cache code when trying to use the corrupted folio.

Fix this by:
1. Checking if pgdat is NULL in lru_gen_test_recent() and setting
   *lruvec to NULL to signal the corruption to the caller.
2. Adding a NULL check for lruvec in lru_gen_refault() to catch
   corrupted shadow entries and skip processing before the corruption
   can propagate further into the page cache code.

Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspot.com
Link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
 mm/workingset.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/mm/workingset.c b/mm/workingset.c
index e9f05634747a..da19ff153dc7 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -270,7 +270,14 @@ static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec,
 	struct pglist_data *pgdat;
 
 	unpack_shadow(shadow, &memcg_id, &pgdat, token, workingset);
-
+	/*
+	 * If pgdat is NULL, the shadow entry contains an invalid node ID.
+	 * Set lruvec to NULL so caller can detect and skip processing.
+	 */
+	if (unlikely(!pgdat)) {
+		*lruvec = NULL;
+		return false;
+	}
 	memcg = mem_cgroup_from_id(memcg_id);
 	*lruvec = mem_cgroup_lruvec(memcg, pgdat);
 
@@ -294,7 +301,7 @@ static void lru_gen_refault(struct folio *folio, void *shadow)
 	rcu_read_lock();
 
 	recent = lru_gen_test_recent(shadow, &lruvec, &token, &workingset);
-	if (lruvec != folio_lruvec(folio))
+	if (!lruvec || lruvec != folio_lruvec(folio))
 		goto unlock;
 
 	mod_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + type, delta);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [syzbot] [mm?] general protection fault in lru_gen_test_recent (2)
       [not found] <20251208044921.2928668-1-kartikey406@gmail.com>
@ 2025-12-08  5:07 ` syzbot
  0 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-08  5:07 UTC (permalink / raw)
  To: kartikey406, linux-kernel, syzbot, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
BUG: unable to handle kernel NULL pointer dereference in filemap_read_folio

BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 33c06067 P4D 33c06067 PUD 0 
Oops: Oops: 0010 [#1] SMP KASAN NOPTI
CPU: 1 UID: 0 PID: 6494 Comm: syz.0.46 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc9000355f988 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff81f7e52e
RDX: ffff888025660000 RSI: ffffea00013c7300 RDI: ffff88802b34c380
RBP: ffffea00013c7300 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 1ffff920006abf32
R13: ffff88802b34c380 R14: 0000000000000000 R15: dffffc0000000000
FS:  00007f612a2386c0(0000) GS:ffff8880d6a09000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000005475f000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 filemap_read_folio+0xc8/0x2a0 mm/filemap.c:2496
 do_read_cache_folio+0x266/0x5c0 mm/filemap.c:4096
 freader_get_folio+0x33a/0x930 lib/buildid.c:58
 freader_fetch+0xbd/0x740 lib/buildid.c:101
 __build_id_parse.isra.0+0xdd/0x6c0 lib/buildid.c:289
 do_procmap_query+0xb0e/0x1080 fs/proc/task_mmu.c:733
 procfs_procmap_ioctl+0x9d/0xe0 fs/proc/task_mmu.c:813
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:597 [inline]
 __se_sys_ioctl fs/ioctl.c:583 [inline]
 __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f612938f7c9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f612a238038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f61295e5fa0 RCX: 00007f612938f7c9
RDX: 0000200000000180 RSI: 00000000c0686611 RDI: 0000000000000003
RBP: 00007f6129413f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f61295e6038 R14: 00007f61295e5fa0 R15: 00007ffc87556108
 </TASK>
Modules linked in:
CR2: 0000000000000000
---[ end trace 0000000000000000 ]---
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc9000355f988 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff81f7e52e
RDX: ffff888025660000 RSI: ffffea00013c7300 RDI: ffff88802b34c380
RBP: ffffea00013c7300 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 1ffff920006abf32
R13: ffff88802b34c380 R14: 0000000000000000 R15: dffffc0000000000
FS:  00007f612a2386c0(0000) GS:ffff8880d6a09000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000005475f000 CR4: 0000000000352ef0


Tested on:

commit:         c2f2b01b Merge tag 'i3c/for-6.19' of git://git.kernel...
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=169ce992580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=dbcb767d1e1208ac
dashboard link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
compiler:       gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=14b58ec2580000


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Forwarded: [PATCH] mm/workingset: fix crash from corrupted shadow entries in lru_gen
  2025-12-07  8:55 [syzbot] [mm?] general protection fault in lru_gen_test_recent (2) syzbot
                   ` (9 preceding siblings ...)
  2025-12-08  4:49 ` Forwarded: [PATCH] mm/workingset: fix crash from corrupted shadow entries in lru_gen syzbot
@ 2025-12-08  5:14 ` syzbot
  2025-12-09  5:35 ` Forwarded: [PATCH] mm/workingset: add debug instrumentation for MGLRU shadow corruption syzbot
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-08  5:14 UTC (permalink / raw)
  To: linux-kernel, syzkaller-bugs

For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com.

***

Subject: [PATCH] mm/workingset: fix crash from corrupted shadow entries in lru_gen
Author: kartikey406@gmail.com

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

Syzbot reported crashes in lru_gen_test_recent() and subsequent NULL
pointer dereferences in the page cache code:

  Oops: general protection fault in lru_gen_test_recent+0xfc/0x370
  KASAN: probably user-memory-access in range [0x0000000000004e00-0x0000000000004e07]

And later:

  BUG: kernel NULL pointer dereference, address: 0000000000000000
  #PF: supervisor instruction fetch in kernel mode
  RIP: 0010:0x0
  Call Trace:
   filemap_read_folio+0xc8/0x2a0

The root cause is that unpack_shadow() can extract an invalid node ID
from a corrupted shadow entry, causing NODE_DATA(nid) to return NULL for
pgdat. When this NULL pgdat is passed to mem_cgroup_lruvec(), it leads
to crashes when dereferencing memcg->nodeinfo.

Even if we detect and return early from lru_gen_test_recent(), the
corrupted state propagates through the call chain, eventually causing
crashes in the page cache code when trying to use the corrupted folio.

Fix this by:
1. Checking if pgdat is NULL in lru_gen_test_recent() and setting
   *lruvec to NULL to signal the corruption to the caller.
2. Adding a NULL check for lruvec in lru_gen_refault() to catch
   corrupted shadow entries and skip processing before the corruption
   can propagate further into the page cache code.

Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspot.com
Link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
---
 mm/workingset.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/mm/workingset.c b/mm/workingset.c
index e9f05634747a..364434168b4c 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -270,7 +270,15 @@ static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec,
 	struct pglist_data *pgdat;
 
 	unpack_shadow(shadow, &memcg_id, &pgdat, token, workingset);
-
+	/*
+	 * If pgdat is NULL, the shadow entry contains an invalid node ID.
+	 * Set lruvec to NULL so caller can detect and skip processing.
+	 */
+	if (unlikely(!pgdat)) {
+	        *lruvec = NULL;
+		pr_warn("lru_gen_test_recent: Detected corrupted shadow (NULL pgdat), setting lruvec=NULL\n");
+		return false;
+	}
 	memcg = mem_cgroup_from_id(memcg_id);
 	*lruvec = mem_cgroup_lruvec(memcg, pgdat);
 
@@ -294,9 +302,11 @@ static void lru_gen_refault(struct folio *folio, void *shadow)
 	rcu_read_lock();
 
 	recent = lru_gen_test_recent(shadow, &lruvec, &token, &workingset);
-	if (lruvec != folio_lruvec(folio))
+	if (!lruvec || lruvec != folio_lruvec(folio)) {
+		if(!lruvec)
+			pr_warn("lru_gen_refault: Skipping corrupted entry (lruvec=NULL)\n");
 		goto unlock;
-
+	}
 	mod_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + type, delta);
 
 	if (!recent)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [syzbot] [mm?] general protection fault in lru_gen_test_recent (2)
       [not found] <20251208051440.2931546-1-kartikey406@gmail.com>
@ 2025-12-08  5:31 ` syzbot
  0 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-08  5:31 UTC (permalink / raw)
  To: kartikey406, linux-kernel, syzbot, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
BUG: unable to handle kernel NULL pointer dereference in filemap_read_folio

lru_gen_test_recent: Detected corrupted shadow (NULL pgdat), setting lruvec=NULL
lru_gen_refault: Skipping corrupted entry (lruvec=NULL)
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 58fec067 P4D 58fec067 PUD 0 
Oops: Oops: 0010 [#1] SMP KASAN NOPTI
CPU: 1 UID: 0 PID: 6433 Comm: syz.0.26 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc900038af988 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff81f7e52e
RDX: ffff88802978c980 RSI: ffffea00010f92c0 RDI: ffff8880374eddc0
RBP: ffffea00010f92c0 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 1ffff92000715f32
R13: ffff8880374eddc0 R14: 0000000000000000 R15: dffffc0000000000
FS:  00007f1fc2e686c0(0000) GS:ffff8880d6a09000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 0000000026060000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 filemap_read_folio+0xc8/0x2a0 mm/filemap.c:2496
 do_read_cache_folio+0x266/0x5c0 mm/filemap.c:4096
 freader_get_folio+0x33a/0x930 lib/buildid.c:58
 freader_fetch+0xbd/0x740 lib/buildid.c:101
 __build_id_parse.isra.0+0xdd/0x6c0 lib/buildid.c:289
 do_procmap_query+0xb0e/0x1080 fs/proc/task_mmu.c:733
 procfs_procmap_ioctl+0x9d/0xe0 fs/proc/task_mmu.c:813
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:597 [inline]
 __se_sys_ioctl fs/ioctl.c:583 [inline]
 __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f1fc1f8f7c9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f1fc2e68038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f1fc21e5fa0 RCX: 00007f1fc1f8f7c9
RDX: 0000200000000180 RSI: 00000000c0686611 RDI: 0000000000000003
RBP: 00007f1fc2013f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f1fc21e6038 R14: 00007f1fc21e5fa0 R15: 00007fff8a945218
 </TASK>
Modules linked in:
CR2: 0000000000000000
---[ end trace 0000000000000000 ]---
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc900038af988 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff81f7e52e
RDX: ffff88802978c980 RSI: ffffea00010f92c0 RDI: ffff8880374eddc0
RBP: ffffea00010f92c0 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 1ffff92000715f32
R13: ffff8880374eddc0 R14: 0000000000000000 R15: dffffc0000000000
FS:  00007f1fc2e686c0(0000) GS:ffff8880d6a09000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 0000000026060000 CR4: 0000000000352ef0


Tested on:

commit:         c2f2b01b Merge tag 'i3c/for-6.19' of git://git.kernel...
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=179aaeb4580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=dbcb767d1e1208ac
dashboard link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
compiler:       gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1188321a580000


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Forwarded: [PATCH] mm/workingset: add debug instrumentation for MGLRU shadow corruption
  2025-12-07  8:55 [syzbot] [mm?] general protection fault in lru_gen_test_recent (2) syzbot
                   ` (10 preceding siblings ...)
  2025-12-08  5:14 ` syzbot
@ 2025-12-09  5:35 ` syzbot
  2025-12-09  5:44 ` Forwarded: [PATCH] mm/workingset: debug MGLRU shadow corruption leading to NULL deref syzbot
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-09  5:35 UTC (permalink / raw)
  To: linux-kernel, syzkaller-bugs

For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com.

***

Subject: [PATCH] mm/workingset: add debug instrumentation for MGLRU shadow corruption
Author: kartikey406@gmail.com

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

Add comprehensive debug logging to track down NULL pointer dereference
in lru_gen_test_recent() when unpacking shadow entries with value 0x41.

The crash occurs when:
1. A shadow entry with value 0x41 is created during page eviction
2. The page later refaults and tries to unpack this shadow
3. unpack_shadow() extracts an invalid node ID from 0x41
4. NODE_DATA() returns NULL for the invalid node
5. Crash when trying to dereference NULL pgdat

This debug patch instruments the complete shadow entry lifecycle:

1. pack_shadow() - Log shadow creation and detect 0x41 creation
2. lru_gen_eviction() - Log MGLRU eviction path with min_seq/token
3. unpack_shadow() - Log shadow unpacking and detect 0x41 unpacking
4. lru_gen_test_recent() - Log entry and detect NULL pgdat
5. workingset_refault() - Log refault entry point
6. lru_gen_refault() - Log MGLRU refault handler

Each function dumps stack trace when 0x41 shadow is detected to capture
the full call chain.

The goal is to identify why pack_shadow() creates 0x41, which likely
indicates MGLRU generation counters (min_seq) are zero when they
shouldn't be.

Link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
 mm/workingset.c | 64 +++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 57 insertions(+), 7 deletions(-)

diff --git a/mm/workingset.c b/mm/workingset.c
index 0ec205a1ae92..d64490cd987d 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -199,28 +199,49 @@ static unsigned int bucket_order __read_mostly;
 static void *pack_shadow(int memcgid, pg_data_t *pgdat, unsigned long eviction,
 			 bool workingset)
 {
+	pr_err("PACK_SHADOW: CREATING SHADOW\n");
+	pr_err("  memcgid=%d node_id=%d eviction=0x%lx workingset=%d\n",
+	       memcgid, pgdat->node_id, eviction, workingset);
 	eviction &= EVICTION_MASK;
 	eviction = (eviction << MEM_CGROUP_ID_SHIFT) | memcgid;
 	eviction = (eviction << NODES_SHIFT) | pgdat->node_id;
 	eviction = (eviction << WORKINGSET_SHIFT) | workingset;
-
-	return xa_mk_value(eviction);
+	void *shadow = xa_mk_value(eviction);
+	pr_err("  Final packed shadow=0x%lx (raw eviction=0x%lx)\n",
+	       (unsigned long)shadow, eviction);
+	if ((unsigned long)shadow == 0x41) {
+		pr_err("*** BUG: CREATED SHADOW 0x41! ***\n");
+		dump_stack();
+	}
+	return shadow;
 }
 
 static void unpack_shadow(void *shadow, int *memcgidp, pg_data_t **pgdat,
 			  unsigned long *evictionp, bool *workingsetp)
 {
+	pr_err("UNPACK_SHADOW: READING SHADOW\n");
+	pr_err("  shadow=0x%lx\n", (unsigned long)shadow);
 	unsigned long entry = xa_to_value(shadow);
 	int memcgid, nid;
 	bool workingset;
-
+	// CRITICAL: Detect if we're reading the bad 0x41 shadow!
+	if ((unsigned long)shadow == 0x41) {
+		pr_err("*** BUG: UNPACKING CORRUPTED SHADOW 0x41! ***\n");
+		dump_stack();
+	}
 	workingset = entry & ((1UL << WORKINGSET_SHIFT) - 1);
 	entry >>= WORKINGSET_SHIFT;
 	nid = entry & ((1UL << NODES_SHIFT) - 1);
 	entry >>= NODES_SHIFT;
 	memcgid = entry & ((1UL << MEM_CGROUP_ID_SHIFT) - 1);
 	entry >>= MEM_CGROUP_ID_SHIFT;
-
+	pr_err("  Unpacked: memcgid=%d nid=%d eviction=0x%lx workingset=%d\n",
+	       memcgid, nid, entry, workingset);
+	pr_err("  NODE_DATA(%d)=%px\n", nid, NODE_DATA(nid));
+	if (nid >= MAX_NUMNODES || !NODE_DATA(nid)) {
+		pr_err("*** BUG: INVALID NODE ID %d! ***\n", nid);
+		dump_stack();
+	}
 	*memcgidp = memcgid;
 	*pgdat = NODE_DATA(nid);
 	*evictionp = entry;
@@ -231,6 +252,8 @@ static void unpack_shadow(void *shadow, int *memcgidp, pg_data_t **pgdat,
 
 static void *lru_gen_eviction(struct folio *folio)
 {
+	pr_err("LRU_GEN_EVICTION: ENTERED\n");
+	pr_err("  folio=%px node=%d\n", folio, folio_nid(folio));
 	int hist;
 	unsigned long token;
 	unsigned long min_seq;
@@ -250,11 +273,15 @@ static void *lru_gen_eviction(struct folio *folio)
 	lrugen = &lruvec->lrugen;
 	min_seq = READ_ONCE(lrugen->min_seq[type]);
 	token = (min_seq << LRU_REFS_WIDTH) | max(refs - 1, 0);
-
+	pr_err("LRU_GEN_EVICTION: min_seq=0x%lx refs=%d tier=%d\n",
+	       min_seq, refs, tier);
+	pr_err("  token=0x%lx (will be eviction parameter)\n", token);
 	hist = lru_hist_from_seq(min_seq);
 	atomic_long_add(delta, &lrugen->evicted[hist][type][tier]);
-
-	return pack_shadow(mem_cgroup_id(memcg), pgdat, token, workingset);
+	void *shadow = pack_shadow(mem_cgroup_id(memcg), pgdat, token, workingset);
+	pr_err("LRU_GEN_EVICTION: Returning shadow=0x%lx\n", (unsigned long)shadow);
+	return shadow;
+	//return pack_shadow(mem_cgroup_id(memcg), pgdat, token, workingset);
 }
 
 /*
@@ -289,6 +316,13 @@ static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec,
 
 static void lru_gen_refault(struct folio *folio, void *shadow)
 {
+	 pr_err("LRU_GEN_REFAULT: ENTERED\n");
+        pr_err("  folio=%px shadow=0x%lx\n", folio, (unsigned long)shadow);
+        
+        if ((unsigned long)shadow == 0x41) {
+                pr_err("*** BUG: LRU_GEN_REFAULT received corrupted shadow 0x41! ***\n");
+                //dump_stack();
+        }
 	bool recent;
 	int hist, tier, refs;
 	bool workingset;
@@ -299,8 +333,11 @@ static void lru_gen_refault(struct folio *folio, void *shadow)
 	int delta = folio_nr_pages(folio);
 
 	rcu_read_lock();
+	        pr_err("LRU_GEN_REFAULT: Calling lru_gen_test_recent\n");
 
 	recent = lru_gen_test_recent(shadow, &lruvec, &token, &workingset);
+	 pr_err("LRU_GEN_REFAULT: lru_gen_test_recent returned %d\n", recent);
+        pr_err("  lruvec=%px token=0x%lx workingset=%d\n", lruvec, token, workingset);
 	if (!lruvec || lruvec != folio_lruvec(folio))
 		goto unlock;
 	mod_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + type, delta);
@@ -539,6 +576,12 @@ bool workingset_test_recent(void *shadow, bool file, bool *workingset,
  */
 void workingset_refault(struct folio *folio, void *shadow)
 {
+	pr_err("WORKINGSET_REFAULT: ENTERED\n");
+        pr_err("  folio=%px shadow=0x%lx\n", folio, (unsigned long)shadow);
+	  if ((unsigned long)shadow == 0x41) {
+                pr_err("*** BUG: WORKINGSET_REFAULT received corrupted shadow 0x41! ***\n");
+                dump_stack();
+        }
 	bool file = folio_is_file_lru(folio);
 	struct pglist_data *pgdat;
 	struct mem_cgroup *memcg;
@@ -549,9 +592,13 @@ void workingset_refault(struct folio *folio, void *shadow)
 	VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
 
 	if (lru_gen_enabled()) {
+		pr_err("WORKINGSET_REFAULT: LRU_GEN enabled, calling lru_gen_refault\n");
 		lru_gen_refault(folio, shadow);
+		pr_err("WORKINGSET_REFAULT: lru_gen_refault returned\n");
+
 		return;
 	}
+	        pr_err("WORKINGSET_REFAULT: Using regular (non-LRU_GEN) path\n");
 
 	/*
 	 * The activation decision for this folio is made at the level
@@ -568,6 +615,7 @@ void workingset_refault(struct folio *folio, void *shadow)
 	lruvec = mem_cgroup_lruvec(memcg, pgdat);
 
 	mod_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + file, nr);
+	        pr_err("WORKINGSET_REFAULT: Calling workingset_test_recent\n");
 
 	if (!workingset_test_recent(shadow, file, &workingset, true))
 		return;
@@ -578,6 +626,7 @@ void workingset_refault(struct folio *folio, void *shadow)
 
 	/* Folio was active prior to eviction */
 	if (workingset) {
+		 pr_err("WORKINGSET_REFAULT: Folio was workingset, restoring\n");
 		folio_set_workingset(folio);
 		/*
 		 * XXX: Move to folio_add_lru() when it supports new vs
@@ -586,6 +635,7 @@ void workingset_refault(struct folio *folio, void *shadow)
 		lru_note_cost_refault(folio);
 		mod_lruvec_state(lruvec, WORKINGSET_RESTORE_BASE + file, nr);
 	}
+	 pr_err("WORKINGSET_REFAULT: EXITING\n");
 }
 
 /**
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [syzbot] [mm?] general protection fault in lru_gen_test_recent (2)
       [not found] <20251209053549.3243990-1-kartikey406@gmail.com>
@ 2025-12-09  5:37 ` syzbot
  0 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-09  5:37 UTC (permalink / raw)
  To: kartikey406, linux-kernel, syzkaller-bugs

Hello,

syzbot tried to test the proposed patch but the build/boot failed:

failed to apply patch:
checking file mm/workingset.c
Hunk #4 succeeded at 309 (offset -7 lines).
Hunk #5 FAILED at 333.
Hunk #6 succeeded at 567 (offset -6 lines).
Hunk #7 succeeded at 583 (offset -6 lines).
Hunk #8 succeeded at 606 (offset -6 lines).
Hunk #9 succeeded at 617 (offset -6 lines).
Hunk #10 succeeded at 626 (offset -6 lines).
1 out of 10 hunks FAILED



Tested on:

commit:         cb015814 Merge tag 'f2fs-for-6.19-rc1' of git://git.ke..
git tree:       upstream
kernel config:  https://syzkaller.appspot.com/x/.config?x=5aef7d5187304591
dashboard link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
compiler:       
patch:          https://syzkaller.appspot.com/x/patch.diff?x=17f5ca1a580000


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Forwarded: [PATCH] mm/workingset: debug MGLRU shadow corruption leading to NULL deref
  2025-12-07  8:55 [syzbot] [mm?] general protection fault in lru_gen_test_recent (2) syzbot
                   ` (11 preceding siblings ...)
  2025-12-09  5:35 ` Forwarded: [PATCH] mm/workingset: add debug instrumentation for MGLRU shadow corruption syzbot
@ 2025-12-09  5:44 ` syzbot
  2025-12-09  6:28 ` Forwarded: [PATCH] mm/workingset: fix NULL deref from invalid node ID in shadow syzbot
  2025-12-23  9:38 ` Forwarded: [PATCH] for test syzbot
  14 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-09  5:44 UTC (permalink / raw)
  To: linux-kernel, syzkaller-bugs

For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com.

***

Subject: [PATCH] mm/workingset: debug MGLRU shadow corruption leading to NULL deref
Author: kartikey406@gmail.com

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

Add debug logging to trace shadow entry 0x41 that causes NULL pointer
dereference in lru_gen_test_recent().

Instruments:
- pack_shadow(): Detect when 0x41 is created
- lru_gen_eviction(): Show min_seq and token values
- unpack_shadow(): Detect when 0x41 is unpacked
- lru_gen_test_recent(): Detect NULL pgdat
- workingset_refault/lru_gen_refault(): Trace refault path

This will identify if MGLRU generation counters are uninitialized
(min_seq=0), causing corrupted shadow entries.

Link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
 mm/workingset.c | 69 ++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 57 insertions(+), 12 deletions(-)

diff --git a/mm/workingset.c b/mm/workingset.c
index e9f05634747a..cebcf5e63f3b 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -199,28 +199,49 @@ static unsigned int bucket_order __read_mostly;
 static void *pack_shadow(int memcgid, pg_data_t *pgdat, unsigned long eviction,
 			 bool workingset)
 {
+	pr_err("PACK_SHADOW: CREATING SHADOW\n");
+	pr_err("  memcgid=%d node_id=%d eviction=0x%lx workingset=%d\n",
+	       memcgid, pgdat->node_id, eviction, workingset);
 	eviction &= EVICTION_MASK;
 	eviction = (eviction << MEM_CGROUP_ID_SHIFT) | memcgid;
 	eviction = (eviction << NODES_SHIFT) | pgdat->node_id;
 	eviction = (eviction << WORKINGSET_SHIFT) | workingset;
-
-	return xa_mk_value(eviction);
+	void *shadow = xa_mk_value(eviction);
+	pr_err("  Final packed shadow=0x%lx (raw eviction=0x%lx)\n",
+	       (unsigned long)shadow, eviction);
+	if ((unsigned long)shadow == 0x41) {
+		pr_err("*** BUG: CREATED SHADOW 0x41! ***\n");
+		dump_stack();
+	}
+	return shadow;
 }
 
 static void unpack_shadow(void *shadow, int *memcgidp, pg_data_t **pgdat,
 			  unsigned long *evictionp, bool *workingsetp)
 {
+	pr_err("UNPACK_SHADOW: READING SHADOW\n");
+	pr_err("  shadow=0x%lx\n", (unsigned long)shadow);
 	unsigned long entry = xa_to_value(shadow);
 	int memcgid, nid;
 	bool workingset;
-
+	// CRITICAL: Detect if we're reading the bad 0x41 shadow!
+	if ((unsigned long)shadow == 0x41) {
+		pr_err("*** BUG: UNPACKING CORRUPTED SHADOW 0x41! ***\n");
+		dump_stack();
+	}
 	workingset = entry & ((1UL << WORKINGSET_SHIFT) - 1);
 	entry >>= WORKINGSET_SHIFT;
 	nid = entry & ((1UL << NODES_SHIFT) - 1);
 	entry >>= NODES_SHIFT;
 	memcgid = entry & ((1UL << MEM_CGROUP_ID_SHIFT) - 1);
 	entry >>= MEM_CGROUP_ID_SHIFT;
-
+	pr_err("  Unpacked: memcgid=%d nid=%d eviction=0x%lx workingset=%d\n",
+	       memcgid, nid, entry, workingset);
+	pr_err("  NODE_DATA(%d)=%px\n", nid, NODE_DATA(nid));
+	if (nid >= MAX_NUMNODES || !NODE_DATA(nid)) {
+		pr_err("*** BUG: INVALID NODE ID %d! ***\n", nid);
+		dump_stack();
+	}
 	*memcgidp = memcgid;
 	*pgdat = NODE_DATA(nid);
 	*evictionp = entry;
@@ -231,6 +252,8 @@ static void unpack_shadow(void *shadow, int *memcgidp, pg_data_t **pgdat,
 
 static void *lru_gen_eviction(struct folio *folio)
 {
+	pr_err("LRU_GEN_EVICTION: ENTERED\n");
+	pr_err("  folio=%px node=%d\n", folio, folio_nid(folio));
 	int hist;
 	unsigned long token;
 	unsigned long min_seq;
@@ -250,11 +273,15 @@ static void *lru_gen_eviction(struct folio *folio)
 	lrugen = &lruvec->lrugen;
 	min_seq = READ_ONCE(lrugen->min_seq[type]);
 	token = (min_seq << LRU_REFS_WIDTH) | max(refs - 1, 0);
-
+	pr_err("LRU_GEN_EVICTION: min_seq=0x%lx refs=%d tier=%d\n",
+	       min_seq, refs, tier);
+	pr_err("  token=0x%lx (will be eviction parameter)\n", token);
 	hist = lru_hist_from_seq(min_seq);
 	atomic_long_add(delta, &lrugen->evicted[hist][type][tier]);
-
-	return pack_shadow(mem_cgroup_id(memcg), pgdat, token, workingset);
+	void *shadow = pack_shadow(mem_cgroup_id(memcg), pgdat, token, workingset);
+	pr_err("LRU_GEN_EVICTION: Returning shadow=0x%lx\n", (unsigned long)shadow);
+	return shadow;
+	//return pack_shadow(mem_cgroup_id(memcg), pgdat, token, workingset);
 }
 
 /*
@@ -270,7 +297,14 @@ static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec,
 	struct pglist_data *pgdat;
 
 	unpack_shadow(shadow, &memcg_id, &pgdat, token, workingset);
-
+	/*
+	 * If pgdat is NULL, the shadow entry contains an invalid node ID.
+	 * Set lruvec to NULL so caller can detect and skip processing.
+	 */
+	if (unlikely(!pgdat)) {
+	        *lruvec = NULL;
+		return false;
+	}
 	memcg = mem_cgroup_from_id(memcg_id);
 	*lruvec = mem_cgroup_lruvec(memcg, pgdat);
 
@@ -280,7 +314,7 @@ static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec,
 	return abs_diff(max_seq, *token >> LRU_REFS_WIDTH) < MAX_NR_GENS;
 }
 
-static void lru_gen_refault(struct folio *folio, void *shadow)
+static void lru_gen_refault(struct folio *folio, void *shadow) 
 {
 	bool recent;
 	int hist, tier, refs;
@@ -292,11 +326,9 @@ static void lru_gen_refault(struct folio *folio, void *shadow)
 	int delta = folio_nr_pages(folio);
 
 	rcu_read_lock();
-
 	recent = lru_gen_test_recent(shadow, &lruvec, &token, &workingset);
-	if (lruvec != folio_lruvec(folio))
+	if (!lruvec || lruvec != folio_lruvec(folio))
 		goto unlock;
-
 	mod_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + type, delta);
 
 	if (!recent)
@@ -533,6 +565,12 @@ bool workingset_test_recent(void *shadow, bool file, bool *workingset,
  */
 void workingset_refault(struct folio *folio, void *shadow)
 {
+	pr_err("WORKINGSET_REFAULT: ENTERED\n");
+        pr_err("  folio=%px shadow=0x%lx\n", folio, (unsigned long)shadow);
+	  if ((unsigned long)shadow == 0x41) {
+                pr_err("*** BUG: WORKINGSET_REFAULT received corrupted shadow 0x41! ***\n");
+                dump_stack();
+        }
 	bool file = folio_is_file_lru(folio);
 	struct pglist_data *pgdat;
 	struct mem_cgroup *memcg;
@@ -543,9 +581,13 @@ void workingset_refault(struct folio *folio, void *shadow)
 	VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
 
 	if (lru_gen_enabled()) {
+		pr_err("WORKINGSET_REFAULT: LRU_GEN enabled, calling lru_gen_refault\n");
 		lru_gen_refault(folio, shadow);
+		pr_err("WORKINGSET_REFAULT: lru_gen_refault returned\n");
+
 		return;
 	}
+	        pr_err("WORKINGSET_REFAULT: Using regular (non-LRU_GEN) path\n");
 
 	/*
 	 * The activation decision for this folio is made at the level
@@ -562,6 +604,7 @@ void workingset_refault(struct folio *folio, void *shadow)
 	lruvec = mem_cgroup_lruvec(memcg, pgdat);
 
 	mod_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + file, nr);
+	        pr_err("WORKINGSET_REFAULT: Calling workingset_test_recent\n");
 
 	if (!workingset_test_recent(shadow, file, &workingset, true))
 		return;
@@ -572,6 +615,7 @@ void workingset_refault(struct folio *folio, void *shadow)
 
 	/* Folio was active prior to eviction */
 	if (workingset) {
+		 pr_err("WORKINGSET_REFAULT: Folio was workingset, restoring\n");
 		folio_set_workingset(folio);
 		/*
 		 * XXX: Move to folio_add_lru() when it supports new vs
@@ -580,6 +624,7 @@ void workingset_refault(struct folio *folio, void *shadow)
 		lru_note_cost_refault(folio);
 		mod_lruvec_state(lruvec, WORKINGSET_RESTORE_BASE + file, nr);
 	}
+	 pr_err("WORKINGSET_REFAULT: EXITING\n");
 }
 
 /**
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [syzbot] [mm?] general protection fault in lru_gen_test_recent (2)
       [not found] <20251209054447.3244819-1-kartikey406@gmail.com>
@ 2025-12-09  5:59 ` syzbot
  0 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-09  5:59 UTC (permalink / raw)
  To: kartikey406, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
BUG: unable to handle kernel NULL pointer dereference in filemap_read_folio

  shadow=0x11
  Unpacked: memcgid=0 nid=4 eviction=0x0 workingset=0
  NODE_DATA(4)=0000000000000000
*** BUG: INVALID NODE ID 4! ***
CPU: 1 UID: 0 PID: 6408 Comm: syz.0.24 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x16c/0x1f0 lib/dump_stack.c:120
 unpack_shadow+0x1f3/0x270 mm/workingset.c:243
 lru_gen_test_recent+0x9a/0x350 mm/workingset.c:299
 lru_gen_refault mm/workingset.c:329 [inline]
 workingset_refault+0x290/0xd50 mm/workingset.c:585
 filemap_add_folio+0x23d/0x610 mm/filemap.c:981
 do_read_cache_folio+0x23c/0x5c0 mm/filemap.c:4063
 freader_get_folio+0x33a/0x930 lib/buildid.c:58
 freader_fetch+0xbd/0x740 lib/buildid.c:101
 __build_id_parse.isra.0+0xdd/0x6c0 lib/buildid.c:289
 do_procmap_query+0xb0e/0x1080 fs/proc/task_mmu.c:733
 procfs_procmap_ioctl+0x9d/0xe0 fs/proc/task_mmu.c:813
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:597 [inline]
 __se_sys_ioctl fs/ioctl.c:583 [inline]
 __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fcba4f8f7c9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fcba5ee8038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fcba51e5fa0 RCX: 00007fcba4f8f7c9
RDX: 0000200000000180 RSI: 00000000c0686611 RDI: 0000000000000003
RBP: 00007fcba5013f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fcba51e6038 R14: 00007fcba51e5fa0 R15: 00007ffc01f89a28
 </TASK>
WORKINGSET_REFAULT: lru_gen_refault returned
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 32990067 P4D 32990067 PUD 0 
Oops: Oops: 0010 [#1] SMP KASAN NOPTI
CPU: 0 UID: 0 PID: 6408 Comm: syz.0.24 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc90003d1f988 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff81f7e52e
RDX: ffff88802e4e4980 RSI: ffffea000167d2c0 RDI: ffff888039482000
RBP: ffffea000167d2c0 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 1ffff920007a3f32
R13: ffff888039482000 R14: 0000000000000000 R15: dffffc0000000000
FS:  00007fcba5ee86c0(0000) GS:ffff8880d6907000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000002bb78000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 filemap_read_folio+0xc8/0x2a0 mm/filemap.c:2496
 do_read_cache_folio+0x266/0x5c0 mm/filemap.c:4096
 freader_get_folio+0x33a/0x930 lib/buildid.c:58
 freader_fetch+0xbd/0x740 lib/buildid.c:101
 __build_id_parse.isra.0+0xdd/0x6c0 lib/buildid.c:289
 do_procmap_query+0xb0e/0x1080 fs/proc/task_mmu.c:733
 procfs_procmap_ioctl+0x9d/0xe0 fs/proc/task_mmu.c:813
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:597 [inline]
 __se_sys_ioctl fs/ioctl.c:583 [inline]
 __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fcba4f8f7c9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fcba5ee8038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fcba51e5fa0 RCX: 00007fcba4f8f7c9
RDX: 0000200000000180 RSI: 00000000c0686611 RDI: 0000000000000003
RBP: 00007fcba5013f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fcba51e6038 R14: 00007fcba51e5fa0 R15: 00007ffc01f89a28
 </TASK>
Modules linked in:
CR2: 0000000000000000
---[ end trace 0000000000000000 ]---
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc90003d1f988 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff81f7e52e
RDX: ffff88802e4e4980 RSI: ffffea000167d2c0 RDI: ffff888039482000
RBP: ffffea000167d2c0 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 1ffff920007a3f32
R13: ffff888039482000 R14: 0000000000000000 R15: dffffc0000000000
FS:  00007fcba5ee86c0(0000) GS:ffff8880d6907000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000002bb78000 CR4: 0000000000352ef0


Tested on:

commit:         cb015814 Merge tag 'f2fs-for-6.19-rc1' of git://git.ke..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=17d01eb4580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=dbcb767d1e1208ac
dashboard link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
compiler:       gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=112dca1a580000


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Forwarded: [PATCH] mm/workingset: fix NULL deref from invalid node ID in shadow
  2025-12-07  8:55 [syzbot] [mm?] general protection fault in lru_gen_test_recent (2) syzbot
                   ` (12 preceding siblings ...)
  2025-12-09  5:44 ` Forwarded: [PATCH] mm/workingset: debug MGLRU shadow corruption leading to NULL deref syzbot
@ 2025-12-09  6:28 ` syzbot
  2025-12-23  9:38 ` Forwarded: [PATCH] for test syzbot
  14 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-09  6:28 UTC (permalink / raw)
  To: linux-kernel, syzkaller-bugs

For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com.

***

Subject: [PATCH] mm/workingset: fix NULL deref from invalid node ID in shadow
Author: kartikey406@gmail.com

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master


Fix a NULL pointer dereference in lru_gen_test_recent() caused by
shadow entries containing invalid NUMA node IDs.

The crash occurs when:
1. A page is evicted and its folio incorrectly reports node_id >= MAX_NUMNODES
2. pack_shadow() stores this invalid node ID in the shadow entry
3. On page refault, unpack_shadow() extracts the invalid node ID
4. NODE_DATA(invalid_nid) returns NULL
5. Subsequent dereference of NULL pgdat causes crash

Example from crash log:
  shadow=0x11 unpacks to: nid=4, but system only has nodes 0-3
  NODE_DATA(4) returns NULL → crash

Root cause: Pages can be tracked on non-existent NUMA nodes due to:
- Incorrect node assignment during page allocation
- Corrupted page->flags NODES bits
- NUMA policy bugs

Fix: Add validation in both pack_shadow() and unpack_shadow():
1. In pack_shadow(): Detect and reject invalid node IDs at creation time
2. In unpack_shadow(): Validate node ID before using NODE_DATA()
3. Fall back to node 0 for invalid node IDs to prevent crash

Additionally, initialize MGLRU min_seq to 1 instead of 0 to prevent
creating shadows with zero eviction time, which lose temporal information.

Link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
Reported-by:  syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com
Debugged-by: Deepanshu Kartikey <kartikey406@gmail.com>
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
 mm/workingset.c | 74 +++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 62 insertions(+), 12 deletions(-)

diff --git a/mm/workingset.c b/mm/workingset.c
index e9f05634747a..23a2d00fb582 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -199,28 +199,55 @@ static unsigned int bucket_order __read_mostly;
 static void *pack_shadow(int memcgid, pg_data_t *pgdat, unsigned long eviction,
 			 bool workingset)
 {
+	pr_err("PACK_SHADOW: CREATING SHADOW\n");
+	pr_err("  memcgid=%d node_id=%d eviction=0x%lx workingset=%d\n",
+	       memcgid, pgdat->node_id, eviction, workingset);
+	if (pgdat->node_id >= MAX_NUMNODES || !NODE_DATA(pgdat->node_id)) {
+		pr_err("*** BUG: pack_shadow called with INVALID node_id=%d! ***\n",
+		       pgdat->node_id);
+		pr_err("  pgdat=%px pgdat->node_id=%d MAX_NUMNODES=%d\n",
+		       pgdat, pgdat->node_id, MAX_NUMNODES);
+		dump_stack();
+		
+		// This will show WHERE the bad pgdat came from
+	}
 	eviction &= EVICTION_MASK;
 	eviction = (eviction << MEM_CGROUP_ID_SHIFT) | memcgid;
 	eviction = (eviction << NODES_SHIFT) | pgdat->node_id;
 	eviction = (eviction << WORKINGSET_SHIFT) | workingset;
-
-	return xa_mk_value(eviction);
+	void *shadow = xa_mk_value(eviction);
+	pr_err("  Final packed shadow=0x%lx (raw eviction=0x%lx)\n",
+	       (unsigned long)shadow, eviction);
+	if ((unsigned long)shadow == 0x41) {
+		pr_err("*** BUG: CREATED SHADOW 0x41! ***\n");
+	}
+	return shadow;
 }
 
 static void unpack_shadow(void *shadow, int *memcgidp, pg_data_t **pgdat,
 			  unsigned long *evictionp, bool *workingsetp)
 {
+	pr_err("UNPACK_SHADOW: READING SHADOW\n");
+	pr_err("  shadow=0x%lx\n", (unsigned long)shadow);
 	unsigned long entry = xa_to_value(shadow);
 	int memcgid, nid;
 	bool workingset;
-
+	// CRITICAL: Detect if we're reading the bad 0x41 shadow!
+	if ((unsigned long)shadow == 0x41) {
+		pr_err("*** BUG: UNPACKING CORRUPTED SHADOW 0x41! ***\n");
+	}
 	workingset = entry & ((1UL << WORKINGSET_SHIFT) - 1);
 	entry >>= WORKINGSET_SHIFT;
 	nid = entry & ((1UL << NODES_SHIFT) - 1);
 	entry >>= NODES_SHIFT;
 	memcgid = entry & ((1UL << MEM_CGROUP_ID_SHIFT) - 1);
 	entry >>= MEM_CGROUP_ID_SHIFT;
-
+	pr_err("  Unpacked: memcgid=%d nid=%d eviction=0x%lx workingset=%d\n",
+	       memcgid, nid, entry, workingset);
+	pr_err("  NODE_DATA(%d)=%px\n", nid, NODE_DATA(nid));
+	if (nid >= MAX_NUMNODES || !NODE_DATA(nid)) {
+		pr_err("*** BUG: INVALID NODE ID %d! ***\n", nid);
+	}
 	*memcgidp = memcgid;
 	*pgdat = NODE_DATA(nid);
 	*evictionp = entry;
@@ -231,6 +258,8 @@ static void unpack_shadow(void *shadow, int *memcgidp, pg_data_t **pgdat,
 
 static void *lru_gen_eviction(struct folio *folio)
 {
+	pr_err("LRU_GEN_EVICTION: ENTERED\n");
+	pr_err("  folio=%px node=%d\n", folio, folio_nid(folio));
 	int hist;
 	unsigned long token;
 	unsigned long min_seq;
@@ -250,11 +279,15 @@ static void *lru_gen_eviction(struct folio *folio)
 	lrugen = &lruvec->lrugen;
 	min_seq = READ_ONCE(lrugen->min_seq[type]);
 	token = (min_seq << LRU_REFS_WIDTH) | max(refs - 1, 0);
-
+	pr_err("LRU_GEN_EVICTION: min_seq=0x%lx refs=%d tier=%d\n",
+	       min_seq, refs, tier);
+	pr_err("  token=0x%lx (will be eviction parameter)\n", token);
 	hist = lru_hist_from_seq(min_seq);
 	atomic_long_add(delta, &lrugen->evicted[hist][type][tier]);
-
-	return pack_shadow(mem_cgroup_id(memcg), pgdat, token, workingset);
+	void *shadow = pack_shadow(mem_cgroup_id(memcg), pgdat, token, workingset);
+	pr_err("LRU_GEN_EVICTION: Returning shadow=0x%lx\n", (unsigned long)shadow);
+	return shadow;
+	//return pack_shadow(mem_cgroup_id(memcg), pgdat, token, workingset);
 }
 
 /*
@@ -270,7 +303,14 @@ static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec,
 	struct pglist_data *pgdat;
 
 	unpack_shadow(shadow, &memcg_id, &pgdat, token, workingset);
-
+	/*
+	 * If pgdat is NULL, the shadow entry contains an invalid node ID.
+	 * Set lruvec to NULL so caller can detect and skip processing.
+	 */
+	if (unlikely(!pgdat)) {
+	        *lruvec = NULL;
+		return false;
+	}
 	memcg = mem_cgroup_from_id(memcg_id);
 	*lruvec = mem_cgroup_lruvec(memcg, pgdat);
 
@@ -280,7 +320,7 @@ static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec,
 	return abs_diff(max_seq, *token >> LRU_REFS_WIDTH) < MAX_NR_GENS;
 }
 
-static void lru_gen_refault(struct folio *folio, void *shadow)
+static void lru_gen_refault(struct folio *folio, void *shadow) 
 {
 	bool recent;
 	int hist, tier, refs;
@@ -292,11 +332,9 @@ static void lru_gen_refault(struct folio *folio, void *shadow)
 	int delta = folio_nr_pages(folio);
 
 	rcu_read_lock();
-
 	recent = lru_gen_test_recent(shadow, &lruvec, &token, &workingset);
-	if (lruvec != folio_lruvec(folio))
+	if (!lruvec || lruvec != folio_lruvec(folio))
 		goto unlock;
-
 	mod_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + type, delta);
 
 	if (!recent)
@@ -533,6 +571,11 @@ bool workingset_test_recent(void *shadow, bool file, bool *workingset,
  */
 void workingset_refault(struct folio *folio, void *shadow)
 {
+	pr_err("WORKINGSET_REFAULT: ENTERED\n");
+        pr_err("  folio=%px shadow=0x%lx\n", folio, (unsigned long)shadow);
+	  if ((unsigned long)shadow == 0x41) {
+                pr_err("*** BUG: WORKINGSET_REFAULT received corrupted shadow 0x41! ***\n");
+        }
 	bool file = folio_is_file_lru(folio);
 	struct pglist_data *pgdat;
 	struct mem_cgroup *memcg;
@@ -543,9 +586,13 @@ void workingset_refault(struct folio *folio, void *shadow)
 	VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
 
 	if (lru_gen_enabled()) {
+		pr_err("WORKINGSET_REFAULT: LRU_GEN enabled, calling lru_gen_refault\n");
 		lru_gen_refault(folio, shadow);
+		pr_err("WORKINGSET_REFAULT: lru_gen_refault returned\n");
+
 		return;
 	}
+	        pr_err("WORKINGSET_REFAULT: Using regular (non-LRU_GEN) path\n");
 
 	/*
 	 * The activation decision for this folio is made at the level
@@ -562,6 +609,7 @@ void workingset_refault(struct folio *folio, void *shadow)
 	lruvec = mem_cgroup_lruvec(memcg, pgdat);
 
 	mod_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + file, nr);
+	        pr_err("WORKINGSET_REFAULT: Calling workingset_test_recent\n");
 
 	if (!workingset_test_recent(shadow, file, &workingset, true))
 		return;
@@ -572,6 +620,7 @@ void workingset_refault(struct folio *folio, void *shadow)
 
 	/* Folio was active prior to eviction */
 	if (workingset) {
+		 pr_err("WORKINGSET_REFAULT: Folio was workingset, restoring\n");
 		folio_set_workingset(folio);
 		/*
 		 * XXX: Move to folio_add_lru() when it supports new vs
@@ -580,6 +629,7 @@ void workingset_refault(struct folio *folio, void *shadow)
 		lru_note_cost_refault(folio);
 		mod_lruvec_state(lruvec, WORKINGSET_RESTORE_BASE + file, nr);
 	}
+	 pr_err("WORKINGSET_REFAULT: EXITING\n");
 }
 
 /**
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [syzbot] [mm?] general protection fault in lru_gen_test_recent (2)
       [not found] <20251209062801.3246869-1-kartikey406@gmail.com>
@ 2025-12-09  6:43 ` syzbot
  0 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-09  6:43 UTC (permalink / raw)
  To: kartikey406, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
BUG: unable to handle kernel NULL pointer dereference in filemap_read_folio

WORKINGSET_REFAULT: LRU_GEN enabled, calling lru_gen_refault
UNPACK_SHADOW: READING SHADOW
  shadow=0x2d
  Unpacked: memcgid=0 nid=11 eviction=0x0 workingset=0
  NODE_DATA(11)=0000000000000000
*** BUG: INVALID NODE ID 11! ***
WORKINGSET_REFAULT: lru_gen_refault returned
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 243bc067 P4D 243bc067 PUD 0 
Oops: Oops: 0010 [#1] SMP KASAN NOPTI
CPU: 2 UID: 0 PID: 6459 Comm: syz.0.38 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc9000326f988 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff81f7e52e
RDX: ffff88802a27a4c0 RSI: ffffea00013ccc40 RDI: ffff8880333a1880
RBP: ffffea00013ccc40 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 1ffff9200064df32
R13: ffff8880333a1880 R14: 0000000000000000 R15: dffffc0000000000
FS:  00007fa2d58f86c0(0000) GS:ffff8880d6b07000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000002826f000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 filemap_read_folio+0xc8/0x2a0 mm/filemap.c:2496
 do_read_cache_folio+0x266/0x5c0 mm/filemap.c:4096
 freader_get_folio+0x33a/0x930 lib/buildid.c:58
 freader_fetch+0xbd/0x740 lib/buildid.c:101
 __build_id_parse.isra.0+0xdd/0x6c0 lib/buildid.c:289
 do_procmap_query+0xb0e/0x1080 fs/proc/task_mmu.c:733
 procfs_procmap_ioctl+0x9d/0xe0 fs/proc/task_mmu.c:813
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:597 [inline]
 __se_sys_ioctl fs/ioctl.c:583 [inline]
 __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fa2d498f7c9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fa2d58f8038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fa2d4be5fa0 RCX: 00007fa2d498f7c9
RDX: 0000200000000180 RSI: 00000000c0686611 RDI: 0000000000000003
RBP: 00007fa2d4a13f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fa2d4be6038 R14: 00007fa2d4be5fa0 R15: 00007fffd9de8348
 </TASK>
Modules linked in:
CR2: 0000000000000000
---[ end trace 0000000000000000 ]---
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc9000326f988 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff81f7e52e
RDX: ffff88802a27a4c0 RSI: ffffea00013ccc40 RDI: ffff8880333a1880
RBP: ffffea00013ccc40 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 1ffff9200064df32
R13: ffff8880333a1880 R14: 0000000000000000 R15: dffffc0000000000
FS:  00007fa2d58f86c0(0000) GS:ffff8880d6b07000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000002826f000 CR4: 0000000000352ef0


Tested on:

commit:         cb015814 Merge tag 'f2fs-for-6.19-rc1' of git://git.ke..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=11841eb4580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=dbcb767d1e1208ac
dashboard link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
compiler:       gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=17105992580000


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Forwarded: [PATCH] for test
  2025-12-07  8:55 [syzbot] [mm?] general protection fault in lru_gen_test_recent (2) syzbot
                   ` (13 preceding siblings ...)
  2025-12-09  6:28 ` Forwarded: [PATCH] mm/workingset: fix NULL deref from invalid node ID in shadow syzbot
@ 2025-12-23  9:38 ` syzbot
  14 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-23  9:38 UTC (permalink / raw)
  To: linux-kernel, syzkaller-bugs

For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com.

***

Subject: [PATCH] for test
Author: wangjinchao600@gmail.com

#syz test

Signed-off-by: Jinchao Wang <wangjinchao600@gmail.com>
---
 lib/buildid.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/buildid.c b/lib/buildid.c
index aaf61dfc0919..7131594cb071 100644
--- a/lib/buildid.c
+++ b/lib/buildid.c
@@ -280,7 +280,10 @@ static int __build_id_parse(struct vm_area_struct *vma, unsigned char *build_id,
 	int ret;
 
 	/* only works for page backed storage  */
-	if (!vma->vm_file)
+	if (!vma->vm_file ||
+	    !S_ISREG(file_inode(vma->vm_file)->i_mode) ||
+	    !vma->vm_file->f_mapping->a_ops ||
+	    !vma->vm_file->f_mapping->a_ops->read_folio)
 		return -EINVAL;
 
 	freader_init_from_file(&r, buf, sizeof(buf), vma->vm_file, may_fault);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [syzbot] [mm?] general protection fault in lru_gen_test_recent (2)
       [not found] <20251223093744.2399424-2-wangjinchao600@gmail.com>
@ 2025-12-23  9:58 ` syzbot
  0 siblings, 0 replies; 31+ messages in thread
From: syzbot @ 2025-12-23  9:58 UTC (permalink / raw)
  To: linux-kernel, syzkaller-bugs, wangjinchao600

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com
Tested-by: syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com

Tested on:

commit:         b9275466 Merge tag 'dma-mapping-6.19-2025-12-22' of gi..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=12406f1a580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=a11e0f726bfb6765
dashboard link: https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
compiler:       gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=17706fc2580000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2025-12-23  9:58 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-07  8:55 [syzbot] [mm?] general protection fault in lru_gen_test_recent (2) syzbot
2025-12-07 12:44 ` Forwarded: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent syzbot
2025-12-07 14:35 ` syzbot
2025-12-07 15:05 ` syzbot
2025-12-07 15:31 ` syzbot
2025-12-07 15:38 ` syzbot
2025-12-07 16:07 ` syzbot
2025-12-08  2:31 ` Forwarded: [PATCH] mm/workingset: fix NULL pointer dereference in lru_gen_test_recent() syzbot
2025-12-08  2:47 ` syzbot
2025-12-08  3:56 ` Forwarded: [PATCH] mm/workingset: add debug for corrupted shadow entry investigation syzbot
2025-12-08  4:49 ` Forwarded: [PATCH] mm/workingset: fix crash from corrupted shadow entries in lru_gen syzbot
2025-12-08  5:14 ` syzbot
2025-12-09  5:35 ` Forwarded: [PATCH] mm/workingset: add debug instrumentation for MGLRU shadow corruption syzbot
2025-12-09  5:44 ` Forwarded: [PATCH] mm/workingset: debug MGLRU shadow corruption leading to NULL deref syzbot
2025-12-09  6:28 ` Forwarded: [PATCH] mm/workingset: fix NULL deref from invalid node ID in shadow syzbot
2025-12-23  9:38 ` Forwarded: [PATCH] for test syzbot
     [not found] <20251207124441.2614564-1-kartikey406@gmail.com>
2025-12-07 13:10 ` [syzbot] [mm?] general protection fault in lru_gen_test_recent (2) syzbot
     [not found] <20251207143534.2719842-1-kartikey406@gmail.com>
2025-12-07 14:50 ` syzbot
     [not found] <20251207150512.2759308-1-kartikey406@gmail.com>
2025-12-07 15:22 ` syzbot
     [not found] <20251207153059.2790675-1-kartikey406@gmail.com>
2025-12-07 15:45 ` syzbot
     [not found] <20251207153807.2801160-1-kartikey406@gmail.com>
2025-12-07 16:00 ` syzbot
     [not found] <20251207160752.2863580-1-kartikey406@gmail.com>
2025-12-07 16:23 ` syzbot
     [not found] <20251208023132.2923514-1-kartikey406@gmail.com>
2025-12-08  3:00 ` syzbot
     [not found] <20251208024712.2925251-1-kartikey406@gmail.com>
2025-12-08  3:28 ` syzbot
     [not found] <20251208035638.2927077-1-kartikey406@gmail.com>
2025-12-08  4:14 ` syzbot
     [not found] <20251208044921.2928668-1-kartikey406@gmail.com>
2025-12-08  5:07 ` syzbot
     [not found] <20251208051440.2931546-1-kartikey406@gmail.com>
2025-12-08  5:31 ` syzbot
     [not found] <20251209053549.3243990-1-kartikey406@gmail.com>
2025-12-09  5:37 ` syzbot
     [not found] <20251209054447.3244819-1-kartikey406@gmail.com>
2025-12-09  5:59 ` syzbot
     [not found] <20251209062801.3246869-1-kartikey406@gmail.com>
2025-12-09  6:43 ` syzbot
     [not found] <20251223093744.2399424-2-wangjinchao600@gmail.com>
2025-12-23  9:58 ` syzbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox