[syzbot] [cgroups?] general protection fault in rebuild_sched_domains

public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed

* [syzbot] [cgroups?] general protection fault in rebuild_sched_domains_locked
@ 2026-02-15 21:05 syzbot
  2026-02-16  5:57 ` Waiman Long
  0 siblings, 1 reply; 6+ messages in thread
From: syzbot @ 2026-02-15 21:05 UTC (permalink / raw)
  To: cgroups, chenridong, hannes, linux-kernel, longman, mkoutny,
	syzkaller-bugs, tj

Hello,

syzbot found the following issue on:

HEAD commit:    37a93dd5c49b Merge tag 'net-next-7.0' of git://git.kernel...
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1649d073980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=a512b4a06724b76a
dashboard link: https://syzkaller.appspot.com/bug?extid=460792609a79c085f79f
compiler:       gcc (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=152086e6580000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=139c2eef980000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/0dedaafff2ad/disk-37a93dd5.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/aa7fae081497/vmlinux-37a93dd5.xz
kernel image: https://storage.googleapis.com/syzbot-assets/9096b39b53e1/bzImage-37a93dd5.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+460792609a79c085f79f@syzkaller.appspotmail.com

R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00007fe00de15fac R14: 00007fe00de15fa0 R15: 00007fe00de15fa0
 </TASK>
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 1 UID: 0 PID: 5994 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/24/2026
RIP: 0010:bitmap_subset include/linux/bitmap.h:433 [inline]
RIP: 0010:cpumask_subset include/linux/cpumask.h:836 [inline]
RIP: 0010:rebuild_sched_domains_locked+0x2aa/0x980 kernel/cgroup/cpuset.c:967
Code: 7d 05 00 41 83 c4 01 89 de 48 83 c5 08 44 89 e7 e8 fb 76 05 00 41 39 dc 0f 8d 4c 04 00 00 e8 fd 7c 05 00 48 89 e8 48 c1 e8 03 <42> 80 3c 30 00 0f 85 1d 06 00 00 48 8b 04 24 48 23 45 00 31 ff 44
RSP: 0018:ffffc90003ecfbc0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000020
RDX: ffff888028de0000 RSI: ffffffff8200f003 RDI: ffffffff8df14f28
RBP: 0000000000000000 R08: 0000000000000cc0 R09: 00000000ffffffff
R10: ffffffff8e7d95b3 R11: 0000000000000001 R12: 0000000000000000
R13: 00000000000f4240 R14: dffffc0000000000 R15: 0000000000000000
FS:  000055555c694500(0000) GS:ffff8881246a5000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b2f463fff CR3: 000000003704c000 CR4: 00000000003526f0
Call Trace:
 <TASK>
 rebuild_sched_domains_cpuslocked kernel/cgroup/cpuset.c:983 [inline]
 rebuild_sched_domains+0x21/0x40 kernel/cgroup/cpuset.c:990
 sched_rt_handler+0xb5/0xe0 kernel/sched/rt.c:2911
 proc_sys_call_handler+0x47f/0x5a0 fs/proc/proc_sysctl.c:600
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x6ac/0x1070 fs/read_write.c:688
 ksys_write+0x12a/0x250 fs/read_write.c:740
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fe00db9bf79
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fff27bcda88 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007fe00de15fa0 RCX: 00007fe00db9bf79
RDX: 00000000000000f6 RSI: 0000200000000000 RDI: 0000000000000003
RBP: 00007fff27bcdaf0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00007fe00de15fac R14: 00007fe00de15fa0 R15: 00007fe00de15fa0
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:bitmap_subset include/linux/bitmap.h:433 [inline]
RIP: 0010:cpumask_subset include/linux/cpumask.h:836 [inline]
RIP: 0010:rebuild_sched_domains_locked+0x2aa/0x980 kernel/cgroup/cpuset.c:967
Code: 7d 05 00 41 83 c4 01 89 de 48 83 c5 08 44 89 e7 e8 fb 76 05 00 41 39 dc 0f 8d 4c 04 00 00 e8 fd 7c 05 00 48 89 e8 48 c1 e8 03 <42> 80 3c 30 00 0f 85 1d 06 00 00 48 8b 04 24 48 23 45 00 31 ff 44
RSP: 0018:ffffc90003ecfbc0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000020
RDX: ffff888028de0000 RSI: ffffffff8200f003 RDI: ffffffff8df14f28
RBP: 0000000000000000 R08: 0000000000000cc0 R09: 00000000ffffffff
R10: ffffffff8e7d95b3 R11: 0000000000000001 R12: 0000000000000000
R13: 00000000000f4240 R14: dffffc0000000000 R15: 0000000000000000
FS:  000055555c694500(0000) GS:ffff8881246a5000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b2f463fff CR3: 000000003704c000 CR4: 00000000003526f0
----------------
Code disassembly (best guess), 1 bytes skipped:
   0:	05 00 41 83 c4       	add    $0xc4834100,%eax
   5:	01 89 de 48 83 c5    	add    %ecx,-0x3a7cb722(%rcx)
   b:	08 44 89 e7          	or     %al,-0x19(%rcx,%rcx,4)
   f:	e8 fb 76 05 00       	call   0x5770f
  14:	41 39 dc             	cmp    %ebx,%r12d
  17:	0f 8d 4c 04 00 00    	jge    0x469
  1d:	e8 fd 7c 05 00       	call   0x57d1f
  22:	48 89 e8             	mov    %rbp,%rax
  25:	48 c1 e8 03          	shr    $0x3,%rax
* 29:	42 80 3c 30 00       	cmpb   $0x0,(%rax,%r14,1) <-- trapping instruction
  2e:	0f 85 1d 06 00 00    	jne    0x651
  34:	48 8b 04 24          	mov    (%rsp),%rax
  38:	48 23 45 00          	and    0x0(%rbp),%rax
  3c:	31 ff                	xor    %edi,%edi
  3e:	44                   	rex.R


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] [cgroups?] general protection fault in rebuild_sched_domains_locked
  2026-02-15 21:05 [syzbot] [cgroups?] general protection fault in rebuild_sched_domains_locked syzbot
@ 2026-02-16  5:57 ` Waiman Long
  2026-02-24  4:03   ` Chen Ridong
  0 siblings, 1 reply; 6+ messages in thread
From: Waiman Long @ 2026-02-16  5:57 UTC (permalink / raw)
  To: syzbot, cgroups, chenridong, hannes, linux-kernel, mkoutny,
	syzkaller-bugs, tj

On 2/15/26 4:05 PM, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:    37a93dd5c49b Merge tag 'net-next-7.0' of git://git.kernel...
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1649d073980000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=a512b4a06724b76a
> dashboard link: https://syzkaller.appspot.com/bug?extid=460792609a79c085f79f
> compiler:       gcc (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=152086e6580000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=139c2eef980000
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/0dedaafff2ad/disk-37a93dd5.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/aa7fae081497/vmlinux-37a93dd5.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/9096b39b53e1/bzImage-37a93dd5.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+460792609a79c085f79f@syzkaller.appspotmail.com
>
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
> R13: 00007fe00de15fac R14: 00007fe00de15fa0 R15: 00007fe00de15fa0
>   </TASK>
> Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN PTI
> KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
> CPU: 1 UID: 0 PID: 5994 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full)
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/24/2026
> RIP: 0010:bitmap_subset include/linux/bitmap.h:433 [inline]
> RIP: 0010:cpumask_subset include/linux/cpumask.h:836 [inline]
> RIP: 0010:rebuild_sched_domains_locked+0x2aa/0x980 kernel/cgroup/cpuset.c:967
> Code: 7d 05 00 41 83 c4 01 89 de 48 83 c5 08 44 89 e7 e8 fb 76 05 00 41 39 dc 0f 8d 4c 04 00 00 e8 fd 7c 05 00 48 89 e8 48 c1 e8 03 <42> 80 3c 30 00 0f 85 1d 06 00 00 48 8b 04 24 48 23 45 00 31 ff 44
> RSP: 0018:ffffc90003ecfbc0 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000020
> RDX: ffff888028de0000 RSI: ffffffff8200f003 RDI: ffffffff8df14f28
> RBP: 0000000000000000 R08: 0000000000000cc0 R09: 00000000ffffffff
> R10: ffffffff8e7d95b3 R11: 0000000000000001 R12: 0000000000000000
> R13: 00000000000f4240 R14: dffffc0000000000 R15: 0000000000000000
> FS:  000055555c694500(0000) GS:ffff8881246a5000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000001b2f463fff CR3: 000000003704c000 CR4: 00000000003526f0
> Call Trace:
>   <TASK>
>   rebuild_sched_domains_cpuslocked kernel/cgroup/cpuset.c:983 [inline]
>   rebuild_sched_domains+0x21/0x40 kernel/cgroup/cpuset.c:990
>   sched_rt_handler+0xb5/0xe0 kernel/sched/rt.c:2911
>   proc_sys_call_handler+0x47f/0x5a0 fs/proc/proc_sysctl.c:600
>   new_sync_write fs/read_write.c:595 [inline]
>   vfs_write+0x6ac/0x1070 fs/read_write.c:688
>   ksys_write+0x12a/0x250 fs/read_write.c:740
>   do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>   do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x7fe00db9bf79
> Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007fff27bcda88 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> RAX: ffffffffffffffda RBX: 00007fe00de15fa0 RCX: 00007fe00db9bf79
> RDX: 00000000000000f6 RSI: 0000200000000000 RDI: 0000000000000003
> RBP: 00007fff27bcdaf0 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
> R13: 00007fe00de15fac R14: 00007fe00de15fa0 R15: 00007fe00de15fa0
>   </TASK>
> Modules linked in:
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:bitmap_subset include/linux/bitmap.h:433 [inline]
> RIP: 0010:cpumask_subset include/linux/cpumask.h:836 [inline]
> RIP: 0010:rebuild_sched_domains_locked+0x2aa/0x980 kernel/cgroup/cpuset.c:967
> Code: 7d 05 00 41 83 c4 01 89 de 48 83 c5 08 44 89 e7 e8 fb 76 05 00 41 39 dc 0f 8d 4c 04 00 00 e8 fd 7c 05 00 48 89 e8 48 c1 e8 03 <42> 80 3c 30 00 0f 85 1d 06 00 00 48 8b 04 24 48 23 45 00 31 ff 44
> RSP: 0018:ffffc90003ecfbc0 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000020
> RDX: ffff888028de0000 RSI: ffffffff8200f003 RDI: ffffffff8df14f28
> RBP: 0000000000000000 R08: 0000000000000cc0 R09: 00000000ffffffff
> R10: ffffffff8e7d95b3 R11: 0000000000000001 R12: 0000000000000000
> R13: 00000000000f4240 R14: dffffc0000000000 R15: 0000000000000000
> FS:  000055555c694500(0000) GS:ffff8881246a5000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000001b2f463fff CR3: 000000003704c000 CR4: 00000000003526f0
> ----------------
> Code disassembly (best guess), 1 bytes skipped:
>     0:	05 00 41 83 c4       	add    $0xc4834100,%eax
>     5:	01 89 de 48 83 c5    	add    %ecx,-0x3a7cb722(%rcx)
>     b:	08 44 89 e7          	or     %al,-0x19(%rcx,%rcx,4)
>     f:	e8 fb 76 05 00       	call   0x5770f
>    14:	41 39 dc             	cmp    %ebx,%r12d
>    17:	0f 8d 4c 04 00 00    	jge    0x469
>    1d:	e8 fd 7c 05 00       	call   0x57d1f
>    22:	48 89 e8             	mov    %rbp,%rax
>    25:	48 c1 e8 03          	shr    $0x3,%rax
> * 29:	42 80 3c 30 00       	cmpb   $0x0,(%rax,%r14,1) <-- trapping instruction
>    2e:	0f 85 1d 06 00 00    	jne    0x651
>    34:	48 8b 04 24          	mov    (%rsp),%rax
>    38:	48 23 45 00          	and    0x0(%rbp),%rax
>    3c:	31 ff                	xor    %edi,%edi
>    3e:	44                   	rex.R

The cpuset.c:967 is:

     966         for (i = 0; i < ndoms; ++i) {
     967                 if (WARN_ON_ONCE(!cpumask_subset(doms[i], 
cpu_active_mask)))
     968                         return;

The oops was caused by accessing doms[i] which was kmalloc'ed in 
generate_sched_domains() by calling alloc_sched_domains() in 
kernel/sched/topology.c. Looking at the console log just before the 
oops, I saw

[  124.398850][ T5994] FAULT_INJECTION: forcing a failure.
[  124.398850][ T5994] name failslab, interval 1, probability 0, space 
0, times 1
[  124.434865][ T5994] CPU: 1 UID: 0 PID: 5994 Comm: syz.0.17 Not 
tainted syzkaller #0 PREEMPT(full)
[  124.434909][ T5994] Hardware name: Google Google Compute 
Engine/Google Compute Engine, BIOS Google 01/24/2026
[  124.434936][ T5994] Call Trace:
[  124.434947][ T5994]  <TASK>
[  124.434959][ T5994]  dump_stack_lvl+0x100/0x190
[  124.435026][ T5994]  should_fail_ex.cold+0x5/0xa
[  124.435062][ T5994]  ? rebuild_sched_domains_locked+0x51/0x980
[  124.435113][ T5994]  should_failslab+0xc2/0x120
[  124.435153][ T5994]  __kmalloc_noprof+0xe0/0x850
[  124.435195][ T5994]  rebuild_sched_domains_locked+0x51/0x980
[  124.435266][ T5994]  rebuild_sched_domains+0x21/0x40
[  124.435314][ T5994]  sched_rt_handler+0xb5/0xe0
[  124.435359][ T5994]  proc_sys_call_handler+0x47f/0x5a0
[  124.435413][ T5994]  ? __pfx_proc_sys_call_handler+0x10/0x10
[  124.435475][ T5994]  vfs_write+0x6ac/0x1070
[  124.435511][ T5994]  ? __pfx_proc_sys_write+0x10/0x10
[  124.435562][ T5994]  ? __pfx_vfs_write+0x10/0x10
[  124.435597][ T5994]  ? __pfx_do_sys_openat2+0x10/0x10
[  124.435664][ T5994]  ksys_write+0x12a/0x250
[  124.435696][ T5994]  ? __pfx_ksys_write+0x10/0x10
[  124.435730][ T5994]  ? do_user_addr_fault+0x8d6/0x12f0
[  124.435787][ T5994]  do_syscall_64+0x106/0xf80
[  124.435834][ T5994]  ? clear_bhb_loop+0x40/0x90
[  124.435875][ T5994]  entry_SYSCALL_64_after_hwframe+0x77/0x7f

So it looks like the oops may be expected. It may not be a bug in the 
cpuset AFAICS.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] [cgroups?] general protection fault in rebuild_sched_domains_locked
  2026-02-16  5:57 ` Waiman Long
@ 2026-02-24  4:03   ` Chen Ridong
  2026-02-24  5:37     ` Waiman Long
  0 siblings, 1 reply; 6+ messages in thread
From: Chen Ridong @ 2026-02-24  4:03 UTC (permalink / raw)
  To: Waiman Long, syzbot, cgroups, hannes, linux-kernel, mkoutny,
	syzkaller-bugs, tj



On 2026/2/16 13:57, Waiman Long wrote:
> On 2/15/26 4:05 PM, syzbot wrote:
>> Hello,
>>
>> syzbot found the following issue on:
>>
>> HEAD commit:    37a93dd5c49b Merge tag 'net-next-7.0' of git://git.kernel...
>> git tree:       upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=1649d073980000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=a512b4a06724b76a
>> dashboard link: https://syzkaller.appspot.com/bug?extid=460792609a79c085f79f
>> compiler:       gcc (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for
>> Debian) 2.44
>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=152086e6580000
>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=139c2eef980000
>>
>> Downloadable assets:
>> disk image:
>> https://storage.googleapis.com/syzbot-assets/0dedaafff2ad/disk-37a93dd5.raw.xz
>> vmlinux:
>> https://storage.googleapis.com/syzbot-assets/aa7fae081497/vmlinux-37a93dd5.xz
>> kernel image:
>> https://storage.googleapis.com/syzbot-assets/9096b39b53e1/bzImage-37a93dd5.xz
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: syzbot+460792609a79c085f79f@syzkaller.appspotmail.com
>>
>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
>> R13: 00007fe00de15fac R14: 00007fe00de15fa0 R15: 00007fe00de15fa0
>>   </TASK>
>> Oops: general protection fault, probably for non-canonical address
>> 0xdffffc0000000000: 0000 [#1] SMP KASAN PTI
>> KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
>> CPU: 1 UID: 0 PID: 5994 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full)
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
>> 01/24/2026
>> RIP: 0010:bitmap_subset include/linux/bitmap.h:433 [inline]
>> RIP: 0010:cpumask_subset include/linux/cpumask.h:836 [inline]
>> RIP: 0010:rebuild_sched_domains_locked+0x2aa/0x980 kernel/cgroup/cpuset.c:967
>> Code: 7d 05 00 41 83 c4 01 89 de 48 83 c5 08 44 89 e7 e8 fb 76 05 00 41 39 dc
>> 0f 8d 4c 04 00 00 e8 fd 7c 05 00 48 89 e8 48 c1 e8 03 <42> 80 3c 30 00 0f 85
>> 1d 06 00 00 48 8b 04 24 48 23 45 00 31 ff 44
>> RSP: 0018:ffffc90003ecfbc0 EFLAGS: 00010246
>> RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000020
>> RDX: ffff888028de0000 RSI: ffffffff8200f003 RDI: ffffffff8df14f28
>> RBP: 0000000000000000 R08: 0000000000000cc0 R09: 00000000ffffffff
>> R10: ffffffff8e7d95b3 R11: 0000000000000001 R12: 0000000000000000
>> R13: 00000000000f4240 R14: dffffc0000000000 R15: 0000000000000000
>> FS:  000055555c694500(0000) GS:ffff8881246a5000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000001b2f463fff CR3: 000000003704c000 CR4: 00000000003526f0
>> Call Trace:
>>   <TASK>
>>   rebuild_sched_domains_cpuslocked kernel/cgroup/cpuset.c:983 [inline]
>>   rebuild_sched_domains+0x21/0x40 kernel/cgroup/cpuset.c:990
>>   sched_rt_handler+0xb5/0xe0 kernel/sched/rt.c:2911
>>   proc_sys_call_handler+0x47f/0x5a0 fs/proc/proc_sysctl.c:600
>>   new_sync_write fs/read_write.c:595 [inline]
>>   vfs_write+0x6ac/0x1070 fs/read_write.c:688
>>   ksys_write+0x12a/0x250 fs/read_write.c:740
>>   do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>>   do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
>>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
>> RIP: 0033:0x7fe00db9bf79
>> Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48
>> 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73
>> 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
>> RSP: 002b:00007fff27bcda88 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
>> RAX: ffffffffffffffda RBX: 00007fe00de15fa0 RCX: 00007fe00db9bf79
>> RDX: 00000000000000f6 RSI: 0000200000000000 RDI: 0000000000000003
>> RBP: 00007fff27bcdaf0 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
>> R13: 00007fe00de15fac R14: 00007fe00de15fa0 R15: 00007fe00de15fa0
>>   </TASK>
>> Modules linked in:
>> ---[ end trace 0000000000000000 ]---
>> RIP: 0010:bitmap_subset include/linux/bitmap.h:433 [inline]
>> RIP: 0010:cpumask_subset include/linux/cpumask.h:836 [inline]
>> RIP: 0010:rebuild_sched_domains_locked+0x2aa/0x980 kernel/cgroup/cpuset.c:967
>> Code: 7d 05 00 41 83 c4 01 89 de 48 83 c5 08 44 89 e7 e8 fb 76 05 00 41 39 dc
>> 0f 8d 4c 04 00 00 e8 fd 7c 05 00 48 89 e8 48 c1 e8 03 <42> 80 3c 30 00 0f 85
>> 1d 06 00 00 48 8b 04 24 48 23 45 00 31 ff 44
>> RSP: 0018:ffffc90003ecfbc0 EFLAGS: 00010246
>> RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000020
>> RDX: ffff888028de0000 RSI: ffffffff8200f003 RDI: ffffffff8df14f28
>> RBP: 0000000000000000 R08: 0000000000000cc0 R09: 00000000ffffffff
>> R10: ffffffff8e7d95b3 R11: 0000000000000001 R12: 0000000000000000
>> R13: 00000000000f4240 R14: dffffc0000000000 R15: 0000000000000000
>> FS:  000055555c694500(0000) GS:ffff8881246a5000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000001b2f463fff CR3: 000000003704c000 CR4: 00000000003526f0
>> ----------------
>> Code disassembly (best guess), 1 bytes skipped:
>>     0:    05 00 41 83 c4           add    $0xc4834100,%eax
>>     5:    01 89 de 48 83 c5        add    %ecx,-0x3a7cb722(%rcx)
>>     b:    08 44 89 e7              or     %al,-0x19(%rcx,%rcx,4)
>>     f:    e8 fb 76 05 00           call   0x5770f
>>    14:    41 39 dc                 cmp    %ebx,%r12d
>>    17:    0f 8d 4c 04 00 00        jge    0x469
>>    1d:    e8 fd 7c 05 00           call   0x57d1f
>>    22:    48 89 e8                 mov    %rbp,%rax
>>    25:    48 c1 e8 03              shr    $0x3,%rax
>> * 29:    42 80 3c 30 00           cmpb   $0x0,(%rax,%r14,1) <-- trapping
>> instruction
>>    2e:    0f 85 1d 06 00 00        jne    0x651
>>    34:    48 8b 04 24              mov    (%rsp),%rax
>>    38:    48 23 45 00              and    0x0(%rbp),%rax
>>    3c:    31 ff                    xor    %edi,%edi
>>    3e:    44                       rex.R
> 
> The cpuset.c:967 is:
> 
>     966         for (i = 0; i < ndoms; ++i) {
>     967                 if (WARN_ON_ONCE(!cpumask_subset(doms[i],
> cpu_active_mask)))
>     968                         return;
> 
> The oops was caused by accessing doms[i] which was kmalloc'ed in
> generate_sched_domains() by calling alloc_sched_domains() in
> kernel/sched/topology.c. Looking at the console log just before the oops, I saw
> 
> [  124.398850][ T5994] FAULT_INJECTION: forcing a failure.
> [  124.398850][ T5994] name failslab, interval 1, probability 0, space 0, times 1
> [  124.434865][ T5994] CPU: 1 UID: 0 PID: 5994 Comm: syz.0.17 Not tainted
> syzkaller #0 PREEMPT(full)
> [  124.434909][ T5994] Hardware name: Google Google Compute Engine/Google
> Compute Engine, BIOS Google 01/24/2026
> [  124.434936][ T5994] Call Trace:
> [  124.434947][ T5994]  <TASK>
> [  124.434959][ T5994]  dump_stack_lvl+0x100/0x190
> [  124.435026][ T5994]  should_fail_ex.cold+0x5/0xa
> [  124.435062][ T5994]  ? rebuild_sched_domains_locked+0x51/0x980
> [  124.435113][ T5994]  should_failslab+0xc2/0x120
> [  124.435153][ T5994]  __kmalloc_noprof+0xe0/0x850
> [  124.435195][ T5994]  rebuild_sched_domains_locked+0x51/0x980
> [  124.435266][ T5994]  rebuild_sched_domains+0x21/0x40
> [  124.435314][ T5994]  sched_rt_handler+0xb5/0xe0
> [  124.435359][ T5994]  proc_sys_call_handler+0x47f/0x5a0
> [  124.435413][ T5994]  ? __pfx_proc_sys_call_handler+0x10/0x10
> [  124.435475][ T5994]  vfs_write+0x6ac/0x1070
> [  124.435511][ T5994]  ? __pfx_proc_sys_write+0x10/0x10
> [  124.435562][ T5994]  ? __pfx_vfs_write+0x10/0x10
> [  124.435597][ T5994]  ? __pfx_do_sys_openat2+0x10/0x10
> [  124.435664][ T5994]  ksys_write+0x12a/0x250
> [  124.435696][ T5994]  ? __pfx_ksys_write+0x10/0x10
> [  124.435730][ T5994]  ? do_user_addr_fault+0x8d6/0x12f0
> [  124.435787][ T5994]  do_syscall_64+0x106/0xf80
> [  124.435834][ T5994]  ? clear_bhb_loop+0x40/0x90
> [  124.435875][ T5994]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> So it looks like the oops may be expected. It may not be a bug in the cpuset
> AFAICS.
> 

Hi Longman,

Thank you for looking into this issue.

Since partition_sched_domains_locked can handle the situation where 'doms' is
NULL, I think we should make it robust and fix it.

The fix can be implemented as follows:

In cpuset.c at line 964:

        for (i = 0; i < ndoms; ++i) {
-               if (WARN_ON_ONCE(!cpumask_subset(doms[i], cpu_active_mask)))
+               if (doms && WARN_ON_ONCE(!cpumask_subset(doms[i],
+                                         cpu_active_mask)))
                        return;
        }

-- 
Best regards,
Ridong


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] [cgroups?] general protection fault in rebuild_sched_domains_locked
  2026-02-24  4:03   ` Chen Ridong
@ 2026-02-24  5:37     ` Waiman Long
  2026-02-24  6:56       ` Chen Ridong
  0 siblings, 1 reply; 6+ messages in thread
From: Waiman Long @ 2026-02-24  5:37 UTC (permalink / raw)
  To: Chen Ridong, Waiman Long, syzbot, cgroups, hannes, linux-kernel,
	mkoutny, syzkaller-bugs, tj

On 2/23/26 11:03 PM, Chen Ridong wrote:
>
> On 2026/2/16 13:57, Waiman Long wrote:
>> On 2/15/26 4:05 PM, syzbot wrote:
>>> Hello,
>>>
>>> syzbot found the following issue on:
>>>
>>> HEAD commit:    37a93dd5c49b Merge tag 'net-next-7.0' of git://git.kernel...
>>> git tree:       upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1649d073980000
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=a512b4a06724b76a
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=460792609a79c085f79f
>>> compiler:       gcc (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for
>>> Debian) 2.44
>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=152086e6580000
>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=139c2eef980000
>>>
>>> Downloadable assets:
>>> disk image:
>>> https://storage.googleapis.com/syzbot-assets/0dedaafff2ad/disk-37a93dd5.raw.xz
>>> vmlinux:
>>> https://storage.googleapis.com/syzbot-assets/aa7fae081497/vmlinux-37a93dd5.xz
>>> kernel image:
>>> https://storage.googleapis.com/syzbot-assets/9096b39b53e1/bzImage-37a93dd5.xz
>>>
>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>> Reported-by: syzbot+460792609a79c085f79f@syzkaller.appspotmail.com
>>>
>>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
>>> R13: 00007fe00de15fac R14: 00007fe00de15fa0 R15: 00007fe00de15fa0
>>>    </TASK>
>>> Oops: general protection fault, probably for non-canonical address
>>> 0xdffffc0000000000: 0000 [#1] SMP KASAN PTI
>>> KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
>>> CPU: 1 UID: 0 PID: 5994 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full)
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
>>> 01/24/2026
>>> RIP: 0010:bitmap_subset include/linux/bitmap.h:433 [inline]
>>> RIP: 0010:cpumask_subset include/linux/cpumask.h:836 [inline]
>>> RIP: 0010:rebuild_sched_domains_locked+0x2aa/0x980 kernel/cgroup/cpuset.c:967
>>> Code: 7d 05 00 41 83 c4 01 89 de 48 83 c5 08 44 89 e7 e8 fb 76 05 00 41 39 dc
>>> 0f 8d 4c 04 00 00 e8 fd 7c 05 00 48 89 e8 48 c1 e8 03 <42> 80 3c 30 00 0f 85
>>> 1d 06 00 00 48 8b 04 24 48 23 45 00 31 ff 44
>>> RSP: 0018:ffffc90003ecfbc0 EFLAGS: 00010246
>>> RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000020
>>> RDX: ffff888028de0000 RSI: ffffffff8200f003 RDI: ffffffff8df14f28
>>> RBP: 0000000000000000 R08: 0000000000000cc0 R09: 00000000ffffffff
>>> R10: ffffffff8e7d95b3 R11: 0000000000000001 R12: 0000000000000000
>>> R13: 00000000000f4240 R14: dffffc0000000000 R15: 0000000000000000
>>> FS:  000055555c694500(0000) GS:ffff8881246a5000(0000) knlGS:0000000000000000
>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 0000001b2f463fff CR3: 000000003704c000 CR4: 00000000003526f0
>>> Call Trace:
>>>    <TASK>
>>>    rebuild_sched_domains_cpuslocked kernel/cgroup/cpuset.c:983 [inline]
>>>    rebuild_sched_domains+0x21/0x40 kernel/cgroup/cpuset.c:990
>>>    sched_rt_handler+0xb5/0xe0 kernel/sched/rt.c:2911
>>>    proc_sys_call_handler+0x47f/0x5a0 fs/proc/proc_sysctl.c:600
>>>    new_sync_write fs/read_write.c:595 [inline]
>>>    vfs_write+0x6ac/0x1070 fs/read_write.c:688
>>>    ksys_write+0x12a/0x250 fs/read_write.c:740
>>>    do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>>>    do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
>>>    entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>> RIP: 0033:0x7fe00db9bf79
>>> Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48
>>> 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73
>>> 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
>>> RSP: 002b:00007fff27bcda88 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
>>> RAX: ffffffffffffffda RBX: 00007fe00de15fa0 RCX: 00007fe00db9bf79
>>> RDX: 00000000000000f6 RSI: 0000200000000000 RDI: 0000000000000003
>>> RBP: 00007fff27bcdaf0 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
>>> R13: 00007fe00de15fac R14: 00007fe00de15fa0 R15: 00007fe00de15fa0
>>>    </TASK>
>>> Modules linked in:
>>> ---[ end trace 0000000000000000 ]---
>>> RIP: 0010:bitmap_subset include/linux/bitmap.h:433 [inline]
>>> RIP: 0010:cpumask_subset include/linux/cpumask.h:836 [inline]
>>> RIP: 0010:rebuild_sched_domains_locked+0x2aa/0x980 kernel/cgroup/cpuset.c:967
>>> Code: 7d 05 00 41 83 c4 01 89 de 48 83 c5 08 44 89 e7 e8 fb 76 05 00 41 39 dc
>>> 0f 8d 4c 04 00 00 e8 fd 7c 05 00 48 89 e8 48 c1 e8 03 <42> 80 3c 30 00 0f 85
>>> 1d 06 00 00 48 8b 04 24 48 23 45 00 31 ff 44
>>> RSP: 0018:ffffc90003ecfbc0 EFLAGS: 00010246
>>> RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000020
>>> RDX: ffff888028de0000 RSI: ffffffff8200f003 RDI: ffffffff8df14f28
>>> RBP: 0000000000000000 R08: 0000000000000cc0 R09: 00000000ffffffff
>>> R10: ffffffff8e7d95b3 R11: 0000000000000001 R12: 0000000000000000
>>> R13: 00000000000f4240 R14: dffffc0000000000 R15: 0000000000000000
>>> FS:  000055555c694500(0000) GS:ffff8881246a5000(0000) knlGS:0000000000000000
>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 0000001b2f463fff CR3: 000000003704c000 CR4: 00000000003526f0
>>> ----------------
>>> Code disassembly (best guess), 1 bytes skipped:
>>>      0:    05 00 41 83 c4           add    $0xc4834100,%eax
>>>      5:    01 89 de 48 83 c5        add    %ecx,-0x3a7cb722(%rcx)
>>>      b:    08 44 89 e7              or     %al,-0x19(%rcx,%rcx,4)
>>>      f:    e8 fb 76 05 00           call   0x5770f
>>>     14:    41 39 dc                 cmp    %ebx,%r12d
>>>     17:    0f 8d 4c 04 00 00        jge    0x469
>>>     1d:    e8 fd 7c 05 00           call   0x57d1f
>>>     22:    48 89 e8                 mov    %rbp,%rax
>>>     25:    48 c1 e8 03              shr    $0x3,%rax
>>> * 29:    42 80 3c 30 00           cmpb   $0x0,(%rax,%r14,1) <-- trapping
>>> instruction
>>>     2e:    0f 85 1d 06 00 00        jne    0x651
>>>     34:    48 8b 04 24              mov    (%rsp),%rax
>>>     38:    48 23 45 00              and    0x0(%rbp),%rax
>>>     3c:    31 ff                    xor    %edi,%edi
>>>     3e:    44                       rex.R
>> The cpuset.c:967 is:
>>
>>      966         for (i = 0; i < ndoms; ++i) {
>>      967                 if (WARN_ON_ONCE(!cpumask_subset(doms[i],
>> cpu_active_mask)))
>>      968                         return;
>>
>> The oops was caused by accessing doms[i] which was kmalloc'ed in
>> generate_sched_domains() by calling alloc_sched_domains() in
>> kernel/sched/topology.c. Looking at the console log just before the oops, I saw
>>
>> [  124.398850][ T5994] FAULT_INJECTION: forcing a failure.
>> [  124.398850][ T5994] name failslab, interval 1, probability 0, space 0, times 1
>> [  124.434865][ T5994] CPU: 1 UID: 0 PID: 5994 Comm: syz.0.17 Not tainted
>> syzkaller #0 PREEMPT(full)
>> [  124.434909][ T5994] Hardware name: Google Google Compute Engine/Google
>> Compute Engine, BIOS Google 01/24/2026
>> [  124.434936][ T5994] Call Trace:
>> [  124.434947][ T5994]  <TASK>
>> [  124.434959][ T5994]  dump_stack_lvl+0x100/0x190
>> [  124.435026][ T5994]  should_fail_ex.cold+0x5/0xa
>> [  124.435062][ T5994]  ? rebuild_sched_domains_locked+0x51/0x980
>> [  124.435113][ T5994]  should_failslab+0xc2/0x120
>> [  124.435153][ T5994]  __kmalloc_noprof+0xe0/0x850
>> [  124.435195][ T5994]  rebuild_sched_domains_locked+0x51/0x980
>> [  124.435266][ T5994]  rebuild_sched_domains+0x21/0x40
>> [  124.435314][ T5994]  sched_rt_handler+0xb5/0xe0
>> [  124.435359][ T5994]  proc_sys_call_handler+0x47f/0x5a0
>> [  124.435413][ T5994]  ? __pfx_proc_sys_call_handler+0x10/0x10
>> [  124.435475][ T5994]  vfs_write+0x6ac/0x1070
>> [  124.435511][ T5994]  ? __pfx_proc_sys_write+0x10/0x10
>> [  124.435562][ T5994]  ? __pfx_vfs_write+0x10/0x10
>> [  124.435597][ T5994]  ? __pfx_do_sys_openat2+0x10/0x10
>> [  124.435664][ T5994]  ksys_write+0x12a/0x250
>> [  124.435696][ T5994]  ? __pfx_ksys_write+0x10/0x10
>> [  124.435730][ T5994]  ? do_user_addr_fault+0x8d6/0x12f0
>> [  124.435787][ T5994]  do_syscall_64+0x106/0xf80
>> [  124.435834][ T5994]  ? clear_bhb_loop+0x40/0x90
>> [  124.435875][ T5994]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>
>> So it looks like the oops may be expected. It may not be a bug in the cpuset
>> AFAICS.
>>
> Hi Longman,
>
> Thank you for looking into this issue.
>
> Since partition_sched_domains_locked can handle the situation where 'doms' is
> NULL, I think we should make it robust and fix it.
>
> The fix can be implemented as follows:
>
> In cpuset.c at line 964:
>
>          for (i = 0; i < ndoms; ++i) {
> -               if (WARN_ON_ONCE(!cpumask_subset(doms[i], cpu_active_mask)))
> +               if (doms && WARN_ON_ONCE(!cpumask_subset(doms[i],
> +                                         cpu_active_mask)))
>                          return;
>          }
>
The problem is that doms is not NULL. It is 0xdffffc0000000000 as shown 
in the dmesg log. So the null check here won't do any good in this 
particular case. In fact, there is already a null check right after 
alloc_sched_domains() above.

Cheers, Longman


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] [cgroups?] general protection fault in rebuild_sched_domains_locked
  2026-02-24  5:37     ` Waiman Long
@ 2026-02-24  6:56       ` Chen Ridong
  2026-02-24 14:30         ` Waiman Long
  0 siblings, 1 reply; 6+ messages in thread
From: Chen Ridong @ 2026-02-24  6:56 UTC (permalink / raw)
  To: Waiman Long, syzbot, cgroups, hannes, linux-kernel, mkoutny,
	syzkaller-bugs, tj



On 2026/2/24 13:37, Waiman Long wrote:
> On 2/23/26 11:03 PM, Chen Ridong wrote:
>>
>> On 2026/2/16 13:57, Waiman Long wrote:
>>> On 2/15/26 4:05 PM, syzbot wrote:
>>>> Hello,
>>>>
>>>> syzbot found the following issue on:
>>>>
>>>> HEAD commit:    37a93dd5c49b Merge tag 'net-next-7.0' of git://git.kernel...
>>>> git tree:       upstream
>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1649d073980000
>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=a512b4a06724b76a
>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=460792609a79c085f79f
>>>> compiler:       gcc (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for
>>>> Debian) 2.44
>>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=152086e6580000
>>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=139c2eef980000
>>>>
>>>> Downloadable assets:
>>>> disk image:
>>>> https://storage.googleapis.com/syzbot-assets/0dedaafff2ad/disk-37a93dd5.raw.xz
>>>> vmlinux:
>>>> https://storage.googleapis.com/syzbot-assets/aa7fae081497/vmlinux-37a93dd5.xz
>>>> kernel image:
>>>> https://storage.googleapis.com/syzbot-assets/9096b39b53e1/bzImage-37a93dd5.xz
>>>>
>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>> Reported-by: syzbot+460792609a79c085f79f@syzkaller.appspotmail.com
>>>>
>>>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
>>>> R13: 00007fe00de15fac R14: 00007fe00de15fa0 R15: 00007fe00de15fa0
>>>>    </TASK>
>>>> Oops: general protection fault, probably for non-canonical address
>>>> 0xdffffc0000000000: 0000 [#1] SMP KASAN PTI
>>>> KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
>>>> CPU: 1 UID: 0 PID: 5994 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full)
>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
>>>> 01/24/2026
>>>> RIP: 0010:bitmap_subset include/linux/bitmap.h:433 [inline]
>>>> RIP: 0010:cpumask_subset include/linux/cpumask.h:836 [inline]
>>>> RIP: 0010:rebuild_sched_domains_locked+0x2aa/0x980 kernel/cgroup/cpuset.c:967
>>>> Code: 7d 05 00 41 83 c4 01 89 de 48 83 c5 08 44 89 e7 e8 fb 76 05 00 41 39 dc
>>>> 0f 8d 4c 04 00 00 e8 fd 7c 05 00 48 89 e8 48 c1 e8 03 <42> 80 3c 30 00 0f 85
>>>> 1d 06 00 00 48 8b 04 24 48 23 45 00 31 ff 44
>>>> RSP: 0018:ffffc90003ecfbc0 EFLAGS: 00010246
>>>> RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000020
>>>> RDX: ffff888028de0000 RSI: ffffffff8200f003 RDI: ffffffff8df14f28
>>>> RBP: 0000000000000000 R08: 0000000000000cc0 R09: 00000000ffffffff
>>>> R10: ffffffff8e7d95b3 R11: 0000000000000001 R12: 0000000000000000
>>>> R13: 00000000000f4240 R14: dffffc0000000000 R15: 0000000000000000
>>>> FS:  000055555c694500(0000) GS:ffff8881246a5000(0000) knlGS:0000000000000000
>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> CR2: 0000001b2f463fff CR3: 000000003704c000 CR4: 00000000003526f0
>>>> Call Trace:
>>>>    <TASK>
>>>>    rebuild_sched_domains_cpuslocked kernel/cgroup/cpuset.c:983 [inline]
>>>>    rebuild_sched_domains+0x21/0x40 kernel/cgroup/cpuset.c:990
>>>>    sched_rt_handler+0xb5/0xe0 kernel/sched/rt.c:2911
>>>>    proc_sys_call_handler+0x47f/0x5a0 fs/proc/proc_sysctl.c:600
>>>>    new_sync_write fs/read_write.c:595 [inline]
>>>>    vfs_write+0x6ac/0x1070 fs/read_write.c:688
>>>>    ksys_write+0x12a/0x250 fs/read_write.c:740
>>>>    do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>>>>    do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
>>>>    entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>>> RIP: 0033:0x7fe00db9bf79
>>>> Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48
>>>> 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73
>>>> 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
>>>> RSP: 002b:00007fff27bcda88 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
>>>> RAX: ffffffffffffffda RBX: 00007fe00de15fa0 RCX: 00007fe00db9bf79
>>>> RDX: 00000000000000f6 RSI: 0000200000000000 RDI: 0000000000000003
>>>> RBP: 00007fff27bcdaf0 R08: 0000000000000000 R09: 0000000000000000
>>>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
>>>> R13: 00007fe00de15fac R14: 00007fe00de15fa0 R15: 00007fe00de15fa0
>>>>    </TASK>
>>>> Modules linked in:
>>>> ---[ end trace 0000000000000000 ]---
>>>> RIP: 0010:bitmap_subset include/linux/bitmap.h:433 [inline]
>>>> RIP: 0010:cpumask_subset include/linux/cpumask.h:836 [inline]
>>>> RIP: 0010:rebuild_sched_domains_locked+0x2aa/0x980 kernel/cgroup/cpuset.c:967
>>>> Code: 7d 05 00 41 83 c4 01 89 de 48 83 c5 08 44 89 e7 e8 fb 76 05 00 41 39 dc
>>>> 0f 8d 4c 04 00 00 e8 fd 7c 05 00 48 89 e8 48 c1 e8 03 <42> 80 3c 30 00 0f 85
>>>> 1d 06 00 00 48 8b 04 24 48 23 45 00 31 ff 44
>>>> RSP: 0018:ffffc90003ecfbc0 EFLAGS: 00010246
>>>> RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000020
>>>> RDX: ffff888028de0000 RSI: ffffffff8200f003 RDI: ffffffff8df14f28
>>>> RBP: 0000000000000000 R08: 0000000000000cc0 R09: 00000000ffffffff
>>>> R10: ffffffff8e7d95b3 R11: 0000000000000001 R12: 0000000000000000
>>>> R13: 00000000000f4240 R14: dffffc0000000000 R15: 0000000000000000
>>>> FS:  000055555c694500(0000) GS:ffff8881246a5000(0000) knlGS:0000000000000000
>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> CR2: 0000001b2f463fff CR3: 000000003704c000 CR4: 00000000003526f0
>>>> ----------------
>>>> Code disassembly (best guess), 1 bytes skipped:
>>>>      0:    05 00 41 83 c4           add    $0xc4834100,%eax
>>>>      5:    01 89 de 48 83 c5        add    %ecx,-0x3a7cb722(%rcx)
>>>>      b:    08 44 89 e7              or     %al,-0x19(%rcx,%rcx,4)
>>>>      f:    e8 fb 76 05 00           call   0x5770f
>>>>     14:    41 39 dc                 cmp    %ebx,%r12d
>>>>     17:    0f 8d 4c 04 00 00        jge    0x469
>>>>     1d:    e8 fd 7c 05 00           call   0x57d1f
>>>>     22:    48 89 e8                 mov    %rbp,%rax
>>>>     25:    48 c1 e8 03              shr    $0x3,%rax
>>>> * 29:    42 80 3c 30 00           cmpb   $0x0,(%rax,%r14,1) <-- trapping
>>>> instruction
>>>>     2e:    0f 85 1d 06 00 00        jne    0x651
>>>>     34:    48 8b 04 24              mov    (%rsp),%rax
>>>>     38:    48 23 45 00              and    0x0(%rbp),%rax
>>>>     3c:    31 ff                    xor    %edi,%edi
>>>>     3e:    44                       rex.R
>>> The cpuset.c:967 is:
>>>
>>>      966         for (i = 0; i < ndoms; ++i) {
>>>      967                 if (WARN_ON_ONCE(!cpumask_subset(doms[i],
>>> cpu_active_mask)))
>>>      968                         return;
>>>
>>> The oops was caused by accessing doms[i] which was kmalloc'ed in
>>> generate_sched_domains() by calling alloc_sched_domains() in
>>> kernel/sched/topology.c. Looking at the console log just before the oops, I saw
>>>
>>> [  124.398850][ T5994] FAULT_INJECTION: forcing a failure.
>>> [  124.398850][ T5994] name failslab, interval 1, probability 0, space 0,
>>> times 1
>>> [  124.434865][ T5994] CPU: 1 UID: 0 PID: 5994 Comm: syz.0.17 Not tainted
>>> syzkaller #0 PREEMPT(full)
>>> [  124.434909][ T5994] Hardware name: Google Google Compute Engine/Google
>>> Compute Engine, BIOS Google 01/24/2026
>>> [  124.434936][ T5994] Call Trace:
>>> [  124.434947][ T5994]  <TASK>
>>> [  124.434959][ T5994]  dump_stack_lvl+0x100/0x190
>>> [  124.435026][ T5994]  should_fail_ex.cold+0x5/0xa
>>> [  124.435062][ T5994]  ? rebuild_sched_domains_locked+0x51/0x980
>>> [  124.435113][ T5994]  should_failslab+0xc2/0x120
>>> [  124.435153][ T5994]  __kmalloc_noprof+0xe0/0x850
>>> [  124.435195][ T5994]  rebuild_sched_domains_locked+0x51/0x980
>>> [  124.435266][ T5994]  rebuild_sched_domains+0x21/0x40
>>> [  124.435314][ T5994]  sched_rt_handler+0xb5/0xe0
>>> [  124.435359][ T5994]  proc_sys_call_handler+0x47f/0x5a0
>>> [  124.435413][ T5994]  ? __pfx_proc_sys_call_handler+0x10/0x10
>>> [  124.435475][ T5994]  vfs_write+0x6ac/0x1070
>>> [  124.435511][ T5994]  ? __pfx_proc_sys_write+0x10/0x10
>>> [  124.435562][ T5994]  ? __pfx_vfs_write+0x10/0x10
>>> [  124.435597][ T5994]  ? __pfx_do_sys_openat2+0x10/0x10
>>> [  124.435664][ T5994]  ksys_write+0x12a/0x250
>>> [  124.435696][ T5994]  ? __pfx_ksys_write+0x10/0x10
>>> [  124.435730][ T5994]  ? do_user_addr_fault+0x8d6/0x12f0
>>> [  124.435787][ T5994]  do_syscall_64+0x106/0xf80
>>> [  124.435834][ T5994]  ? clear_bhb_loop+0x40/0x90
>>> [  124.435875][ T5994]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>>
>>> So it looks like the oops may be expected. It may not be a bug in the cpuset
>>> AFAICS.
>>>
>> Hi Longman,
>>
>> Thank you for looking into this issue.
>>
>> Since partition_sched_domains_locked can handle the situation where 'doms' is
>> NULL, I think we should make it robust and fix it.
>>
>> The fix can be implemented as follows:
>>
>> In cpuset.c at line 964:
>>
>>          for (i = 0; i < ndoms; ++i) {
>> -               if (WARN_ON_ONCE(!cpumask_subset(doms[i], cpu_active_mask)))
>> +               if (doms && WARN_ON_ONCE(!cpumask_subset(doms[i],
>> +                                         cpu_active_mask)))
>>                          return;
>>          }
>>
> The problem is that doms is not NULL. It is 0xdffffc0000000000 as shown in the
> dmesg log. So the null check here won't do any good in this particular case. In
> fact, there is already a null check right after alloc_sched_domains() above.
> 
Looking at the dmesg log:

[  124.660383][ T5994] Oops: general protection fault, probably for
non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN PTI
[  124.672366][ T5994] KASAN: null-ptr-deref in range
[0x0000000000000000-0x0000000000000007]

The address 0xdffffc0000000000 appears to be the KASAN shadow offset, which is
configured in:

arch/x86/Kconfig:413
config KASAN_SHADOW_OFFSET
	hex
	depends on KASAN
	default 0xdffffc0000000000

This indicates that doms is actually NULL. In generate_sched_domains(), doms is
first assigned to NULL.

Indeed, there is already a NULL check right after alloc_sched_domains(), when
doms is NULL, it returns ndoms = 1 and doms = NULL. Therefore, I believe we need
to add 'doms' check in the rebuild_sched_domains_locked.

-- 
Best regards,
Ridong


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] [cgroups?] general protection fault in rebuild_sched_domains_locked
  2026-02-24  6:56       ` Chen Ridong
@ 2026-02-24 14:30         ` Waiman Long
  0 siblings, 0 replies; 6+ messages in thread
From: Waiman Long @ 2026-02-24 14:30 UTC (permalink / raw)
  To: Chen Ridong, Waiman Long, syzbot, cgroups, hannes, linux-kernel,
	mkoutny, syzkaller-bugs, tj

On 2/24/26 1:56 AM, Chen Ridong wrote:
>
> On 2026/2/24 13:37, Waiman Long wrote:
>> On 2/23/26 11:03 PM, Chen Ridong wrote:
>>> On 2026/2/16 13:57, Waiman Long wrote:
>>>> On 2/15/26 4:05 PM, syzbot wrote:
>>>>> Hello,
>>>>>
>>>>> syzbot found the following issue on:
>>>>>
>>>>> HEAD commit:    37a93dd5c49b Merge tag 'net-next-7.0' of git://git.kernel...
>>>>> git tree:       upstream
>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1649d073980000
>>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=a512b4a06724b76a
>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=460792609a79c085f79f
>>>>> compiler:       gcc (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for
>>>>> Debian) 2.44
>>>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=152086e6580000
>>>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=139c2eef980000
>>>>>
>>>>> Downloadable assets:
>>>>> disk image:
>>>>> https://storage.googleapis.com/syzbot-assets/0dedaafff2ad/disk-37a93dd5.raw.xz
>>>>> vmlinux:
>>>>> https://storage.googleapis.com/syzbot-assets/aa7fae081497/vmlinux-37a93dd5.xz
>>>>> kernel image:
>>>>> https://storage.googleapis.com/syzbot-assets/9096b39b53e1/bzImage-37a93dd5.xz
>>>>>
>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>>> Reported-by: syzbot+460792609a79c085f79f@syzkaller.appspotmail.com
>>>>>
>>>>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
>>>>> R13: 00007fe00de15fac R14: 00007fe00de15fa0 R15: 00007fe00de15fa0
>>>>>     </TASK>
>>>>> Oops: general protection fault, probably for non-canonical address
>>>>> 0xdffffc0000000000: 0000 [#1] SMP KASAN PTI
>>>>> KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
>>>>> CPU: 1 UID: 0 PID: 5994 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full)
>>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
>>>>> 01/24/2026
>>>>> RIP: 0010:bitmap_subset include/linux/bitmap.h:433 [inline]
>>>>> RIP: 0010:cpumask_subset include/linux/cpumask.h:836 [inline]
>>>>> RIP: 0010:rebuild_sched_domains_locked+0x2aa/0x980 kernel/cgroup/cpuset.c:967
>>>>> Code: 7d 05 00 41 83 c4 01 89 de 48 83 c5 08 44 89 e7 e8 fb 76 05 00 41 39 dc
>>>>> 0f 8d 4c 04 00 00 e8 fd 7c 05 00 48 89 e8 48 c1 e8 03 <42> 80 3c 30 00 0f 85
>>>>> 1d 06 00 00 48 8b 04 24 48 23 45 00 31 ff 44
>>>>> RSP: 0018:ffffc90003ecfbc0 EFLAGS: 00010246
>>>>> RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000020
>>>>> RDX: ffff888028de0000 RSI: ffffffff8200f003 RDI: ffffffff8df14f28
>>>>> RBP: 0000000000000000 R08: 0000000000000cc0 R09: 00000000ffffffff
>>>>> R10: ffffffff8e7d95b3 R11: 0000000000000001 R12: 0000000000000000
>>>>> R13: 00000000000f4240 R14: dffffc0000000000 R15: 0000000000000000
>>>>> FS:  000055555c694500(0000) GS:ffff8881246a5000(0000) knlGS:0000000000000000
>>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> CR2: 0000001b2f463fff CR3: 000000003704c000 CR4: 00000000003526f0
>>>>> Call Trace:
>>>>>     <TASK>
>>>>>     rebuild_sched_domains_cpuslocked kernel/cgroup/cpuset.c:983 [inline]
>>>>>     rebuild_sched_domains+0x21/0x40 kernel/cgroup/cpuset.c:990
>>>>>     sched_rt_handler+0xb5/0xe0 kernel/sched/rt.c:2911
>>>>>     proc_sys_call_handler+0x47f/0x5a0 fs/proc/proc_sysctl.c:600
>>>>>     new_sync_write fs/read_write.c:595 [inline]
>>>>>     vfs_write+0x6ac/0x1070 fs/read_write.c:688
>>>>>     ksys_write+0x12a/0x250 fs/read_write.c:740
>>>>>     do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>>>>>     do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
>>>>>     entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>>>> RIP: 0033:0x7fe00db9bf79
>>>>> Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48
>>>>> 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73
>>>>> 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
>>>>> RSP: 002b:00007fff27bcda88 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
>>>>> RAX: ffffffffffffffda RBX: 00007fe00de15fa0 RCX: 00007fe00db9bf79
>>>>> RDX: 00000000000000f6 RSI: 0000200000000000 RDI: 0000000000000003
>>>>> RBP: 00007fff27bcdaf0 R08: 0000000000000000 R09: 0000000000000000
>>>>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
>>>>> R13: 00007fe00de15fac R14: 00007fe00de15fa0 R15: 00007fe00de15fa0
>>>>>     </TASK>
>>>>> Modules linked in:
>>>>> ---[ end trace 0000000000000000 ]---
>>>>> RIP: 0010:bitmap_subset include/linux/bitmap.h:433 [inline]
>>>>> RIP: 0010:cpumask_subset include/linux/cpumask.h:836 [inline]
>>>>> RIP: 0010:rebuild_sched_domains_locked+0x2aa/0x980 kernel/cgroup/cpuset.c:967
>>>>> Code: 7d 05 00 41 83 c4 01 89 de 48 83 c5 08 44 89 e7 e8 fb 76 05 00 41 39 dc
>>>>> 0f 8d 4c 04 00 00 e8 fd 7c 05 00 48 89 e8 48 c1 e8 03 <42> 80 3c 30 00 0f 85
>>>>> 1d 06 00 00 48 8b 04 24 48 23 45 00 31 ff 44
>>>>> RSP: 0018:ffffc90003ecfbc0 EFLAGS: 00010246
>>>>> RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000020
>>>>> RDX: ffff888028de0000 RSI: ffffffff8200f003 RDI: ffffffff8df14f28
>>>>> RBP: 0000000000000000 R08: 0000000000000cc0 R09: 00000000ffffffff
>>>>> R10: ffffffff8e7d95b3 R11: 0000000000000001 R12: 0000000000000000
>>>>> R13: 00000000000f4240 R14: dffffc0000000000 R15: 0000000000000000
>>>>> FS:  000055555c694500(0000) GS:ffff8881246a5000(0000) knlGS:0000000000000000
>>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> CR2: 0000001b2f463fff CR3: 000000003704c000 CR4: 00000000003526f0
>>>>> ----------------
>>>>> Code disassembly (best guess), 1 bytes skipped:
>>>>>       0:    05 00 41 83 c4           add    $0xc4834100,%eax
>>>>>       5:    01 89 de 48 83 c5        add    %ecx,-0x3a7cb722(%rcx)
>>>>>       b:    08 44 89 e7              or     %al,-0x19(%rcx,%rcx,4)
>>>>>       f:    e8 fb 76 05 00           call   0x5770f
>>>>>      14:    41 39 dc                 cmp    %ebx,%r12d
>>>>>      17:    0f 8d 4c 04 00 00        jge    0x469
>>>>>      1d:    e8 fd 7c 05 00           call   0x57d1f
>>>>>      22:    48 89 e8                 mov    %rbp,%rax
>>>>>      25:    48 c1 e8 03              shr    $0x3,%rax
>>>>> * 29:    42 80 3c 30 00           cmpb   $0x0,(%rax,%r14,1) <-- trapping
>>>>> instruction
>>>>>      2e:    0f 85 1d 06 00 00        jne    0x651
>>>>>      34:    48 8b 04 24              mov    (%rsp),%rax
>>>>>      38:    48 23 45 00              and    0x0(%rbp),%rax
>>>>>      3c:    31 ff                    xor    %edi,%edi
>>>>>      3e:    44                       rex.R
>>>> The cpuset.c:967 is:
>>>>
>>>>       966         for (i = 0; i < ndoms; ++i) {
>>>>       967                 if (WARN_ON_ONCE(!cpumask_subset(doms[i],
>>>> cpu_active_mask)))
>>>>       968                         return;
>>>>
>>>> The oops was caused by accessing doms[i] which was kmalloc'ed in
>>>> generate_sched_domains() by calling alloc_sched_domains() in
>>>> kernel/sched/topology.c. Looking at the console log just before the oops, I saw
>>>>
>>>> [  124.398850][ T5994] FAULT_INJECTION: forcing a failure.
>>>> [  124.398850][ T5994] name failslab, interval 1, probability 0, space 0,
>>>> times 1
>>>> [  124.434865][ T5994] CPU: 1 UID: 0 PID: 5994 Comm: syz.0.17 Not tainted
>>>> syzkaller #0 PREEMPT(full)
>>>> [  124.434909][ T5994] Hardware name: Google Google Compute Engine/Google
>>>> Compute Engine, BIOS Google 01/24/2026
>>>> [  124.434936][ T5994] Call Trace:
>>>> [  124.434947][ T5994]  <TASK>
>>>> [  124.434959][ T5994]  dump_stack_lvl+0x100/0x190
>>>> [  124.435026][ T5994]  should_fail_ex.cold+0x5/0xa
>>>> [  124.435062][ T5994]  ? rebuild_sched_domains_locked+0x51/0x980
>>>> [  124.435113][ T5994]  should_failslab+0xc2/0x120
>>>> [  124.435153][ T5994]  __kmalloc_noprof+0xe0/0x850
>>>> [  124.435195][ T5994]  rebuild_sched_domains_locked+0x51/0x980
>>>> [  124.435266][ T5994]  rebuild_sched_domains+0x21/0x40
>>>> [  124.435314][ T5994]  sched_rt_handler+0xb5/0xe0
>>>> [  124.435359][ T5994]  proc_sys_call_handler+0x47f/0x5a0
>>>> [  124.435413][ T5994]  ? __pfx_proc_sys_call_handler+0x10/0x10
>>>> [  124.435475][ T5994]  vfs_write+0x6ac/0x1070
>>>> [  124.435511][ T5994]  ? __pfx_proc_sys_write+0x10/0x10
>>>> [  124.435562][ T5994]  ? __pfx_vfs_write+0x10/0x10
>>>> [  124.435597][ T5994]  ? __pfx_do_sys_openat2+0x10/0x10
>>>> [  124.435664][ T5994]  ksys_write+0x12a/0x250
>>>> [  124.435696][ T5994]  ? __pfx_ksys_write+0x10/0x10
>>>> [  124.435730][ T5994]  ? do_user_addr_fault+0x8d6/0x12f0
>>>> [  124.435787][ T5994]  do_syscall_64+0x106/0xf80
>>>> [  124.435834][ T5994]  ? clear_bhb_loop+0x40/0x90
>>>> [  124.435875][ T5994]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>>>
>>>> So it looks like the oops may be expected. It may not be a bug in the cpuset
>>>> AFAICS.
>>>>
>>> Hi Longman,
>>>
>>> Thank you for looking into this issue.
>>>
>>> Since partition_sched_domains_locked can handle the situation where 'doms' is
>>> NULL, I think we should make it robust and fix it.
>>>
>>> The fix can be implemented as follows:
>>>
>>> In cpuset.c at line 964:
>>>
>>>           for (i = 0; i < ndoms; ++i) {
>>> -               if (WARN_ON_ONCE(!cpumask_subset(doms[i], cpu_active_mask)))
>>> +               if (doms && WARN_ON_ONCE(!cpumask_subset(doms[i],
>>> +                                         cpu_active_mask)))
>>>                           return;
>>>           }
>>>
>> The problem is that doms is not NULL. It is 0xdffffc0000000000 as shown in the
>> dmesg log. So the null check here won't do any good in this particular case. In
>> fact, there is already a null check right after alloc_sched_domains() above.
>>
> Looking at the dmesg log:
>
> [  124.660383][ T5994] Oops: general protection fault, probably for
> non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN PTI
> [  124.672366][ T5994] KASAN: null-ptr-deref in range
> [0x0000000000000000-0x0000000000000007]
>
> The address 0xdffffc0000000000 appears to be the KASAN shadow offset, which is
> configured in:
>
> arch/x86/Kconfig:413
> config KASAN_SHADOW_OFFSET
> 	hex
> 	depends on KASAN
> 	default 0xdffffc0000000000
>
> This indicates that doms is actually NULL. In generate_sched_domains(), doms is
> first assigned to NULL.
>
> Indeed, there is already a NULL check right after alloc_sched_domains(), when
> doms is NULL, it returns ndoms = 1 and doms = NULL. Therefore, I believe we need
> to add 'doms' check in the rebuild_sched_domains_locked.
>
Right. The for loop shouldn't be run if doms is NULL.

Sure. Please send a patch to make this change.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-02-24 14:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-15 21:05 [syzbot] [cgroups?] general protection fault in rebuild_sched_domains_locked syzbot
2026-02-16  5:57 ` Waiman Long
2026-02-24  4:03   ` Chen Ridong
2026-02-24  5:37     ` Waiman Long
2026-02-24  6:56       ` Chen Ridong
2026-02-24 14:30         ` Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox