All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG] sched/cache: "Make LLC id continuous" causes NULL cpumask dereference in build_sched_domains on POWER9
@ 2026-05-25 14:07 Venkat Rao Bagalkote
  2026-05-25 15:35 ` Chen, Yu C
                   ` (3 more replies)
  0 siblings, 4 replies; 31+ messages in thread
From: Venkat Rao Bagalkote @ 2026-05-25 14:07 UTC (permalink / raw)
  To: Peter Zijlstra, K Prateek Nayak, Chen, Yu C, tim.c.chen
  Cc: Madhavan Srinivasan, Shrikanth Hegde, Ritesh Harjani,
	Christophe Leroy (CS GROUP), LKML, linuxppc-dev, linux-sched

Greetings!!!

I am seeing an early boot kernel panic due to NULL pointer dereference 
on a POWER9 (pSeries) system when testing linux-next (next-20260522).


Traces:

[    0.038567] Big cores detected but using small core scheduling
[    0.038796] BUG: Kernel NULL pointer dereference at 0x00000000
[    0.038804] Faulting instruction address: 0xc000000000e58504
[    0.038812] Oops: Kernel access of bad area, sig: 11 [#1]
[    0.038819] LE PAGE_SIZE=64K MMU=Hash  SMP NR_CPUS=8192 NUMA pSeries
[    0.038830] Modules linked in:
[    0.038840] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 
7.0.0-rc6+ #14 PREEMPTLAZY
[    0.038851] Hardware name: IBM,8375-42A POWER9 (architected) 0x4e0202 
0xf000005 of:IBM,FW950.80 (VL950_131) hv:phyp pSeries
[    0.038860] NIP:  c000000000e58504 LR: c000000000e58500 CTR: 
0000000000000000
[    0.038869] REGS: c0000000090e78e0 TRAP: 0380   Not tainted (7.0.0-rc6+)
[    0.038878] MSR:  8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 
44002242  XER: 20040003
[    0.038907] CFAR: c00000000093f3f0 IRQMASK: 0
[    0.038907] GPR00: c00000000038b3b8 c0000000090e7b80 c00000000259a800 
0000000000000000
[    0.038907] GPR04: 0000000000000038 0000000000000038 c00000000c6e2560 
0000000000000000
[    0.038907] GPR08: 0000000000000000 0000000000000037 0000ffffffffffff 
0000000000000000
[    0.038907] GPR12: c000000000072730 c0000000051b0000 c00000000c6ee560 
00000000ffffffff
[    0.038907] GPR16: 0000000000000000 0000000000000038 c0000000032c6b08 
fffffffffffffff6
[    0.038907] GPR20: 0000000000000000 c000000004d1a6e0 0000000000000000 
0000000000000000
[    0.038907] GPR24: 0000000000000000 0000000000000000 00000000ffffffff 
c00000000a3bf940
[    0.038907] GPR28: 0000000000000038 0000000000000000 0000000000000000 
0000000000000000
[    0.039029] NIP [c000000000e58504] _find_first_bit+0x44/0x130
[    0.039043] LR [c000000000e58500] _find_first_bit+0x40/0x130
[    0.039054] Call Trace:
[    0.039060] [c0000000090e7b80] [c00000000416af20] 
schedutil_gov+0x0/0xa0 (unreliable)
[    0.039076] [c0000000090e7bc0] [c00000000038b3b8] 
build_sched_domains+0xad8/0xe50
[    0.039089] [c0000000090e7ce0] [c000000003045d78] 
sched_init_smp+0xa8/0x164
[    0.039102] [c0000000090e7d30] [c00000000300f374] 
kernel_init_freeable+0x250/0x370
[    0.039117] [c0000000090e7de0] [c000000000011f90] kernel_init+0x34/0x1e4
[    0.039129] [c0000000090e7e50] [c00000000000debc] 
ret_from_kernel_user_thread+0x14/0x1c
[    0.039142] ---- interrupt: 0 at 0x0
[    0.039150] Code: 41820090 7c0802a6 393cffff fbe10038 7c7f1b78 
fba10028 fbc10030 3bc00000 793dd7e2 f8010050 4bae6e9d 60000000 
<e93f0000> 2c290000 408200bc 283c0040
[    0.039196] ---[ end trace 0000000000000000 ]---


Git bisect is pointing to b5ea300a17e3 sched/cache: Make LLC id 
continuous as first bad commit.


Git Bisect Logs:


# git bisect log
git bisect start
# status: waiting for both good and bad commits
# bad: [c1ecb239fa3456529a32255359fc78b69eb9d847] Add linux-next 
specific files for 20260522
git bisect bad c1ecb239fa3456529a32255359fc78b69eb9d847
# status: waiting for good commit(s), bad commit known
# good: [5200f5f493f79f14bbdc349e402a40dfb32f23c8] Linux 7.1-rc4
git bisect good 5200f5f493f79f14bbdc349e402a40dfb32f23c8
# good: [7cd27a0d57b8539366c98bb04fe48d1aff779ea9] Merge branch 'main' 
of https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
git bisect good 7cd27a0d57b8539366c98bb04fe48d1aff779ea9
# good: [efb3dd6031ec9858c7285fd673970320c86c01f3] Merge branch 'next' 
of https://git.kernel.org/pub/scm/linux/kernel/git/dtor/input.git
git bisect good efb3dd6031ec9858c7285fd673970320c86c01f3
# bad: [1a6066d1c1243fdc5ed464032bbdf12e6710c027] Merge branch 
'driver-core-next' of 
https://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core.git
git bisect bad 1a6066d1c1243fdc5ed464032bbdf12e6710c027
# good: [409a99cbc316d912c999fd75b9df042b25900e50] Merge branch 
'for-next' of 
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git
git bisect good 409a99cbc316d912c999fd75b9df042b25900e50
# bad: [af73f6b022c8c09a3234176892a18216be4cd984] Merge branch 'next' of 
git://git.kernel.org/pub/scm/virt/kvm/kvm.git
git bisect bad af73f6b022c8c09a3234176892a18216be4cd984
# bad: [6a459eb254e4bff61546587eccd3091955123d24] Merge branch into 
tip/master: 'sched/core'
git bisect bad 6a459eb254e4bff61546587eccd3091955123d24
# good: [71ba4bb66c3a9287245d0f5fcfb27d4b951ba402] Merge branch into 
tip/master: 'locking/core'
git bisect good 71ba4bb66c3a9287245d0f5fcfb27d4b951ba402
# good: [f3b45696a160a2230d846de8f706e835984ae65b] Merge branch into 
tip/master: 'objtool/core'
git bisect good f3b45696a160a2230d846de8f706e835984ae65b
# bad: [c99b8593b060931c5a0a4b701689f8d6a2c00dbf] sched/cache: Fix stale 
preferred_llc for a new task
git bisect bad c99b8593b060931c5a0a4b701689f8d6a2c00dbf
# bad: [5b1d5e6db20a6c64ffb95d04578db8c4b0228eea] sched/cache: Respect 
LLC preference in task migration and detach
git bisect bad 5b1d5e6db20a6c64ffb95d04578db8c4b0228eea
# bad: [46afe3af7ead57190b6d362e214814ec804e3b7b] sched/cache: Track 
LLC-preferred tasks per runqueue
git bisect bad 46afe3af7ead57190b6d362e214814ec804e3b7b
# good: [f025ef275388742643a2c33f00a0d9c0af3112ee] sched/cache: Record 
per LLC utilization to guide cache aware scheduling decisions
git bisect good f025ef275388742643a2c33f00a0d9c0af3112ee
# bad: [b5ea300a17e37eada7a98561fbd34a3054578713] sched/cache: Make LLC 
id continuous
git bisect bad b5ea300a17e37eada7a98561fbd34a3054578713
# good: [23b2b5ccc45ce2a38b9336a916088fffdc4cdfb1] sched/cache: 
Introduce helper functions to enforce LLC migration policy
git bisect good 23b2b5ccc45ce2a38b9336a916088fffdc4cdfb1
# first bad commit: [b5ea300a17e37eada7a98561fbd34a3054578713] 
sched/cache: Make LLC id continuous


b5ea300a17e37eada7a98561fbd34a3054578713 is the first bad commit
commit b5ea300a17e37eada7a98561fbd34a3054578713
Author: Tim Chen <tim.c.chen@linux.intel.com>
Date:   Wed Apr 1 14:52:17 2026 -0700

     sched/cache: Make LLC id continuous

     Introduce an index mapping between CPUs and their LLCs. This provides
     a roughly continuous per LLC index needed for cache-aware load 
balancing in
     later patches.

     The existing per_cpu llc_id usually points to the first CPU of the
     LLC domain, which is sparse and unsuitable as an array index. Using
     llc_id directly would waste memory.

     With the new mapping, CPUs in the same LLC share an approximate
     continuous id:

       per_cpu(llc_id, CPU=0...15)  = 0
       per_cpu(llc_id, CPU=16...31) = 1
       per_cpu(llc_id, CPU=32...47) = 2
       ...

     Note that the LLC IDs are allocated via bitmask, so the IDs may be
     reused during CPU offline->online transitions.

     Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
     Originally-by: K Prateek Nayak <kprateek.nayak@amd.com>
     Co-developed-by: Chen Yu <yu.c.chen@intel.com>
     Signed-off-by: Chen Yu <yu.c.chen@intel.com>
     Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
     Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
     Link: 
https://patch.msgid.link/047ef46339e4db497b54a89940a7ebedf27fcf28.1775065312.git.tim.c.chen@linux.intel.com

  kernel/sched/core.c     |  2 ++
  kernel/sched/sched.h    |  3 ++
  kernel/sched/topology.c | 90 
+++++++++++++++++++++++++++++++++++++++++++++++--
  3 files changed, 93 insertions(+), 2 deletions(-)


If you happen to fix this, please add below tag.


Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>


Regards,

Venkat.




^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2026-05-29  6:59 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-25 14:07 [BUG] sched/cache: "Make LLC id continuous" causes NULL cpumask dereference in build_sched_domains on POWER9 Venkat Rao Bagalkote
2026-05-25 15:35 ` Chen, Yu C
2026-05-25 16:16   ` K Prateek Nayak
2026-05-26  3:14     ` Chen, Yu C
2026-05-26  3:14   ` Srikar Dronamraju
2026-05-26  4:08     ` Chen, Yu C
2026-05-26  4:58       ` Srikar Dronamraju
2026-05-26  5:53         ` K Prateek Nayak
2026-05-26 14:08           ` [BUG] sched/cache: "Make LLC id continuous" causes NULL cpumask Chen Yu
2026-05-27  7:01             ` Shrikanth Hegde
2026-05-27 16:05               ` Chen, Yu C
2026-05-27 18:07                 ` Shrikanth Hegde
2026-05-28  4:58                   ` Shrikanth Hegde
2026-05-28  9:12                     ` Chen, Yu C
2026-05-28 10:26                       ` Shrikanth Hegde
2026-05-28 15:54                       ` Srikar Dronamraju
2026-05-28 15:58                   ` Srikar Dronamraju
2026-05-27 16:30               ` K Prateek Nayak
2026-05-26  5:24       ` [BUG] sched/cache: "Make LLC id continuous" causes NULL cpumask dereference in build_sched_domains on POWER9 Venkat Rao Bagalkote
2026-05-27  7:05         ` Shrikanth Hegde
2026-05-28 16:01           ` Srikar Dronamraju
2026-05-28  6:54 ` Ritesh Harjani
2026-05-28 16:06   ` Srikar Dronamraju
2026-05-28 11:27 ` Shrikanth Hegde
2026-05-28 13:21   ` Chen, Yu C
2026-05-28 15:06   ` Ritesh Harjani
2026-05-28 15:56   ` Srikar Dronamraju
2026-05-28 16:31     ` Shrikanth Hegde
2026-05-28 16:44       ` Srikar Dronamraju
2026-05-29  3:58 ` Shrikanth Hegde
2026-05-29  6:59   ` Venkat Rao Bagalkote

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.