* [bisected][mainline]Kernel warnings at kernel/sched/cpudeadline.c:219
@ 2025-10-08 2:11 Venkat Rao Bagalkote
2025-10-08 9:50 ` Peter Zijlstra
0 siblings, 1 reply; 9+ messages in thread
From: Venkat Rao Bagalkote @ 2025-10-08 2:11 UTC (permalink / raw)
To: LKML, linuxppc-dev, Madhavan Srinivasan, Shrikanth Hegde,
Peter Zijlstra, jstultz, stultz
Greetings!!!
IBM CI has reported a kernel warnings while running CPU hot plug
operation on IBM Power9 system.
Command to reproduce the issue:
drmgr -c cpu -r -q 1
Git Bisect is pointing to below commit as the first bad commit.
4ae8d9aa9f9dc7137ea5e564d79c5aa5af1bc45c
Traces:
[ 464.306613] ------------[ cut here ]------------
[ 464.306628] WARNING: CPU: 0 PID: 0 at kernel/sched/cpudeadline.c:219
cpudl_set+0x58/0x170
[ 464.306641] Modules linked in: rpadlpar_io(E) rpaphp(E)
nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E)
nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E)
bonding(E) nft_ct(E) tls(E) rfkill(E) nft_chain_nat(E) ip_set(E) hvcs(E)
ibmveth(E) pseries_rng(E) hvcserver(E) vmx_crypto(E) sg(E)
dm_multipath(E) drm(E) dm_mod(E) fuse(E) drm_panel_orientation_quirks(E)
ext4(E) crc16(E) mbcache(E) jbd2(E) sr_mod(E) sd_mod(E) cdrom(E)
ibmvscsi(E) scsi_transport_srp(E)
[ 464.306703] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G E
6.17.0-gfd94619c4336 #1 VOLUNTARY
[ 464.306711] Tainted: [E]=UNSIGNED_MODULE
[ 464.306714] Hardware name: IBM,8375-42A POWER9 (architected) 0x4e0202
0xf000005 of:IBM,FW950.80 (VL950_131) hv:phyp pSeries
[ 464.306720] NIP: c0000000002b6ed8 LR: c0000000002b7cb8 CTR:
c0000000002b7df0
[ 464.306725] REGS: c000000002c2f5d0 TRAP: 0700 Tainted: G E
(6.17.0-gfd94619c4336)
[ 464.306730] MSR: 8000000000021033 <SF,ME,IR,DR,RI,LE> CR: 22000228
XER: 00000000
[ 464.306743] CFAR: c0000000002b726c IRQMASK: 3
[ 464.306743] GPR00: c0000000002b7cb8 c000000002c2f870 c000000001df8100
c000000002d6a710
[ 464.306743] GPR04: 000000000000001e 0000006c566f51e0 0000000000000000
c000000002d6adb0
[ 464.306743] GPR08: 00000000ffffffff 0000000000000001 c000000002cac488
0000000000000000
[ 464.306743] GPR12: c0000000030a7000 c000000002fa0000 0000000000000000
0000000000000000
[ 464.306743] GPR16: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
[ 464.306743] GPR20: c0000009e940ac20 0000006c1aa50360 0000000000000001
0000000000000002
[ 464.306743] GPR24: 0000000000000000 0000000000000000 0000000000000003
c0000009e940ab80
[ 464.306743] GPR28: 000000000000001e 0000006c566f51e0 c000000002d6a710
000000000000001e
[ 464.306804] NIP [c0000000002b6ed8] cpudl_set+0x58/0x170
[ 464.306809] LR [c0000000002b7cb8] dl_server_timer+0x168/0x2a0
[ 464.306815] Call Trace:
[ 464.306818] [c000000002c2f870] [c000000002c2f8c0]
init_stack+0x78c0/0x8000 (unreliable)
[ 464.306828] [c000000002c2f8c0] [c0000000002b7cb8]
dl_server_timer+0x168/0x2a0
[ 464.306835] [c000000002c2f920] [c00000000034df84]
__hrtimer_run_queues+0x1a4/0x390
[ 464.306842] [c000000002c2f9b0] [c00000000034f624]
hrtimer_interrupt+0x124/0x300
[ 464.306849] [c000000002c2fa60] [c00000000002a230]
timer_interrupt+0x140/0x320
[ 464.306856] [c000000002c2fac0] [c000000000009ffc]
decrementer_common_virt+0x28c/0x290
[ 464.306865] ---- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
[ 464.306872] NIP: c0000000001b75d8 LR: c0000000001bf274 CTR:
0000000000000000
[ 464.306877] REGS: c000000002c2faf0 TRAP: 0900 Tainted: G E
(6.17.0-gfd94619c4336)
[ 464.306882] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>
CR: 24000228 XER: 20040000
[ 464.306897] CFAR: 0000000000000000 IRQMASK: 0
[ 464.306897] GPR00: 0000000000000000 c000000002c2fd90 c000000001df8100
0000000000000000
[ 464.306897] GPR04: 0000000000000010 000000002c000040 0000000000000002
0000000000000040
[ 464.306897] GPR08: 0000000000000000 0000000000000310 0000000000000031
0000000000000000
[ 464.306897] GPR12: 00000000d02f71f1 c000000002fa0000 0000000000000000
0000000000000000
[ 464.306897] GPR16: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
[ 464.306897] GPR20: 0000000000c00000 0000000000000008 0000000000000000
0000000000000000
[ 464.306897] GPR24: 0000000000000000 c000000000000000 c00000000a6e0000
c000000002cad0c0
[ 464.306897] GPR28: 0000000000000001 c0000000022418e0 c0000000022418e8
c0000000022418e0
[ 464.306956] NIP [c0000000001b75d8] plpar_hcall_norets_notrace+0x18/0x2c
[ 464.306962] LR [c0000000001bf274] pseries_lpar_idle.part.0+0x74/0x160
[ 464.306967] ---- interrupt: 900
[ 464.306970] [c000000002c2fd90] [c0000009e940b3b0] 0xc0000009e940b3b0
(unreliable)
[ 464.306984] [c000000002c2fe10] [c0000000000212fc]
arch_cpu_idle+0x4c/0x110
[ 464.306993] [c000000002c2fe30] [c00000000134ddd0]
default_idle_call+0x50/0x140
[ 464.307001] [c000000002c2fe50] [c0000000002b4fdc]
cpuidle_idle_call+0x1ac/0x240
[ 464.307007] [c000000002c2fea0] [c0000000002b5164] do_idle+0xf4/0x1a0
[ 464.307013] [c000000002c2fef0] [c0000000002b5498]
cpu_startup_entry+0x48/0x50
[ 464.307020] [c000000002c2ff20] [c0000000000113cc] rest_init+0xec/0xf0
[ 464.307026] [c000000002c2ff50] [c0000000020052e0] do_initcalls+0x0/0x18c
[ 464.307034] [c000000002c2ffe0] [c00000000000ea9c]
start_here_common+0x1c/0x20
[ 464.307040] Code: 549c06be 7c9f2378 7cbd2b78 7c7e1b78 39494388
5489e8f8 f8010010 f821ffb1 7d2a482a 7d29e436 552907fe 69290001
<0b090000> 490a428d 60000000 e93e0010
[ 464.307060] ---[ end trace 0000000000000000 ]---
[ 464.736380] ------------[ cut here ]------------
[ 464.736397] WARNING: CPU: 0 PID: 0 at kernel/sched/cpudeadline.c:219
cpudl_set+0x58/0x170
[ 464.736408] Modules linked in: rpadlpar_io(E) rpaphp(E)
nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E)
nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E)
bonding(E) nft_ct(E) tls(E) rfkill(E) nft_chain_nat(E) ip_set(E) hvcs(E)
ibmveth(E) pseries_rng(E) hvcserver(E) vmx_crypto(E) sg(E)
dm_multipath(E) drm(E) dm_mod(E) fuse(E) drm_panel_orientation_quirks(E)
ext4(E) crc16(E) mbcache(E) jbd2(E) sr_mod(E) sd_mod(E) cdrom(E)
ibmvscsi(E) scsi_transport_srp(E)
[ 464.736468] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G W E
6.17.0-gfd94619c4336 #1 VOLUNTARY
[ 464.736476] Tainted: [W]=WARN, [E]=UNSIGNED_MODULE
[ 464.736480] Hardware name: IBM,8375-42A POWER9 (architected) 0x4e0202
0xf000005 of:IBM,FW950.80 (VL950_131) hv:phyp pSeries
[ 464.736486] NIP: c0000000002b6ed8 LR: c0000000002b7cb8 CTR:
c0000000002b7df0
[ 464.736491] REGS: c000000002c2f4f0 TRAP: 0700 Tainted: G W E
(6.17.0-gfd94619c4336)
[ 464.736497] MSR: 8000000000021033 <SF,ME,IR,DR,RI,LE> CR: 22000424
XER: 00000000
[ 464.736509] CFAR: c0000000002b726c IRQMASK: 3
[ 464.736509] GPR00: c0000000002b7cb8 c000000002c2f790 c000000001df8100
c000000002d6a710
[ 464.736509] GPR04: 000000000000001f 0000006c700d1304 0000000000000000
c000000002d6adb0
[ 464.736509] GPR08: 00000000ffffffff 0000000000000001 c000000002cac488
0000000000000000
[ 464.736509] GPR12: c0000000030a7000 c000000002fa0000 0000000000000000
0000000000000000
[ 464.736509] GPR16: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
[ 464.736509] GPR20: c0000009e940ac20 0000006c3442c73b 0000000000000001
0000000000000002
[ 464.736509] GPR24: 0000000000000000 0000000000000000 0000000000000003
c0000009e940ab80
[ 464.736509] GPR28: 000000000000001f 0000006c700d1304 c000000002d6a710
000000000000001f
[ 464.736569] NIP [c0000000002b6ed8] cpudl_set+0x58/0x170
[ 464.736574] LR [c0000000002b7cb8] dl_server_timer+0x168/0x2a0
[ 464.736580] Call Trace:
[ 464.736582] [c000000002c2f790] [c000000002c2f7e0]
init_stack+0x77e0/0x8000 (unreliable)
[ 464.736592] [c000000002c2f7e0] [c0000000002b7cb8]
dl_server_timer+0x168/0x2a0
[ 464.736599] [c000000002c2f840] [c00000000034df84]
__hrtimer_run_queues+0x1a4/0x390
[ 464.736606] [c000000002c2f8d0] [c00000000034f624]
hrtimer_interrupt+0x124/0x300
[ 464.736613] [c000000002c2f980] [c00000000002a230]
timer_interrupt+0x140/0x320
[ 464.736620] [c000000002c2f9e0] [c000000000009ffc]
decrementer_common_virt+0x28c/0x290
[ 464.736627] ---- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
[ 464.736634] NIP: c0000000001b75d8 LR: c00000000134dfe8 CTR:
0000000000000000
[ 464.736638] REGS: c000000002c2fa10 TRAP: 0900 Tainted: G W E
(6.17.0-gfd94619c4336)
[ 464.736644] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>
CR: 22000424 XER: 20040000
[ 464.736659] CFAR: 0000000000000000 IRQMASK: 0
[ 464.736659] GPR00: 0000000000000000 c000000002c2fcb0 c000000001df8100
0000000000000000
[ 464.736659] GPR04: 0000000000000010 000000002c000040 0000000000000002
0000000000000040
[ 464.736659] GPR08: 0000000000000000 0000000000000290 0000000000000029
0000000000000000
[ 464.736659] GPR12: 00000000d02f74a9 c000000002fa0000 0000000000000000
0000000000000000
[ 464.736659] GPR16: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
[ 464.736659] GPR20: 0000000000c00000 0000000000000008 0000000000000000
0000000000000000
[ 464.736659] GPR24: 0000000000000000 0000000000000000 0000006c3469239a
0000000000000001
[ 464.736659] GPR28: c0000009e9419cc0 0000000000000001 c0000000022418e0
c0000000022418e8
[ 464.736717] NIP [c0000000001b75d8] plpar_hcall_norets_notrace+0x18/0x2c
[ 464.736723] LR [c00000000134dfe8] check_and_cede_processor+0x48/0x60
[ 464.736730] ---- interrupt: 900
[ 464.736733] [c000000002c2fcb0] [c0000000026a1080]
init_task+0x0/0x1d00 (unreliable)
[ 464.736741] [c000000002c2fd10] [c00000000134e210]
shared_cede_loop+0x70/0x170
[ 464.736748] [c000000002c2fd50] [c00000000134d830]
cpuidle_enter_state+0x2b0/0x648
[ 464.736756] [c000000002c2fdf0] [c000000000e09f70] cpuidle_enter+0x50/0x80
[ 464.736764] [c000000002c2fe30] [c0000000002ad868] call_cpuidle+0x48/0x90
[ 464.736772] [c000000002c2fe50] [c0000000002b4f94]
cpuidle_idle_call+0x164/0x240
[ 464.736779] [c000000002c2fea0] [c0000000002b5164] do_idle+0xf4/0x1a0
[ 464.736785] [c000000002c2fef0] [c0000000002b549c]
cpu_startup_entry+0x4c/0x50
[ 464.736791] [c000000002c2ff20] [c0000000000113cc] rest_init+0xec/0xf0
[ 464.736797] [c000000002c2ff50] [c0000000020052e0] do_initcalls+0x0/0x18c
[ 464.736804] [c000000002c2ffe0] [c00000000000ea9c]
start_here_common+0x1c/0x20
[ 464.736810] Code: 549c06be 7c9f2378 7cbd2b78 7c7e1b78 39494388
5489e8f8 f8010010 f821ffb1 7d2a482a 7d29e436 552907fe 69290001
<0b090000> 490a428d 60000000 e93e0010
[ 464.736831] ---[ end trace 0000000000000000 ]---
[ 493.843328] Non-volatile memory driver v1.3
Git Bisect logs:
git bisect bad
4ae8d9aa9f9dc7137ea5e564d79c5aa5af1bc45c is the first bad commit
commit 4ae8d9aa9f9dc7137ea5e564d79c5aa5af1bc45c (HEAD)
Author: Peter Zijlstra <peterz@infradead.org>
Date: Tue Sep 16 23:02:41 2025 +0200
sched/deadline: Fix dl_server getting stuck
John found it was easy to hit lockup warnings when running locktorture
on a 2 CPU VM, which he bisected down to: commit cccb45d7c429
("sched/deadline: Less agressive dl_server handling").
While debugging it seems there is a chance where we end up with the
dl_server dequeued, with dl_se->dl_server_active. This causes
dl_server_start() to return without enqueueing the dl_server, thus it
fails to run when RT tasks starve the cpu.
When this happens, dl_server_timer() catches the
'!dl_se->server_has_tasks(dl_se)' case, which then calls
replenish_dl_entity() and dl_server_stopped() and finally return
HRTIMER_NO_RESTART.
This ends in no new timer and also no enqueue, leaving the dl_server
'dead', allowing starvation.
What should have happened is for the bandwidth timer to start the
zero-laxity timer, which in turn would enqueue the dl_server and cause
dl_se->server_pick_task() to be called -- which will stop the
dl_server if no fair tasks are observed for a whole period.
IOW, it is totally irrelevant if there are fair tasks at the moment of
bandwidth refresh.
This removes all dl_se->server_has_tasks() users, so remove the whole
thing.
Fixes: cccb45d7c4295 ("sched/deadline: Less agressive dl_server
handling")
Reported-by: John Stultz <jstultz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: John Stultz <jstultz@google.com>
include/linux/sched.h | 1 -
kernel/sched/deadline.c | 12 +-----------
kernel/sched/fair.c | 7 +------
kernel/sched/sched.h | 4 ----
4 files changed, 2 insertions(+), 22 deletions(-)
# git bisect log
git bisect start
# status: waiting for both good and bad commits
# good: [038d61fd642278bab63ee8ef722c50d10ab01e8f] Linux 6.16
git bisect good 038d61fd642278bab63ee8ef722c50d10ab01e8f
# status: waiting for bad commit, 1 good commit known
# bad: [c746c3b5169831d7fb032a1051d8b45592ae8d78] Merge tag
'for-6.18-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
git bisect bad c746c3b5169831d7fb032a1051d8b45592ae8d78
# good: [e25079858627916b22c4a789005a90a9fae808d8] Merge branch
'net-better-drop-accounting'
git bisect good e25079858627916b22c4a789005a90a9fae808d8
# bad: [05a54fa773284d1a7923cdfdd8f0c8dabb98bd26] Merge tag
'sound-6.18-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect bad 05a54fa773284d1a7923cdfdd8f0c8dabb98bd26
# bad: [ae28ed4578e6d5a481e39c5a9827f27048661fdd] Merge tag
'bpf-next-6.18' of
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
git bisect bad ae28ed4578e6d5a481e39c5a9827f27048661fdd
# bad: [6855f06042ae8d134f96c63feb5dfb3943c6d789] Merge tag
'i2c-for-6.17-rc8' of
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
git bisect bad 6855f06042ae8d134f96c63feb5dfb3943c6d789
# good: [3d1e36499e02457f8de0edc9d87783cce97e8677] Merge tag
'gpio-fixes-for-v6.17-rc5' of
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
git bisect good 3d1e36499e02457f8de0edc9d87783cce97e8677
# good: [86cc796e5e9bff0c3993607f4301b8188095516c] Merge tag 'for-linus'
of git://git.kernel.org/pub/scm/virt/kvm/kvm
git bisect good 86cc796e5e9bff0c3993607f4301b8188095516c
# good: [f975f08c2e899ae2484407d7bba6bb7f8b6d9d40] Merge tag
'for-6.17-rc6-tag' of
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
git bisect good f975f08c2e899ae2484407d7bba6bb7f8b6d9d40
# good: [4ff71af020ae59ae2d83b174646fc2ad9fcd4dc4] Merge tag
'net-6.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
git bisect good 4ff71af020ae59ae2d83b174646fc2ad9fcd4dc4
# good: [f26a24662cd2875f82029e28879a20cea212214c] Merge tag
'v6.17rc7-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
git bisect good f26a24662cd2875f82029e28879a20cea212214c
# bad: [51a24b7deaae5c3561965f5b4b27bb9d686add1c] Merge tag
'trace-tools-v6.17-rc5' of
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
git bisect bad 51a24b7deaae5c3561965f5b4b27bb9d686add1c
# bad: [083fc6d7fa0d974a3663b97c8b0466737a544236] Merge tag
'sched-urgent-2025-09-26' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 083fc6d7fa0d974a3663b97c8b0466737a544236
# good: [2cea0ed9796381b142f46bd8de97bb6b54b1df61] Merge tag
'locking-urgent-2025-09-26' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 2cea0ed9796381b142f46bd8de97bb6b54b1df61
# bad: [a3a70caf7906708bf9bbc80018752a6b36543808] sched/deadline: Fix
dl_server behaviour
git bisect bad a3a70caf7906708bf9bbc80018752a6b36543808
# bad: [4ae8d9aa9f9dc7137ea5e564d79c5aa5af1bc45c] sched/deadline: Fix
dl_server getting stuck
git bisect bad 4ae8d9aa9f9dc7137ea5e564d79c5aa5af1bc45c
# first bad commit: [4ae8d9aa9f9dc7137ea5e564d79c5aa5af1bc45c]
sched/deadline: Fix dl_server getting stuck
If you happen to fix this, please add below tag.
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Regards,
Venkat.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [bisected][mainline]Kernel warnings at kernel/sched/cpudeadline.c:219
2025-10-08 2:11 [bisected][mainline]Kernel warnings at kernel/sched/cpudeadline.c:219 Venkat Rao Bagalkote
@ 2025-10-08 9:50 ` Peter Zijlstra
2025-10-08 10:17 ` Shrikanth Hegde
0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2025-10-08 9:50 UTC (permalink / raw)
To: Venkat Rao Bagalkote
Cc: LKML, linuxppc-dev, Madhavan Srinivasan, Shrikanth Hegde, jstultz,
stultz
On Wed, Oct 08, 2025 at 07:41:10AM +0530, Venkat Rao Bagalkote wrote:
> Greetings!!!
>
>
> IBM CI has reported a kernel warnings while running CPU hot plug operation
> on IBM Power9 system.
>
>
> Command to reproduce the issue:
>
> drmgr -c cpu -r -q 1
>
>
> Git Bisect is pointing to below commit as the first bad commit.
Does something like this help?
(also, for future reference, please don't line wrap logs, it makes them
very hard to read)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 198d2dd45f59..65f37bfcd661 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8328,6 +8328,7 @@ static inline void sched_set_rq_offline(struct rq *rq, int cpu)
BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
set_rq_offline(rq);
}
+ dl_server_stop(&rq->fair_server);
rq_unlock_irqrestore(rq, &rf);
}
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [bisected][mainline]Kernel warnings at kernel/sched/cpudeadline.c:219
2025-10-08 9:50 ` Peter Zijlstra
@ 2025-10-08 10:17 ` Shrikanth Hegde
2025-10-08 11:13 ` Peter Zijlstra
0 siblings, 1 reply; 9+ messages in thread
From: Shrikanth Hegde @ 2025-10-08 10:17 UTC (permalink / raw)
To: Peter Zijlstra, Venkat Rao Bagalkote
Cc: LKML, linuxppc-dev, Madhavan Srinivasan, jstultz, stultz
On 10/8/25 3:20 PM, Peter Zijlstra wrote:
> On Wed, Oct 08, 2025 at 07:41:10AM +0530, Venkat Rao Bagalkote wrote:
>> Greetings!!!
>>
>>
>> IBM CI has reported a kernel warnings while running CPU hot plug operation
>> on IBM Power9 system.
>>
>>
>> Command to reproduce the issue:
>>
>> drmgr -c cpu -r -q 1
>>
>>
>> Git Bisect is pointing to below commit as the first bad commit.
>
> Does something like this help?
>
> (also, for future reference, please don't line wrap logs, it makes them
> very hard to read)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 198d2dd45f59..65f37bfcd661 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -8328,6 +8328,7 @@ static inline void sched_set_rq_offline(struct rq *rq, int cpu)
> BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
> set_rq_offline(rq);
> }
> + dl_server_stop(&rq->fair_server);
> rq_unlock_irqrestore(rq, &rf);
> }
>
Hi Peter. Thanks for looking into it.
I was able to repro this issue on my system. This above diff didn't help. I still see the warning.
I have to understand this dl server stuff still.
So not sure if my understanding is completely correct.
Looks like the hrtimer is firing after the cpu was removed. The warn on hit only with
drmgr. Regular hotplug with chcpu doesn;t hit. That's because drmgr changes the cpu_present mask.
and warning is hit with it.
maybe during drmgr, the dl server gets started again? Maybe that's why above patch it didn't work.
Will see and understand this bit more.
Also, i tried this below diff which fixes it. Just ignore the hrtimer if the cpu is offline.
Does this makes sense?
---
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 615411a0a881..a342cf5e4624 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1160,6 +1160,9 @@ static enum hrtimer_restart dl_server_timer(struct hrtimer *timer, struct sched_
scoped_guard (rq_lock, rq) {
struct rq_flags *rf = &scope.rf;
+ if (!cpu_online(rq->cpu))
+ return HRTIMER_NORESTART;
+
if (!dl_se->dl_throttled || !dl_se->dl_runtime)
return HRTIMER_NORESTART;
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [bisected][mainline]Kernel warnings at kernel/sched/cpudeadline.c:219
2025-10-08 10:17 ` Shrikanth Hegde
@ 2025-10-08 11:13 ` Peter Zijlstra
2025-10-08 18:09 ` Shrikanth Hegde
0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2025-10-08 11:13 UTC (permalink / raw)
To: Shrikanth Hegde
Cc: Venkat Rao Bagalkote, LKML, linuxppc-dev, Madhavan Srinivasan,
jstultz, stultz
On Wed, Oct 08, 2025 at 03:47:16PM +0530, Shrikanth Hegde wrote:
>
>
> On 10/8/25 3:20 PM, Peter Zijlstra wrote:
> > On Wed, Oct 08, 2025 at 07:41:10AM +0530, Venkat Rao Bagalkote wrote:
> > > Greetings!!!
> > >
> > >
> > > IBM CI has reported a kernel warnings while running CPU hot plug operation
> > > on IBM Power9 system.
> > >
> > >
> > > Command to reproduce the issue:
> > >
> > > drmgr -c cpu -r -q 1
> > >
> > >
> > > Git Bisect is pointing to below commit as the first bad commit.
> >
> > Does something like this help?
> >
> > (also, for future reference, please don't line wrap logs, it makes them
> > very hard to read)
> >
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 198d2dd45f59..65f37bfcd661 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -8328,6 +8328,7 @@ static inline void sched_set_rq_offline(struct rq *rq, int cpu)
> > BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
> > set_rq_offline(rq);
> > }
> > + dl_server_stop(&rq->fair_server);
> > rq_unlock_irqrestore(rq, &rf);
> > }
>
>
> Hi Peter. Thanks for looking into it.
>
> I was able to repro this issue on my system. This above diff didn't help. I still see the warning.
>
> I have to understand this dl server stuff still.
> So not sure if my understanding is completely correct.
>
> Looks like the hrtimer is firing after the cpu was removed. The warn on hit only with
> drmgr. Regular hotplug with chcpu doesn;t hit. That's because drmgr changes the cpu_present mask.
> and warning is hit with it.
I do not know what drmgr is. I am not familiar with PowerPC tools.
AFAICT x86 never modifies cpu_present_mask after boot.
> maybe during drmgr, the dl server gets started again? Maybe that's why above patch it didn't work.
> Will see and understand this bit more.
dl_server is per cpu and is started on enqueue of a fair task when:
- the runqueue was empty; and
- the dl_server wasn't already active
Once the dl_server is active it has this timer (you already found this),
this timer is set for the 0-laxity moment (the last possible moment in
time where it can still run its budget and not be late), during this
time any fair runtime is accounted against it budget (subtracted from).
Once the timer fires and it still has budget left; it will enqueue the
deadline entity. However the more common case is that its budget will be
depleted, in which case the timer is reset to its period end for
replenish (where it gets new runtime budget), after which its back to
the 0-laxity.
If the deadline entity gets scheduled, it will try and pick a fair task
and run that. In the case where there is no fair task, it will
deactivate itself.
The patch I sent earlier would force stop the deadline timer on CPU
offline.
> Also, i tried this below diff which fixes it. Just ignore the hrtimer if the cpu is offline.
> Does this makes sense?
> ---
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 615411a0a881..a342cf5e4624 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1160,6 +1160,9 @@ static enum hrtimer_restart dl_server_timer(struct hrtimer *timer, struct sched_
> scoped_guard (rq_lock, rq) {
> struct rq_flags *rf = &scope.rf;
> + if (!cpu_online(rq->cpu))
> + return HRTIMER_NORESTART;
> +
> if (!dl_se->dl_throttled || !dl_se->dl_runtime)
> return HRTIMER_NORESTART;
This could leave the dl_server in inconsistent state. It would have to
call dl_server_stop() or something along those lines.
Also, this really should not happen; per my previous patch we should be
stopping the timer when we go offline.
Since you can readily reproduce this; perhaps you could stick something
like this in dl_server_start():
WARN_ON_ONCE(!cpu_online(rq->cpu))
See if anybody is (re)starting the thing ?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [bisected][mainline]Kernel warnings at kernel/sched/cpudeadline.c:219
2025-10-08 11:13 ` Peter Zijlstra
@ 2025-10-08 18:09 ` Shrikanth Hegde
2025-10-09 8:00 ` Peter Zijlstra
0 siblings, 1 reply; 9+ messages in thread
From: Shrikanth Hegde @ 2025-10-08 18:09 UTC (permalink / raw)
To: Peter Zijlstra, Venkat Rao Bagalkote
Cc: LKML, linuxppc-dev, Madhavan Srinivasan, jstultz, stultz
On 10/8/25 4:43 PM, Peter Zijlstra wrote:
> On Wed, Oct 08, 2025 at 03:47:16PM +0530, Shrikanth Hegde wrote:
>>
>>
>> On 10/8/25 3:20 PM, Peter Zijlstra wrote:
>>> On Wed, Oct 08, 2025 at 07:41:10AM +0530, Venkat Rao Bagalkote wrote:
>>>> Greetings!!!
>>>>
>>>>
>>>> IBM CI has reported a kernel warnings while running CPU hot plug operation
>>>> on IBM Power9 system.
>>>>
>>>>
>>>> Command to reproduce the issue:
>>>>
>>>> drmgr -c cpu -r -q 1
>>>>
>
> I do not know what drmgr is. I am not familiar with PowerPC tools.
> AFAICT x86 never modifies cpu_present_mask after boot.
>
It is a tool which allows dynamic addition of cpu/memory. It does indeed changes the present cpus.
Even i am not profound with it :)
>> maybe during drmgr, the dl server gets started again? Maybe that's why above patch it didn't work.
>> Will see and understand this bit more.
>
> dl_server is per cpu and is started on enqueue of a fair task when:
>
> - the runqueue was empty; and
> - the dl_server wasn't already active
>
> Once the dl_server is active it has this timer (you already found this),
> this timer is set for the 0-laxity moment (the last possible moment in
> time where it can still run its budget and not be late), during this
> time any fair runtime is accounted against it budget (subtracted from).
>
> Once the timer fires and it still has budget left; it will enqueue the
> deadline entity. However the more common case is that its budget will be
> depleted, in which case the timer is reset to its period end for
> replenish (where it gets new runtime budget), after which its back to
> the 0-laxity.
>
> If the deadline entity gets scheduled, it will try and pick a fair task
> and run that. In the case where there is no fair task, it will
> deactivate itself.
ok cool.
>
> The patch I sent earlier would force stop the deadline timer on CPU
> offline.
>
>
>> Also, i tried this below diff which fixes it. Just ignore the hrtimer if the cpu is offline.
>> Does this makes sense?
>> ---
>>
>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>> index 615411a0a881..a342cf5e4624 100644
>> --- a/kernel/sched/deadline.c
>> +++ b/kernel/sched/deadline.c
>> @@ -1160,6 +1160,9 @@ static enum hrtimer_restart dl_server_timer(struct hrtimer *timer, struct sched_
>> scoped_guard (rq_lock, rq) {
>> struct rq_flags *rf = &scope.rf;
>> + if (!cpu_online(rq->cpu))
>> + return HRTIMER_NORESTART;
>> +
>> if (!dl_se->dl_throttled || !dl_se->dl_runtime)
>> return HRTIMER_NORESTART;
>
> This could leave the dl_server in inconsistent state. It would have to
> call dl_server_stop() or something along those lines.
>
> Also, this really should not happen; per my previous patch we should be
> stopping the timer when we go offline.
>
> Since you can readily reproduce this; perhaps you could stick something
> like this in dl_server_start():
>
> WARN_ON_ONCE(!cpu_online(rq->cpu))
>
> See if anybody is (re)starting the thing ?
So i did use this diff to get who is enabling it again, after it was stopped in offline.
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 198d2dd45f59..83e77bbbb6b4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8328,6 +8328,8 @@ static inline void sched_set_rq_offline(struct rq *rq, int cpu)
BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
set_rq_offline(rq);
}
+ dl_server_stop(&rq->fair_server);
+
rq_unlock_irqrestore(rq, &rf);
}
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 615411a0a881..5847540bdc18 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1582,6 +1582,8 @@ void dl_server_start(struct sched_dl_entity *dl_se)
if (!dl_server(dl_se) || dl_se->dl_server_active)
return;
+ WARN_ON(!rq->online);
+
dl_se->dl_server_active = 1;
enqueue_dl_entity(dl_se, ENQUEUE_WAKEUP);
if (!dl_task(dl_se->rq->curr) || dl_entity_preempt(dl_se, &rq->curr->dl))
*It pointed to this*
NIP [c0000000001fd798] dl_server_start+0x50/0xd8
LR [c0000000001d9534] enqueue_task_fair+0x228/0x8ec
Call Trace:
[c000006684a579c0] [0000000000000001] 0x1 (unreliable)
[c000006684a579f0] [c0000000001d9534] enqueue_task_fair+0x228/0x8ec
[c000006684a57a60] [c0000000001bb344] enqueue_task+0x5c/0x1c8
[c000006684a57aa0] [c0000000001c5fc0] ttwu_do_activate+0x98/0x2fc
[c000006684a57af0] [c0000000001c671c] try_to_wake_up+0x2e0/0xa60
[c000006684a57b80] [c00000000019fb48] kthread_park+0x7c/0xf0
[c000006684a57bb0] [c00000000015fefc] takedown_cpu+0x60/0x194
[c000006684a57c00] [c000000000161924] cpuhp_invoke_callback+0x1f4/0x9a4
[c000006684a57c90] [c0000000001621a4] __cpuhp_invoke_callback_range+0xd0/0x188
[c000006684a57d30] [c000000000165aec] _cpu_down+0x19c/0x560
[c000006684a57df0] [c0000000001637c0] __cpu_down_maps_locked+0x2c/0x3c
[c000006684a57e10] [c00000000018a100] work_for_cpu_fn+0x38/0x54
[c000006684a57e40] [c00000000019075c] process_one_work+0x1d8/0x554
[c000006684a57ef0] [c00000000019165c] worker_thread+0x308/0x46c
[c000006684a57f90] [c00000000019e474] kthread+0x16c/0x19c
[c000006684a57fe0] [c00000000000dd58] start_kernel_thread+0x14/0x18
It is takedown_cpu called from CPU0(boot CPU) and it wakes up kthread which is CPU Bound I guess.
Since happens after rq was marked offline, it ends up starting the deadline server again.
So i think it is sensible idea to stop the deadline server if the cpu is going down.
Once we stop the server we will return HRTIMER_NORESTART.
This does fix the warning. Does this look any good?
---
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 615411a0a881..831797b9ec0f 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1160,11 +1160,14 @@ static enum hrtimer_restart dl_server_timer(struct hrtimer *timer, struct sched_
scoped_guard (rq_lock, rq) {
struct rq_flags *rf = &scope.rf;
+ update_rq_clock(rq);
+ if (!cpu_online(rq->cpu))
+ dl_server_stop(dl_se);
+
if (!dl_se->dl_throttled || !dl_se->dl_runtime)
return HRTIMER_NORESTART;
sched_clock_tick();
- update_rq_clock(rq);
if (!dl_se->dl_runtime)
return HRTIMER_NORESTART;
----
Also below check is duplicate. We do the same check above.
if (!dl_se->dl_runtime)
return HRTIMER_NORESTART;
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [bisected][mainline]Kernel warnings at kernel/sched/cpudeadline.c:219
2025-10-08 18:09 ` Shrikanth Hegde
@ 2025-10-09 8:00 ` Peter Zijlstra
2025-10-09 9:47 ` Shrikanth Hegde
2025-10-09 11:54 ` Marek Szyprowski
0 siblings, 2 replies; 9+ messages in thread
From: Peter Zijlstra @ 2025-10-09 8:00 UTC (permalink / raw)
To: Shrikanth Hegde
Cc: Venkat Rao Bagalkote, LKML, linuxppc-dev, Madhavan Srinivasan,
jstultz, stultz
On Wed, Oct 08, 2025 at 11:39:11PM +0530, Shrikanth Hegde wrote:
> *It pointed to this*
>
> NIP [c0000000001fd798] dl_server_start+0x50/0xd8
> LR [c0000000001d9534] enqueue_task_fair+0x228/0x8ec
> Call Trace:
> [c000006684a579c0] [0000000000000001] 0x1 (unreliable)
> [c000006684a579f0] [c0000000001d9534] enqueue_task_fair+0x228/0x8ec
> [c000006684a57a60] [c0000000001bb344] enqueue_task+0x5c/0x1c8
> [c000006684a57aa0] [c0000000001c5fc0] ttwu_do_activate+0x98/0x2fc
> [c000006684a57af0] [c0000000001c671c] try_to_wake_up+0x2e0/0xa60
> [c000006684a57b80] [c00000000019fb48] kthread_park+0x7c/0xf0
> [c000006684a57bb0] [c00000000015fefc] takedown_cpu+0x60/0x194
> [c000006684a57c00] [c000000000161924] cpuhp_invoke_callback+0x1f4/0x9a4
> [c000006684a57c90] [c0000000001621a4] __cpuhp_invoke_callback_range+0xd0/0x188
> [c000006684a57d30] [c000000000165aec] _cpu_down+0x19c/0x560
> [c000006684a57df0] [c0000000001637c0] __cpu_down_maps_locked+0x2c/0x3c
> [c000006684a57e10] [c00000000018a100] work_for_cpu_fn+0x38/0x54
> [c000006684a57e40] [c00000000019075c] process_one_work+0x1d8/0x554
> [c000006684a57ef0] [c00000000019165c] worker_thread+0x308/0x46c
> [c000006684a57f90] [c00000000019e474] kthread+0x16c/0x19c
> [c000006684a57fe0] [c00000000000dd58] start_kernel_thread+0x14/0x18
>
> It is takedown_cpu called from CPU0(boot CPU) and it wakes up kthread
> which is CPU Bound I guess. Since happens after rq was marked
> offline, it ends up starting the deadline server again.
>
> So i think it is sensible idea to stop the deadline server if the cpu
> is going down. Once we stop the server we will return
> HRTIMER_NORESTART.
D'0h.. that stop was far too early.
How about moving that dl_server_stop() into sched_cpu_dying() like so.
This seems to survive a few hotplugs for me.
---
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 198d2dd45f59..f1ebf67b48e2 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8571,10 +8571,12 @@ int sched_cpu_dying(unsigned int cpu)
sched_tick_stop(cpu);
rq_lock_irqsave(rq, &rf);
+ update_rq_clock(rq);
if (rq->nr_running != 1 || rq_has_pinned_tasks(rq)) {
WARN(true, "Dying CPU not properly vacated!");
dump_rq_tasks(rq, KERN_WARNING);
}
+ dl_server_stop(&rq->fair_server);
rq_unlock_irqrestore(rq, &rf);
calc_load_migrate(rq);
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 615411a0a881..7b7671060bf9 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1582,6 +1582,9 @@ void dl_server_start(struct sched_dl_entity *dl_se)
if (!dl_server(dl_se) || dl_se->dl_server_active)
return;
+ if (WARN_ON_ONCE(!cpu_online(cpu_of(rq))))
+ return;
+
dl_se->dl_server_active = 1;
enqueue_dl_entity(dl_se, ENQUEUE_WAKEUP);
if (!dl_task(dl_se->rq->curr) || dl_entity_preempt(dl_se, &rq->curr->dl))
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [bisected][mainline]Kernel warnings at kernel/sched/cpudeadline.c:219
2025-10-09 8:00 ` Peter Zijlstra
@ 2025-10-09 9:47 ` Shrikanth Hegde
2025-10-09 11:49 ` Peter Zijlstra
2025-10-09 11:54 ` Marek Szyprowski
1 sibling, 1 reply; 9+ messages in thread
From: Shrikanth Hegde @ 2025-10-09 9:47 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Venkat Rao Bagalkote, LKML, linuxppc-dev, Madhavan Srinivasan,
jstultz, stultz
On 10/9/25 1:30 PM, Peter Zijlstra wrote:
> On Wed, Oct 08, 2025 at 11:39:11PM +0530, Shrikanth Hegde wrote:
>
>> *It pointed to this*
>>
>> NIP [c0000000001fd798] dl_server_start+0x50/0xd8
>> LR [c0000000001d9534] enqueue_task_fair+0x228/0x8ec
>> Call Trace:
>> [c000006684a579c0] [0000000000000001] 0x1 (unreliable)
>> [c000006684a579f0] [c0000000001d9534] enqueue_task_fair+0x228/0x8ec
>> [c000006684a57a60] [c0000000001bb344] enqueue_task+0x5c/0x1c8
>> [c000006684a57aa0] [c0000000001c5fc0] ttwu_do_activate+0x98/0x2fc
>> [c000006684a57af0] [c0000000001c671c] try_to_wake_up+0x2e0/0xa60
>> [c000006684a57b80] [c00000000019fb48] kthread_park+0x7c/0xf0
>> [c000006684a57bb0] [c00000000015fefc] takedown_cpu+0x60/0x194
>> [c000006684a57c00] [c000000000161924] cpuhp_invoke_callback+0x1f4/0x9a4
>> [c000006684a57c90] [c0000000001621a4] __cpuhp_invoke_callback_range+0xd0/0x188
>> [c000006684a57d30] [c000000000165aec] _cpu_down+0x19c/0x560
>> [c000006684a57df0] [c0000000001637c0] __cpu_down_maps_locked+0x2c/0x3c
>> [c000006684a57e10] [c00000000018a100] work_for_cpu_fn+0x38/0x54
>> [c000006684a57e40] [c00000000019075c] process_one_work+0x1d8/0x554
>> [c000006684a57ef0] [c00000000019165c] worker_thread+0x308/0x46c
>> [c000006684a57f90] [c00000000019e474] kthread+0x16c/0x19c
>> [c000006684a57fe0] [c00000000000dd58] start_kernel_thread+0x14/0x18
>>
>> It is takedown_cpu called from CPU0(boot CPU) and it wakes up kthread
>> which is CPU Bound I guess. Since happens after rq was marked
>> offline, it ends up starting the deadline server again.
>>
>> So i think it is sensible idea to stop the deadline server if the cpu
>> is going down. Once we stop the server we will return
>> HRTIMER_NORESTART.
>
> D'0h.. that stop was far too early.
>
> How about moving that dl_server_stop() into sched_cpu_dying() like so.
>
> This seems to survive a few hotplugs for me.
>
> ---
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 198d2dd45f59..f1ebf67b48e2 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -8571,10 +8571,12 @@ int sched_cpu_dying(unsigned int cpu)
> sched_tick_stop(cpu);
>
> rq_lock_irqsave(rq, &rf);
> + update_rq_clock(rq);
> if (rq->nr_running != 1 || rq_has_pinned_tasks(rq)) {
> WARN(true, "Dying CPU not properly vacated!");
> dump_rq_tasks(rq, KERN_WARNING);
> }
> + dl_server_stop(&rq->fair_server);
> rq_unlock_irqrestore(rq, &rf);
>
> calc_load_migrate(rq);
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 615411a0a881..7b7671060bf9 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1582,6 +1582,9 @@ void dl_server_start(struct sched_dl_entity *dl_se)
> if (!dl_server(dl_se) || dl_se->dl_server_active)
> return;
>
> + if (WARN_ON_ONCE(!cpu_online(cpu_of(rq))))
> + return;
> +
> dl_se->dl_server_active = 1;
> enqueue_dl_entity(dl_se, ENQUEUE_WAKEUP);
> if (!dl_task(dl_se->rq->curr) || dl_entity_preempt(dl_se, &rq->curr->dl))
Yes. This works. no warning with drmgr or chcpu.
shall i write changelog and send it as patch?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [bisected][mainline]Kernel warnings at kernel/sched/cpudeadline.c:219
2025-10-09 9:47 ` Shrikanth Hegde
@ 2025-10-09 11:49 ` Peter Zijlstra
0 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2025-10-09 11:49 UTC (permalink / raw)
To: Shrikanth Hegde
Cc: Venkat Rao Bagalkote, LKML, linuxppc-dev, Madhavan Srinivasan,
jstultz, stultz
On Thu, Oct 09, 2025 at 03:17:40PM +0530, Shrikanth Hegde wrote:
>
>
> On 10/9/25 1:30 PM, Peter Zijlstra wrote:
> > On Wed, Oct 08, 2025 at 11:39:11PM +0530, Shrikanth Hegde wrote:
> >
> > > *It pointed to this*
> > >
> > > NIP [c0000000001fd798] dl_server_start+0x50/0xd8
> > > LR [c0000000001d9534] enqueue_task_fair+0x228/0x8ec
> > > Call Trace:
> > > [c000006684a579c0] [0000000000000001] 0x1 (unreliable)
> > > [c000006684a579f0] [c0000000001d9534] enqueue_task_fair+0x228/0x8ec
> > > [c000006684a57a60] [c0000000001bb344] enqueue_task+0x5c/0x1c8
> > > [c000006684a57aa0] [c0000000001c5fc0] ttwu_do_activate+0x98/0x2fc
> > > [c000006684a57af0] [c0000000001c671c] try_to_wake_up+0x2e0/0xa60
> > > [c000006684a57b80] [c00000000019fb48] kthread_park+0x7c/0xf0
> > > [c000006684a57bb0] [c00000000015fefc] takedown_cpu+0x60/0x194
> > > [c000006684a57c00] [c000000000161924] cpuhp_invoke_callback+0x1f4/0x9a4
> > > [c000006684a57c90] [c0000000001621a4] __cpuhp_invoke_callback_range+0xd0/0x188
> > > [c000006684a57d30] [c000000000165aec] _cpu_down+0x19c/0x560
> > > [c000006684a57df0] [c0000000001637c0] __cpu_down_maps_locked+0x2c/0x3c
> > > [c000006684a57e10] [c00000000018a100] work_for_cpu_fn+0x38/0x54
> > > [c000006684a57e40] [c00000000019075c] process_one_work+0x1d8/0x554
> > > [c000006684a57ef0] [c00000000019165c] worker_thread+0x308/0x46c
> > > [c000006684a57f90] [c00000000019e474] kthread+0x16c/0x19c
> > > [c000006684a57fe0] [c00000000000dd58] start_kernel_thread+0x14/0x18
> > >
> > > It is takedown_cpu called from CPU0(boot CPU) and it wakes up kthread
> > > which is CPU Bound I guess. Since happens after rq was marked
> > > offline, it ends up starting the deadline server again.
> > >
> > > So i think it is sensible idea to stop the deadline server if the cpu
> > > is going down. Once we stop the server we will return
> > > HRTIMER_NORESTART.
> >
> > D'0h.. that stop was far too early.
> >
> > How about moving that dl_server_stop() into sched_cpu_dying() like so.
> >
> > This seems to survive a few hotplugs for me.
> >
> > ---
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 198d2dd45f59..f1ebf67b48e2 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -8571,10 +8571,12 @@ int sched_cpu_dying(unsigned int cpu)
> > sched_tick_stop(cpu);
> > rq_lock_irqsave(rq, &rf);
> > + update_rq_clock(rq);
> > if (rq->nr_running != 1 || rq_has_pinned_tasks(rq)) {
> > WARN(true, "Dying CPU not properly vacated!");
> > dump_rq_tasks(rq, KERN_WARNING);
> > }
> > + dl_server_stop(&rq->fair_server);
> > rq_unlock_irqrestore(rq, &rf);
> > calc_load_migrate(rq);
> > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> > index 615411a0a881..7b7671060bf9 100644
> > --- a/kernel/sched/deadline.c
> > +++ b/kernel/sched/deadline.c
> > @@ -1582,6 +1582,9 @@ void dl_server_start(struct sched_dl_entity *dl_se)
> > if (!dl_server(dl_se) || dl_se->dl_server_active)
> > return;
> > + if (WARN_ON_ONCE(!cpu_online(cpu_of(rq))))
> > + return;
> > +
> > dl_se->dl_server_active = 1;
> > enqueue_dl_entity(dl_se, ENQUEUE_WAKEUP);
> > if (!dl_task(dl_se->rq->curr) || dl_entity_preempt(dl_se, &rq->curr->dl))
>
> Yes. This works. no warning with drmgr or chcpu.
>
> shall i write changelog and send it as patch?
If you would. Thanks!
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [bisected][mainline]Kernel warnings at kernel/sched/cpudeadline.c:219
2025-10-09 8:00 ` Peter Zijlstra
2025-10-09 9:47 ` Shrikanth Hegde
@ 2025-10-09 11:54 ` Marek Szyprowski
1 sibling, 0 replies; 9+ messages in thread
From: Marek Szyprowski @ 2025-10-09 11:54 UTC (permalink / raw)
To: Peter Zijlstra, Shrikanth Hegde
Cc: Venkat Rao Bagalkote, LKML, linuxppc-dev, Madhavan Srinivasan,
jstultz, stultz
On 09.10.2025 10:00, Peter Zijlstra wrote:
> On Wed, Oct 08, 2025 at 11:39:11PM +0530, Shrikanth Hegde wrote:
>> *It pointed to this*
>>
>> NIP [c0000000001fd798] dl_server_start+0x50/0xd8
>> LR [c0000000001d9534] enqueue_task_fair+0x228/0x8ec
>> Call Trace:
>> [c000006684a579c0] [0000000000000001] 0x1 (unreliable)
>> [c000006684a579f0] [c0000000001d9534] enqueue_task_fair+0x228/0x8ec
>> [c000006684a57a60] [c0000000001bb344] enqueue_task+0x5c/0x1c8
>> [c000006684a57aa0] [c0000000001c5fc0] ttwu_do_activate+0x98/0x2fc
>> [c000006684a57af0] [c0000000001c671c] try_to_wake_up+0x2e0/0xa60
>> [c000006684a57b80] [c00000000019fb48] kthread_park+0x7c/0xf0
>> [c000006684a57bb0] [c00000000015fefc] takedown_cpu+0x60/0x194
>> [c000006684a57c00] [c000000000161924] cpuhp_invoke_callback+0x1f4/0x9a4
>> [c000006684a57c90] [c0000000001621a4] __cpuhp_invoke_callback_range+0xd0/0x188
>> [c000006684a57d30] [c000000000165aec] _cpu_down+0x19c/0x560
>> [c000006684a57df0] [c0000000001637c0] __cpu_down_maps_locked+0x2c/0x3c
>> [c000006684a57e10] [c00000000018a100] work_for_cpu_fn+0x38/0x54
>> [c000006684a57e40] [c00000000019075c] process_one_work+0x1d8/0x554
>> [c000006684a57ef0] [c00000000019165c] worker_thread+0x308/0x46c
>> [c000006684a57f90] [c00000000019e474] kthread+0x16c/0x19c
>> [c000006684a57fe0] [c00000000000dd58] start_kernel_thread+0x14/0x18
>>
>> It is takedown_cpu called from CPU0(boot CPU) and it wakes up kthread
>> which is CPU Bound I guess. Since happens after rq was marked
>> offline, it ends up starting the deadline server again.
>>
>> So i think it is sensible idea to stop the deadline server if the cpu
>> is going down. Once we stop the server we will return
>> HRTIMER_NORESTART.
> D'0h.. that stop was far too early.
>
> How about moving that dl_server_stop() into sched_cpu_dying() like so.
>
> This seems to survive a few hotplugs for me.
>
> ---
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 198d2dd45f59..f1ebf67b48e2 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -8571,10 +8571,12 @@ int sched_cpu_dying(unsigned int cpu)
> sched_tick_stop(cpu);
>
> rq_lock_irqsave(rq, &rf);
> + update_rq_clock(rq);
> if (rq->nr_running != 1 || rq_has_pinned_tasks(rq)) {
> WARN(true, "Dying CPU not properly vacated!");
> dump_rq_tasks(rq, KERN_WARNING);
> }
> + dl_server_stop(&rq->fair_server);
> rq_unlock_irqrestore(rq, &rf);
>
> calc_load_migrate(rq);
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 615411a0a881..7b7671060bf9 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1582,6 +1582,9 @@ void dl_server_start(struct sched_dl_entity *dl_se)
> if (!dl_server(dl_se) || dl_se->dl_server_active)
> return;
>
> + if (WARN_ON_ONCE(!cpu_online(cpu_of(rq))))
> + return;
> +
> dl_se->dl_server_active = 1;
> enqueue_dl_entity(dl_se, ENQUEUE_WAKEUP);
> if (!dl_task(dl_se->rq->curr) || dl_entity_preempt(dl_se, &rq->curr->dl))
This fixes a similar issue observed on Samsung Exynos SoC based boards
(ARM 32bit and 64bit) that I've reported in the following thread:
https://lore.kernel.org/all/e56310b5-f7a9-4fad-b79a-dcbcdd3d3883@samsung.com/
Thanks for the fix! Feel free to add:
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-10-09 11:54 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-08 2:11 [bisected][mainline]Kernel warnings at kernel/sched/cpudeadline.c:219 Venkat Rao Bagalkote
2025-10-08 9:50 ` Peter Zijlstra
2025-10-08 10:17 ` Shrikanth Hegde
2025-10-08 11:13 ` Peter Zijlstra
2025-10-08 18:09 ` Shrikanth Hegde
2025-10-09 8:00 ` Peter Zijlstra
2025-10-09 9:47 ` Shrikanth Hegde
2025-10-09 11:49 ` Peter Zijlstra
2025-10-09 11:54 ` Marek Szyprowski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox