All of lore.kernel.org
 help / color / mirror / Atom feed
* [peterz-queue:sched/hrtick] [entry,hrtimer,x86] 4a683282cd: INFO:rcu_sched_detected_stalls_on_CPUs/tasks
@ 2025-04-24  6:05 kernel test robot
  0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2025-04-24  6:05 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: oe-lkp, lkp, oliver.sang



Hello,

kernel test robot noticed "INFO:rcu_sched_detected_stalls_on_CPUs/tasks" on:

commit: 4a683282cde4cb1f2e346544a0b1f84f36389df3 ("entry,hrtimer,x86: Push reprogramming timers into the interrupt return path")
https://git.kernel.org/cgit/linux/kernel/git/peterz/queue.git sched/hrtick

in testcase: vm-scalability
version: vm-scalability-x86_64-6f4ef16-0_20241103
with following parameters:

	runtime: 300s
	test: mremap-xread-rand-mt
	cpufreq_governor: performance



config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 96 threads 2 sockets Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz (Cascade Lake) with 128G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202504241308.26441226-lkp@intel.com


[  302.926021][   C83] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  302.933066][   C83] rcu: 	13-...!: (98 ticks this GP) idle=6674/1/0x4000000000000002 softirq=20508/20512 fqs=117
[  302.943609][   C83] rcu: 	46-...!: (96 ticks this GP) idle=7c8c/1/0x4000000000000002 softirq=21010/21015 fqs=117
[  302.954075][   C83] rcu: 	90-...!: (96 ticks this GP) idle=7734/1/0x4000000000000002 softirq=21308/21313 fqs=117
[  302.964538][   C83] rcu: 	(detected by 83, t=100038 jiffies, g=22093, q=970 ncpus=96)
[  302.972646][   C83] Sending NMI from CPU 83 to CPUs 13:
[  312.996955][   C83] Sending NMI from CPU 83 to CPUs 46:
[  323.021594][   C83] Sending NMI from CPU 83 to CPUs 90:
[  333.046096][   C83] rcu: rcu_sched kthread timer wakeup didn't happen for 129584 jiffies! g22093 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[  333.075287][   C83] rcu: 	Possible timer handling issue on cpu=45 timer-softirq=795
[  333.083244][   C83] rcu: rcu_sched kthread starved for 129622 jiffies! g22093 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=45
[  333.094756][   C83] rcu: 	Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[  333.104710][   C83] rcu: RCU grace-period kthread stack dump:
[  333.110761][   C83] task:rcu_sched       state:I stack:0     pid:15    tgid:15    ppid:2      task_flags:0x208040 flags:0x00004000
[  333.123135][   C83] Call Trace:
[  333.126603][   C83]  <TASK>
[ 333.129713][ C83] __schedule (kernel/sched/core.c:5388 kernel/sched/core.c:6773) 
[ 333.134221][ C83] schedule (arch/x86/include/asm/bitops.h:206 arch/x86/include/asm/bitops.h:238 include/linux/thread_info.h:192 include/linux/thread_info.h:208 include/linux/sched.h:2176 kernel/sched/core.c:6853 kernel/sched/core.c:6866) 
[ 333.138380][ C83] schedule_timeout (include/linux/timer.h:185 kernel/time/sleep_timeout.c:100) 
[ 333.143309][ C83] ? __pfx_process_timeout (kernel/time/sleep_timeout.c:24) 
[ 333.148752][ C83] rcu_gp_fqs_loop (kernel/rcu/tree.c:2046 (discriminator 13)) 
[ 333.153666][ C83] ? __pfx_rcu_gp_kthread (kernel/rcu/tree.c:2223) 
[ 333.159014][ C83] rcu_gp_kthread (kernel/rcu/tree.c:2251) 
[ 333.163751][ C83] kthread (kernel/kthread.c:464) 
[ 333.167882][ C83] ? __pfx_kthread (kernel/kthread.c:413) 
[ 333.172620][ C83] ret_from_fork (arch/x86/kernel/process.c:153) 
[ 333.177173][ C83] ? __pfx_kthread (kernel/kthread.c:413) 
[ 333.181889][ C83] ret_from_fork_asm (arch/x86/entry/entry_64.S:258) 
[  333.186784][   C83]  </TASK>
[  333.189935][   C83] rcu: Stack dump where RCU GP kthread last ran:
[  333.196379][   C83] Sending NMI from CPU 83 to CPUs 45:
j
input_data: 0x0000005f7ef4d2cc
input_len: 0x0000000000e59c10
output: 0x0000005f7c000000
output_len: 0x0000000003d48cac
kernel_total_size: 0x0000000003430000
needed_size: 0x0000000003e00000
trampoline_32bit: 0x0000000000000000


KASLR disabled: 'nokaslr' on cmdline.




The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250424/202504241308.26441226-lkp@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2025-04-24  6:06 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-24  6:05 [peterz-queue:sched/hrtick] [entry,hrtimer,x86] 4a683282cd: INFO:rcu_sched_detected_stalls_on_CPUs/tasks kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.