public inbox for oe-lkp@lists.linux.dev
 help / color / mirror / Atom feed
From: kernel test robot <oliver.sang@intel.com>
To: Pingfan Liu <piliu@redhat.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
	<linux-kernel@vger.kernel.org>, Tejun Heo <tj@kernel.org>,
	Waiman Long <longman@redhat.com>,
	Chen Ridong <chenridong@huaweicloud.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"Juri Lelli" <juri.lelli@redhat.com>,
	Pierre Gondois <pierre.gondois@arm.com>,
	"Ingo Molnar" <mingo@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	<aubrey.li@linux.intel.com>, <yu.c.chen@intel.com>,
	<oliver.sang@intel.com>
Subject: [linus:master] [sched/deadline]  318e18ed22: BUG:soft_lockup-CPU##stuck_for#s![swapper:#]
Date: Tue, 16 Dec 2025 15:43:59 +0800	[thread overview]
Message-ID: <202512161547.cd3a9187-lkp@intel.com> (raw)



Hello,

kernel test robot noticed "BUG:soft_lockup-CPU##stuck_for#s![swapper:#]" on:

commit: 318e18ed22e89397635e15095c014accaf47ed30 ("sched/deadline: Walk up cpuset hierarchy to decide root domain when hot-unplug")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[test failed on linus/master      d358e5254674b70f34c847715ca509e46eb81e6f]
[test failed on linux-next/master 5ce74bc1b7cb2732b22f9c93082545bc655d6547]

in testcase: trinity
version: trinity-static-i386-x86_64-f93256fb_2019-08-28
with following parameters:

	runtime: 300s
	group: group-03
	nr_groups: 5


config: i386-randconfig-r071-20250410
compiler: gcc-14
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 32G

(please refer to attached dmesg/kmsg for entire log/backtrace)


we don't have enough knowledge to analyze the relation between the change
and the issue, so we run tests up to 1000 times. the issue can be reproduced
65 times out of 1000 runs. while parent always keeps clean.

=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/runtime/group/nr_groups:
  vm-snb/trinity/openwrt-i386-generic-20190428.cgz/i386-randconfig-r071-20250410/gcc-14/300s/group-03/5


1f382215119a0bc1 318e18ed22e89397635e15095c0
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
           :1000         8%          82:1000  dmesg.BUG:kernel_hang_in_boot_stage
           :1000         7%          69:1000  dmesg.BUG:soft_lockup-CPU##stuck_for#s![swapper:#]   <----
           :1000         8%          82:1000  dmesg.BUG:workqueue_lockup-pool
           :1000         7%          69:1000  dmesg.EIP:tick_clock_notify
           :1000         2%          15:1000  dmesg.INFO:rcu_preempt_detected_stalls_on_CPUs/tasks
           :1000         5%          53:1000  dmesg.INFO:task_blocked_for_more_than#seconds
           :1000         7%          69:1000  dmesg.Kernel_panic-not_syncing:softlockup:hung_tasks



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202512161547.cd3a9187-lkp@intel.com


[  699.774873][    C0] watchdog: BUG: soft lockup - CPU#0 stuck for 626s! [swapper/0:1]
[  699.775553][    C0] CPU#0 Utilization every 96000ms during lockup:
[  699.775553][    C0] 	#1:  26% system,	  0% softirq,	  0% hardirq,	  0% idle
[  699.775553][    C0] 	#2:  25% system,	  0% softirq,	  0% hardirq,	  0% idle
[  699.775553][    C0] 	#3:  25% system,	  0% softirq,	  0% hardirq,	  0% idle
[  699.775553][    C0] 	#4:  34% system,	  0% softirq,	  0% hardirq,	  0% idle
[  699.775553][    C0] 	#5: 100% system,	  0% softirq,	  0% hardirq,	  0% idle
[  699.775553][    C0] Modules linked in:
[  699.775553][    C0] irq event stamp: 201566
[  699.775553][    C0] hardirqs last  enabled at (201565): timekeeping_notify (arch/x86/include/asm/irqflags.h:42 arch/x86/include/asm/irqflags.h:119 arch/x86/include/asm/irqflags.h:159 include/linux/stop_machine.h:172 include/linux/stop_machine.h:179 kernel/time/timekeeping.c:1634)
[  699.775553][    C0] hardirqs last disabled at (201566): sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1052)
[  699.775553][    C0] softirqs last  enabled at (200324): handle_softirqs (kernel/softirq.c:469 (discriminator 2) kernel/softirq.c:650 (discriminator 2))
[  699.775553][    C0] softirqs last disabled at (200309): __do_softirq (kernel/softirq.c:657)
[  699.775553][    C0] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.18.0-rc2-00020-g318e18ed22e8 #1 PREEMPT(full)
[  699.775553][    C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[  699.775553][    C0] EIP: tick_clock_notify (arch/x86/include/asm/bitops.h:55 include/asm-generic/bitops/instrumented-atomic.h:29 kernel/time/tick-sched.c:1633)
[  699.775553][    C0] Code: 8b 45 e4 89 1d 24 d5 6a 83 a3 38 d5 6a 83 89 15 3c d5 6a 83 83 c4 10 5b 5e 5f 5d c3 2e 8d b4 26 00 00 00 00 8d b6 00 00 00 00 <80> 0d 44 d5 6a 83 01 c3 2e 8d b4 26 00 00 00 00 80 0d 44 d5 6a 83
All code
========
   0:	8b 45 e4             	mov    -0x1c(%rbp),%eax
   3:	89 1d 24 d5 6a 83    	mov    %ebx,-0x7c952adc(%rip)        # 0xffffffff836ad52d
   9:	a3 38 d5 6a 83 89 15 	movabs %eax,0xd53c1589836ad538
  10:	3c d5 
  12:	6a 83                	push   $0xffffffffffffff83
  14:	83 c4 10             	add    $0x10,%esp
  17:	5b                   	pop    %rbx
  18:	5e                   	pop    %rsi
  19:	5f                   	pop    %rdi
  1a:	5d                   	pop    %rbp
  1b:	c3                   	ret
  1c:	2e 8d b4 26 00 00 00 	cs lea 0x0(%rsi,%riz,1),%esi
  23:	00 
  24:	8d b6 00 00 00 00    	lea    0x0(%rsi),%esi
  2a:*	80 0d 44 d5 6a 83 01 	orb    $0x1,-0x7c952abc(%rip)        # 0xffffffff836ad575		<-- trapping instruction
  31:	c3                   	ret
  32:	2e 8d b4 26 00 00 00 	cs lea 0x0(%rsi,%riz,1),%esi
  39:	00 
  3a:	80                   	.byte 0x80
  3b:	0d 44 d5 6a 83       	or     $0x836ad544,%eax

Code starting with the faulting instruction
===========================================
   0:	80 0d 44 d5 6a 83 01 	orb    $0x1,-0x7c952abc(%rip)        # 0xffffffff836ad54b
   7:	c3                   	ret
   8:	2e 8d b4 26 00 00 00 	cs lea 0x0(%rsi,%riz,1),%esi
   f:	00 
  10:	80                   	.byte 0x80
  11:	0d 44 d5 6a 83       	or     $0x836ad544,%eax
[  699.775553][    C0] EAX: 0003135d EBX: 8322ef00 ECX: 00000006 EDX: 82f6bcac
[  699.775553][    C0] ESI: 00000200 EDI: 836ac3e0 EBP: 84c97ed8 ESP: 84c97ebc
[  699.775553][    C0] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00000202
[  699.775553][    C0] CR0: 80050033 CR2: ffdaa000 CR3: 03aeb000 CR4: 000406d0
[  699.775553][    C0] Call Trace:
[  699.775553][    C0]  ? timekeeping_notify (kernel/time/timekeeping.c:1636)
[  699.775553][    C0]  __clocksource_select (kernel/time/clocksource.c:1069 (discriminator 1))
[  699.775553][    C0]  ? boot_override_clock (kernel/time/clocksource.c:1101)
[  699.775553][    C0]  clocksource_select (kernel/time/clocksource.c:1086)
[  699.775553][    C0]  clocksource_done_booting (kernel/time/clocksource.c:1110)
[  699.775553][    C0]  do_one_initcall (init/main.c:1283)
[  699.775553][    C0]  ? rdinit_setup (init/main.c:1331)
[  699.775553][    C0]  do_initcalls (init/main.c:1344 (discriminator 3) init/main.c:1361 (discriminator 3))
[  699.775553][    C0]  kernel_init_freeable (init/main.c:1597)
[  699.775553][    C0]  ? rest_init (init/main.c:1475)
[  699.775553][    C0]  kernel_init (init/main.c:1485)
[  699.775553][    C0]  ret_from_fork (arch/x86/kernel/process.c:164)
[  699.775553][    C0]  ? rest_init (init/main.c:1475)
[  699.775553][    C0]  ret_from_fork_asm (arch/x86/entry/entry_32.S:737)
[  699.775553][    C0]  entry_INT80_32 (arch/x86/entry/entry_32.S:945)
[  699.775553][    C0] Kernel panic - not syncing: softlockup: hung tasks
[  699.775553][    C0] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Tainted: G             L      6.18.0-rc2-00020-g318e18ed22e8 #1 PREEMPT(full)
[  699.775553][    C0] Tainted: [L]=SOFTLOCKUP
[  699.775553][    C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[  699.775553][    C0] Call Trace:
[  699.775553][    C0]  dump_stack_lvl (lib/dump_stack.c:122)
[  699.775553][    C0]  dump_stack (lib/dump_stack.c:130)
[  699.775553][    C0]  vpanic (kernel/panic.c:487)
[  699.775553][    C0]  panic (kernel/panic.c:626)
[  699.775553][    C0]  watchdog_timer_fn (kernel/watchdog.c:753)
[  699.775553][    C0]  __hrtimer_run_queues+0x125/0x1e0
[  699.775553][    C0]  ? schedule_work (drivers/usb/core/hub.c:925)
[  699.775553][    C0]  hrtimer_run_queues (kernel/time/hrtimer.c:1999)
[  699.775553][    C0]  update_process_times (kernel/time/timer.c:2416 kernel/time/timer.c:2472)
[  699.775553][    C0]  tick_periodic (kernel/time/tick-common.c:103)
[  699.775553][    C0]  tick_handle_periodic (kernel/time/tick-common.c:144)
[  699.775553][    C0]  ? vmware_sched_clock (arch/x86/kernel/apic/apic.c:1052)
[  699.775553][    C0]  __sysvec_apic_timer_interrupt (arch/x86/include/asm/trace/irq_vectors.h:40 (discriminator 4) arch/x86/include/asm/trace/irq_vectors.h:40 (discriminator 4) arch/x86/kernel/apic/apic.c:1059 (discriminator 4))
[  699.775553][    C0]  sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1052 (discriminator 2) arch/x86/kernel/apic/apic.c:1052 (discriminator 2))
[  699.775553][    C0]  handle_exception (arch/x86/entry/entry_32.S:1055)
[  699.775553][    C0] EIP: tick_clock_notify (arch/x86/include/asm/bitops.h:55 include/asm-generic/bitops/instrumented-atomic.h:29 kernel/time/tick-sched.c:1633)
[  699.775553][    C0] Code: 8b 45 e4 89 1d 24 d5 6a 83 a3 38 d5 6a 83 89 15 3c d5 6a 83 83 c4 10 5b 5e 5f 5d c3 2e 8d b4 26 00 00 00 00 8d b6 00 00 00 00 <80> 0d 44 d5 6a 83 01 c3 2e 8d b4 26 00 00 00 00 80 0d 44 d5 6a 83
All code
========
   0:	8b 45 e4             	mov    -0x1c(%rbp),%eax
   3:	89 1d 24 d5 6a 83    	mov    %ebx,-0x7c952adc(%rip)        # 0xffffffff836ad52d
   9:	a3 38 d5 6a 83 89 15 	movabs %eax,0xd53c1589836ad538
  10:	3c d5 
  12:	6a 83                	push   $0xffffffffffffff83
  14:	83 c4 10             	add    $0x10,%esp
  17:	5b                   	pop    %rbx
  18:	5e                   	pop    %rsi
  19:	5f                   	pop    %rdi
  1a:	5d                   	pop    %rbp
  1b:	c3                   	ret
  1c:	2e 8d b4 26 00 00 00 	cs lea 0x0(%rsi,%riz,1),%esi
  23:	00 
  24:	8d b6 00 00 00 00    	lea    0x0(%rsi),%esi
  2a:*	80 0d 44 d5 6a 83 01 	orb    $0x1,-0x7c952abc(%rip)        # 0xffffffff836ad575		<-- trapping instruction
  31:	c3                   	ret
  32:	2e 8d b4 26 00 00 00 	cs lea 0x0(%rsi,%riz,1),%esi
  39:	00 
  3a:	80                   	.byte 0x80
  3b:	0d 44 d5 6a 83       	or     $0x836ad544,%eax

Code starting with the faulting instruction
===========================================
   0:	80 0d 44 d5 6a 83 01 	orb    $0x1,-0x7c952abc(%rip)        # 0xffffffff836ad54b
   7:	c3                   	ret
   8:	2e 8d b4 26 00 00 00 	cs lea 0x0(%rsi,%riz,1),%esi
   f:	00 
  10:	80                   	.byte 0x80
  11:	0d 44 d5 6a 83       	or     $0x836ad544,%eax


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20251216/202512161547.cd3a9187-lkp@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


             reply	other threads:[~2025-12-16  7:44 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-16  7:43 kernel test robot [this message]
2025-12-16 12:12 ` [linus:master] [sched/deadline] 318e18ed22: BUG:soft_lockup-CPU##stuck_for#s![swapper:#] Pingfan Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202512161547.cd3a9187-lkp@intel.com \
    --to=oliver.sang@intel.com \
    --cc=aubrey.li@linux.intel.com \
    --cc=bsegall@google.com \
    --cc=chenridong@huaweicloud.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=longman@redhat.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=peterz@infradead.org \
    --cc=pierre.gondois@arm.com \
    --cc=piliu@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=tj@kernel.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox