* [cxl:for-7.0/cxl-init] [dax/hmem, e820, resource] bc62f5b308: BUG:soft_lockup-CPU##stuck_for#s![kworker:#:#]
@ 2026-01-21 5:11 kernel test robot
2026-01-22 20:18 ` dan.j.williams
0 siblings, 1 reply; 2+ messages in thread
From: kernel test robot @ 2026-01-21 5:11 UTC (permalink / raw)
To: Dan Williams
Cc: oe-lkp, lkp, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, linux-cxl, Dave Jiang, Smita Koralahalli,
linux-kernel, nvdimm, oliver.sang
Hello,
FYI. we don't have enough knowledge to understand how the issues we found
in the tests are related with the code. we just run the tests up to 200 times
for both this commit and parent, noticed there are various random issues on
this commit, but always clean on parent.
=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/sleep:
vm-snb/boot/debian-11.1-i386-20220923.cgz/i386-randconfig-141-20260117/gcc-14/1
29317f8dc6ed601e bc62f5b308cbdedf29132fe96e9
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:200 2% 5:200 dmesg.BUG:soft_lockup-CPU##stuck_for#s![kworker##:#]
:200 2% 5:200 dmesg.BUG:soft_lockup-CPU##stuck_for#s![kworker:#:#]
:200 8% 17:200 dmesg.BUG:soft_lockup-CPU##stuck_for#s![swapper:#]
:200 2% 4:200 dmesg.BUG:workqueue_lockup-pool
:200 0% 1:200 dmesg.EIP:__schedule
:200 0% 1:200 dmesg.EIP:_raw_spin_unlock_irq
:200 2% 4:200 dmesg.EIP:_raw_spin_unlock_irqrestore
:200 6% 11:200 dmesg.EIP:console_emit_next_record
:200 0% 1:200 dmesg.EIP:finish_task_switch
:200 3% 6:200 dmesg.EIP:lock_acquire
:200 1% 2:200 dmesg.EIP:lock_release
:200 1% 2:200 dmesg.EIP:queue_work_on
:200 0% 1:200 dmesg.EIP:rcu_preempt_deferred_qs_irqrestore
:200 1% 2:200 dmesg.EIP:timekeeping_notify
:200 0% 1:200 dmesg.INFO:rcu_preempt_detected_stalls_on_CPUs/tasks
:200 0% 1:200 dmesg.INFO:task_blocked_for_more_than#seconds
:200 14% 27:200 dmesg.Kernel_panic-not_syncing:softlockup:hung_tasks
below is full report.
kernel test robot noticed "BUG:soft_lockup-CPU##stuck_for#s![kworker:#:#]" on:
commit: bc62f5b308cbdedf29132fe96e9d591e526527e1 ("dax/hmem, e820, resource: Defer Soft Reserved insertion until hmem is ready")
https://git.kernel.org/cgit/linux/kernel/git/cxl/cxl.git for-7.0/cxl-init
in testcase: boot
config: i386-randconfig-141-20260117
compiler: gcc-14
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 32G
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202601211001.82fe0f1b-lkp@intel.com
[ 674.140379][ C0] watchdog: BUG: soft lockup - CPU#0 stuck for 626s! [kworker/0:2:18]
[ 674.140379][ C0] Modules linked in:
[ 674.140379][ C0] irq event stamp: 192928
[ 674.140379][ C0] hardirqs last enabled at (192927): rcu_preempt_deferred_qs_irqrestore (arch/x86/include/asm/irqflags.h:26 arch/x86/include/asm/irqflags.h:109 arch/x86/include/asm/irqflags.h:151 kernel/rcu/tree_plugin.h:587)
[ 674.140379][ C0] hardirqs last disabled at (192928): sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1056)
[ 674.140379][ C0] softirqs last enabled at (192850): handle_softirqs (kernel/softirq.c:469 (discriminator 2) kernel/softirq.c:650 (discriminator 2))
[ 674.140379][ C0] softirqs last disabled at (192839): __do_softirq (kernel/softirq.c:657)
[ 674.140379][ C0] CPU: 0 UID: 0 PID: 18 Comm: kworker/0:2 Not tainted 6.19.0-rc4-00007-gbc62f5b308cb #1 PREEMPT(lazy) 9b7ba6dd04fa63ebf0e343a2cc1c803e2e6231bd
[ 674.140379][ C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 674.140379][ C0] Workqueue: rcu_gp strict_work_handler
[ 674.140379][ C0] EIP: lock_release (kernel/locking/lockdep.c:5893)
[ 674.140379][ C0] Code: b8 ff ff ff ff 0f c1 05 48 c2 ff c3 48 0f 85 95 00 00 00 9c 58 f6 c4 02 0f 85 aa 00 00 00 81 e7 00 02 00 00 74 01 fb 8d 65 f4 <5b> 5e 5f 5d c3 2e 8d b4 26 00 00 00 00 90 ff 05 14 e0 e7 c3 a1 5c
All code
========
0: b8 ff ff ff ff mov $0xffffffff,%eax
5: 0f c1 05 48 c2 ff c3 xadd %eax,-0x3c003db8(%rip) # 0xffffffffc3ffc254
c: 48 0f 85 95 00 00 00 rex.W jne 0xa8
13: 9c pushf
14: 58 pop %rax
15: f6 c4 02 test $0x2,%ah
18: 0f 85 aa 00 00 00 jne 0xc8
1e: 81 e7 00 02 00 00 and $0x200,%edi
24: 74 01 je 0x27
26: fb sti
27: 8d 65 f4 lea -0xc(%rbp),%esp
2a:* 5b pop %rbx <-- trapping instruction
2b: 5e pop %rsi
2c: 5f pop %rdi
2d: 5d pop %rbp
2e: c3 ret
2f: 2e 8d b4 26 00 00 00 cs lea 0x0(%rsi,%riz,1),%esi
36: 00
37: 90 nop
38: ff 05 14 e0 e7 c3 incl -0x3c181fec(%rip) # 0xffffffffc3e7e052
3e: a1 .byte 0xa1
3f: 5c pop %rsp
Code starting with the faulting instruction
===========================================
0: 5b pop %rbx
1: 5e pop %rsi
2: 5f pop %rdi
3: 5d pop %rbp
4: c3 ret
5: 2e 8d b4 26 00 00 00 cs lea 0x0(%rsi,%riz,1),%esi
c: 00
d: 90 nop
e: ff 05 14 e0 e7 c3 incl -0x3c181fec(%rip) # 0xffffffffc3e7e028
14: a1 .byte 0xa1
15: 5c pop %rsp
[ 674.140379][ C0] EAX: 00000047 EBX: c54814c0 ECX: c5622508 EDX: ffffffff
[ 674.140379][ C0] ESI: c122e710 EDI: 00000200 EBP: c562def4 ESP: c562dee8
[ 674.140379][ C0] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00000206
[ 674.140379][ C0] CR0: 80050033 CR2: ffda9000 CR3: 047db000 CR4: 00040690
[ 674.140379][ C0] Call Trace:
[ 674.140379][ C0] process_one_work (kernel/workqueue.c:3268)
[ 674.140379][ C0] worker_thread (kernel/workqueue.c:3334 (discriminator 2) kernel/workqueue.c:3421 (discriminator 2))
[ 674.140379][ C0] kthread (kernel/kthread.c:463)
[ 674.140379][ C0] ? rescuer_thread (kernel/workqueue.c:3367)
[ 674.140379][ C0] ? kthread_unpark (kernel/kthread.c:412)
[ 674.140379][ C0] ret_from_fork (arch/x86/kernel/process.c:164)
[ 674.140379][ C0] ? kthread_unpark (kernel/kthread.c:412)
[ 674.140379][ C0] ret_from_fork_asm (arch/x86/entry/entry_32.S:737)
[ 674.140379][ C0] entry_INT80_32 (arch/x86/entry/entry_32.S:945)
[ 674.140379][ C0] Kernel panic - not syncing: softlockup: hung tasks
[ 674.140379][ C0] CPU: 0 UID: 0 PID: 18 Comm: kworker/0:2 Tainted: G L 6.19.0-rc4-00007-gbc62f5b308cb #1 PREEMPT(lazy) 9b7ba6dd04fa63ebf0e343a2cc1c803e2e6231bd
[ 674.140379][ C0] Tainted: [L]=SOFTLOCKUP
[ 674.140379][ C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 674.140379][ C0] Workqueue: rcu_gp strict_work_handler
[ 674.140379][ C0] Call Trace:
[ 674.140379][ C0] dump_stack_lvl (lib/dump_stack.c:122)
[ 674.140379][ C0] dump_stack (lib/dump_stack.c:130)
[ 674.140379][ C0] vpanic (kernel/panic.c:487)
[ 674.140379][ C0] panic (kernel/panic.c:365)
[ 674.140379][ C0] watchdog_timer_fn.cold (kernel/watchdog.c:869)
[ 674.140379][ C0] ? softlockup_fn (kernel/watchdog.c:781)
[ 674.140379][ C0] __hrtimer_run_queues+0xa4/0x380
[ 674.140379][ C0] hrtimer_run_queues (kernel/time/hrtimer.c:1999)
[ 674.140379][ C0] update_process_times (kernel/time/timer.c:2455 (discriminator 3) kernel/time/timer.c:2473 (discriminator 3))
[ 674.140379][ C0] tick_periodic+0x33/0x100
[ 674.140379][ C0] tick_handle_periodic (kernel/time/tick-common.c:130)
[ 674.140379][ C0] ? vmware_sched_clock (arch/x86/kernel/apic/apic.c:1056)
[ 674.140379][ C0] __sysvec_apic_timer_interrupt (arch/x86/include/asm/trace/irq_vectors.h:40 (discriminator 4) arch/x86/include/asm/trace/irq_vectors.h:40 (discriminator 4) arch/x86/kernel/apic/apic.c:1063 (discriminator 4))
[ 674.140379][ C0] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1056 (discriminator 2) arch/x86/kernel/apic/apic.c:1056 (discriminator 2))
[ 674.140379][ C0] ? process_one_work (kernel/workqueue.c:3266)
[ 674.140379][ C0] handle_exception (arch/x86/entry/entry_32.S:1048)
[ 674.140379][ C0] EIP: lock_release (kernel/locking/lockdep.c:5893)
[ 674.140379][ C0] Code: b8 ff ff ff ff 0f c1 05 48 c2 ff c3 48 0f 85 95 00 00 00 9c 58 f6 c4 02 0f 85 aa 00 00 00 81 e7 00 02 00 00 74 01 fb 8d 65 f4 <5b> 5e 5f 5d c3 2e 8d b4 26 00 00 00 00 90 ff 05 14 e0 e7 c3 a1 5c
All code
========
0: b8 ff ff ff ff mov $0xffffffff,%eax
5: 0f c1 05 48 c2 ff c3 xadd %eax,-0x3c003db8(%rip) # 0xffffffffc3ffc254
c: 48 0f 85 95 00 00 00 rex.W jne 0xa8
13: 9c pushf
14: 58 pop %rax
15: f6 c4 02 test $0x2,%ah
18: 0f 85 aa 00 00 00 jne 0xc8
1e: 81 e7 00 02 00 00 and $0x200,%edi
24: 74 01 je 0x27
26: fb sti
27: 8d 65 f4 lea -0xc(%rbp),%esp
2a:* 5b pop %rbx <-- trapping instruction
2b: 5e pop %rsi
2c: 5f pop %rdi
2d: 5d pop %rbp
2e: c3 ret
2f: 2e 8d b4 26 00 00 00 cs lea 0x0(%rsi,%riz,1),%esi
36: 00
37: 90 nop
38: ff 05 14 e0 e7 c3 incl -0x3c181fec(%rip) # 0xffffffffc3e7e052
3e: a1 .byte 0xa1
3f: 5c pop %rsp
Code starting with the faulting instruction
===========================================
0: 5b pop %rbx
1: 5e pop %rsi
2: 5f pop %rdi
3: 5d pop %rbp
4: c3 ret
5: 2e 8d b4 26 00 00 00 cs lea 0x0(%rsi,%riz,1),%esi
c: 00
d: 90 nop
e: ff 05 14 e0 e7 c3 incl -0x3c181fec(%rip) # 0xffffffffc3e7e028
14: a1 .byte 0xa1
15: 5c pop %rsp
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260121/202601211001.82fe0f1b-lkp@intel.com
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 2+ messages in thread* Re: [cxl:for-7.0/cxl-init] [dax/hmem, e820, resource] bc62f5b308: BUG:soft_lockup-CPU##stuck_for#s![kworker:#:#]
2026-01-21 5:11 [cxl:for-7.0/cxl-init] [dax/hmem, e820, resource] bc62f5b308: BUG:soft_lockup-CPU##stuck_for#s![kworker:#:#] kernel test robot
@ 2026-01-22 20:18 ` dan.j.williams
0 siblings, 0 replies; 2+ messages in thread
From: dan.j.williams @ 2026-01-22 20:18 UTC (permalink / raw)
To: kernel test robot, Dan Williams
Cc: oe-lkp, lkp, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, linux-cxl, Dave Jiang, Smita Koralahalli,
linux-kernel, nvdimm, oliver.sang
kernel test robot wrote:
>
>
> Hello,
>
> FYI. we don't have enough knowledge to understand how the issues we found
> in the tests are related with the code. we just run the tests up to 200 times
> for both this commit and parent, noticed there are various random issues on
> this commit, but always clean on parent.
>
>
> =========================================================================================
> tbox_group/testcase/rootfs/kconfig/compiler/sleep:
> vm-snb/boot/debian-11.1-i386-20220923.cgz/i386-randconfig-141-20260117/gcc-14/1
>
> 29317f8dc6ed601e bc62f5b308cbdedf29132fe96e9
> ---------------- ---------------------------
> fail:runs %reproduction fail:runs
> | | |
> :200 2% 5:200 dmesg.BUG:soft_lockup-CPU##stuck_for#s![kworker##:#]
> :200 2% 5:200 dmesg.BUG:soft_lockup-CPU##stuck_for#s![kworker:#:#]
> :200 8% 17:200 dmesg.BUG:soft_lockup-CPU##stuck_for#s![swapper:#]
> :200 2% 4:200 dmesg.BUG:workqueue_lockup-pool
> :200 0% 1:200 dmesg.EIP:__schedule
> :200 0% 1:200 dmesg.EIP:_raw_spin_unlock_irq
> :200 2% 4:200 dmesg.EIP:_raw_spin_unlock_irqrestore
> :200 6% 11:200 dmesg.EIP:console_emit_next_record
> :200 0% 1:200 dmesg.EIP:finish_task_switch
> :200 3% 6:200 dmesg.EIP:lock_acquire
> :200 1% 2:200 dmesg.EIP:lock_release
> :200 1% 2:200 dmesg.EIP:queue_work_on
> :200 0% 1:200 dmesg.EIP:rcu_preempt_deferred_qs_irqrestore
> :200 1% 2:200 dmesg.EIP:timekeeping_notify
> :200 0% 1:200 dmesg.INFO:rcu_preempt_detected_stalls_on_CPUs/tasks
> :200 0% 1:200 dmesg.INFO:task_blocked_for_more_than#seconds
> :200 14% 27:200 dmesg.Kernel_panic-not_syncing:softlockup:hung_tasks
>
> below is full report.
So this is good data, but I do not know what to do with it. The
RCU_STRICT_GRACE_PERIOD feature seems to want to make RCU usage bugs
more detectable, but at the risk of false positives. My concern is that
this patch disturbs 32-bit x86 builds just enough to make the softlockup
detector start getting upset about this rcu_gp::strict_work_handler
workqueue.
So unless this causes actual boot failures all I can assume is that this
is a false positive report. Nothing in this patch is touching workqueues
or object lifetime issues. So I can only assume this is a side effect of
instruction cache layout, or similar.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-01-22 20:19 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-21 5:11 [cxl:for-7.0/cxl-init] [dax/hmem, e820, resource] bc62f5b308: BUG:soft_lockup-CPU##stuck_for#s![kworker:#:#] kernel test robot
2026-01-22 20:18 ` dan.j.williams
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox