From: kernel test robot <oliver.sang@intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
<linux-kernel@vger.kernel.org>, <oliver.sang@intel.com>
Subject: [peterz-queue:sched/hrtick] [entry,hrtimer,x86] c07c4e0c01: BUG:soft_lockup-CPU##stuck_for#s![schbench:#]
Date: Fri, 28 Mar 2025 09:24:05 +0800 [thread overview]
Message-ID: <202503280925.27fefb28-lkp@intel.com> (raw)
Hello,
kernel test robot noticed "BUG:soft_lockup-CPU##stuck_for#s![schbench:#]" on:
commit: c07c4e0c013dc11dd466fa63a4af12ef8282b27b ("entry,hrtimer,x86: Push reprogramming timers into the interrupt return path")
https://git.kernel.org/cgit/linux/kernel/git/peterz/queue.git sched/hrtick
in testcase: schbench
version: schbench-x86_64-48aed1d-1_20241103
with following parameters:
iterations: 3x
message_threads: 10%
worker_threads: 128
runtime: 300s
cpufreq_governor: performance
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202503280925.27fefb28-lkp@intel.com
[ 120.056174][ C17] watchdog: BUG: soft lockup - CPU#17 stuck for 22s! [schbench:4939]
[ 120.056179][ C17] Modules linked in: kmem intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common device_dax nd_pmem nd_btt dax_pmem i10nm_edac skx_edac_common x86_pkg_temp_thermal intel_powerclamp coretemp btrfs blake2b_generic xor raid6_pq sd_mod kvm_intel sg kvm snd_pcm ast snd_timer dax_hmem ghash_clmulni_intel rapl drm_client_lib ahci cxl_acpi snd ipmi_ssif drm_shmem_helper intel_cstate isst_if_mmio isst_if_mbox_pci acpi_power_meter cxl_port libahci binfmt_misc intel_th_gth cxl_core mei_me soundcore ipmi_si ioatdma i2c_i801 intel_th_pci intel_uncore einj acpi_ipmi pcspkr libata mei isst_if_common drm_kms_helper i2c_smbus intel_pch_thermal intel_vsec intel_th dca wmi nfit ipmi_devintf libnvdimm ipmi_msghandler acpi_pad joydev drm fuse dm_mod loop ip_tables
[ 120.056218][ C17] CPU: 17 UID: 0 PID: 4939 Comm: schbench Tainted: G S 6.14.0-01502-gc07c4e0c013d #1 VOLUNTARY
[ 120.056221][ C17] Tainted: [S]=CPU_OUT_OF_SPEC
[ 120.056222][ C17] Hardware name: Intel Corporation M50CYP2SB1U/M50CYP2SB1U, BIOS SE5C620.86B.01.01.0003.2104260124 04/26/2021
[ 120.056223][ C17] RIP: 0010:native_queued_spin_lock_slowpath (kernel/locking/qspinlock.c:474)
[ 120.056234][ C17] Code: c1 e9 12 83 e0 03 83 e9 01 48 c1 e0 05 48 63 c9 48 05 80 2b e5 83 48 03 04 cd e0 cc bc 82 48 89 10 8b 42 08 85 c0 75 09 f3 90 <8b> 42 08 85 c0 74 f7 48 8b 0a 48 85 c9 74 90 0f 0d 09 eb 91 8b 03
All code
========
0: c1 e9 12 shr $0x12,%ecx
3: 83 e0 03 and $0x3,%eax
6: 83 e9 01 sub $0x1,%ecx
9: 48 c1 e0 05 shl $0x5,%rax
d: 48 63 c9 movslq %ecx,%rcx
10: 48 05 80 2b e5 83 add $0xffffffff83e52b80,%rax
16: 48 03 04 cd e0 cc bc add -0x7d433320(,%rcx,8),%rax
1d: 82
1e: 48 89 10 mov %rdx,(%rax)
21: 8b 42 08 mov 0x8(%rdx),%eax
24: 85 c0 test %eax,%eax
26: 75 09 jne 0x31
28: f3 90 pause
2a:* 8b 42 08 mov 0x8(%rdx),%eax <-- trapping instruction
2d: 85 c0 test %eax,%eax
2f: 74 f7 je 0x28
31: 48 8b 0a mov (%rdx),%rcx
34: 48 85 c9 test %rcx,%rcx
37: 74 90 je 0xffffffffffffffc9
39: 0f 0d 09 prefetchw (%rcx)
3c: eb 91 jmp 0xffffffffffffffcf
3e: 8b 03 mov (%rbx),%eax
Code starting with the faulting instruction
===========================================
0: 8b 42 08 mov 0x8(%rdx),%eax
3: 85 c0 test %eax,%eax
5: 74 f7 je 0xfffffffffffffffe
7: 48 8b 0a mov (%rdx),%rcx
a: 48 85 c9 test %rcx,%rcx
d: 74 90 je 0xffffffffffffff9f
f: 0f 0d 09 prefetchw (%rcx)
12: eb 91 jmp 0xffffffffffffffa5
14: 8b 03 mov (%rbx),%eax
[ 120.056236][ C17] RSP: 0000:ffa00000222dfd68 EFLAGS: 00000246
[ 120.056238][ C17] RAX: 0000000000000000 RBX: ffd40000055f6568 RCX: 000000000000002a
[ 120.056239][ C17] RDX: ff1100103f671b80 RSI: 0000000000ac0101 RDI: ffd40000055f6568
[ 120.056241][ C17] RBP: ff1100103f671b80 R08: 0000000000000000 R09: 0000000000000000
[ 120.056242][ C17] R10: 0000000055555554 R11: ff11000240ff850c R12: 0000000000480000
[ 120.056242][ C17] R13: 0000000000480000 R14: 0200000000000000 R15: 0000000000000000
[ 120.056243][ C17] FS: 00007f75844266c0(0000) GS:ff110010bb81f000(0000) knlGS:0000000000000000
[ 120.056245][ C17] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 120.056246][ C17] CR2: 00007f76e0415c70 CR3: 00000001f83fc002 CR4: 0000000000773ef0
[ 120.056247][ C17] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 120.056247][ C17] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 120.056248][ C17] PKRU: 55555554
[ 120.056249][ C17] Call Trace:
[ 120.056250][ C17] <TASK>
[ 120.056252][ C17] _raw_spin_lock (arch/x86/include/asm/paravirt.h:572 arch/x86/include/asm/qspinlock.h:51 include/asm-generic/qspinlock.h:114 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154)
[ 120.056254][ C17] do_huge_pmd_numa_page (mm/huge_memory.c:1976)
[ 120.056259][ C17] __handle_mm_fault (mm/memory.c:6014)
[ 120.056264][ C17] handle_mm_fault (mm/memory.c:6197)
[ 120.056266][ C17] do_user_addr_fault (arch/x86/mm/fault.c:1338)
[ 120.056272][ C17] exc_page_fault (arch/x86/include/asm/irqflags.h:37 arch/x86/include/asm/irqflags.h:92 arch/x86/mm/fault.c:1488 arch/x86/mm/fault.c:1538)
[ 120.056275][ C17] asm_exc_page_fault (arch/x86/include/asm/idtentry.h:623)
[ 120.056278][ C17] RIP: 0033:0x55f6cc692d8b
[ 120.056280][ C17] Code: e3 ff ff 8b 05 86 82 00 00 85 c0 0f 84 f7 02 00 00 48 8b 15 b7 33 00 00 31 db 48 85 d2 0f 84 30 01 00 00 4c 8b 15 55 82 00 00 <4d> 8b b7 70 98 10 00 4d 89 d5 4e 8d 1c d5 00 00 00 00 4d 0f af ea
All code
========
0: e3 ff jrcxz 0x1
2: ff 8b 05 86 82 00 decl 0x828605(%rbx)
8: 00 85 c0 0f 84 f7 add %al,-0x87bf040(%rbp)
e: 02 00 add (%rax),%al
10: 00 48 8b add %cl,-0x75(%rax)
13: 15 b7 33 00 00 adc $0x33b7,%eax
18: 31 db xor %ebx,%ebx
1a: 48 85 d2 test %rdx,%rdx
1d: 0f 84 30 01 00 00 je 0x153
23: 4c 8b 15 55 82 00 00 mov 0x8255(%rip),%r10 # 0x827f
2a:* 4d 8b b7 70 98 10 00 mov 0x109870(%r15),%r14 <-- trapping instruction
31: 4d 89 d5 mov %r10,%r13
34: 4e 8d 1c d5 00 00 00 lea 0x0(,%r10,8),%r11
3b: 00
3c: 4d 0f af ea imul %r10,%r13
Code starting with the faulting instruction
===========================================
0: 4d 8b b7 70 98 10 00 mov 0x109870(%r15),%r14
7: 4d 89 d5 mov %r10,%r13
a: 4e 8d 1c d5 00 00 00 lea 0x0(,%r10,8),%r11
11: 00
12: 4d 0f af ea imul %r10,%r13
[ 120.056281][ C17] RSP: 002b:00007f7584425df0 EFLAGS: 00010206
[ 120.056282][ C17] RAX: 0000000000000000 RBX: 000055f6e61615d0 RCX: 0000000000000000
[ 120.056283][ C17] RDX: 0000000000000005 RSI: 0000000000000000 RDI: 000055f6e61615d0
[ 120.056284][ C17] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 120.056284][ C17] R10: 0000000000000068 R11: 0000000000000293 R12: 00007f76e0315c70
[ 120.056285][ C17] R13: 0000000000000011 R14: 00007f76e030c420 R15: 00007f76e030c400
[ 120.056287][ C17] </TASK>
[ 120.056288][ C17] Kernel panic - not syncing: softlockup: hung tasks
[ 120.410327][ C17] CPU: 17 UID: 0 PID: 4939 Comm: schbench Tainted: G S L 6.14.0-01502-gc07c4e0c013d #1 VOLUNTARY
[ 120.422640][ C17] Tainted: [S]=CPU_OUT_OF_SPEC, [L]=SOFTLOCKUP
[ 120.428974][ C17] Hardware name: Intel Corporation M50CYP2SB1U/M50CYP2SB1U, BIOS SE5C620.86B.01.01.0003.2104260124 04/26/2021
[ 120.441111][ C17] Call Trace:
[ 120.444577][ C17] <IRQ>
[ 120.447593][ C17] panic (kernel/panic.c:354)
[ 120.451654][ C17] watchdog_timer_fn (kernel/watchdog.c:733)
[ 120.456739][ C17] ? __pfx_watchdog_timer_fn (kernel/watchdog.c:683)
[ 120.462344][ C17] __hrtimer_run_queues (kernel/time/hrtimer.c:1799 kernel/time/hrtimer.c:1863)
[ 120.467684][ C17] hrtimer_interrupt (kernel/time/hrtimer.c:1960)
[ 120.472753][ C17] __sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1038 arch/x86/kernel/apic/apic.c:1055)
[ 120.478688][ C17] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1049 arch/x86/kernel/apic/apic.c:1049)
[ 120.484437][ C17] </IRQ>
[ 120.487494][ C17] <TASK>
[ 120.490535][ C17] asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:702)
[ 120.496622][ C17] RIP: 0010:native_queued_spin_lock_slowpath (kernel/locking/qspinlock.c:474)
[ 120.503754][ C17] Code: c1 e9 12 83 e0 03 83 e9 01 48 c1 e0 05 48 63 c9 48 05 80 2b e5 83 48 03 04 cd e0 cc bc 82 48 89 10 8b 42 08 85 c0 75 09 f3 90 <8b> 42 08 85 c0 74 f7 48 8b 0a 48 85 c9 74 90 0f 0d 09 eb 91 8b 03
All code
========
0: c1 e9 12 shr $0x12,%ecx
3: 83 e0 03 and $0x3,%eax
6: 83 e9 01 sub $0x1,%ecx
9: 48 c1 e0 05 shl $0x5,%rax
d: 48 63 c9 movslq %ecx,%rcx
10: 48 05 80 2b e5 83 add $0xffffffff83e52b80,%rax
16: 48 03 04 cd e0 cc bc add -0x7d433320(,%rcx,8),%rax
1d: 82
1e: 48 89 10 mov %rdx,(%rax)
21: 8b 42 08 mov 0x8(%rdx),%eax
24: 85 c0 test %eax,%eax
26: 75 09 jne 0x31
28: f3 90 pause
2a:* 8b 42 08 mov 0x8(%rdx),%eax <-- trapping instruction
2d: 85 c0 test %eax,%eax
2f: 74 f7 je 0x28
31: 48 8b 0a mov (%rdx),%rcx
34: 48 85 c9 test %rcx,%rcx
37: 74 90 je 0xffffffffffffffc9
39: 0f 0d 09 prefetchw (%rcx)
3c: eb 91 jmp 0xffffffffffffffcf
3e: 8b 03 mov (%rbx),%eax
Code starting with the faulting instruction
===========================================
0: 8b 42 08 mov 0x8(%rdx),%eax
3: 85 c0 test %eax,%eax
5: 74 f7 je 0xfffffffffffffffe
7: 48 8b 0a mov (%rdx),%rcx
a: 48 85 c9 test %rcx,%rcx
d: 74 90 je 0xffffffffffffff9f
f: 0f 0d 09 prefetchw (%rcx)
12: eb 91 jmp 0xffffffffffffffa5
14: 8b 03 mov (%rbx),%eax
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250328/202503280925.27fefb28-lkp@intel.com
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
next reply other threads:[~2025-03-28 1:24 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-28 1:24 kernel test robot [this message]
2025-04-02 12:41 ` [peterz-queue:sched/hrtick] [entry,hrtimer,x86] c07c4e0c01: BUG:soft_lockup-CPU##stuck_for#s![schbench:#] Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202503280925.27fefb28-lkp@intel.com \
--to=oliver.sang@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@intel.com \
--cc=oe-lkp@lists.linux.dev \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.