public inbox for kdevops@lists.linux.dev
 help / color / mirror / Atom feed
* [linus:master] [x86/module]  5185e7f9f3: BUG:soft_lockup-CPU##stuck_for#s![perf:#]
@ 2024-12-12 14:36 kernel test robot
  0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2024-12-12 14:36 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: oe-lkp, lkp, linux-kernel, Andrew Morton, Luis Chamberlain,
	kdevops, Andreas Larsson, Andy Lutomirski, Ard Biesheuvel,
	Arnd Bergmann, Borislav Petkov, Brian Cain, Catalin Marinas,
	Christophe Leroy, Christoph Hellwig, Dave Hansen, Dinh Nguyen,
	Geert Uytterhoeven, Guo Ren, Helge Deller, Huacai Chen,
	Ingo Molnar, Johannes Berg, John Paul Adrian Glaubitz,
	Kent Overstreet, Liam R. Howlett, Mark Rutland, Masami Hiramatsu,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Simek,
	Oleg Nesterov, Palmer Dabbelt, Peter Zijlstra, Richard Weinberger,
	Russell King, Song Liu, Stafford Horne, Steven Rostedt,
	Suren Baghdasaryan, Thomas Bogendoerfer, Thomas Gleixner,
	Uladzislau Rezki, Vineet Gupta, Will Deacon, oliver.sang



Hello,

kernel test robot noticed "BUG:soft_lockup-CPU##stuck_for#s![perf:#]" on:

commit: 5185e7f9f3bd754ab60680814afd714e2673ef88 ("x86/module: enable ROX caches for module text on 64 bit")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[test failed on linus/master      7503345ac5f5e82fd9a36d6e6b447c016376403a]
[test failed on linux-next/master ebe1b11614e079c5e366ce9bd3c8f44ca0fbcc1b]

in testcase: lkvs
version: lkvs-x86_64-2187c57-1_20241102
with following parameters:

	test: pt



config: x86_64-dcg_x86_64_defconfig-func
compiler: gcc-12
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480+ (Sapphire Rapids) with 256G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202412122201.67d21c2b-lkp@intel.com


[  737.450753][   C63] watchdog: BUG: soft lockup - CPU#63 stuck for 26s! [perf:95490]
[  737.460225][   C63] Modules linked in: intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common intel_ifs i10nm_edac skx_edac_common nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp dax_hmem ofpart iTCO_wdt cxl_acpi qat_4xxx intel_pmc_bxt kvm_intel spi_nor cxl_port pmt_telemetry iTCO_vendor_support mtd ipmi_ssif kvm isst_if_mbox_pci intel_th_gth isst_if_mmio intel_sdsi pmt_class intel_qat i2c_i801 cxl_core mei_me spi_intel_pci pinctrl_emmitsburg ast crc32c_intel einj pinctrl_intel intel_th_pci dh_generic drm_shmem_helper cdc_ether isst_if_common idxd mei i2c_smbus crc8 intel_vsec intel_th i2c_ismt spi_intel ipmi_si joydev pwm_lpss acpi_power_meter btrfs binfmt_misc fuse ip_tables
[  737.536652][   C63] CPU: 63 UID: 0 PID: 95490 Comm: perf Tainted: G S                 6.12.0-rc6-00142-g5185e7f9f3bd #1
[  737.549630][   C63] Tainted: [S]=CPU_OUT_OF_SPEC
[  737.555668][   C63] Hardware name: Intel Corporation D50DNP1SBB/D50DNP1SBB, BIOS SE5C7411.86B.8118.D04.2206151341 06/15/2022
[ 737.569144][ C63] RIP: 0010:find_vmap_area_exceed_addr_lock (mm/vmalloc.c:1034 mm/vmalloc.c:1066)
[ 737.577511][ C63] Code: 89 f8 48 c1 e8 03 42 80 3c 38 00 0f 85 62 02 00 00 48 8b 5b 10 48 85 db 74 3b 48 8d 7b f8 48 89 f8 48 c1 e8 03 42 80 3c 38 00 <0f> 85 f1 01 00 00 4c 3b 73 f8 72 a9 48 8d 7b 08 48 89 f8 48 c1 e8
All code
========
   0:	89 f8                	mov    %edi,%eax
   2:	48 c1 e8 03          	shr    $0x3,%rax
   6:	42 80 3c 38 00       	cmpb   $0x0,(%rax,%r15,1)
   b:	0f 85 62 02 00 00    	jne    0x273
  11:	48 8b 5b 10          	mov    0x10(%rbx),%rbx
  15:	48 85 db             	test   %rbx,%rbx
  18:	74 3b                	je     0x55
  1a:	48 8d 7b f8          	lea    -0x8(%rbx),%rdi
  1e:	48 89 f8             	mov    %rdi,%rax
  21:	48 c1 e8 03          	shr    $0x3,%rax
  25:	42 80 3c 38 00       	cmpb   $0x0,(%rax,%r15,1)
  2a:*	0f 85 f1 01 00 00    	jne    0x221		<-- trapping instruction
  30:	4c 3b 73 f8          	cmp    -0x8(%rbx),%r14
  34:	72 a9                	jb     0xffffffffffffffdf
  36:	48 8d 7b 08          	lea    0x8(%rbx),%rdi
  3a:	48 89 f8             	mov    %rdi,%rax
  3d:	48                   	rex.W
  3e:	c1                   	.byte 0xc1
  3f:	e8                   	.byte 0xe8

Code starting with the faulting instruction
===========================================
   0:	0f 85 f1 01 00 00    	jne    0x1f7
   6:	4c 3b 73 f8          	cmp    -0x8(%rbx),%r14
   a:	72 a9                	jb     0xffffffffffffffb5
   c:	48 8d 7b 08          	lea    0x8(%rbx),%rdi
  10:	48 89 f8             	mov    %rdi,%rax
  13:	48                   	rex.W
  14:	c1                   	.byte 0xc1
  15:	e8                   	.byte 0xe8
[  737.600980][   C63] RSP: 0018:ffa0000035d3f7c0 EFLAGS: 00000246
[  737.608491][   C63] RAX: 1fe220043da2917a RBX: ff110021ed148bd8 RCX: ffffffff813cb361
[  737.618177][   C63] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ff110021ed148bd0
[  737.627846][   C63] RBP: 000000000000005e R08: 0000000000000001 R09: fff3fc0006ba7eea
[  737.637506][   C63] R10: 0000000000000003 R11: 00007fece9080fff R12: ffffffffc0801000
[  737.647177][   C63] R13: ff1100010c992bc8 R14: ffffffffc0600000 R15: dffffc0000000000
[  737.656846][   C63] FS:  00007fed2a754840(0000) GS:ff11003fcc180000(0000) knlGS:0000000000000000
[  737.667580][   C63] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  737.675680][   C63] CR2: ff1100013ca00000 CR3: 0000002169956003 CR4: 0000000000f73ef0
[  737.685338][   C63] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  737.694977][   C63] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[  737.704624][   C63] PKRU: 55555554
[  737.709273][   C63] Call Trace:
[  737.713624][   C63]  <IRQ>
[ 737.717447][ C63] ? watchdog_timer_fn (kernel/watchdog.c:762)
[ 737.723792][ C63] ? __pfx_watchdog_timer_fn (kernel/watchdog.c:677)
[ 737.730545][ C63] ? __hrtimer_run_queues (kernel/time/hrtimer.c:1691 kernel/time/hrtimer.c:1755)
[ 737.737187][ C63] ? __pfx___hrtimer_run_queues (kernel/time/hrtimer.c:1725)
[ 737.744234][ C63] ? ktime_get_update_offsets_now (kernel/time/timekeeping.c:195 (discriminator 3) kernel/time/timekeeping.c:395 (discriminator 3) kernel/time/timekeeping.c:403 (discriminator 3) kernel/time/timekeeping.c:2449 (discriminator 3))
[ 737.751558][ C63] ? hrtimer_interrupt (kernel/time/hrtimer.c:1820)
[ 737.757924][ C63] ? __sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1038 arch/x86/kernel/apic/apic.c:1055)
[ 737.765354][ C63] ? sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1049 arch/x86/kernel/apic/apic.c:1049)
[  737.772485][   C63]  </IRQ>
[  737.776390][   C63]  <TASK>
[ 737.780307][ C63] ? asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:702)
[  737.787823][   C63]  ? 0xffffffffc0600000
[ 737.793079][ C63] ? do_raw_spin_lock (arch/x86/include/asm/atomic.h:107 include/linux/atomic/atomic-arch-fallback.h:2170 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 kernel/locking/spinlock_debug.c:116)
[ 737.799294][ C63] ? find_vmap_area_exceed_addr_lock (mm/vmalloc.c:1034 mm/vmalloc.c:1066)
[  737.806842][   C63]  ? 0xffffffffc0600000
[  737.812040][   C63]  ? 0xffffffffc0600000
[ 737.817228][ C63] vread_iter (mm/vmalloc.c:4354)
[ 737.822483][ C63] ? __pfx_fault_in_safe_writeable (mm/gup.c:2185)
[ 737.829690][ C63] ? __pfx_vread_iter (mm/vmalloc.c:4337)
[  737.835622][   C63]  ? 0xffffffffc0600000
[  737.840761][   C63]  ? 0xffffffffc0600000
[ 737.845890][ C63] read_kcore_iter (fs/proc/kcore.c:534)
[  737.851801][   C63]  ? 0xffffffffc0600000
[ 737.856897][ C63] ? __pfx_read_kcore_iter (fs/proc/kcore.c:325)
[ 737.863261][ C63] ? __filemap_add_folio (mm/filemap.c:943)
[ 737.869597][ C63] ? __pfx___filemap_add_folio (mm/filemap.c:852)
[ 737.876300][ C63] ? __pfx_workingset_update_node (mm/workingset.c:617)
[ 737.883288][ C63] ? preempt_count_add (include/linux/ftrace.h:976 kernel/sched/core.c:5777 kernel/sched/core.c:5774 kernel/sched/core.c:5802)
[ 737.889298][ C63] ? __folio_batch_add_and_move (arch/x86/include/asm/preempt.h:103 mm/swap.c:246)
[ 737.896253][ C63] ? preempt_count_add (include/linux/ftrace.h:976 kernel/sched/core.c:5777 kernel/sched/core.c:5774 kernel/sched/core.c:5802)
[ 737.902233][ C63] ? copy_page_from_iter_atomic (include/linux/highmem-internal.h:234 lib/iov_iter.c:484)
[ 737.909273][ C63] ? __vfs_getxattr (fs/xattr.c:419)
[ 737.914937][ C63] ? __pfx_copy_page_from_iter_atomic (lib/iov_iter.c:462)
[ 737.922254][ C63] ? simple_write_end (arch/x86/include/asm/atomic.h:67 include/linux/atomic/atomic-arch-fallback.h:2278 include/linux/atomic/atomic-instrumented.h:1384 include/linux/page_ref.h:205 include/linux/mm.h:1141 include/linux/mm.h:1146 include/linux/mm.h:1477 fs/libfs.c:985)
[ 737.928198][ C63] ? generic_perform_write (mm/filemap.c:4077)
[ 737.934609][ C63] ? __pfx___fsnotify_parent (fs/notify/fsnotify.c:216)
[ 737.941037][ C63] ? file_update_time (fs/inode.c:2272)
[ 737.946882][ C63] ? preempt_count_add (include/linux/ftrace.h:976 kernel/sched/core.c:5777 kernel/sched/core.c:5774 kernel/sched/core.c:5802)
[ 737.952802][ C63] proc_reg_read_iter (fs/proc/inode.c:299)
[ 737.958732][ C63] vfs_read (fs/read_write.c:488 fs/read_write.c:569)
[ 737.963662][ C63] ? __pfx_vfs_read (fs/read_write.c:550)
[ 737.969169][ C63] ? __asan_memset (mm/kasan/shadow.c:84)
[ 737.974591][ C63] ? preempt_count_add (include/linux/ftrace.h:976 kernel/sched/core.c:5777 kernel/sched/core.c:5774 kernel/sched/core.c:5802)
[ 737.980499][ C63] ? fdget_pos (arch/x86/include/asm/atomic64_64.h:15 include/linux/atomic/atomic-arch-fallback.h:2583 include/linux/atomic/atomic-long.h:38 include/linux/atomic/atomic-instrumented.h:3189 fs/file.c:1150 fs/file.c:1158)
[ 737.985704][ C63] ksys_read (fs/read_write.c:713)
[ 737.990625][ C63] ? __pfx_ksys_read (fs/read_write.c:702)
[ 737.996234][ C63] ? fpregs_assert_state_consistent (arch/x86/kernel/fpu/context.h:38 arch/x86/kernel/fpu/core.c:822)
[ 738.003307][ C63] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
[ 738.008632][ C63] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
[  738.015535][   C63] RIP: 0033:0x7fed2b67b19d
[ 738.020760][ C63] Code: 31 c0 e9 c6 fe ff ff 50 48 8d 3d 66 54 0a 00 e8 49 ff 01 00 66 0f 1f 84 00 00 00 00 00 80 3d 41 24 0e 00 00 74 17 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 5b c3 66 2e 0f 1f 84 00 00 00 00 00 48 83 ec
All code
========
   0:	31 c0                	xor    %eax,%eax
   2:	e9 c6 fe ff ff       	jmp    0xfffffffffffffecd
   7:	50                   	push   %rax
   8:	48 8d 3d 66 54 0a 00 	lea    0xa5466(%rip),%rdi        # 0xa5475
   f:	e8 49 ff 01 00       	call   0x1ff5d
  14:	66 0f 1f 84 00 00 00 	nopw   0x0(%rax,%rax,1)
  1b:	00 00 
  1d:	80 3d 41 24 0e 00 00 	cmpb   $0x0,0xe2441(%rip)        # 0xe2465
  24:	74 17                	je     0x3d
  26:	31 c0                	xor    %eax,%eax
  28:	0f 05                	syscall
  2a:*	48 3d 00 f0 ff ff    	cmp    $0xfffffffffffff000,%rax		<-- trapping instruction
  30:	77 5b                	ja     0x8d
  32:	c3                   	ret
  33:	66 2e 0f 1f 84 00 00 	cs nopw 0x0(%rax,%rax,1)
  3a:	00 00 00 
  3d:	48                   	rex.W
  3e:	83                   	.byte 0x83
  3f:	ec                   	in     (%dx),%al

Code starting with the faulting instruction
===========================================
   0:	48 3d 00 f0 ff ff    	cmp    $0xfffffffffffff000,%rax
   6:	77 5b                	ja     0x63
   8:	c3                   	ret
   9:	66 2e 0f 1f 84 00 00 	cs nopw 0x0(%rax,%rax,1)
  10:	00 00 00 
  13:	48                   	rex.W
  14:	83                   	.byte 0x83
  15:	ec                   	in     (%dx),%al


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241212/202412122201.67d21c2b-lkp@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2024-12-12 14:37 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-12 14:36 [linus:master] [x86/module] 5185e7f9f3: BUG:soft_lockup-CPU##stuck_for#s![perf:#] kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox