All of lore.kernel.org
 help / color / mirror / Atom feed
From: kernel test robot <oliver.sang@intel.com>
To: Kairui Song <kasong@tencent.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
	<linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	Baoquan He <bhe@redhat.com>, Barry Song <v-songbaohua@oppo.com>,
	Chris Li <chrisl@kernel.org>,
	"Huang, Ying" <ying.huang@linux.alibaba.com>,
	Hugh Dickins <hughd@google.com>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	Kalesh Singh <kaleshsingh@google.com>,
	"Matthew Wilcow" <willy@infradead.org>,
	Nhat Pham <nphamcs@gmail.com>,
	Yosry Ahmed <yosryahmed@google.com>, <linux-mm@kvack.org>,
	<oliver.sang@intel.com>
Subject: [linus:master] [mm, swap]  b487a2da35: BUG:soft_lockup-CPU##stuck_for#s![stress-ng-swap:#]
Date: Fri, 16 May 2025 14:36:39 +0800	[thread overview]
Message-ID: <202505161438.9009cf47-lkp@intel.com> (raw)



Hello,

kernel test robot noticed "BUG:soft_lockup-CPU##stuck_for#s![stress-ng-swap:#]" on:

commit: b487a2da3575b6cdfb6d6559311830c8fea70bb9 ("mm, swap: simplify folio swap allocation")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[test failed on linus/master      fee3e843b309444f48157e2188efa6818bae85cf]
[test failed on linux-next/master 484803582c77061b470ac64a634f25f89715be3f]

in testcase: stress-ng
version: stress-ng-x86_64-f76f86ffb-1_20250412
with following parameters:

	nr_threads: 100%
	disk: 1HDD
	testtime: 60s
	fs: xfs
	test: swap
	cpufreq_governor: performance



config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202505161438.9009cf47-lkp@intel.com


[   76.825919][   C17] watchdog: BUG: soft lockup - CPU#17 stuck for 26s! [stress-ng-swap:3017]
[   76.825921][   C17] Modules linked in: xfs ipmi_ssif intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common i10nm_edac skx_edac_common nfit libnvdimm x86_pkg_temp_thermal coretemp kvm_intel btrfs kvm sd_mod sg snd_pcm blake2b_generic ghash_clmulni_intel xor dax_hmem ast raid6_pq snd_timer rapl cxl_acpi drm_client_lib ahci intel_cstate cxl_port drm_shmem_helper snd libahci cxl_core intel_th_gth nvme acpi_power_meter isst_if_mbox_pci isst_if_mmio soundcore intel_th_pci mei_me intel_uncore ioatdma i2c_i801 megaraid_sas ipmi_si einj acpi_ipmi pcspkr libata drm_kms_helper nvme_core mei isst_if_common intel_th intel_pch_thermal i2c_smbus intel_vsec dca wmi ipmi_devintf ipmi_msghandler joydev binfmt_misc drm fuse dm_mod loop ip_tables
[   76.825949][   C17] CPU: 17 UID: 0 PID: 3017 Comm: stress-ng-swap Not tainted 6.14.0-rc6-00228-gb487a2da3575 #1
[   76.825951][   C17] Hardware name: Inspur NF5180M6/NF5180M6, BIOS 06.00.04 04/12/2022
[ 76.825952][ C17] RIP: 0010:shmem_unuse (include/linux/list.h:119 include/linux/list.h:215 include/linux/list.h:287 mm/shmem.c:1497) 
[ 76.825959][ C17] Code: 30 48 8d 6b 30 4c 8d 60 d0 48 81 fd 20 e3 26 83 0f 84 8a 00 00 00 48 83 7b 18 00 75 89 48 8b 43 38 48 8b 53 30 48 39 28 75 6c <48> 39 6a 08 75 66 48 89 42 08 48 89 10 48 89 6b 30 48 89 6b 38 4c
All code
========
   0:	30 48 8d             	xor    %cl,-0x73(%rax)
   3:	6b 30 4c             	imul   $0x4c,(%rax),%esi
   6:	8d 60 d0             	lea    -0x30(%rax),%esp
   9:	48 81 fd 20 e3 26 83 	cmp    $0xffffffff8326e320,%rbp
  10:	0f 84 8a 00 00 00    	je     0xa0
  16:	48 83 7b 18 00       	cmpq   $0x0,0x18(%rbx)
  1b:	75 89                	jne    0xffffffffffffffa6
  1d:	48 8b 43 38          	mov    0x38(%rbx),%rax
  21:	48 8b 53 30          	mov    0x30(%rbx),%rdx
  25:	48 39 28             	cmp    %rbp,(%rax)
  28:	75 6c                	jne    0x96
  2a:*	48 39 6a 08          	cmp    %rbp,0x8(%rdx)		<-- trapping instruction
  2e:	75 66                	jne    0x96
  30:	48 89 42 08          	mov    %rax,0x8(%rdx)
  34:	48 89 10             	mov    %rdx,(%rax)
  37:	48 89 6b 30          	mov    %rbp,0x30(%rbx)
  3b:	48 89 6b 38          	mov    %rbp,0x38(%rbx)
  3f:	4c                   	rex.WR

Code starting with the faulting instruction
===========================================
   0:	48 39 6a 08          	cmp    %rbp,0x8(%rdx)
   4:	75 66                	jne    0x6c
   6:	48 89 42 08          	mov    %rax,0x8(%rdx)
   a:	48 89 10             	mov    %rdx,(%rax)
   d:	48 89 6b 30          	mov    %rbp,0x30(%rbx)
  11:	48 89 6b 38          	mov    %rbp,0x38(%rbx)
  15:	4c                   	rex.WR
[   76.825961][   C17] RSP: 0018:ffa00000232ffce0 EFLAGS: 00000246
[   76.825964][   C17] RAX: ff110020c063bde0 RBX: ff110020c063bdb0 RCX: 0000000000000006
[   76.825965][   C17] RDX: ff110020c063bde0 RSI: 0000000000000000 RDI: ff110020c063be2c
[   76.825966][   C17] RBP: ff110020c063bde0 R08: ffffffffffffffc0 R09: 0000000000000000
[   76.825967][   C17] R10: 000000000000ffff R11: ffa000000cafa000 R12: ff110020c063bdb0
[   76.825968][   C17] R13: ff110020c063be2c R14: 0000000000000014 R15: ff110020c063bdb0
[   76.825969][   C17] FS:  00007fcb691d2580(0000) GS:ff11003fbac80000(0000) knlGS:0000000000000000
[   76.825970][   C17] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   76.825971][   C17] CR2: 00007fcb6490f000 CR3: 00000001897e4006 CR4: 0000000000773ef0
[   76.825972][   C17] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   76.825973][   C17] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   76.825973][   C17] PKRU: 55555554
[   76.825974][   C17] Call Trace:
[   76.825975][   C17]  <IRQ>
[ 76.825976][ C17] ? watchdog_timer_fn (kernel/watchdog.c:770) 
[ 76.825980][ C17] ? __pfx_watchdog_timer_fn (kernel/watchdog.c:685) 
[ 76.825981][ C17] ? __hrtimer_run_queues (kernel/time/hrtimer.c:1801 kernel/time/hrtimer.c:1865) 
[ 76.825985][ C17] ? hrtimer_interrupt (kernel/time/hrtimer.c:1930) 
[ 76.825986][ C17] ? __sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1038 arch/x86/kernel/apic/apic.c:1055) 
[ 76.825989][ C17] ? sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1049 arch/x86/kernel/apic/apic.c:1049) 
[   76.825992][   C17]  </IRQ>
[   76.825992][   C17]  <TASK>
[ 76.825993][ C17] ? asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:702) 
[ 76.825997][ C17] ? shmem_unuse (include/linux/list.h:119 include/linux/list.h:215 include/linux/list.h:287 mm/shmem.c:1497) 
[ 76.825999][ C17] try_to_unuse (mm/swapfile.c:2295) 
[ 76.826003][ C17] __do_sys_swapoff (mm/swapfile.c:2736) 
[ 76.826005][ C17] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) 
[ 76.826008][ C17] ? __rseq_handle_notify_resume (kernel/rseq.c:420) 
[ 76.826012][ C17] ? syscall_exit_to_user_mode (include/linux/rseq.h:38 include/linux/resume_user_mode.h:62 kernel/entry/common.c:114 include/linux/entry-common.h:329 kernel/entry/common.c:207 kernel/entry/common.c:218) 
[ 76.826013][ C17] ? do_syscall_64 (arch/x86/entry/common.c:102) 
[ 76.826015][ C17] ? __perf_sw_event (kernel/events/internal.h:229 kernel/events/core.c:10210 kernel/events/core.c:10235) 
[ 76.826016][ C17] ? handle_mm_fault (include/linux/perf_event.h:1503 mm/memory.c:6106 mm/memory.c:6233) 
[ 76.826020][ C17] ? do_user_addr_fault (arch/x86/mm/fault.c:1338) 
[ 76.826022][ C17] ? clear_bhb_loop (arch/x86/entry/entry_64.S:1538) 
[ 76.826024][ C17] ? clear_bhb_loop (arch/x86/entry/entry_64.S:1538) 
[ 76.826026][ C17] ? clear_bhb_loop (arch/x86/entry/entry_64.S:1538) 
[ 76.826028][ C17] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) 
[   76.826030][   C17] RIP: 0033:0x7fcb69976fc7
[ 76.826031][ C17] Code: 73 01 c3 48 8b 0d 39 2e 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 a8 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 09 2e 0d 00 f7 d8 64 89 01 48
All code
========
   0:	73 01                	jae    0x3
   2:	c3                   	ret
   3:	48 8b 0d 39 2e 0d 00 	mov    0xd2e39(%rip),%rcx        # 0xd2e43
   a:	f7 d8                	neg    %eax
   c:	64 89 01             	mov    %eax,%fs:(%rcx)
   f:	48 83 c8 ff          	or     $0xffffffffffffffff,%rax
  13:	c3                   	ret
  14:	66 2e 0f 1f 84 00 00 	cs nopw 0x0(%rax,%rax,1)
  1b:	00 00 00 
  1e:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  23:	b8 a8 00 00 00       	mov    $0xa8,%eax
  28:	0f 05                	syscall
  2a:*	48 3d 01 f0 ff ff    	cmp    $0xfffffffffffff001,%rax		<-- trapping instruction
  30:	73 01                	jae    0x33
  32:	c3                   	ret
  33:	48 8b 0d 09 2e 0d 00 	mov    0xd2e09(%rip),%rcx        # 0xd2e43
  3a:	f7 d8                	neg    %eax
  3c:	64 89 01             	mov    %eax,%fs:(%rcx)
  3f:	48                   	rex.W

Code starting with the faulting instruction
===========================================
   0:	48 3d 01 f0 ff ff    	cmp    $0xfffffffffffff001,%rax
   6:	73 01                	jae    0x9
   8:	c3                   	ret
   9:	48 8b 0d 09 2e 0d 00 	mov    0xd2e09(%rip),%rcx        # 0xd2e19
  10:	f7 d8                	neg    %eax
  12:	64 89 01             	mov    %eax,%fs:(%rcx)
  15:	48                   	rex.W
[   76.826032][   C17] RSP: 002b:00007ffd6d92dbd8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a8
[   76.826033][   C17] RAX: ffffffffffffffda RBX: 0000000000000019 RCX: 00007fcb69976fc7
[   76.826034][   C17] RDX: fffffffffffffccc RSI: 000000000003d000 RDI: 00007ffd6d92e690
[   76.826035][   C17] RBP: 00007fcb64910000 R08: 0000000000000007 R09: 0000559500e1f530
[   76.826036][   C17] R10: c7c9ff097bf742d3 R11: 0000000000000206 R12: 00007fcb691d2180
[   76.826036][   C17] R13: 00007ffd6d92e690 R14: 0000000000000001 R15: 0000000000000000
[   76.826037][   C17]  </TASK>
[   76.826038][   C17] Kernel panic - not syncing: softlockup: hung tasks
[   77.239533][   C17] CPU: 17 UID: 0 PID: 3017 Comm: stress-ng-swap Tainted: G             L     6.14.0-rc6-00228-gb487a2da3575 #1
[   77.251353][   C17] Tainted: [L]=SOFTLOCKUP
[   77.255672][   C17] Hardware name: Inspur NF5180M6/NF5180M6, BIOS 06.00.04 04/12/2022
[   77.263635][   C17] Call Trace:
[   77.266915][   C17]  <IRQ>
[ 77.269756][ C17] panic (kernel/panic.c:354) 
[ 77.273643][ C17] watchdog_timer_fn (kernel/watchdog.c:735) 
[ 77.278568][ C17] ? __pfx_watchdog_timer_fn (kernel/watchdog.c:685) 
[ 77.284015][ C17] __hrtimer_run_queues (kernel/time/hrtimer.c:1801 kernel/time/hrtimer.c:1865) 
[ 77.289201][ C17] hrtimer_interrupt (kernel/time/hrtimer.c:1930) 
[ 77.294040][ C17] __sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1038 arch/x86/kernel/apic/apic.c:1055) 
[ 77.299835][ C17] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1049 arch/x86/kernel/apic/apic.c:1049) 
[   77.305454][   C17]  </IRQ>
[   77.308382][   C17]  <TASK>
[ 77.311313][ C17] asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:702) 
[ 77.317275][ C17] RIP: 0010:shmem_unuse (include/linux/list.h:119 include/linux/list.h:215 include/linux/list.h:287 mm/shmem.c:1497) 
[ 77.322373][ C17] Code: 30 48 8d 6b 30 4c 8d 60 d0 48 81 fd 20 e3 26 83 0f 84 8a 00 00 00 48 83 7b 18 00 75 89 48 8b 43 38 48 8b 53 30 48 39 28 75 6c <48> 39 6a 08 75 66 48 89 42 08 48 89 10 48 89 6b 30 48 89 6b 38 4c
All code
========
   0:	30 48 8d             	xor    %cl,-0x73(%rax)
   3:	6b 30 4c             	imul   $0x4c,(%rax),%esi
   6:	8d 60 d0             	lea    -0x30(%rax),%esp
   9:	48 81 fd 20 e3 26 83 	cmp    $0xffffffff8326e320,%rbp
  10:	0f 84 8a 00 00 00    	je     0xa0
  16:	48 83 7b 18 00       	cmpq   $0x0,0x18(%rbx)
  1b:	75 89                	jne    0xffffffffffffffa6
  1d:	48 8b 43 38          	mov    0x38(%rbx),%rax
  21:	48 8b 53 30          	mov    0x30(%rbx),%rdx
  25:	48 39 28             	cmp    %rbp,(%rax)
  28:	75 6c                	jne    0x96
  2a:*	48 39 6a 08          	cmp    %rbp,0x8(%rdx)		<-- trapping instruction
  2e:	75 66                	jne    0x96
  30:	48 89 42 08          	mov    %rax,0x8(%rdx)
  34:	48 89 10             	mov    %rdx,(%rax)
  37:	48 89 6b 30          	mov    %rbp,0x30(%rbx)
  3b:	48 89 6b 38          	mov    %rbp,0x38(%rbx)
  3f:	4c                   	rex.WR


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250516/202505161438.9009cf47-lkp@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


             reply	other threads:[~2025-05-16  6:36 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-16  6:36 kernel test robot [this message]
2025-05-16  7:19 ` [linus:master] [mm, swap] b487a2da35: BUG:soft_lockup-CPU##stuck_for#s![stress-ng-swap:#] Kairui Song
2025-05-16  7:39   ` Kemeng Shi
2025-05-16  7:45     ` Kairui Song
2025-05-16  8:13       ` Kemeng Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202505161438.9009cf47-lkp@intel.com \
    --to=oliver.sang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bhe@redhat.com \
    --cc=chrisl@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kaleshsingh@google.com \
    --cc=kasong@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=nphamcs@gmail.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=v-songbaohua@oppo.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@linux.alibaba.com \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.