All of lore.kernel.org
 help / color / mirror / Atom feed
From: kernel test robot <oliver.sang@intel.com>
To: Qi Zheng <qi.zheng@linux.dev>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
	Qi Zheng <zhengqi.arch@bytedance.com>, Zi Yan <ziy@nvidia.com>,
	David Hildenbrand <david@redhat.com>, <linux-mm@kvack.org>,
	<hannes@cmpxchg.org>, <hughd@google.com>, <mhocko@suse.com>,
	<roman.gushchin@linux.dev>, <shakeel.butt@linux.dev>,
	<muchun.song@linux.dev>, <lorenzo.stoakes@oracle.com>,
	<harry.yoo@oracle.com>, <baolin.wang@linux.alibaba.com>,
	<Liam.Howlett@oracle.com>, <npache@redhat.com>,
	<ryan.roberts@arm.com>, <dev.jain@arm.com>, <baohua@kernel.org>,
	<lance.yang@linux.dev>, <akpm@linux-foundation.org>,
	<linux-kernel@vger.kernel.org>, <cgroups@vger.kernel.org>,
	Muchun Song <songmuchun@bytedance.com>, <oliver.sang@intel.com>
Subject: Re: [PATCH v4 3/4] mm: thp: use folio_batch to handle THP splitting in deferred_split_scan()
Date: Tue, 14 Oct 2025 22:25:11 +0800	[thread overview]
Message-ID: <202510142245.b857a693-lkp@intel.com> (raw)
In-Reply-To: <304df1ad1e8180e102c4d6931733bcc77774eb9e.1759510072.git.zhengqi.arch@bytedance.com>



Hello,

kernel test robot noticed "BUG:soft_lockup-CPU##stuck_for#s![#:#]" on:

commit: 9d46e7734bc76dcb83ab0591de7f7ba94234140a ("[PATCH v4 3/4] mm: thp: use folio_batch to handle THP splitting in deferred_split_scan()")
url: https://github.com/intel-lab-lkp/linux/commits/Qi-Zheng/mm-thp-replace-folio_memcg-with-folio_memcg_charged/20251004-005605
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git e406d57be7bd2a4e73ea512c1ae36a40a44e499e
patch link: https://lore.kernel.org/all/304df1ad1e8180e102c4d6931733bcc77774eb9e.1759510072.git.zhengqi.arch@bytedance.com/
patch subject: [PATCH v4 3/4] mm: thp: use folio_batch to handle THP splitting in deferred_split_scan()

in testcase: xfstests
version: xfstests-x86_64-5a9cd3ef-1_20250910
with following parameters:

	disk: 4HDD
	fs: f2fs
	test: generic-group-39



config: x86_64-rhel-9.4-func
compiler: gcc-14
test machine: 8 threads Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz (Skylake) with 16G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202510142245.b857a693-lkp@intel.com


[  102.427367][    C6] watchdog: BUG: soft lockup - CPU#6 stuck for 26s! [391:4643]
[  102.427373][    C6] Modules linked in: dm_mod f2fs binfmt_misc btrfs snd_hda_codec_intelhdmi blake2b_generic snd_hda_codec_hdmi xor zstd_compress intel_rapl_msr intel_rapl_common snd_ctl_led raid6_pq snd_hda_codec_alc269 snd_hda_scodec_component x86_pkg_temp_thermal snd_hda_codec_realtek_lib intel_powerclamp snd_hda_codec_generic coretemp sd_mod snd_hda_intel kvm_intel sg snd_soc_avs snd_soc_hda_codec snd_hda_ext_core i915 snd_hda_codec kvm intel_gtt snd_hda_core drm_buddy snd_intel_dspcfg ttm snd_intel_sdw_acpi snd_hwdep drm_display_helper snd_soc_core irqbypass cec ghash_clmulni_intel drm_client_lib snd_compress mei_wdt drm_kms_helper snd_pcm ahci rapl wmi_bmof mei_me libahci snd_timer intel_cstate i2c_i801 video snd libata soundcore pcspkr intel_uncore mei serio_raw i2c_smbus intel_pmc_core intel_pch_thermal pmt_telemetry pmt_discovery pmt_class wmi intel_pmc_ssram_telemetry acpi_pad intel_vsec drm fuse nfnetlink
[  102.427441][    C6] CPU: 6 UID: 0 PID: 4643 Comm: 391 Tainted: G S                  6.17.0-09102-g9d46e7734bc7 #1 PREEMPT(voluntary)
[  102.427446][    C6] Tainted: [S]=CPU_OUT_OF_SPEC
[  102.427448][    C6] Hardware name: HP HP Z240 SFF Workstation/802E, BIOS N51 Ver. 01.63 10/05/2017
[  102.427449][    C6] RIP: 0010:_raw_spin_unlock_irqrestore (include/linux/spinlock_api_smp.h:152 (discriminator 2) kernel/locking/spinlock.c:194 (discriminator 2))
[  102.427456][    C6] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 c6 07 00 0f 1f 00 f7 c6 00 02 00 00 74 01 fb 65 ff 0d 65 de 2d 03 <74> 05 c3 cc cc cc cc 0f 1f 44 00 00 c3 cc cc cc cc 0f 1f 40 00 90
All code
========
   0:	90                   	nop
   1:	90                   	nop
   2:	90                   	nop
   3:	90                   	nop
   4:	90                   	nop
   5:	90                   	nop
   6:	90                   	nop
   7:	90                   	nop
   8:	90                   	nop
   9:	90                   	nop
   a:	90                   	nop
   b:	90                   	nop
   c:	90                   	nop
   d:	90                   	nop
   e:	90                   	nop
   f:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  14:	c6 07 00             	movb   $0x0,(%rdi)
  17:	0f 1f 00             	nopl   (%rax)
  1a:	f7 c6 00 02 00 00    	test   $0x200,%esi
  20:	74 01                	je     0x23
  22:	fb                   	sti
  23:	65 ff 0d 65 de 2d 03 	decl   %gs:0x32dde65(%rip)        # 0x32dde8f
  2a:*	74 05                	je     0x31		<-- trapping instruction
  2c:	c3                   	ret
  2d:	cc                   	int3
  2e:	cc                   	int3
  2f:	cc                   	int3
  30:	cc                   	int3
  31:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  36:	c3                   	ret
  37:	cc                   	int3
  38:	cc                   	int3
  39:	cc                   	int3
  3a:	cc                   	int3
  3b:	0f 1f 40 00          	nopl   0x0(%rax)
  3f:	90                   	nop

Code starting with the faulting instruction
===========================================
   0:	74 05                	je     0x7
   2:	c3                   	ret
   3:	cc                   	int3
   4:	cc                   	int3
   5:	cc                   	int3
   6:	cc                   	int3
   7:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
   c:	c3                   	ret
   d:	cc                   	int3
   e:	cc                   	int3
   f:	cc                   	int3
  10:	cc                   	int3
  11:	0f 1f 40 00          	nopl   0x0(%rax)
  15:	90                   	nop
[  102.427458][    C6] RSP: 0018:ffffc900028cf6c8 EFLAGS: 00000246
[  102.427461][    C6] RAX: ffff88810cb32900 RBX: ffffc900028cfa38 RCX: ffff88810cb32910
[  102.427463][    C6] RDX: ffff88810cb32900 RSI: 0000000000000246 RDI: ffff88810cb328f8
[  102.427465][    C6] RBP: ffff88810cb32870 R08: 0000000000000001 R09: fffff52000519ece
[  102.427467][    C6] R10: 0000000000000003 R11: ffffc900028cf770 R12: ffffea0008400034
[  102.427469][    C6] R13: dffffc0000000000 R14: ffff88810cb32900 R15: ffff88810cb32870
[  102.427470][    C6] FS:  00007f2303266740(0000) GS:ffff8883f6bb8000(0000) knlGS:0000000000000000
[  102.427473][    C6] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  102.427475][    C6] CR2: 00005651624c9c70 CR3: 00000001e42c6002 CR4: 00000000003726f0
[  102.427477][    C6] Call Trace:
[  102.427478][    C6]  <TASK>
[  102.427480][    C6]  deferred_split_scan (mm/huge_memory.c:4166 mm/huge_memory.c:4240)
[  102.427485][    C6]  ? __list_lru_walk_one (include/linux/spinlock.h:392 mm/list_lru.c:110 mm/list_lru.c:331)
[  102.427490][    C6]  ? __pfx_deferred_split_scan (mm/huge_memory.c:4195)
[  102.427495][    C6]  ? list_lru_count_one (mm/list_lru.c:263 (discriminator 1))
[  102.427498][    C6]  ? super_cache_scan (fs/super.c:232)
[  102.427501][    C6]  ? super_cache_count (fs/super.c:265 (discriminator 1))
[  102.427504][    C6]  do_shrink_slab (mm/shrinker.c:438)
[  102.427509][    C6]  shrink_slab_memcg (mm/shrinker.c:551)
[  102.427513][    C6]  ? __pfx_shrink_slab_memcg (mm/shrinker.c:471)
[  102.427517][    C6]  ? do_shrink_slab (mm/shrinker.c:384)
[  102.427521][    C6]  shrink_slab (mm/shrinker.c:628)
[  102.427524][    C6]  ? __pfx_shrink_slab (mm/shrinker.c:616)
[  102.427528][    C6]  ? mem_cgroup_iter (include/linux/percpu-refcount.h:338 include/linux/percpu-refcount.h:351 include/linux/cgroup_refcnt.h:79 include/linux/cgroup_refcnt.h:76 mm/memcontrol.c:1084)
[  102.427531][    C6]  drop_slab (mm/vmscan.c:435 (discriminator 1) mm/vmscan.c:452 (discriminator 1))
[  102.427534][    C6]  drop_caches_sysctl_handler (include/linux/vmstat.h:68 (discriminator 1) fs/drop_caches.c:69 (discriminator 1) fs/drop_caches.c:51 (discriminator 1))
[  102.427538][    C6]  proc_sys_call_handler (fs/proc/proc_sysctl.c:606)
[  102.427542][    C6]  ? __pfx_proc_sys_call_handler (fs/proc/proc_sysctl.c:555)
[  102.427545][    C6]  ? __pfx_handle_pte_fault (mm/memory.c:6134)
[  102.427548][    C6]  ? rw_verify_area (fs/read_write.c:473)
[  102.427552][    C6]  vfs_write (fs/read_write.c:594 (discriminator 1) fs/read_write.c:686 (discriminator 1))
[  102.427556][    C6]  ? __pfx_vfs_write (fs/read_write.c:667)
[  102.427559][    C6]  ? __pfx_css_rstat_updated (kernel/cgroup/rstat.c:71)
[  102.427563][    C6]  ? fdget_pos (arch/x86/include/asm/atomic64_64.h:15 include/linux/atomic/atomic-arch-fallback.h:2583 include/linux/atomic/atomic-long.h:38 include/linux/atomic/atomic-instrumented.h:3189 include/linux/file_ref.h:215 fs/file.c:1204 fs/file.c:1230)
[  102.427566][    C6]  ? count_memcg_events (arch/x86/include/asm/atomic.h:23 include/linux/atomic/atomic-arch-fallback.h:457 include/linux/atomic/atomic-instrumented.h:33 mm/memcontrol.c:560 mm/memcontrol.c:583 mm/memcontrol.c:564 mm/memcontrol.c:846)
[  102.427568][    C6]  ksys_write (fs/read_write.c:738)
[  102.427572][    C6]  ? __pfx_ksys_write (fs/read_write.c:728)
[  102.427575][    C6]  ? handle_mm_fault (mm/memory.c:6360 mm/memory.c:6513)
[  102.427579][    C6]  do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1))
[  102.427583][    C6]  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
[  102.427586][    C6] RIP: 0033:0x7f23032f8687
[  102.427589][    C6] Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
All code
========
   0:	48 89 fa             	mov    %rdi,%rdx
   3:	4c 89 df             	mov    %r11,%rdi
   6:	e8 58 b3 00 00       	call   0xb363
   b:	8b 93 08 03 00 00    	mov    0x308(%rbx),%edx
  11:	59                   	pop    %rcx
  12:	5e                   	pop    %rsi
  13:	48 83 f8 fc          	cmp    $0xfffffffffffffffc,%rax
  17:	74 1a                	je     0x33
  19:	5b                   	pop    %rbx
  1a:	c3                   	ret
  1b:	0f 1f 84 00 00 00 00 	nopl   0x0(%rax,%rax,1)
  22:	00 
  23:	48 8b 44 24 10       	mov    0x10(%rsp),%rax
  28:	0f 05                	syscall
  2a:*	5b                   	pop    %rbx		<-- trapping instruction
  2b:	c3                   	ret
  2c:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)
  33:	83 e2 39             	and    $0x39,%edx
  36:	83 fa 08             	cmp    $0x8,%edx
  39:	75 de                	jne    0x19
  3b:	e8 23 ff ff ff       	call   0xffffffffffffff63

Code starting with the faulting instruction
===========================================
   0:	5b                   	pop    %rbx
   1:	c3                   	ret
   2:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)
   9:	83 e2 39             	and    $0x39,%edx
   c:	83 fa 08             	cmp    $0x8,%edx
   f:	75 de                	jne    0xffffffffffffffef
  11:	e8 23 ff ff ff       	call   0xffffffffffffff39


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20251014/202510142245.b857a693-lkp@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


  parent reply	other threads:[~2025-10-14 14:26 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-03 16:53 [PATCH v4 0/4] reparent the THP split queue Qi Zheng
2025-10-03 16:53 ` [PATCH v4 1/4] mm: thp: replace folio_memcg() with folio_memcg_charged() Qi Zheng
2025-10-03 16:53 ` [PATCH v4 2/4] mm: thp: introduce folio_split_queue_lock and its variants Qi Zheng
2025-10-03 16:53 ` [PATCH v4 3/4] mm: thp: use folio_batch to handle THP splitting in deferred_split_scan() Qi Zheng
2025-10-06 23:16   ` Shakeel Butt
2025-10-13  7:28     ` Qi Zheng
2025-10-14 14:25   ` kernel test robot [this message]
2025-10-03 16:53 ` [PATCH v4 4/4] mm: thp: reparent the split queue during memcg offline Qi Zheng
2025-10-03 16:58   ` Zi Yan
2025-10-04  7:52   ` Muchun Song
2025-10-06  6:46   ` David Hildenbrand
2025-10-07 17:56   ` Shakeel Butt
2025-10-13  7:29     ` Qi Zheng
2025-10-10 16:25 ` [PATCH v4 0/4] reparent the THP split queue Zi Yan
2025-10-11  0:51   ` Qi Zheng
2025-10-11 18:28     ` Andrew Morton
2025-10-13  7:23   ` Qi Zheng
2025-10-13 16:37     ` Zi Yan
2025-10-14  6:49       ` Qi Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202510142245.b857a693-lkp@intel.com \
    --to=oliver.sang@intel.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=cgroups@vger.kernel.org \
    --cc=david@redhat.com \
    --cc=dev.jain@arm.com \
    --cc=hannes@cmpxchg.org \
    --cc=harry.yoo@oracle.com \
    --cc=hughd@google.com \
    --cc=lance.yang@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=npache@redhat.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=qi.zheng@linux.dev \
    --cc=roman.gushchin@linux.dev \
    --cc=ryan.roberts@arm.com \
    --cc=shakeel.butt@linux.dev \
    --cc=songmuchun@bytedance.com \
    --cc=zhengqi.arch@bytedance.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.