From: kernel test robot <oliver.sang@intel.com>
To: Qi Zheng <qi.zheng@linux.dev>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
Qi Zheng <zhengqi.arch@bytedance.com>, Zi Yan <ziy@nvidia.com>,
David Hildenbrand <david@redhat.com>, <linux-mm@kvack.org>,
<hannes@cmpxchg.org>, <hughd@google.com>, <mhocko@suse.com>,
<roman.gushchin@linux.dev>, <shakeel.butt@linux.dev>,
<muchun.song@linux.dev>, <lorenzo.stoakes@oracle.com>,
<harry.yoo@oracle.com>, <baolin.wang@linux.alibaba.com>,
<Liam.Howlett@oracle.com>, <npache@redhat.com>,
<ryan.roberts@arm.com>, <dev.jain@arm.com>, <baohua@kernel.org>,
<lance.yang@linux.dev>, <akpm@linux-foundation.org>,
<linux-kernel@vger.kernel.org>, <cgroups@vger.kernel.org>,
Muchun Song <songmuchun@bytedance.com>, <oliver.sang@intel.com>
Subject: Re: [PATCH v4 3/4] mm: thp: use folio_batch to handle THP splitting in deferred_split_scan()
Date: Tue, 14 Oct 2025 22:25:11 +0800 [thread overview]
Message-ID: <202510142245.b857a693-lkp@intel.com> (raw)
In-Reply-To: <304df1ad1e8180e102c4d6931733bcc77774eb9e.1759510072.git.zhengqi.arch@bytedance.com>
Hello,
kernel test robot noticed "BUG:soft_lockup-CPU##stuck_for#s![#:#]" on:
commit: 9d46e7734bc76dcb83ab0591de7f7ba94234140a ("[PATCH v4 3/4] mm: thp: use folio_batch to handle THP splitting in deferred_split_scan()")
url: https://github.com/intel-lab-lkp/linux/commits/Qi-Zheng/mm-thp-replace-folio_memcg-with-folio_memcg_charged/20251004-005605
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git e406d57be7bd2a4e73ea512c1ae36a40a44e499e
patch link: https://lore.kernel.org/all/304df1ad1e8180e102c4d6931733bcc77774eb9e.1759510072.git.zhengqi.arch@bytedance.com/
patch subject: [PATCH v4 3/4] mm: thp: use folio_batch to handle THP splitting in deferred_split_scan()
in testcase: xfstests
version: xfstests-x86_64-5a9cd3ef-1_20250910
with following parameters:
disk: 4HDD
fs: f2fs
test: generic-group-39
config: x86_64-rhel-9.4-func
compiler: gcc-14
test machine: 8 threads Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz (Skylake) with 16G memory
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202510142245.b857a693-lkp@intel.com
[ 102.427367][ C6] watchdog: BUG: soft lockup - CPU#6 stuck for 26s! [391:4643]
[ 102.427373][ C6] Modules linked in: dm_mod f2fs binfmt_misc btrfs snd_hda_codec_intelhdmi blake2b_generic snd_hda_codec_hdmi xor zstd_compress intel_rapl_msr intel_rapl_common snd_ctl_led raid6_pq snd_hda_codec_alc269 snd_hda_scodec_component x86_pkg_temp_thermal snd_hda_codec_realtek_lib intel_powerclamp snd_hda_codec_generic coretemp sd_mod snd_hda_intel kvm_intel sg snd_soc_avs snd_soc_hda_codec snd_hda_ext_core i915 snd_hda_codec kvm intel_gtt snd_hda_core drm_buddy snd_intel_dspcfg ttm snd_intel_sdw_acpi snd_hwdep drm_display_helper snd_soc_core irqbypass cec ghash_clmulni_intel drm_client_lib snd_compress mei_wdt drm_kms_helper snd_pcm ahci rapl wmi_bmof mei_me libahci snd_timer intel_cstate i2c_i801 video snd libata soundcore pcspkr intel_uncore mei serio_raw i2c_smbus intel_pmc_core intel_pch_thermal pmt_telemetry pmt_discovery pmt_class wmi intel_pmc_ssram_telemetry acpi_pad intel_vsec drm fuse nfnetlink
[ 102.427441][ C6] CPU: 6 UID: 0 PID: 4643 Comm: 391 Tainted: G S 6.17.0-09102-g9d46e7734bc7 #1 PREEMPT(voluntary)
[ 102.427446][ C6] Tainted: [S]=CPU_OUT_OF_SPEC
[ 102.427448][ C6] Hardware name: HP HP Z240 SFF Workstation/802E, BIOS N51 Ver. 01.63 10/05/2017
[ 102.427449][ C6] RIP: 0010:_raw_spin_unlock_irqrestore (include/linux/spinlock_api_smp.h:152 (discriminator 2) kernel/locking/spinlock.c:194 (discriminator 2))
[ 102.427456][ C6] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 c6 07 00 0f 1f 00 f7 c6 00 02 00 00 74 01 fb 65 ff 0d 65 de 2d 03 <74> 05 c3 cc cc cc cc 0f 1f 44 00 00 c3 cc cc cc cc 0f 1f 40 00 90
All code
========
0: 90 nop
1: 90 nop
2: 90 nop
3: 90 nop
4: 90 nop
5: 90 nop
6: 90 nop
7: 90 nop
8: 90 nop
9: 90 nop
a: 90 nop
b: 90 nop
c: 90 nop
d: 90 nop
e: 90 nop
f: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
14: c6 07 00 movb $0x0,(%rdi)
17: 0f 1f 00 nopl (%rax)
1a: f7 c6 00 02 00 00 test $0x200,%esi
20: 74 01 je 0x23
22: fb sti
23: 65 ff 0d 65 de 2d 03 decl %gs:0x32dde65(%rip) # 0x32dde8f
2a:* 74 05 je 0x31 <-- trapping instruction
2c: c3 ret
2d: cc int3
2e: cc int3
2f: cc int3
30: cc int3
31: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
36: c3 ret
37: cc int3
38: cc int3
39: cc int3
3a: cc int3
3b: 0f 1f 40 00 nopl 0x0(%rax)
3f: 90 nop
Code starting with the faulting instruction
===========================================
0: 74 05 je 0x7
2: c3 ret
3: cc int3
4: cc int3
5: cc int3
6: cc int3
7: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
c: c3 ret
d: cc int3
e: cc int3
f: cc int3
10: cc int3
11: 0f 1f 40 00 nopl 0x0(%rax)
15: 90 nop
[ 102.427458][ C6] RSP: 0018:ffffc900028cf6c8 EFLAGS: 00000246
[ 102.427461][ C6] RAX: ffff88810cb32900 RBX: ffffc900028cfa38 RCX: ffff88810cb32910
[ 102.427463][ C6] RDX: ffff88810cb32900 RSI: 0000000000000246 RDI: ffff88810cb328f8
[ 102.427465][ C6] RBP: ffff88810cb32870 R08: 0000000000000001 R09: fffff52000519ece
[ 102.427467][ C6] R10: 0000000000000003 R11: ffffc900028cf770 R12: ffffea0008400034
[ 102.427469][ C6] R13: dffffc0000000000 R14: ffff88810cb32900 R15: ffff88810cb32870
[ 102.427470][ C6] FS: 00007f2303266740(0000) GS:ffff8883f6bb8000(0000) knlGS:0000000000000000
[ 102.427473][ C6] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 102.427475][ C6] CR2: 00005651624c9c70 CR3: 00000001e42c6002 CR4: 00000000003726f0
[ 102.427477][ C6] Call Trace:
[ 102.427478][ C6] <TASK>
[ 102.427480][ C6] deferred_split_scan (mm/huge_memory.c:4166 mm/huge_memory.c:4240)
[ 102.427485][ C6] ? __list_lru_walk_one (include/linux/spinlock.h:392 mm/list_lru.c:110 mm/list_lru.c:331)
[ 102.427490][ C6] ? __pfx_deferred_split_scan (mm/huge_memory.c:4195)
[ 102.427495][ C6] ? list_lru_count_one (mm/list_lru.c:263 (discriminator 1))
[ 102.427498][ C6] ? super_cache_scan (fs/super.c:232)
[ 102.427501][ C6] ? super_cache_count (fs/super.c:265 (discriminator 1))
[ 102.427504][ C6] do_shrink_slab (mm/shrinker.c:438)
[ 102.427509][ C6] shrink_slab_memcg (mm/shrinker.c:551)
[ 102.427513][ C6] ? __pfx_shrink_slab_memcg (mm/shrinker.c:471)
[ 102.427517][ C6] ? do_shrink_slab (mm/shrinker.c:384)
[ 102.427521][ C6] shrink_slab (mm/shrinker.c:628)
[ 102.427524][ C6] ? __pfx_shrink_slab (mm/shrinker.c:616)
[ 102.427528][ C6] ? mem_cgroup_iter (include/linux/percpu-refcount.h:338 include/linux/percpu-refcount.h:351 include/linux/cgroup_refcnt.h:79 include/linux/cgroup_refcnt.h:76 mm/memcontrol.c:1084)
[ 102.427531][ C6] drop_slab (mm/vmscan.c:435 (discriminator 1) mm/vmscan.c:452 (discriminator 1))
[ 102.427534][ C6] drop_caches_sysctl_handler (include/linux/vmstat.h:68 (discriminator 1) fs/drop_caches.c:69 (discriminator 1) fs/drop_caches.c:51 (discriminator 1))
[ 102.427538][ C6] proc_sys_call_handler (fs/proc/proc_sysctl.c:606)
[ 102.427542][ C6] ? __pfx_proc_sys_call_handler (fs/proc/proc_sysctl.c:555)
[ 102.427545][ C6] ? __pfx_handle_pte_fault (mm/memory.c:6134)
[ 102.427548][ C6] ? rw_verify_area (fs/read_write.c:473)
[ 102.427552][ C6] vfs_write (fs/read_write.c:594 (discriminator 1) fs/read_write.c:686 (discriminator 1))
[ 102.427556][ C6] ? __pfx_vfs_write (fs/read_write.c:667)
[ 102.427559][ C6] ? __pfx_css_rstat_updated (kernel/cgroup/rstat.c:71)
[ 102.427563][ C6] ? fdget_pos (arch/x86/include/asm/atomic64_64.h:15 include/linux/atomic/atomic-arch-fallback.h:2583 include/linux/atomic/atomic-long.h:38 include/linux/atomic/atomic-instrumented.h:3189 include/linux/file_ref.h:215 fs/file.c:1204 fs/file.c:1230)
[ 102.427566][ C6] ? count_memcg_events (arch/x86/include/asm/atomic.h:23 include/linux/atomic/atomic-arch-fallback.h:457 include/linux/atomic/atomic-instrumented.h:33 mm/memcontrol.c:560 mm/memcontrol.c:583 mm/memcontrol.c:564 mm/memcontrol.c:846)
[ 102.427568][ C6] ksys_write (fs/read_write.c:738)
[ 102.427572][ C6] ? __pfx_ksys_write (fs/read_write.c:728)
[ 102.427575][ C6] ? handle_mm_fault (mm/memory.c:6360 mm/memory.c:6513)
[ 102.427579][ C6] do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1))
[ 102.427583][ C6] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
[ 102.427586][ C6] RIP: 0033:0x7f23032f8687
[ 102.427589][ C6] Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
All code
========
0: 48 89 fa mov %rdi,%rdx
3: 4c 89 df mov %r11,%rdi
6: e8 58 b3 00 00 call 0xb363
b: 8b 93 08 03 00 00 mov 0x308(%rbx),%edx
11: 59 pop %rcx
12: 5e pop %rsi
13: 48 83 f8 fc cmp $0xfffffffffffffffc,%rax
17: 74 1a je 0x33
19: 5b pop %rbx
1a: c3 ret
1b: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
22: 00
23: 48 8b 44 24 10 mov 0x10(%rsp),%rax
28: 0f 05 syscall
2a:* 5b pop %rbx <-- trapping instruction
2b: c3 ret
2c: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
33: 83 e2 39 and $0x39,%edx
36: 83 fa 08 cmp $0x8,%edx
39: 75 de jne 0x19
3b: e8 23 ff ff ff call 0xffffffffffffff63
Code starting with the faulting instruction
===========================================
0: 5b pop %rbx
1: c3 ret
2: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
9: 83 e2 39 and $0x39,%edx
c: 83 fa 08 cmp $0x8,%edx
f: 75 de jne 0xffffffffffffffef
11: e8 23 ff ff ff call 0xffffffffffffff39
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20251014/202510142245.b857a693-lkp@intel.com
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
next prev parent reply other threads:[~2025-10-14 14:26 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-03 16:53 [PATCH v4 0/4] reparent the THP split queue Qi Zheng
2025-10-03 16:53 ` [PATCH v4 1/4] mm: thp: replace folio_memcg() with folio_memcg_charged() Qi Zheng
2025-10-03 16:53 ` [PATCH v4 2/4] mm: thp: introduce folio_split_queue_lock and its variants Qi Zheng
2025-10-03 16:53 ` [PATCH v4 3/4] mm: thp: use folio_batch to handle THP splitting in deferred_split_scan() Qi Zheng
2025-10-06 23:16 ` Shakeel Butt
2025-10-13 7:28 ` Qi Zheng
2025-10-14 14:25 ` kernel test robot [this message]
2025-10-03 16:53 ` [PATCH v4 4/4] mm: thp: reparent the split queue during memcg offline Qi Zheng
2025-10-03 16:58 ` Zi Yan
2025-10-04 7:52 ` Muchun Song
2025-10-06 6:46 ` David Hildenbrand
2025-10-07 17:56 ` Shakeel Butt
2025-10-13 7:29 ` Qi Zheng
2025-10-10 16:25 ` [PATCH v4 0/4] reparent the THP split queue Zi Yan
2025-10-11 0:51 ` Qi Zheng
2025-10-11 18:28 ` Andrew Morton
2025-10-13 7:23 ` Qi Zheng
2025-10-13 16:37 ` Zi Yan
2025-10-14 6:49 ` Qi Zheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202510142245.b857a693-lkp@intel.com \
--to=oliver.sang@intel.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=cgroups@vger.kernel.org \
--cc=david@redhat.com \
--cc=dev.jain@arm.com \
--cc=hannes@cmpxchg.org \
--cc=harry.yoo@oracle.com \
--cc=hughd@google.com \
--cc=lance.yang@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lkp@intel.com \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=npache@redhat.com \
--cc=oe-lkp@lists.linux.dev \
--cc=qi.zheng@linux.dev \
--cc=roman.gushchin@linux.dev \
--cc=ryan.roberts@arm.com \
--cc=shakeel.butt@linux.dev \
--cc=songmuchun@bytedance.com \
--cc=zhengqi.arch@bytedance.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.