From: Qi Zheng <qi.zheng@linux.dev>
To: Zi Yan <ziy@nvidia.com>
Cc: hannes@cmpxchg.org, hughd@google.com, mhocko@suse.com,
roman.gushchin@linux.dev, shakeel.butt@linux.dev,
muchun.song@linux.dev, david@redhat.com,
lorenzo.stoakes@oracle.com, harry.yoo@oracle.com,
baolin.wang@linux.alibaba.com, Liam.Howlett@oracle.com,
npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com,
baohua@kernel.org, lance.yang@linux.dev,
akpm@linux-foundation.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
Qi Zheng <zhengqi.arch@bytedance.com>
Subject: Re: [PATCH v4 0/4] reparent the THP split queue
Date: Mon, 13 Oct 2025 15:23:53 +0800 [thread overview]
Message-ID: <df9a3e22-caca-4298-b7d8-5334ce5446a0@linux.dev> (raw)
In-Reply-To: <925E0247-2976-4D85-A8AA-E8C92C64CED4@nvidia.com>
Hi Zi,
On 10/11/25 12:25 AM, Zi Yan wrote:
> On 3 Oct 2025, at 12:53, Qi Zheng wrote:
[snip]
>>
>
> Hi Qi,
>
> I got CPU soft locks when run "echo 3 | sudo tee /proc/sys/vm/drop_caches"
> with today's mm-new on a freshly booted system. Reverting Patch 3 (and Patch 4)
> of your patchset solves the issue.
>
> My config file is attached. My kernel relevant kernel parameters are:
> "cgroup_no_v1=all transparent_hugepage=always thp_shmem=2M:always".
> The machine is a 8GB 8-core x86_64 VM.
>
> The kernel log:
>
> [ 36.441539] watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [tee:810]
> [ 36.441549] Modules linked in:
> [ 36.441566] CPU: 0 UID: 0 PID: 810 Comm: tee Not tainted 6.17.0-mm-everything-2024-01-29-07-19-no-mglru+ #526 PREEMPT(voluntary)
> [ 36.441570] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-debian-1.17.0-1 04/01/2014
> [ 36.441574] RIP: 0010:_raw_spin_unlock_irqrestore+0x19/0x40
It seems that the softlockup is not caused by lock contention. I have
also seen the following stack locally:
[14819.914867][ C2] watchdog: BUG: soft lockup - CPU#2 stuck for 49s!
[kswapd0:202]
[14819.914873][ C2] Modules linked in:
[14819.914877][ C2] CPU: 2 UID: 0 PID: 202 Comm: kswapd0 Tainted: G
L 6.1
[14819.914880][ C2] Tainted: [L]=SOFTLOCKUP
[14819.914881][ C2] Hardware name: QEMU Standard PC (i440FX + PIIX,
1996), BIOS 1.12.0-1 04
[14819.914883][ C2] RIP: 0010:folios_put_refs+0x34/0x210
[14819.914889][ C2] Code: 57 41 56 49 89 fe 41 55 41 54 55 53 48 83
ec 18 65 48 8b 05 36 51
[14819.914891][ C2] RSP: 0000:ffffc900039db8e8 EFLAGS: 00000246
[14819.914893][ C2] RAX: 0000000000000000 RBX: ffff888109e64330 RCX:
ffffffff82848bb8
[14819.914895][ C2] RDX: 0000000000000001 RSI: 0000000000000000 RDI:
ffffc900039db960
[14819.914896][ C2] RBP: ffff888109e642a0 R08: ffff888100403058 R09:
ffff888100402fe8
[14819.914897][ C2] R10: 0000000000000000 R11: ffffffff82711328 R12:
ffff888109e64328
[14819.914898][ C2] R13: ffffc900039dbb30 R14: ffffc900039db960 R15:
ffffc900039dbb30
[14819.914916][ C2] FS: 0000000000000000(0000)
GS:ffff888332ad7000(0000) knlGS:00000000000
[14819.914918][ C2] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[14819.914919][ C2] CR2: 00007f8d211b3000 CR3: 000000010833c000 CR4:
00000000000006f0
[14819.914921][ C2] Call Trace:
[14819.914923][ C2] <TASK>
[14819.914927][ C2] deferred_split_scan+0x30d/0x420
[14819.914934][ C2] do_shrink_slab+0x105/0x350
[14819.914937][ C2] shrink_slab+0x2fa/0x3f0
[14819.914939][ C2] ? try_to_shrink_lruvec+0x1a6/0x270
[14819.914944][ C2] shrink_one+0x11d/0x1b0
[14819.914947][ C2] shrink_node+0x9a4/0xb90
[14819.914952][ C2] balance_pgdat+0x548/0x910
[14819.914955][ C2] ? __sched_fork+0x160/0x180
[14819.914964][ C2] ? __timer_delete_sync+0x2b/0x40
[14819.914970][ C2] kswapd+0x1b9/0x320
[14819.914973][ C2] ? __pfx_autoremove_wake_function+0x10/0x10
[14819.914978][ C2] ? __pfx_kswapd+0x10/0x10
[14819.914981][ C2] kthread+0xfd/0x250
[14819.914984][ C2] ? finish_task_switch+0x90/0x270
[14819.914987][ C2] ? __pfx_kthread+0x10/0x10
[14819.914989][ C2] ? __pfx_kthread+0x10/0x10
[14819.914991][ C2] ret_from_fork+0x170/0x190
[14819.914995][ C2] ? __pfx_kthread+0x10/0x10
[14819.914996][ C2] ret_from_fork_asm+0x1a/0x30
[14819.915002][ C2] </TASK>
It seems that the reason is that deferred_split_scan() has been executed
for too long.
Could you please help test the following changes? I applied them locally
and no softlockup occurred.
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index b5eea2091cdf6..5353c7bd2c9af 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -4286,8 +4286,10 @@ static unsigned long deferred_split_scan(struct
shrinker *shrink,
}
folios_put(&fbatch);
- if (sc->nr_to_scan)
+ if (sc->nr_to_scan) {
+ cond_resched();
goto retry;
+ }
/*
* Stop shrinker if we didn't split any page, but the queue is
empty.
Thanks,
Qi
> [ 36.441592] Code: 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 53 48 89 f3 e8 92 68 fd fe 80 e7 02 74 06 fb 0f 1f 44 00 00 <65> ff 0d d0 5f 7e 01 74 06 5b c3 cc cc cc cc 0f 1f 44 00 00 5b c3
> [ 36.441594] RSP: 0018:ffffc900029afb60 EFLAGS: 00000202
> [ 36.441598] RAX: 0000000000000001 RBX: 0000000000000286 RCX: ffff888101168670
> [ 36.441601] RDX: 0000000000000001 RSI: 0000000000000286 RDI: ffff888101168658
> [ 36.441602] RBP: 0000000000000001 R08: ffff88813ba44ec0 R09: 0000000000000000
> [ 36.441603] R10: 00000000000001a8 R11: 0000000000000000 R12: ffff8881011685e0
> [ 36.441604] R13: 0000000000000000 R14: ffff888101168000 R15: ffffc900029afd60
> [ 36.441606] FS: 00007f7fe3655740(0000) GS:ffff8881b7e5d000(0000) knlGS:0000000000000000
> [ 36.441607] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 36.441608] CR2: 0000563d4d439bf0 CR3: 000000010873c006 CR4: 0000000000370ef0
> [ 36.441614] Call Trace:
> [ 36.441616] <TASK>
> [ 36.441619] deferred_split_scan+0x1e0/0x480
> [ 36.441627] ? _raw_spin_unlock_irqrestore+0xe/0x40
> [ 36.441630] ? kvfree_rcu_queue_batch+0x96/0x1c0
> [ 36.441634] ? do_raw_spin_unlock+0x46/0xd0
> [ 36.441639] ? kfree_rcu_monitor+0x1da/0x2c0
> [ 36.441641] ? list_lru_count_one+0x47/0x90
> [ 36.441644] do_shrink_slab+0x153/0x360
> [ 36.441649] shrink_slab+0xd3/0x390
> [ 36.441652] drop_slab+0x7d/0x130
> [ 36.441655] drop_caches_sysctl_handler+0x98/0xb0
> [ 36.441660] proc_sys_call_handler+0x1c7/0x2c0
> [ 36.441664] vfs_write+0x221/0x450
> [ 36.441669] ksys_write+0x6c/0xe0
> [ 36.441672] do_syscall_64+0x50/0x200
> [ 36.441675] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 36.441678] RIP: 0033:0x7f7fe36e7687
> [ 36.441685] Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
> [ 36.441686] RSP: 002b:00007ffdffcbba10 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
> [ 36.441688] RAX: ffffffffffffffda RBX: 00007f7fe3655740 RCX: 00007f7fe36e7687
> [ 36.441689] RDX: 0000000000000002 RSI: 00007ffdffcbbbb0 RDI: 0000000000000003
> [ 36.441690] RBP: 00007ffdffcbbbb0 R08: 0000000000000000 R09: 0000000000000000
> [ 36.441691] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000002
> [ 36.441692] R13: 0000558d40be64c0 R14: 00007f7fe383de80 R15: 0000000000000002
> [ 36.441694] </TASK>
> [ 64.441531] watchdog: BUG: soft lockup - CPU#0 stuck for 53s! [tee:810]
> [ 64.441537] Modules linked in:
> [ 64.441545] CPU: 0 UID: 0 PID: 810 Comm: tee Tainted: G L 6.17.0-mm-everything-2024-01-29-07-19-no-mglru+ #526 PREEMPT(voluntary)
> [ 64.441548] Tainted: [L]=SOFTLOCKUP
> [ 64.441552] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-debian-1.17.0-1 04/01/2014
> [ 64.441555] RIP: 0010:_raw_spin_unlock_irqrestore+0x19/0x40
> [ 64.441565] Code: 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 53 48 89 f3 e8 92 68 fd fe 80 e7 02 74 06 fb 0f 1f 44 00 00 <65> ff 0d d0 5f 7e 01 74 06 5b c3 cc cc cc cc 0f 1f 44 00 00 5b c3
> [ 64.441566] RSP: 0018:ffffc900029afb60 EFLAGS: 00000202
> [ 64.441568] RAX: 0000000000000001 RBX: 0000000000000286 RCX: ffff888101168670
> [ 64.441570] RDX: 0000000000000001 RSI: 0000000000000286 RDI: ffff888101168658
> [ 64.441571] RBP: 0000000000000001 R08: ffff88813ba44ec0 R09: 0000000000000000
> [ 64.441572] R10: 00000000000001a8 R11: 0000000000000000 R12: ffff8881011685e0
> [ 64.441573] R13: 0000000000000000 R14: ffff888101168000 R15: ffffc900029afd60
> [ 64.441574] FS: 00007f7fe3655740(0000) GS:ffff8881b7e5d000(0000) knlGS:0000000000000000
> [ 64.441576] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 64.441577] CR2: 0000563d4d439bf0 CR3: 000000010873c006 CR4: 0000000000370ef0
> [ 64.441581] Call Trace:
> [ 64.441583] <TASK>
> [ 64.441591] deferred_split_scan+0x1e0/0x480
> [ 64.441598] ? _raw_spin_unlock_irqrestore+0xe/0x40
> [ 64.441599] ? kvfree_rcu_queue_batch+0x96/0x1c0
> [ 64.441603] ? do_raw_spin_unlock+0x46/0xd0
> [ 64.441607] ? kfree_rcu_monitor+0x1da/0x2c0
> [ 64.441610] ? list_lru_count_one+0x47/0x90
> [ 64.441613] do_shrink_slab+0x153/0x360
> [ 64.441618] shrink_slab+0xd3/0x390
> [ 64.441621] drop_slab+0x7d/0x130
> [ 64.441624] drop_caches_sysctl_handler+0x98/0xb0
> [ 64.441629] proc_sys_call_handler+0x1c7/0x2c0
> [ 64.441632] vfs_write+0x221/0x450
> [ 64.441638] ksys_write+0x6c/0xe0
> [ 64.441641] do_syscall_64+0x50/0x200
> [ 64.441645] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 64.441648] RIP: 0033:0x7f7fe36e7687
> [ 64.441654] Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
> [ 64.441656] RSP: 002b:00007ffdffcbba10 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
> [ 64.441658] RAX: ffffffffffffffda RBX: 00007f7fe3655740 RCX: 00007f7fe36e7687
> [ 64.441659] RDX: 0000000000000002 RSI: 00007ffdffcbbbb0 RDI: 0000000000000003
> [ 64.441660] RBP: 00007ffdffcbbbb0 R08: 0000000000000000 R09: 0000000000000000
> [ 64.441661] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000002
> [ 64.441662] R13: 0000558d40be64c0 R14: 00007f7fe383de80 R15: 0000000000000002
> [ 64.441663] </TASK>
>
>
>
> --
> Best Regards,
> Yan, Zi
next prev parent reply other threads:[~2025-10-13 7:24 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-03 16:53 [PATCH v4 0/4] reparent the THP split queue Qi Zheng
2025-10-03 16:53 ` [PATCH v4 1/4] mm: thp: replace folio_memcg() with folio_memcg_charged() Qi Zheng
2025-10-03 16:53 ` [PATCH v4 2/4] mm: thp: introduce folio_split_queue_lock and its variants Qi Zheng
2025-10-03 16:53 ` [PATCH v4 3/4] mm: thp: use folio_batch to handle THP splitting in deferred_split_scan() Qi Zheng
2025-10-06 23:16 ` Shakeel Butt
2025-10-13 7:28 ` Qi Zheng
2025-10-14 14:25 ` kernel test robot
2025-10-03 16:53 ` [PATCH v4 4/4] mm: thp: reparent the split queue during memcg offline Qi Zheng
2025-10-03 16:58 ` Zi Yan
2025-10-04 7:52 ` Muchun Song
2025-10-06 6:46 ` David Hildenbrand
2025-10-07 17:56 ` Shakeel Butt
2025-10-13 7:29 ` Qi Zheng
2025-10-10 16:25 ` [PATCH v4 0/4] reparent the THP split queue Zi Yan
2025-10-11 0:51 ` Qi Zheng
2025-10-11 18:28 ` Andrew Morton
2025-10-13 7:23 ` Qi Zheng [this message]
2025-10-13 16:37 ` Zi Yan
2025-10-14 6:49 ` Qi Zheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=df9a3e22-caca-4298-b7d8-5334ce5446a0@linux.dev \
--to=qi.zheng@linux.dev \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=cgroups@vger.kernel.org \
--cc=david@redhat.com \
--cc=dev.jain@arm.com \
--cc=hannes@cmpxchg.org \
--cc=harry.yoo@oracle.com \
--cc=hughd@google.com \
--cc=lance.yang@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=npache@redhat.com \
--cc=roman.gushchin@linux.dev \
--cc=ryan.roberts@arm.com \
--cc=shakeel.butt@linux.dev \
--cc=zhengqi.arch@bytedance.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.