From: Lance Yang <lance.yang@linux.dev>
To: Lorenzo Stoakes <ljs@kernel.org>,
Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Gerald Schaefer <gerald.schaefer@linux.ibm.com>,
Nico Pache <npache@redhat.com>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org,
aarcange@redhat.com, anshuman.khandual@arm.com,
apopple@nvidia.com, baohua@kernel.org,
baolin.wang@linux.alibaba.com, byungchul@sk.com,
catalin.marinas@arm.com, cl@gentwo.org, corbet@lwn.net,
dave.hansen@linux.intel.com, david@kernel.org, dev.jain@arm.com,
gourry@gourry.net, hannes@cmpxchg.org, hughd@google.com,
jack@suse.cz, jackmanb@google.com, jannh@google.com,
jglisse@google.com, joshua.hahnjy@gmail.com, kas@kernel.org,
liam@infradead.org, mathieu.desnoyers@efficios.com,
matthew.brost@intel.com, mhiramat@kernel.org, mhocko@suse.com,
peterx@redhat.com, pfalcato@suse.de, rakie.kim@sk.com,
raquini@redhat.com, rdunlap@infradead.org,
richard.weiyang@gmail.com, rientjes@google.com,
rostedt@goodmis.org, rppt@kernel.org, ryan.roberts@arm.com,
shivankg@amd.com, sunnanyong@huawei.com, surenb@google.com,
thomas.hellstrom@linux.intel.com, tiwai@suse.de,
usamaarif642@gmail.com, vbabka@suse.cz, vishal.moola@gmail.com,
wangkefeng.wang@huawei.com, will@kernel.org, willy@infradead.org,
yang@os.amperecomputing.com, ying.huang@linux.alibaba.com,
ziy@nvidia.com, zokeefe@google.com, linux-s390@vger.kernel.org,
linux-next@vger.kernel.org
Subject: Re: [PATCH mm-hotfixes-unstable v18 00/14] khugepaged: add mTHP collapse support
Date: Tue, 2 Jun 2026 09:53:54 +0800 [thread overview]
Message-ID: <153ba7fd-9121-4884-87c6-45822828545e@linux.dev> (raw)
In-Reply-To: <ah2z26OzPktchVeT@lucifer>
On 2026/6/2 01:08, Lorenzo Stoakes wrote:
> On Mon, Jun 01, 2026 at 05:58:08PM +0200, Alexander Gordeev wrote:
>> On Fri, May 22, 2026 at 01:47:24PM -0700, Andrew Morton wrote:
>>
>> Hi Andrew et al,
>>
>>> On Fri, 22 May 2026 08:59:55 -0600 Nico Pache <npache@redhat.com> wrote:
>>>
>>>> The following series provides khugepaged with the capability to collapse
>>>> anonymous memory regions to mTHPs.
>>>
>>> Thanks, I've update mm.git's mm-unstable branch to this version.
>>>
>>> It sounds like I might be dropping it soon, haven't started looking at
>>> that yet. But let's at least eyeball the latest version at this time.
>>>
>>> Sashiko was able to apply this, so the base-it-on-hotfixes thing worked
>>> well, thanks. The AI checking made a few allegations:
>>
>> This series appears to cause hangs on s390 in linux-next.
>> The issue is not easily reproducible, so it is not yet confirmed.
>> Any ideas for a reliable reproducer that exercises the code path below?
>>
>> [ 2749.385719] sysrq: Show Blocked State
>> [ 2749.385730] task:khugepaged state:D stack:0 pid:209 tgid:209 ppid:2 task_flags:0x200040 flags:0x00000000
>> [ 2749.385735] Call Trace:
>> [ 2749.385736] [<0000017f63c8b226>] __schedule+0x316/0x890
>> [ 2749.385740] [<0000017f63c8b7dc>] schedule+0x3c/0xc0
>> [ 2749.385743] [<0000017f63c8b888>] schedule_preempt_disabled+0x28/0x40
>> [ 2749.385746] [<0000017f63c902ea>] rwsem_down_write_slowpath+0x2fa/0x8b0
>> [ 2749.385749] [<0000017f63c90910>] down_write+0x70/0x80
>> [ 2749.385752] [<0000017f6313407a>] collapse_huge_page+0x2ea/0x9e0
>> [ 2749.385755] [<0000017f6313491e>] mthp_collapse+0x1ae/0x1f0
>> [ 2749.385757] [<0000017f63134fda>] collapse_scan_pmd+0x67a/0x8f0
>> [ 2749.385760] [<0000017f6313751a>] collapse_single_pmd+0x15a/0x260
>> [ 2749.385762] [<0000017f6313792c>] collapse_scan_mm_slot.constprop.0+0x30c/0x470
>> [ 2749.385765] [<0000017f63137cb6>] khugepaged+0x226/0x240
>> [ 2749.385768] [<0000017f62db3128>] kthread+0x148/0x170
>> [ 2749.385770] [<0000017f62d2c238>] __ret_from_fork+0x48/0x220
>> [ 2749.385772] [<0000017f63c95d0a>] ret_from_fork+0xa/0x30
>>
>> Thanks!
>
> Hi Alexander,
>
> Thanks for the report.
>
> It's a pity it's non-repro, I had Claude have a look at it and it couldn't find
> a definite issue with the code at v18, all the locks seem balanced internally.
>
> Things it highlighted FWIW:
>
> - Far more mmap_write_lock()'s being taken - the stack-based approach calls
> colapse_huge_page() multiple times per-PMD each of which entails an mmap read
> lock/unlock and mmap write lock.
>
> - anon_vma write lock held for a much longer period over partial collapse.
>
> So maybe these are triggering issues rather than being the cause of them per-se?
>
> If you happen to see it again could you give the output for:
>
> 'echo t > /proc/sysrq-trigger' so we can track who holds the contended lock and
> get more details on it?
>
> Also the .config would be useful.
>
> I'm guessing you've also not enabled mTHP in any way on the system?
>
> Repro-wise you could also:
>
> # echo 1 > /sys/kernel/mm/transparent_hugepage/khugepaged/scan_sleep_millisecs
> # echo 1 > /sys/kernel/mm/transparent_hugepage/khugepaged/alloc_sleep_millisecs
>
> To get khugepaged going a more aggressively:
>
> $ for f in /sys/kernel/mm/transparent_hugepage/hugepages-*; do echo always | sudo tee $f/enabled; done
>
> Then maybe some stress-ng like sudo stress-ng --vm 4 --vm-bytes 2G --vm-method
> all --timeout 5m (or maybe something more refined :)?
>
> Maybe some of this will help repro more reliably?
>
Cool!
Maybe also worth trying with CONFIG_DETECT_HUNG_TASK=y and
CONFIG_DETECT_HUNG_TASK_BLOCKER=y.
# detect after 10s in D state instead of default 120s
echo 10 > /proc/sys/kernel/hung_task_timeout_secs
# optional: check more often; 0 means same as timeout
echo 0 > /proc/sys/kernel/hung_task_check_interval_secs
With that enabled, the kernel should hopefully tell us which task likely
owns the rwsem. If it is writer-owned, I would expect that to be fairly
reliable.
Cheers, Lance
next prev parent reply other threads:[~2026-06-02 1:54 UTC|newest]
Thread overview: 144+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-22 14:59 [PATCH mm-hotfixes-unstable v18 00/14] khugepaged: add mTHP collapse support Nico Pache
2026-05-22 14:59 ` [PATCH mm-unstable v18 01/14] mm/khugepaged: generalize hugepage_vma_revalidate for mTHP support Nico Pache
2026-05-22 14:59 ` [PATCH mm-unstable v18 02/14] mm/khugepaged: generalize alloc_charge_folio() Nico Pache
2026-05-22 14:59 ` [PATCH mm-unstable v18 03/14] mm/khugepaged: rework max_ptes_* handling with helper functions Nico Pache
2026-05-22 21:16 ` David Hildenbrand (Arm)
2026-06-01 13:26 ` Lorenzo Stoakes
2026-06-05 16:04 ` Zi Yan
2026-05-22 14:59 ` [PATCH mm-unstable v18 04/14] mm/khugepaged: generalize __collapse_huge_page_* for mTHP support Nico Pache
2026-05-22 21:24 ` David Hildenbrand (Arm)
2026-05-26 14:39 ` Nico Pache
2026-06-01 14:04 ` Lorenzo Stoakes
2026-05-22 15:00 ` [PATCH mm-unstable v18 05/14] mm/khugepaged: require collapse_huge_page to enter/exit with the lock dropped Nico Pache
2026-06-01 14:07 ` Lorenzo Stoakes
2026-06-02 10:26 ` Nico Pache
2026-05-22 15:00 ` [PATCH mm-unstable v18 06/14] mm/khugepaged: generalize collapse_huge_page for mTHP collapse Nico Pache
2026-05-22 21:47 ` David Hildenbrand (Arm)
2026-05-26 14:42 ` Nico Pache
2026-05-31 9:39 ` Lance Yang
2026-05-31 20:00 ` David Hildenbrand (Arm)
2026-06-01 3:28 ` Lance Yang
2026-06-01 6:54 ` David Hildenbrand (Arm)
2026-06-01 7:49 ` Lance Yang
2026-06-01 8:15 ` David Hildenbrand (Arm)
2026-06-01 8:44 ` Lance Yang
2026-06-01 10:09 ` David Hildenbrand (Arm)
2026-06-01 9:08 ` Lance Yang
2026-06-01 10:23 ` David Hildenbrand (Arm)
2026-06-01 10:47 ` Lance Yang
2026-06-01 11:13 ` David Hildenbrand (Arm)
2026-06-01 15:00 ` Nico Pache
2026-06-01 15:05 ` David Hildenbrand (Arm)
2026-06-01 16:07 ` Lance Yang
2026-06-04 17:04 ` Nico Pache
2026-06-04 18:12 ` Lorenzo Stoakes
2026-06-05 7:18 ` David Hildenbrand (Arm)
2026-06-05 8:07 ` Lorenzo Stoakes
2026-06-05 8:59 ` Lance Yang
2026-06-02 15:30 ` Nico Pache
2026-06-02 16:34 ` Lance Yang
2026-06-04 12:33 ` Lorenzo Stoakes
2026-06-04 10:21 ` Lorenzo Stoakes
2026-06-04 10:32 ` Nico Pache
2026-06-04 11:38 ` Lorenzo Stoakes
2026-06-04 12:39 ` Lorenzo Stoakes
2026-06-04 12:45 ` Nico Pache
2026-06-04 12:55 ` Lorenzo Stoakes
2026-06-04 16:28 ` Nico Pache
2026-05-22 15:00 ` [PATCH mm-unstable v18 07/14] mm/khugepaged: skip collapsing mTHP to smaller orders Nico Pache
2026-05-22 21:51 ` David Hildenbrand (Arm)
2026-05-22 15:00 ` [PATCH mm-unstable v18 08/14] mm/khugepaged: add per-order mTHP collapse failure statistics Nico Pache
2026-05-31 20:09 ` David Hildenbrand (Arm)
2026-06-01 14:13 ` Lorenzo Stoakes
2026-05-22 15:00 ` [PATCH mm-unstable v18 09/14] mm/khugepaged: improve tracepoints for mTHP orders Nico Pache
2026-05-22 15:00 ` [PATCH mm-unstable v18 10/14] mm/khugepaged: introduce collapse_allowable_orders helper function Nico Pache
2026-05-31 20:18 ` David Hildenbrand (Arm)
2026-06-01 14:35 ` Lorenzo Stoakes
2026-06-01 14:40 ` David Hildenbrand (Arm)
2026-05-22 15:00 ` [PATCH mm-unstable v18 11/14] mm/khugepaged: Introduce mTHP collapse support Nico Pache
2026-05-25 14:15 ` Nico Pache
2026-05-25 19:10 ` Andrew Morton
2026-05-26 6:57 ` Wei Yang
2026-05-26 12:07 ` Nico Pache
2026-05-28 8:42 ` Wei Yang
2026-05-28 17:11 ` Nico Pache
2026-05-31 7:18 ` Lance Yang
2026-05-31 8:48 ` Lance Yang
2026-06-01 12:01 ` Nico Pache
2026-06-01 12:06 ` David Hildenbrand (Arm)
2026-06-02 10:58 ` Nico Pache
2026-06-02 15:44 ` Lance Yang
2026-06-03 8:05 ` David Hildenbrand (Arm)
2026-06-04 14:40 ` Lorenzo Stoakes
2026-06-01 8:11 ` David Hildenbrand (Arm)
2026-06-01 12:40 ` Nico Pache
2026-06-01 13:15 ` David Hildenbrand (Arm)
2026-06-02 17:23 ` Nico Pache
2026-06-02 17:26 ` Nico Pache
2026-06-03 9:55 ` David Hildenbrand (Arm)
2026-06-03 10:00 ` David Hildenbrand (Arm)
2026-06-03 12:16 ` Nico Pache
2026-06-03 12:27 ` David Hildenbrand (Arm)
2026-06-04 14:14 ` Lorenzo Stoakes
2026-06-04 14:19 ` Lorenzo Stoakes
2026-06-04 13:53 ` Lorenzo Stoakes
2026-06-04 13:59 ` Lorenzo Stoakes
2026-06-04 14:45 ` Lorenzo Stoakes
2026-06-05 11:07 ` Nico Pache
2026-06-05 11:08 ` Nico Pache
2026-05-22 15:00 ` [PATCH mm-unstable v18 12/14] mm/khugepaged: avoid unnecessary mTHP collapse attempts Nico Pache
2026-05-31 7:31 ` Lance Yang
2026-05-31 20:02 ` David Hildenbrand (Arm)
2026-06-01 1:53 ` Lance Yang
2026-05-22 15:00 ` [PATCH mm-unstable v18 13/14] mm/khugepaged: run khugepaged for all orders Nico Pache
2026-05-22 15:00 ` [PATCH mm-unstable v18 14/14] Documentation: mm: update the admin guide for mTHP collapse Nico Pache
2026-05-22 21:58 ` David Hildenbrand (Arm)
2026-05-26 12:00 ` Nico Pache
2026-05-26 14:45 ` Nico Pache
2026-05-22 15:07 ` [PATCH mm-hotfixes-unstable v18 00/14] khugepaged: add mTHP collapse support Nico Pache
2026-05-22 15:13 ` Vlastimil Babka (SUSE)
2026-05-22 16:11 ` Nico Pache
2026-05-22 21:13 ` David Hildenbrand (Arm)
2026-05-26 8:33 ` Process (was Re: [PATCH mm-hotfixes-unstable v18 00/14] khugepaged: add mTHP) " Lorenzo Stoakes
2026-05-26 19:09 ` Andrew Morton
2026-05-26 20:42 ` Vlastimil Babka (SUSE)
2026-05-31 19:49 ` David Hildenbrand (Arm)
2026-06-01 15:41 ` Lorenzo Stoakes
2026-06-01 15:45 ` David Hildenbrand (Arm)
2026-06-01 16:16 ` Lorenzo Stoakes
2026-06-02 11:20 ` David Hildenbrand (Arm)
2026-06-02 11:31 ` David Hildenbrand (Arm)
2026-06-02 12:47 ` Lorenzo Stoakes
2026-06-02 12:55 ` Vlastimil Babka (SUSE)
2026-06-02 13:01 ` David Hildenbrand (Arm)
2026-06-02 17:31 ` Mike Rapoport
2026-06-03 6:48 ` Lorenzo Stoakes
2026-06-03 8:39 ` Mike Rapoport
2026-06-03 9:57 ` Mark Brown
2026-06-03 10:51 ` Mike Rapoport
2026-06-03 9:03 ` Mark Brown
2026-06-02 12:40 ` Lorenzo Stoakes
2026-06-02 12:49 ` David Hildenbrand (Arm)
2026-06-02 12:47 ` Vlastimil Babka (SUSE)
2026-06-02 12:58 ` David Hildenbrand (Arm)
2026-06-02 13:08 ` Vlastimil Babka (SUSE)
2026-06-02 13:16 ` David Hildenbrand (Arm)
2026-06-03 1:48 ` SeongJae Park
2026-06-05 15:24 ` David Hildenbrand (Arm)
2026-06-01 15:37 ` Lorenzo Stoakes
2026-06-01 15:43 ` David Hildenbrand (Arm)
2026-06-01 15:47 ` Lorenzo Stoakes
2026-06-01 16:00 ` David Hildenbrand (Arm)
2026-05-22 15:16 ` [PATCH mm-hotfixes-unstable v18 00/14] khugepaged: add mTHP " Lorenzo Stoakes
2026-05-22 16:08 ` Nico Pache
2026-05-22 16:19 ` Lorenzo Stoakes
2026-05-22 16:31 ` Nico Pache
2026-05-22 17:12 ` Lorenzo Stoakes
2026-05-26 8:14 ` Lorenzo Stoakes
2026-05-22 15:13 ` Lorenzo Stoakes
2026-05-22 20:47 ` Andrew Morton
2026-06-01 15:58 ` Alexander Gordeev
2026-06-01 17:05 ` Nico Pache
2026-06-01 17:08 ` Lorenzo Stoakes
2026-06-02 1:53 ` Lance Yang [this message]
2026-06-04 10:10 ` Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=153ba7fd-9121-4884-87c6-45822828545e@linux.dev \
--to=lance.yang@linux.dev \
--cc=aarcange@redhat.com \
--cc=agordeev@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=apopple@nvidia.com \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=byungchul@sk.com \
--cc=catalin.marinas@arm.com \
--cc=cl@gentwo.org \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=gerald.schaefer@linux.ibm.com \
--cc=gourry@gourry.net \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=jackmanb@google.com \
--cc=jannh@google.com \
--cc=jglisse@google.com \
--cc=joshua.hahnjy@gmail.com \
--cc=kas@kernel.org \
--cc=liam@infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-next@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=ljs@kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=matthew.brost@intel.com \
--cc=mhiramat@kernel.org \
--cc=mhocko@suse.com \
--cc=npache@redhat.com \
--cc=peterx@redhat.com \
--cc=pfalcato@suse.de \
--cc=rakie.kim@sk.com \
--cc=raquini@redhat.com \
--cc=rdunlap@infradead.org \
--cc=richard.weiyang@gmail.com \
--cc=rientjes@google.com \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=shivankg@amd.com \
--cc=sunnanyong@huawei.com \
--cc=surenb@google.com \
--cc=thomas.hellstrom@linux.intel.com \
--cc=tiwai@suse.de \
--cc=usamaarif642@gmail.com \
--cc=vbabka@suse.cz \
--cc=vishal.moola@gmail.com \
--cc=wangkefeng.wang@huawei.com \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=yang@os.amperecomputing.com \
--cc=ying.huang@linux.alibaba.com \
--cc=ziy@nvidia.com \
--cc=zokeefe@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.