From: Lance Yang <lance.yang@linux.dev>
To: "David Hildenbrand (Arm)" <david@kernel.org>, hev <r@hev.cc>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
Andrew Morton <akpm@linux-foundation.org>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
Barry Song <baohua@kernel.org>,
Christian Brauner <brauner@kernel.org>,
Dev Jain <dev.jain@arm.com>, Jan Kara <jack@suse.cz>,
Kees Cook <kees@kernel.org>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Matthew Wilcox <willy@infradead.org>,
Nico Pache <npache@redhat.com>,
Ryan Roberts <ryan.roberts@arm.com>, Zi Yan <ziy@nvidia.com>,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, wangkefeng.wang@huawei.com
Subject: Re: [PATCH v4 1/2] huge_mm: add stubs for THP-disabled configs
Date: Fri, 13 Mar 2026 17:47:51 +0800 [thread overview]
Message-ID: <1a886b5b-319c-4f3e-8db1-6af6696f4d84@linux.dev> (raw)
In-Reply-To: <60ba4311-01f8-4ff3-a2df-e1b3fb6db699@kernel.org>
On 2026/3/13 00:29, David Hildenbrand (Arm) wrote:
> On 3/12/26 17:12, hev wrote:
>> Hi David,
>>
>> On Thu, Mar 12, 2026 at 11:57 PM David Hildenbrand (Arm)
>> <david@kernel.org> wrote:
>>>
>>> On 3/12/26 16:53, David Hildenbrand (Arm) wrote:
>>>>
>>>> There are other ways to enable PMD THP. So I don't quite think this is
>>>> the right tool for the job.
>>>
>>> Ah, you care about file THPs ... gah.
>>>
>>> Why can't we simply do the alignment without considering the current
>>> setting?
>>
>> The main motivation of raising the alignment here is to increase the
>> chance of getting PMD-sized THPs for executable mappings.
>>
>> If THP is not in "always" mode, the kernel will not automatically
>> collapse file-backed mappings into THPs, so the increased alignment
>> would not actually improve THP usage.
>>
>> In that case we would only be introducing additional padding in the
>> virtual address layout, which slightly reduces ASLR entropy without
>> providing a practical benefit.
>
> Well, that parameter can get toggled at runtime later? Also, I think
> that readahead code could end up allocating a PMD THP (I might be
> wrong about that, the code is confusing).
Right. In do_sync_mmap_readahead(), if the VMA has VM_HUGEPAGE,
force_thp_readahead becomes true and ra->order is set to
HPAGE_PMD_ORDER, IIUC.
/* Use the readahead code, even if readahead is disabled */
if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
(vm_flags & VM_HUGEPAGE) && HPAGE_PMD_ORDER <= MAX_PAGECACHE_ORDER)
force_thp_readahead = true;
That order is then passed down to page_cache_ra_order() and finally to
filemap_alloc_folio().
if (force_thp_readahead) {
[...]
ra->async_size = HPAGE_PMD_NR;
ra->order = HPAGE_PMD_ORDER;
page_cache_ra_order(&ractl, ra);
return fpin;
}
For plain VM_EXEC, the code starts from exec_folio_order(), not
HPAGE_PMD_ORDER.
if (vm_flags & VM_EXEC) {
[...]
ra->order = exec_folio_order();
[...]
ra->async_size = 0;
}
The default exec_folio_order() is small, and arm64 only overrides it
to 64K.
/*
* Request exec memory is read into pagecache in at least 64K folios.
This size
* can be contpte-mapped when 4K base pages are in use (16 pages into 1
iTLB
* entry), and HPA can coalesce it (4 pages into 1 TLB entry) when 16K base
* pages are in use.
*/
#define exec_folio_order() ilog2(SZ_64K >> PAGE_SHIFT)
>
> Let's take a look at __get_unmapped_area(), where we don't care about
> ASLR entropy for anonymous memory:
>
> } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && !file
> && !addr /* no hint */
> && IS_ALIGNED(len, PMD_SIZE)) {
Yeah. For anonymous memory, the kernel is willing to do THP-friendly
alignment, but it is constrained, of course :)
> Interestingly we had:
>
> commit 34d7cf637c437d5c2a8a6ef23ea45193bad8a91c
> Author: Kefeng Wang <wangkefeng.wang@huawei.com>
> Date: Fri Dec 6 15:03:45 2024 +0800
>
> mm: don't try THP alignment for FS without get_unmapped_area
>
> Commit ed48e87c7df3 ("thp: add thp_get_unmapped_area_vmflags()") changes
> thp_get_unmapped_area() to thp_get_unmapped_area_vmflags() in
> __get_unmapped_area(), which doesn't initialize local get_area for
> anonymous mappings. This leads to us always trying THP alignment even for
> file_operations which have a NULL ->get_unmapped_area() callback.
>
> Since commit efa7df3e3bb5 ("mm: align larger anonymous mappings on THP
> boundaries") we only want to enable THP alignment for anonymous mappings,
> so add a !file check to avoid attempting THP alignment for file mappings.
>
> Found issue by code inspection. THP alignment is used for easy or more
> pmd mappings, from vma side. This may cause unnecessary VMA fragmentation
> and potentially worse performance on filesystems that do not actually
> support THPs and thus cannot benefit from the alignment.
Looks like this commit does not *ban* file-backed THP-friendly alignment
completely. It only prevents file mappings from getting it accidentally
via the generic fallback path.
Note that some filesystems still explicitly opt in with their own
.get_unmapped_area = thp_get_unmapped_area
for example ext4, xfs, and btrfs.
So explicit filesystem opt-in is still allowed :)
> I'm not sure about the "VMA fragmentation" argument, really. We only consider
> stuff that is already multiples of PMD_SIZE.
>
> Filesystem support for THPs is also not really something you would handle, and it's
> a problem that solves itself over time as more filesystems keep adding support for
> large folios.
>
> So I think we should try limiting it to IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE),
> but not checking the runtime toggle.
Good point! ELF layout is decided once at exec time, while the runtime
THP mode
can change later.
next prev parent reply other threads:[~2026-03-13 9:48 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-10 3:11 [PATCH v4 0/2] binfmt_elf: Align eligible read-only PT_LOAD segments to PMD_SIZE for THP WANG Rui
2026-03-10 3:11 ` [PATCH v4 1/2] huge_mm: add stubs for THP-disabled configs WANG Rui
2026-03-12 15:53 ` David Hildenbrand (Arm)
2026-03-12 15:57 ` David Hildenbrand (Arm)
2026-03-12 16:12 ` hev
2026-03-12 16:29 ` David Hildenbrand (Arm)
2026-03-13 0:10 ` hev
2026-03-13 9:47 ` Lance Yang [this message]
2026-03-10 3:11 ` [PATCH v4 2/2] binfmt_elf: Align eligible read-only PT_LOAD segments to PMD_SIZE for THP WANG Rui
2026-03-13 8:41 ` [PATCH v4 0/2] " Baolin Wang
2026-03-13 10:46 ` Usama Arif
2026-03-13 14:39 ` hev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1a886b5b-319c-4f3e-8db1-6af6696f4d84@linux.dev \
--to=lance.yang@linux.dev \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=brauner@kernel.org \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=jack@suse.cz \
--cc=kees@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=npache@redhat.com \
--cc=r@hev.cc \
--cc=ryan.roberts@arm.com \
--cc=viro@zeniv.linux.org.uk \
--cc=wangkefeng.wang@huawei.com \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox