* Re: [PATCH v5] binfmt_elf: Align eligible read-only PT_LOAD segments to PMD_SIZE for THP
[not found] <024d2480-df23-4c2c-9f2a-1c4a130f71b1@kernel.org>
@ 2026-03-20 17:11 ` WANG Rui
2026-03-21 14:21 ` WANG Rui
0 siblings, 1 reply; 2+ messages in thread
From: WANG Rui @ 2026-03-20 17:11 UTC (permalink / raw)
To: david, usama.arif
Cc: baolin.wang, brauner, jack, kees, lance.yang, linux-fsdevel,
linux-kernel, linux-mm, r, ryan.roberts, viro, willy,
Liam.Howlett, ajd, akpm, apopple, baohua, catalin.marinas,
dev.jain, kevin.brodsky, linux-arm-kernel, lorenzo.stoakes,
mhocko, npache, pasha.tatashin, rmclure, rppt, surenb, vbabka
>> Thanks! Also adding Ryan who did the exec_folio_order() work for ARM,
>> and also raised good concerns in [1]
>>
>> The problem is not just alignment for elf, we need to fix more things like
>> mmap heuristics [2] and how unmapped areas are gotten [3].
>
> I agree, ideally, that would all be tackled in one go.
From Usama’s v2 [1], it looks like we may be operating under slightly
different assumptions. His approach seems to key off page cache
characteristics when deciding segment alignment, while my patch is more
about proactively making things THP-friendly so that more code can end
up backed by large mappings. That helps in cases where a segment size is
just over a large mapping boundary.
Maybe what we really need here is to make sure the virtual address is
properly aligned, while avoiding overly aggressive alignment (e.g. capping
it at something like 32M, which is fairly common across architectures).
Beyond that, we can just leave it to THP in “always” mode. THP already has
its own heuristics to decide whether collapsing into large pages makes sense.
It also looks like this approach would work fine with Usama’s cont-pte
mappings. If so, would it make sense to implement [1] along these lines
instead?
[1] https://lore.kernel.org/linux-fsdevel/20260320140315.979307-4-usama.arif@linux.dev
Thanks,
Rui
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH v5] binfmt_elf: Align eligible read-only PT_LOAD segments to PMD_SIZE for THP
2026-03-20 17:11 ` [PATCH v5] binfmt_elf: Align eligible read-only PT_LOAD segments to PMD_SIZE for THP WANG Rui
@ 2026-03-21 14:21 ` WANG Rui
0 siblings, 0 replies; 2+ messages in thread
From: WANG Rui @ 2026-03-21 14:21 UTC (permalink / raw)
To: david, usama.arif, willy, baolin.wang
Cc: r, Liam.Howlett, ajd, akpm, apopple, baohua, brauner,
catalin.marinas, dev.jain, jack, kees, kevin.brodsky, lance.yang,
linux-arm-kernel, linux-fsdevel, linux-kernel, linux-mm,
lorenzo.stoakes, mhocko, npache, pasha.tatashin, rmclure, rppt,
ryan.roberts, surenb, vbabka, viro
One clarification regarding my earlier comment about compatibility with
cont-pte.
What I meant there is that the alignment logic in my patch does work for
systems with 4K and 16K base pages, where the PMD size remains within a
practical range. In those configurations, providing PMD-level alignment
already creates the conditions needed for both THP collapse and cont-pte
coalescing.
This does not fully extend to 64K base page systems. There the PMD size
can be quite large (e.g. 512M), which exceeds the 32M cap used in my
patch, so PMD-sized alignment may not be achievable in practice.
One way to structure this could be to treat alignment in a layered manner.
My patch focuses on establishing a reliable PMD-level alignment baseline
so that THP has the opportunity to form large mappings where it is practical.
On top of that, Usama's work can further improve behavior at smaller
granularities, for example by enabling cont-pte mappings when PMD alignment
is not feasible.
/* skip non-power of two alignments as invalid */
if (!is_power_of_2(p_align))
continue;
if (p_align < PMD_SIZE && should_align_to_pmd(&cmds[i]))
p_align = PMD_SIZE;
+ else if (p_align < CONT_PTE_SIZE && should_align_to_cont_pte(&cmds[i]))
+ p_align = CONT_PTE_SIZE;
alignment = max(alignment, p_align);
}
}
With that separation of roles, the two approaches complement each other,
and we can get the benefit of both without changing the core alignment
policy in binfmt_elf.
Thanks,
Rui
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-03-21 14:21 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <024d2480-df23-4c2c-9f2a-1c4a130f71b1@kernel.org>
2026-03-20 17:11 ` [PATCH v5] binfmt_elf: Align eligible read-only PT_LOAD segments to PMD_SIZE for THP WANG Rui
2026-03-21 14:21 ` WANG Rui
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox