From: Wei Yang <richard.weiyang@gmail.com>
To: Wei Yang <richard.weiyang@gmail.com>
Cc: akpm@linux-foundation.org, david@kernel.org,
lorenzo.stoakes@oracle.com, ziy@nvidia.com,
baolin.wang@linux.alibaba.com, Liam.Howlett@oracle.com,
npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com,
baohua@kernel.org, lance.yang@linux.dev, linux-mm@kvack.org
Subject: Re: [Patch v2] mm/huge_memory: consolidate order-related checks into folio_check_splittable()
Date: Sun, 4 Jan 2026 02:37:56 +0000 [thread overview]
Message-ID: <20260104023756.jufklyl3bl64fnck@master> (raw)
In-Reply-To: <20251223122539.10726-1-richard.weiyang@gmail.com>
On Tue, Dec 23, 2025 at 12:25:39PM +0000, Wei Yang wrote:
>The primary goal of the folio_check_splittable() function is to validate
>whether a folio is suitable for splitting and to bail out early if it is
>not.
>
>Currently, some order-related checks are scattered throughout the
>calling code rather than being centralized in folio_check_splittable().
>
>This commit moves all remaining order-related validation logic into
>folio_check_splittable(). This consolidation ensures that the function
>serves its intended purpose as a single point of failure and improves
>the clarity and maintainability of the surrounding code.
>
>Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>Cc: Zi Yan <ziy@nvidia.com>
>
>---
[...]
>@@ -3719,28 +3723,33 @@ int folio_check_splittable(struct folio *folio, unsigned int new_order,
> /* order-1 is not supported for anonymous THP. */
> if (new_order == 1)
> return -EINVAL;
>- } else if (split_type == SPLIT_TYPE_NON_UNIFORM || new_order) {
>- if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
>- !mapping_large_folio_support(folio->mapping)) {
>- /*
>- * We can always split a folio down to a single page
>- * (new_order == 0) uniformly.
>- *
>- * For any other scenario
>- * a) uniform split targeting a large folio
>- * (new_order > 0)
>- * b) any non-uniform split
>- * we must confirm that the file system supports large
>- * folios.
>- *
>- * Note that we might still have THPs in such
>- * mappings, which is created from khugepaged when
>- * CONFIG_READ_ONLY_THP_FOR_FS is enabled. But in that
>- * case, the mapping does not actually support large
>- * folios properly.
>- */
>- return -EINVAL;
>+ } else {
>+ if (split_type == SPLIT_TYPE_NON_UNIFORM || new_order) {
>+ if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
>+ !mapping_large_folio_support(folio->mapping)) {
>+ /*
>+ * We can always split a folio down to a
>+ * single page (new_order == 0) uniformly.
>+ *
>+ * For any other scenario
>+ * a) uniform split targeting a large folio
>+ * (new_order > 0)
>+ * b) any non-uniform split
>+ * we must confirm that the file system
>+ * supports large folios.
>+ *
>+ * Note that we might still have THPs in such
>+ * mappings, which is created from khugepaged
>+ * when CONFIG_READ_ONLY_THP_FOR_FS is
>+ * enabled. But in that case, the mapping does
>+ * not actually support large folios properly.
>+ */
>+ return -EINVAL;
>+ }
> }
Hi, Happy New Year to all :-)
Following the application of this patch, a warning was reported [5]. The root
cause is an attempt to uniformly split a page cache folio down to order-0 when
the mapping has a mapping_min_folio_order() > 0.
It is worth noting that the current upstream code also denies this split, but
it does so silently. This patch simply makes the violation visible.
Upon reviewing the code history, I believe the logic introduced here is
correct. The existing comment--"We can always split a folio down to a single
page"--appears to be misleading, as it does not account for modern
constraints where a minimum folio order is required by the mapping.
Below is my analysis and suggestion:
----
All the story came from [1] which introduced folio split to any lower order.
The check on order is fixed by [2], which is the base of current form.
When we split a large pagecache folio, it has two possibilities:
1) khugepaged collapsed folio
2) filesystem supported large folio
For case 1), the folio could only be splitted to order-0, so the check is
added, (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !mapping_large_folio_support(folio->mapping))
For case 2), it looks implies mapping_large_folio_support() is true. And it
looks we assume it could be splitted to any lower order. Because the
mapping_min_folio_order() is introduced by [3] and the min_order check is
introduced in [4], both are later than [1] and [2]. So when [2] applied, we
don't have the knowledge of mapping_min_folio_order(). (I may lose some
background here, if not correct, please correct me.)
The introduction of [3] and [4] changes the assumption at [2], besides
checking mapping_large_folio_support(), the mapping_min_folio_order() should
be checked. So the comment in current code is misleading:
/*
* We can always split a folio down to a
* single page (new_order == 0) uniformly.
*/
For case 1), it still stands, but for case 2) we should also check min_order
with mapping_min_folio_order(), which [4] does.
If my understanding is correct, below is my suggestion for the comment change.
Feel free to correct me, if I miss something.
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 40cf59301c21..b0ba27b0f763 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3714,28 +3718,28 @@ int folio_check_splittable(struct folio *folio, unsigned int new_order,
/* order-1 is not supported for anonymous THP. */
if (new_order == 1)
return -EINVAL;
- } else if (split_type == SPLIT_TYPE_NON_UNIFORM || new_order) {
- if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
- !mapping_large_folio_support(folio->mapping)) {
- /*
- * We can always split a folio down to a single page
- * (new_order == 0) uniformly.
- *
- * For any other scenario
- * a) uniform split targeting a large folio
- * (new_order > 0)
- * b) any non-uniform split
- * we must confirm that the file system supports large
- * folios.
- *
- * Note that we might still have THPs in such
- * mappings, which is created from khugepaged when
- * CONFIG_READ_ONLY_THP_FOR_FS is enabled. But in that
- * case, the mapping does not actually support large
- * folios properly.
- */
- return -EINVAL;
+ } else {
+ /*
+ * When splitting a large pagecache folio, it has two
+ * possibilities:
+ *
+ * 1) khugepaged collapsed folio when
+ * CONFIG_READ_ONLY_THP_FOR_FS is enabled
+ * 2) filesystem supported folio
+ *
+ * For case 1), we only support uniform split to order-0.
+ * For case 2), we need to make sure new_order is not less
+ * than mapping_min_folio_order().
+ */
+ if (split_type == SPLIT_TYPE_NON_UNIFORM || new_order) {
+ if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
+ !mapping_large_folio_support(folio->mapping)) {
+ return -EINVAL;
+ }
}
+
+ if (new_order < mapping_min_folio_order(folio->mapping))
+ return -EINVAL;
}
The warning reported is in madvise_cold_or_pageout_pte_range(). I don't find a
way to eliminate it, since user may specify any range to madvise(). Uniformly
splitting to order-0 is a general way.
[1]: 2024-02-26 c010d47f107f mm: thp: split huge page to any lower order pages
[2]: 2024-06-07 6a50c9b512f7 mm: huge_memory: fix misused mapping_large_folio_support() for anon folios
[3]: 2024-08-22 84429b675bcf fs: Allow fine-grained control of folio sizes
[4]: 2024-08-22 e220917fa507 mm: split a folio in minimum folio order chunks
[5]: https://lore.kernel.org/all/694ac438.050a0220.35954c.0000.GAE@google.com/
--
Wei Yang
Help you, Help me
next prev parent reply other threads:[~2026-01-04 2:38 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-23 12:25 [Patch v2] mm/huge_memory: consolidate order-related checks into folio_check_splittable() Wei Yang
2025-12-23 17:50 ` [syzbot ci] " syzbot ci
2026-01-04 2:37 ` Wei Yang [this message]
2026-01-05 16:16 ` [Patch v2] " David Hildenbrand (Red Hat)
2026-01-05 16:29 ` Lorenzo Stoakes
2026-01-05 16:52 ` Matthew Wilcox
2026-01-06 9:54 ` Wei Yang
2026-01-06 12:28 ` Zi Yan
2026-01-06 12:51 ` Wei Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260104023756.jufklyl3bl64fnck@master \
--to=richard.weiyang@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=lance.yang@linux.dev \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=npache@redhat.com \
--cc=ryan.roberts@arm.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.