* [RFC v2 0/3] Decoupling large folios dependency on THP
@ 2025-12-06 3:08 Pankaj Raghav
2025-12-06 3:08 ` [RFC v2 1/3] filemap: set max order to be min order if THP is disabled Pankaj Raghav
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Pankaj Raghav @ 2025-12-06 3:08 UTC (permalink / raw)
To: Suren Baghdasaryan, Mike Rapoport, David Hildenbrand,
Ryan Roberts, Michal Hocko, Lance Yang, Lorenzo Stoakes,
Baolin Wang, Dev Jain, Barry Song, Andrew Morton, Nico Pache,
Zi Yan, Vlastimil Babka, Liam R . Howlett, Jens Axboe
Cc: linux-kernel, linux-mm, linux-block, linux-fsdevel, mcgrof,
gost.dev, kernel, tytso, Pankaj Raghav
File-backed Large folios were initially implemented with dependencies on Transparent
Huge Pages (THP) infrastructure. As large folio adoption expanded across
the kernel, CONFIG_TRANSPARENT_HUGEPAGE has become an overloaded
configuration option, sometimes used as a proxy for large folio support
[1][2][3].
This series is a part of the LPC talk[4], and I am sending the RFC
series to start the discussion.
There are multiple solutions to solve this problem and this is one of
them with minimal changes. I plan on discussing possible other solutions
at the talk.
Based on my investigation, the only feature large folios depend on is
the THP splitting infrastructure. Either during truncation or memory
pressure when the large folio has to be split, then THP's splitting
infrastructure is used to split them into min order folio chunks.
In this approach, we restrict the maximum order of the large folio to
minimum order to ensure we never use the splitting infrastructure when
THP is disabled.
I disabled THP, and ran xfstests on XFS with 16k, 32k and 64k blocksizes
and the changes seems to survive the test without any issues.
Looking forward to some productive discussion.
P.S: Thanks to Zi, David and willy for all the ideas they provided to
solve this problem.
[1] https://lore.kernel.org/linux-mm/731d8b44-1a45-40bc-a274-8f39a7ae0f7f@lucifer.local/
[2] https://lore.kernel.org/all/aGfNKGBz9lhuK1AF@casper.infradead.org/
[3] https://lore.kernel.org/linux-ext4/20251110043226.GD2988753@mit.edu/
[4] https://lpc.events/event/19/contributions/2139/
Pankaj Raghav (3):
filemap: set max order to be min order if THP is disabled
huge_memory: skip warning if min order and folio order are same in
split
blkdev: remove CONFIG_TRANSPARENT_HUGEPAGES dependency for LBS devices
include/linux/blkdev.h | 5 -----
include/linux/huge_mm.h | 40 ++++++++--------------------------------
include/linux/pagemap.h | 17 ++++++-----------
mm/memory.c | 41 +++++++++++++++++++++++++++++++++++++++++
4 files changed, 55 insertions(+), 48 deletions(-)
base-commit: e4c4d9892021888be6d874ec1be307e80382f431
--
2.50.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC v2 1/3] filemap: set max order to be min order if THP is disabled
2025-12-06 3:08 [RFC v2 0/3] Decoupling large folios dependency on THP Pankaj Raghav
@ 2025-12-06 3:08 ` Pankaj Raghav
2025-12-09 7:45 ` Hannes Reinecke
2025-12-06 3:08 ` [RFC v2 2/3] huge_memory: skip warning if min order and folio order are same in split Pankaj Raghav
` (2 subsequent siblings)
3 siblings, 1 reply; 11+ messages in thread
From: Pankaj Raghav @ 2025-12-06 3:08 UTC (permalink / raw)
To: Suren Baghdasaryan, Mike Rapoport, David Hildenbrand,
Ryan Roberts, Michal Hocko, Lance Yang, Lorenzo Stoakes,
Baolin Wang, Dev Jain, Barry Song, Andrew Morton, Nico Pache,
Zi Yan, Vlastimil Babka, Liam R . Howlett, Jens Axboe
Cc: linux-kernel, linux-mm, linux-block, linux-fsdevel, mcgrof,
gost.dev, kernel, tytso, Pankaj Raghav
Large folios in the page cache depend on the splitting infrastructure from
THP. To remove the dependency between large folios and
CONFIG_TRANSPARENT_HUGEPAGE, set the min order == max order if THP is
disabled. This will make sure the splitting code will not be required
when THP is disabled, therefore, removing the dependency between large
folios and THP.
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
---
include/linux/pagemap.h | 17 ++++++-----------
1 file changed, 6 insertions(+), 11 deletions(-)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 09b581c1d878..1bb0d4432d4b 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -397,9 +397,7 @@ static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask)
*/
static inline size_t mapping_max_folio_size_supported(void)
{
- if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
- return 1U << (PAGE_SHIFT + MAX_PAGECACHE_ORDER);
- return PAGE_SIZE;
+ return 1U << (PAGE_SHIFT + MAX_PAGECACHE_ORDER);
}
/*
@@ -422,16 +420,17 @@ static inline void mapping_set_folio_order_range(struct address_space *mapping,
unsigned int min,
unsigned int max)
{
- if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
- return;
-
if (min > MAX_PAGECACHE_ORDER)
min = MAX_PAGECACHE_ORDER;
if (max > MAX_PAGECACHE_ORDER)
max = MAX_PAGECACHE_ORDER;
- if (max < min)
+ /* Large folios depend on THP infrastructure for splitting.
+ * If THP is disabled, we cap the max order to min order to avoid
+ * splitting the folios.
+ */
+ if ((max < min) || !IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
max = min;
mapping->flags = (mapping->flags & ~AS_FOLIO_ORDER_MASK) |
@@ -463,16 +462,12 @@ static inline void mapping_set_large_folios(struct address_space *mapping)
static inline unsigned int
mapping_max_folio_order(const struct address_space *mapping)
{
- if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
- return 0;
return (mapping->flags & AS_FOLIO_ORDER_MAX_MASK) >> AS_FOLIO_ORDER_MAX;
}
static inline unsigned int
mapping_min_folio_order(const struct address_space *mapping)
{
- if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
- return 0;
return (mapping->flags & AS_FOLIO_ORDER_MIN_MASK) >> AS_FOLIO_ORDER_MIN;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC v2 2/3] huge_memory: skip warning if min order and folio order are same in split
2025-12-06 3:08 [RFC v2 0/3] Decoupling large folios dependency on THP Pankaj Raghav
2025-12-06 3:08 ` [RFC v2 1/3] filemap: set max order to be min order if THP is disabled Pankaj Raghav
@ 2025-12-06 3:08 ` Pankaj Raghav
2025-12-06 3:08 ` [RFC v2 3/3] blkdev: remove CONFIG_TRANSPARENT_HUGEPAGES dependency for LBS devices Pankaj Raghav
2025-12-09 16:03 ` [RFC v2 0/3] Decoupling large folios dependency on THP Zi Yan
3 siblings, 0 replies; 11+ messages in thread
From: Pankaj Raghav @ 2025-12-06 3:08 UTC (permalink / raw)
To: Suren Baghdasaryan, Mike Rapoport, David Hildenbrand,
Ryan Roberts, Michal Hocko, Lance Yang, Lorenzo Stoakes,
Baolin Wang, Dev Jain, Barry Song, Andrew Morton, Nico Pache,
Zi Yan, Vlastimil Babka, Liam R . Howlett, Jens Axboe
Cc: linux-kernel, linux-mm, linux-block, linux-fsdevel, mcgrof,
gost.dev, kernel, tytso, Pankaj Raghav
When THP is disabled, file-backed large folios max order is capped to the
min order to avoid using the splitting infrastructure.
Currently, splitting calls will create a warning when called with THP
disabled. But splitting call does not have to do anything when min order
is same as the folio order.
So skip the warning in folio split functions if the min order is same as
the folio order for file backed folios.
Due to issues with circular dependency, move the definition of split
function for !CONFIG_TRANSPARENT_HUGEPAGES to mm/memory.c
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
---
include/linux/huge_mm.h | 40 ++++++++--------------------------------
mm/memory.c | 41 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 49 insertions(+), 32 deletions(-)
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 21162493a0a0..71e309f2d26a 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -612,42 +612,18 @@ can_split_folio(struct folio *folio, int caller_pins, int *pextra_pins)
{
return false;
}
-static inline int
-split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
- unsigned int new_order)
-{
- VM_WARN_ON_ONCE_PAGE(1, page);
- return -EINVAL;
-}
-static inline int split_huge_page_to_order(struct page *page, unsigned int new_order)
-{
- VM_WARN_ON_ONCE_PAGE(1, page);
- return -EINVAL;
-}
+int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
+ unsigned int new_order);
+int split_huge_page_to_order(struct page *page, unsigned int new_order);
static inline int split_huge_page(struct page *page)
{
- VM_WARN_ON_ONCE_PAGE(1, page);
- return -EINVAL;
-}
-
-static inline unsigned int min_order_for_split(struct folio *folio)
-{
- VM_WARN_ON_ONCE_FOLIO(1, folio);
- return 0;
-}
-
-static inline int split_folio_to_list(struct folio *folio, struct list_head *list)
-{
- VM_WARN_ON_ONCE_FOLIO(1, folio);
- return -EINVAL;
+ return split_huge_page_to_list_to_order(page, NULL, 0);
}
-static inline int try_folio_split_to_order(struct folio *folio,
- struct page *page, unsigned int new_order)
-{
- VM_WARN_ON_ONCE_FOLIO(1, folio);
- return -EINVAL;
-}
+unsigned int min_order_for_split(struct folio *folio);
+int split_folio_to_list(struct folio *folio, struct list_head *list);
+int try_folio_split_to_order(struct folio *folio,
+ struct page *page, unsigned int new_order);
static inline void deferred_split_folio(struct folio *folio, bool partially_mapped) {}
static inline void reparent_deferred_split_queue(struct mem_cgroup *memcg) {}
diff --git a/mm/memory.c b/mm/memory.c
index 6675e87eb7dd..4eccdf72a46e 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4020,6 +4020,47 @@ static bool __wp_can_reuse_large_anon_folio(struct folio *folio,
{
BUILD_BUG();
}
+
+int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
+ unsigned int new_order)
+{
+ struct folio *folio = page_folio(page);
+ unsigned int order = mapping_min_folio_order(folio->mapping);
+
+ if (!folio_test_anon(folio) && order == folio_order(folio))
+ return -EINVAL;
+
+ VM_WARN_ON_ONCE_PAGE(1, page);
+ return -EINVAL;
+}
+
+int split_huge_page_to_order(struct page *page, unsigned int new_order)
+{
+ return split_huge_page_to_list_to_order(page, NULL, new_order);
+}
+
+int split_folio_to_list(struct folio *folio, struct list_head *list)
+{
+ unsigned int order = mapping_min_folio_order(folio->mapping);
+
+ if (!folio_test_anon(folio) && order == folio_order(folio))
+ return -EINVAL;
+
+ VM_WARN_ON_ONCE_FOLIO(1, folio);
+ return -EINVAL;
+}
+
+unsigned int min_order_for_split(struct folio *folio)
+{
+ return split_folio_to_list(folio, NULL);
+}
+
+
+int try_folio_split_to_order(struct folio *folio, struct page *page,
+ unsigned int new_order)
+{
+ return split_folio_to_list(folio, NULL);
+}
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
static bool wp_can_reuse_anon_folio(struct folio *folio,
--
2.50.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC v2 3/3] blkdev: remove CONFIG_TRANSPARENT_HUGEPAGES dependency for LBS devices
2025-12-06 3:08 [RFC v2 0/3] Decoupling large folios dependency on THP Pankaj Raghav
2025-12-06 3:08 ` [RFC v2 1/3] filemap: set max order to be min order if THP is disabled Pankaj Raghav
2025-12-06 3:08 ` [RFC v2 2/3] huge_memory: skip warning if min order and folio order are same in split Pankaj Raghav
@ 2025-12-06 3:08 ` Pankaj Raghav
2025-12-09 16:03 ` [RFC v2 0/3] Decoupling large folios dependency on THP Zi Yan
3 siblings, 0 replies; 11+ messages in thread
From: Pankaj Raghav @ 2025-12-06 3:08 UTC (permalink / raw)
To: Suren Baghdasaryan, Mike Rapoport, David Hildenbrand,
Ryan Roberts, Michal Hocko, Lance Yang, Lorenzo Stoakes,
Baolin Wang, Dev Jain, Barry Song, Andrew Morton, Nico Pache,
Zi Yan, Vlastimil Babka, Liam R . Howlett, Jens Axboe
Cc: linux-kernel, linux-mm, linux-block, linux-fsdevel, mcgrof,
gost.dev, kernel, tytso, Pankaj Raghav
Now that dependency between CONFIG_TRANSPARENT_HUGEPAGES and large
folios are removed, enable LBS devices even when THP config is disabled.
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
---
include/linux/blkdev.h | 5 -----
1 file changed, 5 deletions(-)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 70b671a9a7f7..b6379d73f546 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -270,16 +270,11 @@ static inline dev_t disk_devt(struct gendisk *disk)
return MKDEV(disk->major, disk->first_minor);
}
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
/*
* We should strive for 1 << (PAGE_SHIFT + MAX_PAGECACHE_ORDER)
* however we constrain this to what we can validate and test.
*/
#define BLK_MAX_BLOCK_SIZE SZ_64K
-#else
-#define BLK_MAX_BLOCK_SIZE PAGE_SIZE
-#endif
-
/* blk_validate_limits() validates bsize, so drivers don't usually need to */
static inline int blk_validate_block_size(unsigned long bsize)
--
2.50.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [RFC v2 1/3] filemap: set max order to be min order if THP is disabled
2025-12-06 3:08 ` [RFC v2 1/3] filemap: set max order to be min order if THP is disabled Pankaj Raghav
@ 2025-12-09 7:45 ` Hannes Reinecke
2025-12-09 16:33 ` Pankaj Raghav
0 siblings, 1 reply; 11+ messages in thread
From: Hannes Reinecke @ 2025-12-09 7:45 UTC (permalink / raw)
To: Pankaj Raghav, Suren Baghdasaryan, Mike Rapoport,
David Hildenbrand, Ryan Roberts, Michal Hocko, Lance Yang,
Lorenzo Stoakes, Baolin Wang, Dev Jain, Barry Song, Andrew Morton,
Nico Pache, Zi Yan, Vlastimil Babka, Liam R . Howlett, Jens Axboe
Cc: linux-kernel, linux-mm, linux-block, linux-fsdevel, mcgrof,
gost.dev, kernel, tytso
On 12/6/25 04:08, Pankaj Raghav wrote:
> Large folios in the page cache depend on the splitting infrastructure from
> THP. To remove the dependency between large folios and
> CONFIG_TRANSPARENT_HUGEPAGE, set the min order == max order if THP is
> disabled. This will make sure the splitting code will not be required
> when THP is disabled, therefore, removing the dependency between large
> folios and THP.
>
The description is actually misleading.
It's not that you remove the dependency from THP for large folios
_in general_ (the CONFIG_THP is retained in this patch).
Rather you remove the dependency for large folios _for the block layer_.
And that should be make explicit in the description, otherwise the
description and the patch doesn't match.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC v2 0/3] Decoupling large folios dependency on THP
2025-12-06 3:08 [RFC v2 0/3] Decoupling large folios dependency on THP Pankaj Raghav
` (2 preceding siblings ...)
2025-12-06 3:08 ` [RFC v2 3/3] blkdev: remove CONFIG_TRANSPARENT_HUGEPAGES dependency for LBS devices Pankaj Raghav
@ 2025-12-09 16:03 ` Zi Yan
2025-12-10 4:27 ` Matthew Wilcox
3 siblings, 1 reply; 11+ messages in thread
From: Zi Yan @ 2025-12-09 16:03 UTC (permalink / raw)
To: Pankaj Raghav
Cc: Suren Baghdasaryan, Mike Rapoport, David Hildenbrand,
Ryan Roberts, Michal Hocko, Lance Yang, Lorenzo Stoakes,
Baolin Wang, Dev Jain, Barry Song, Andrew Morton, Nico Pache,
Vlastimil Babka, Liam R . Howlett, Jens Axboe, linux-kernel,
linux-mm, linux-block, linux-fsdevel, mcgrof, gost.dev, kernel,
tytso
On 5 Dec 2025, at 22:08, Pankaj Raghav wrote:
> File-backed Large folios were initially implemented with dependencies on Transparent
> Huge Pages (THP) infrastructure. As large folio adoption expanded across
> the kernel, CONFIG_TRANSPARENT_HUGEPAGE has become an overloaded
> configuration option, sometimes used as a proxy for large folio support
> [1][2][3].
>
> This series is a part of the LPC talk[4], and I am sending the RFC
> series to start the discussion.
>
> There are multiple solutions to solve this problem and this is one of
> them with minimal changes. I plan on discussing possible other solutions
> at the talk.
>
> Based on my investigation, the only feature large folios depend on is
> the THP splitting infrastructure. Either during truncation or memory
> pressure when the large folio has to be split, then THP's splitting
> infrastructure is used to split them into min order folio chunks.
>
> In this approach, we restrict the maximum order of the large folio to
> minimum order to ensure we never use the splitting infrastructure when
> THP is disabled.
>
> I disabled THP, and ran xfstests on XFS with 16k, 32k and 64k blocksizes
> and the changes seems to survive the test without any issues.
But are large folios really created?
IIUC, in do_sync_mmap_readahead(), when THP is disabled, force_thp_readahead
is never set to true and later ra->order is set to 0. Oh, page_cache_ra_order()
later bumps new_order to mapping_min_folio_order(). So large folios are
created there.
I wonder if core-mm should move mTHP code out of CONFIG_TRANSPARENT_HUGEPAGE
and mTHP might just work. Hmm, folio split might need to be moved out of
mm/huge_memory.c in that case. khugepaged should work for mTHP without
CONFIG_TRANSPARENT_HUGEPAGE as well. OK, for anon folios, the changes might
be more involved.
>
> Looking forward to some productive discussion.
>
> P.S: Thanks to Zi, David and willy for all the ideas they provided to
> solve this problem.
>
> [1] https://lore.kernel.org/linux-mm/731d8b44-1a45-40bc-a274-8f39a7ae0f7f@lucifer.local/
> [2] https://lore.kernel.org/all/aGfNKGBz9lhuK1AF@casper.infradead.org/
> [3] https://lore.kernel.org/linux-ext4/20251110043226.GD2988753@mit.edu/
> [4] https://lpc.events/event/19/contributions/2139/
>
> Pankaj Raghav (3):
> filemap: set max order to be min order if THP is disabled
> huge_memory: skip warning if min order and folio order are same in
> split
> blkdev: remove CONFIG_TRANSPARENT_HUGEPAGES dependency for LBS devices
>
> include/linux/blkdev.h | 5 -----
> include/linux/huge_mm.h | 40 ++++++++--------------------------------
> include/linux/pagemap.h | 17 ++++++-----------
> mm/memory.c | 41 +++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 55 insertions(+), 48 deletions(-)
>
>
> base-commit: e4c4d9892021888be6d874ec1be307e80382f431
> --
> 2.50.1
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC v2 1/3] filemap: set max order to be min order if THP is disabled
2025-12-09 7:45 ` Hannes Reinecke
@ 2025-12-09 16:33 ` Pankaj Raghav
2025-12-10 0:38 ` Hannes Reinecke
0 siblings, 1 reply; 11+ messages in thread
From: Pankaj Raghav @ 2025-12-09 16:33 UTC (permalink / raw)
To: Hannes Reinecke, Pankaj Raghav, Suren Baghdasaryan, Mike Rapoport,
David Hildenbrand, Ryan Roberts, Michal Hocko, Lance Yang,
Lorenzo Stoakes, Baolin Wang, Dev Jain, Barry Song, Andrew Morton,
Nico Pache, Zi Yan, Vlastimil Babka, Liam R . Howlett, Jens Axboe
Cc: linux-kernel, linux-mm, linux-block, linux-fsdevel, mcgrof,
gost.dev, tytso
On 12/9/25 13:15, Hannes Reinecke wrote:
> On 12/6/25 04:08, Pankaj Raghav wrote:
>> Large folios in the page cache depend on the splitting infrastructure from
>> THP. To remove the dependency between large folios and
>> CONFIG_TRANSPARENT_HUGEPAGE, set the min order == max order if THP is
>> disabled. This will make sure the splitting code will not be required
>> when THP is disabled, therefore, removing the dependency between large
>> folios and THP.
>>
> The description is actually misleading.
> It's not that you remove the dependency from THP for large folios
> _in general_ (the CONFIG_THP is retained in this patch).
> Rather you remove the dependency for large folios _for the block layer_.
> And that should be make explicit in the description, otherwise the
> description and the patch doesn't match.
>
Hmm, that is not what I am doing. This has nothing to do with the block layer directly.
I mentioned this in the cover letter but I can reiterate it again.
Large folios depended on THP infrastructure when it was introduced. When we added added LBS support
to the block layer, we introduced an indirect dependency on CONFIG_THP. When we disabled config_THP
and had a block device logical block size > page size, we ran into a panic.
That was fixed here[1].
If this patch is upstreamed, then we can disable THP but still have a LBS drive attached without any
issues.
Baolin added another CONFIG_THP block in ext4 [2]. With this support, we don't need to sprinkle THP
where file backed large folios are used.
Happy to discuss this in LPC (if you are attending)!
[1] https://lore.kernel.org/all/20250704092134.289491-1-p.raghav@samsung.com/
[2] https://lwn.net/ml/all/20251121090654.631996-25-libaokun@huaweicloud.com/
--
Pankaj
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC v2 1/3] filemap: set max order to be min order if THP is disabled
2025-12-09 16:33 ` Pankaj Raghav
@ 2025-12-10 0:38 ` Hannes Reinecke
0 siblings, 0 replies; 11+ messages in thread
From: Hannes Reinecke @ 2025-12-10 0:38 UTC (permalink / raw)
To: Pankaj Raghav, Pankaj Raghav, Suren Baghdasaryan, Mike Rapoport,
David Hildenbrand, Ryan Roberts, Michal Hocko, Lance Yang,
Lorenzo Stoakes, Baolin Wang, Dev Jain, Barry Song, Andrew Morton,
Nico Pache, Zi Yan, Vlastimil Babka, Liam R . Howlett, Jens Axboe
Cc: linux-kernel, linux-mm, linux-block, linux-fsdevel, mcgrof,
gost.dev, tytso
On 12/9/25 17:33, Pankaj Raghav wrote:
> On 12/9/25 13:15, Hannes Reinecke wrote:
>> On 12/6/25 04:08, Pankaj Raghav wrote:
>>> Large folios in the page cache depend on the splitting infrastructure from
>>> THP. To remove the dependency between large folios and
>>> CONFIG_TRANSPARENT_HUGEPAGE, set the min order == max order if THP is
>>> disabled. This will make sure the splitting code will not be required
>>> when THP is disabled, therefore, removing the dependency between large
>>> folios and THP.
>>>
>> The description is actually misleading.
>> It's not that you remove the dependency from THP for large folios
>> _in general_ (the CONFIG_THP is retained in this patch).
>> Rather you remove the dependency for large folios _for the block layer_.
>> And that should be make explicit in the description, otherwise the
>> description and the patch doesn't match.
>>
>
> Hmm, that is not what I am doing. This has nothing to do with the block layer directly.
> I mentioned this in the cover letter but I can reiterate it again.
>
> Large folios depended on THP infrastructure when it was introduced. When we added added LBS support
> to the block layer, we introduced an indirect dependency on CONFIG_THP. When we disabled config_THP
> and had a block device logical block size > page size, we ran into a panic.
>
> That was fixed here[1].
>
Yes, and no. That patch limited the maximum blocksize without THP to 4k,
so effectively disabling LBS.
> If this patch is upstreamed, then we can disable THP but still have a LBS drive attached without any
> issues.
>
But this is what I meant. We do _not_ disable the dependency on THP for
LBS, we just remove checks for the config option itself in the block
layer code. The actual dependency on THP will remain as LBS will only
be supported if THP is enabled.
> Baolin added another CONFIG_THP block in ext4 [2]. With this support, we don't need to sprinkle THP
> where file backed large folios are used.
>
> Happy to discuss this in LPC (if you are attending)!
>
The very first presentation on the first day in the CXL track. Yes :-)
Let's discuss there; would love to figure out if we cannot remove the
actual dependency on THP, too.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC v2 0/3] Decoupling large folios dependency on THP
2025-12-09 16:03 ` [RFC v2 0/3] Decoupling large folios dependency on THP Zi Yan
@ 2025-12-10 4:27 ` Matthew Wilcox
2025-12-10 16:37 ` Zi Yan
0 siblings, 1 reply; 11+ messages in thread
From: Matthew Wilcox @ 2025-12-10 4:27 UTC (permalink / raw)
To: Zi Yan
Cc: Pankaj Raghav, Suren Baghdasaryan, Mike Rapoport,
David Hildenbrand, Ryan Roberts, Michal Hocko, Lance Yang,
Lorenzo Stoakes, Baolin Wang, Dev Jain, Barry Song, Andrew Morton,
Nico Pache, Vlastimil Babka, Liam R . Howlett, Jens Axboe,
linux-kernel, linux-mm, linux-block, linux-fsdevel, mcgrof,
gost.dev, kernel, tytso
On Tue, Dec 09, 2025 at 11:03:23AM -0500, Zi Yan wrote:
> I wonder if core-mm should move mTHP code out of CONFIG_TRANSPARENT_HUGEPAGE
> and mTHP might just work. Hmm, folio split might need to be moved out of
> mm/huge_memory.c in that case. khugepaged should work for mTHP without
> CONFIG_TRANSPARENT_HUGEPAGE as well. OK, for anon folios, the changes might
> be more involved.
I think this is the key question to be discussed at LPC. How much of
the current THP code should we say "OK, this is large folio support
and everybody needs it" and how much is "This is PMD (or mTHP) support;
this architecture doesn't have it, we don't need to compile it in".
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC v2 0/3] Decoupling large folios dependency on THP
2025-12-10 4:27 ` Matthew Wilcox
@ 2025-12-10 16:37 ` Zi Yan
2025-12-11 7:37 ` Matthew Wilcox
0 siblings, 1 reply; 11+ messages in thread
From: Zi Yan @ 2025-12-10 16:37 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Pankaj Raghav, Suren Baghdasaryan, Mike Rapoport,
David Hildenbrand, Ryan Roberts, Michal Hocko, Lance Yang,
Lorenzo Stoakes, Baolin Wang, Dev Jain, Barry Song, Andrew Morton,
Nico Pache, Vlastimil Babka, Liam R . Howlett, Jens Axboe,
linux-kernel, linux-mm, linux-block, linux-fsdevel, mcgrof,
gost.dev, kernel, tytso
On 9 Dec 2025, at 23:27, Matthew Wilcox wrote:
> On Tue, Dec 09, 2025 at 11:03:23AM -0500, Zi Yan wrote:
>> I wonder if core-mm should move mTHP code out of CONFIG_TRANSPARENT_HUGEPAGE
>> and mTHP might just work. Hmm, folio split might need to be moved out of
>> mm/huge_memory.c in that case. khugepaged should work for mTHP without
>> CONFIG_TRANSPARENT_HUGEPAGE as well. OK, for anon folios, the changes might
>> be more involved.
>
> I think this is the key question to be discussed at LPC. How much of
I am not going, so would like to get a summary afterwards. :)
> the current THP code should we say "OK, this is large folio support
> and everybody needs it" and how much is "This is PMD (or mTHP) support;
> this architecture doesn't have it, we don't need to compile it in".
I agree with most of it, except mTHP part. mTHP should be part of large
folio, since I see mTHP is anon equivalent to file backed large folio.
Both are a >0 order folio mapped by PTEs (ignoring to-be-implemented
multi-PMD mapped large folios for now).
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC v2 0/3] Decoupling large folios dependency on THP
2025-12-10 16:37 ` Zi Yan
@ 2025-12-11 7:37 ` Matthew Wilcox
0 siblings, 0 replies; 11+ messages in thread
From: Matthew Wilcox @ 2025-12-11 7:37 UTC (permalink / raw)
To: Zi Yan
Cc: Pankaj Raghav, Suren Baghdasaryan, Mike Rapoport,
David Hildenbrand, Ryan Roberts, Michal Hocko, Lance Yang,
Lorenzo Stoakes, Baolin Wang, Dev Jain, Barry Song, Andrew Morton,
Nico Pache, Vlastimil Babka, Liam R . Howlett, Jens Axboe,
linux-kernel, linux-mm, linux-block, linux-fsdevel, mcgrof,
gost.dev, kernel, tytso
On Wed, Dec 10, 2025 at 11:37:51AM -0500, Zi Yan wrote:
> On 9 Dec 2025, at 23:27, Matthew Wilcox wrote:
>
> > On Tue, Dec 09, 2025 at 11:03:23AM -0500, Zi Yan wrote:
> >> I wonder if core-mm should move mTHP code out of CONFIG_TRANSPARENT_HUGEPAGE
> >> and mTHP might just work. Hmm, folio split might need to be moved out of
> >> mm/huge_memory.c in that case. khugepaged should work for mTHP without
> >> CONFIG_TRANSPARENT_HUGEPAGE as well. OK, for anon folios, the changes might
> >> be more involved.
> >
> > I think this is the key question to be discussed at LPC. How much of
>
> I am not going, so would like to get a summary afterwards. :)
You can join the fun at meet.lpc.events, or there's apparently a youtube
stream.
> > the current THP code should we say "OK, this is large folio support
> > and everybody needs it" and how much is "This is PMD (or mTHP) support;
> > this architecture doesn't have it, we don't need to compile it in".
>
> I agree with most of it, except mTHP part. mTHP should be part of large
> folio, since I see mTHP is anon equivalent to file backed large folio.
> Both are a >0 order folio mapped by PTEs (ignoring to-be-implemented
> multi-PMD mapped large folios for now).
Maybe we disagree about what words mean ;-) When I said "mTHP" what
I meant was "support for TLB entries which cover more than one page".
I have no objection to supporting large folio allocation for anon memory
because I think that's beneficial even if there's no hardware support
for TLB entries that cover intermediate sizes between PMD and PTE.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-12-11 7:38 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-06 3:08 [RFC v2 0/3] Decoupling large folios dependency on THP Pankaj Raghav
2025-12-06 3:08 ` [RFC v2 1/3] filemap: set max order to be min order if THP is disabled Pankaj Raghav
2025-12-09 7:45 ` Hannes Reinecke
2025-12-09 16:33 ` Pankaj Raghav
2025-12-10 0:38 ` Hannes Reinecke
2025-12-06 3:08 ` [RFC v2 2/3] huge_memory: skip warning if min order and folio order are same in split Pankaj Raghav
2025-12-06 3:08 ` [RFC v2 3/3] blkdev: remove CONFIG_TRANSPARENT_HUGEPAGES dependency for LBS devices Pankaj Raghav
2025-12-09 16:03 ` [RFC v2 0/3] Decoupling large folios dependency on THP Zi Yan
2025-12-10 4:27 ` Matthew Wilcox
2025-12-10 16:37 ` Zi Yan
2025-12-11 7:37 ` Matthew Wilcox
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).