linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] __folio_split() clean up.
@ 2025-07-11 18:23 Zi Yan
  2025-07-11 18:23 ` [PATCH v2 1/2] mm/huge_memory: move unrelated code out of __split_unmapped_folio() Zi Yan
  2025-07-11 18:23 ` [PATCH v2 2/2] mm/huge_memory: use folio_expected_ref_count() to calculate ref_count Zi Yan
  0 siblings, 2 replies; 8+ messages in thread
From: Zi Yan @ 2025-07-11 18:23 UTC (permalink / raw)
  To: Balbir Singh, David Hildenbrand, linux-mm
  Cc: Andrew Morton, Hugh Dickins, Kirill Shutemov, Lorenzo Stoakes,
	Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts,
	Dev Jain, Barry Song, linux-kernel

Based on the prior discussion[1], this patch improves
__split_unmapped_folio() by making it reusable for splitting unmapped folios.
This helps avoid the need for a new boolean unmapped parameter to guard
mapping-related code.

An additional benefit is that __split_unmapped_folio() could be
called on after-split folios by __folio_split(). It can enable new split
methods. For example, at deferred split time, unmapped subpages can
scatter arbitrarily within a large folio, neither uniform nor non-uniform
split can maximize after-split folio orders for mapped subpages.
The hope is that by calling __split_unmapped_folio() multiple times,
a better split result can be achieved.

It passed mm selftests.


Changelog
===
From V1[2]:
1. Fixed indentations.
2. Used folio_expected_ref_count() to calculate ref_count instead of
   open coding.


[1] https://lore.kernel.org/linux-mm/94D8C1A4-780C-4BEC-A336-7D3613B54845@nvidia.com/
[2] https://lore.kernel.org/linux-mm/20250711030259.3574392-1-ziy@nvidia.com/

Zi Yan (2):
  mm/huge_memory: move unrelated code out of __split_unmapped_folio()
  mm/huge_memory: use folio_expected_ref_count() to calculate ref_count.

 mm/huge_memory.c | 271 +++++++++++++++++++++++------------------------
 1 file changed, 135 insertions(+), 136 deletions(-)

-- 
2.47.2



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/2] mm/huge_memory: move unrelated code out of __split_unmapped_folio()
  2025-07-11 18:23 [PATCH v2 0/2] __folio_split() clean up Zi Yan
@ 2025-07-11 18:23 ` Zi Yan
  2025-07-14  1:15   ` Balbir Singh
  2025-07-14 15:30   ` David Hildenbrand
  2025-07-11 18:23 ` [PATCH v2 2/2] mm/huge_memory: use folio_expected_ref_count() to calculate ref_count Zi Yan
  1 sibling, 2 replies; 8+ messages in thread
From: Zi Yan @ 2025-07-11 18:23 UTC (permalink / raw)
  To: Balbir Singh, David Hildenbrand, linux-mm
  Cc: Andrew Morton, Hugh Dickins, Kirill Shutemov, Lorenzo Stoakes,
	Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts,
	Dev Jain, Barry Song, linux-kernel

remap(), folio_ref_unfreeze(), lru_add_split_folio() are not relevant to
splitting unmapped folio operations. Move them out to the caller so that
__split_unmapped_folio() only handles unmapped folio splits. This makes
__split_unmapped_folio() reusable.

Convert VM_BUG_ON(mapping) to use VM_WARN_ON_ONCE_FOLIO().

Signed-off-by: Zi Yan <ziy@nvidia.com>
---
 mm/huge_memory.c | 273 ++++++++++++++++++++++++-----------------------
 1 file changed, 137 insertions(+), 136 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 3eb1c34be601..818a6bd9f0d4 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3396,10 +3396,6 @@ static void __split_folio_to_order(struct folio *folio, int old_order,
  *             order - 1 to new_order).
  * @split_at: in buddy allocator like split, the folio containing @split_at
  *            will be split until its order becomes @new_order.
- * @lock_at: the folio containing @lock_at is left locked for caller.
- * @list: the after split folios will be added to @list if it is not NULL,
- *        otherwise to LRU lists.
- * @end: the end of the file @folio maps to. -1 if @folio is anonymous memory.
  * @xas: xa_state pointing to folio->mapping->i_pages and locked by caller
  * @mapping: @folio->mapping
  * @uniform_split: if the split is uniform or not (buddy allocator like split)
@@ -3425,51 +3421,26 @@ static void __split_folio_to_order(struct folio *folio, int old_order,
  *    @page, which is split in next for loop.
  *
  * After splitting, the caller's folio reference will be transferred to the
- * folio containing @page. The other folios may be freed if they are not mapped.
- *
- * In terms of locking, after splitting,
- * 1. uniform split leaves @page (or the folio contains it) locked;
- * 2. buddy allocator like (non-uniform) split leaves @folio locked.
- *
+ * folio containing @page. The caller needs to unlock and/or free after-split
+ * folios if necessary.
  *
  * For !uniform_split, when -ENOMEM is returned, the original folio might be
  * split. The caller needs to check the input folio.
  */
 static int __split_unmapped_folio(struct folio *folio, int new_order,
-		struct page *split_at, struct page *lock_at,
-		struct list_head *list, pgoff_t end,
-		struct xa_state *xas, struct address_space *mapping,
-		bool uniform_split)
+		struct page *split_at, struct xa_state *xas,
+		struct address_space *mapping, bool uniform_split)
 {
-	struct lruvec *lruvec;
-	struct address_space *swap_cache = NULL;
-	struct folio *origin_folio = folio;
-	struct folio *next_folio = folio_next(folio);
-	struct folio *new_folio;
 	struct folio *next;
 	int order = folio_order(folio);
 	int split_order;
 	int start_order = uniform_split ? new_order : order - 1;
-	int nr_dropped = 0;
 	int ret = 0;
 	bool stop_split = false;
 
-	if (folio_test_swapcache(folio)) {
-		VM_BUG_ON(mapping);
-
-		/* a swapcache folio can only be uniformly split to order-0 */
-		if (!uniform_split || new_order != 0)
-			return -EINVAL;
-
-		swap_cache = swap_address_space(folio->swap);
-		xa_lock(&swap_cache->i_pages);
-	}
-
 	if (folio_test_anon(folio))
 		mod_mthp_stat(order, MTHP_STAT_NR_ANON, -1);
 
-	/* lock lru list/PageCompound, ref frozen by page_ref_freeze */
-	lruvec = folio_lruvec_lock(folio);
 
 	folio_clear_has_hwpoisoned(folio);
 
@@ -3480,9 +3451,9 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
 	for (split_order = start_order;
 	     split_order >= new_order && !stop_split;
 	     split_order--) {
-		int old_order = folio_order(folio);
-		struct folio *release;
 		struct folio *end_folio = folio_next(folio);
+		int old_order = folio_order(folio);
+		struct folio *new_folio;
 
 		/* order-1 anonymous folio is not supported */
 		if (folio_test_anon(folio) && split_order == 1)
@@ -3517,113 +3488,34 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
 
 after_split:
 		/*
-		 * Iterate through after-split folios and perform related
-		 * operations. But in buddy allocator like split, the folio
+		 * Iterate through after-split folios and update folio stats.
+		 * But in buddy allocator like split, the folio
 		 * containing the specified page is skipped until its order
 		 * is new_order, since the folio will be worked on in next
 		 * iteration.
 		 */
-		for (release = folio; release != end_folio; release = next) {
-			next = folio_next(release);
+		for (new_folio = folio; new_folio != end_folio; new_folio = next) {
+			next = folio_next(new_folio);
 			/*
-			 * for buddy allocator like split, the folio containing
-			 * page will be split next and should not be released,
-			 * until the folio's order is new_order or stop_split
-			 * is set to true by the above xas_split() failure.
+			 * for buddy allocator like split, new_folio containing
+			 * page could be split again, thus do not change stats
+			 * yet. Wait until new_folio's order is new_order or
+			 * stop_split is set to true by the above xas_split()
+			 * failure.
 			 */
-			if (release == page_folio(split_at)) {
-				folio = release;
+			if (new_folio == page_folio(split_at)) {
+				folio = new_folio;
 				if (split_order != new_order && !stop_split)
 					continue;
 			}
-			if (folio_test_anon(release)) {
-				mod_mthp_stat(folio_order(release),
+			if (folio_test_anon(new_folio)) {
+				mod_mthp_stat(folio_order(new_folio),
 						MTHP_STAT_NR_ANON, 1);
 			}
 
-			/*
-			 * origin_folio should be kept frozon until page cache
-			 * entries are updated with all the other after-split
-			 * folios to prevent others seeing stale page cache
-			 * entries.
-			 */
-			if (release == origin_folio)
-				continue;
-
-			folio_ref_unfreeze(release, 1 +
-					((mapping || swap_cache) ?
-						folio_nr_pages(release) : 0));
-
-			lru_add_split_folio(origin_folio, release, lruvec,
-					list);
-
-			/* Some pages can be beyond EOF: drop them from cache */
-			if (release->index >= end) {
-				if (shmem_mapping(mapping))
-					nr_dropped += folio_nr_pages(release);
-				else if (folio_test_clear_dirty(release))
-					folio_account_cleaned(release,
-						inode_to_wb(mapping->host));
-				__filemap_remove_folio(release, NULL);
-				folio_put_refs(release, folio_nr_pages(release));
-			} else if (mapping) {
-				__xa_store(&mapping->i_pages,
-						release->index, release, 0);
-			} else if (swap_cache) {
-				__xa_store(&swap_cache->i_pages,
-						swap_cache_index(release->swap),
-						release, 0);
-			}
 		}
 	}
 
-	/*
-	 * Unfreeze origin_folio only after all page cache entries, which used
-	 * to point to it, have been updated with new folios. Otherwise,
-	 * a parallel folio_try_get() can grab origin_folio and its caller can
-	 * see stale page cache entries.
-	 */
-	folio_ref_unfreeze(origin_folio, 1 +
-		((mapping || swap_cache) ? folio_nr_pages(origin_folio) : 0));
-
-	unlock_page_lruvec(lruvec);
-
-	if (swap_cache)
-		xa_unlock(&swap_cache->i_pages);
-	if (mapping)
-		xa_unlock(&mapping->i_pages);
-
-	/* Caller disabled irqs, so they are still disabled here */
-	local_irq_enable();
-
-	if (nr_dropped)
-		shmem_uncharge(mapping->host, nr_dropped);
-
-	remap_page(origin_folio, 1 << order,
-			folio_test_anon(origin_folio) ?
-				RMP_USE_SHARED_ZEROPAGE : 0);
-
-	/*
-	 * At this point, folio should contain the specified page.
-	 * For uniform split, it is left for caller to unlock.
-	 * For buddy allocator like split, the first after-split folio is left
-	 * for caller to unlock.
-	 */
-	for (new_folio = origin_folio; new_folio != next_folio; new_folio = next) {
-		next = folio_next(new_folio);
-		if (new_folio == page_folio(lock_at))
-			continue;
-
-		folio_unlock(new_folio);
-		/*
-		 * Subpages may be freed if there wasn't any mapping
-		 * like if add_to_swap() is running on a lru page that
-		 * had its mapping zapped. And freeing these pages
-		 * requires taking the lru_lock so we do the put_page
-		 * of the tail pages after the split is complete.
-		 */
-		free_folio_and_swap_cache(new_folio);
-	}
 	return ret;
 }
 
@@ -3706,10 +3598,13 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
 {
 	struct deferred_split *ds_queue = get_deferred_split_queue(folio);
 	XA_STATE(xas, &folio->mapping->i_pages, folio->index);
+	struct folio *next_folio = folio_next(folio);
 	bool is_anon = folio_test_anon(folio);
 	struct address_space *mapping = NULL;
 	struct anon_vma *anon_vma = NULL;
 	int order = folio_order(folio);
+	struct folio *new_folio, *next;
+	int nr_shmem_dropped = 0;
 	int extra_pins, ret;
 	pgoff_t end;
 	bool is_hzp;
@@ -3833,13 +3728,18 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
 		 */
 		xas_lock(&xas);
 		xas_reset(&xas);
-		if (xas_load(&xas) != folio)
+		if (xas_load(&xas) != folio) {
+			ret = -EAGAIN;
 			goto fail;
+		}
 	}
 
 	/* Prevent deferred_split_scan() touching ->_refcount */
 	spin_lock(&ds_queue->split_queue_lock);
 	if (folio_ref_freeze(folio, 1 + extra_pins)) {
+		struct address_space *swap_cache = NULL;
+		struct lruvec *lruvec;
+
 		if (folio_order(folio) > 1 &&
 		    !list_empty(&folio->_deferred_list)) {
 			ds_queue->split_queue_len--;
@@ -3873,18 +3773,119 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
 			}
 		}
 
-		ret = __split_unmapped_folio(folio, new_order,
-				split_at, lock_at, list, end, &xas, mapping,
-				uniform_split);
+		if (folio_test_swapcache(folio)) {
+			if (mapping) {
+				VM_WARN_ON_ONCE_FOLIO(mapping, folio);
+				ret = -EINVAL;
+				goto fail;
+			}
+
+			/*
+			 * a swapcache folio can only be uniformly split to
+			 * order-0
+			 */
+			if (!uniform_split || new_order != 0) {
+				ret = -EINVAL;
+				goto fail;
+			}
+
+			swap_cache = swap_address_space(folio->swap);
+			xa_lock(&swap_cache->i_pages);
+		}
+
+		/* lock lru list/PageCompound, ref frozen by page_ref_freeze */
+		lruvec = folio_lruvec_lock(folio);
+
+		ret = __split_unmapped_folio(folio, new_order, split_at, &xas,
+					     mapping, uniform_split);
+
+		/*
+		 * Unfreeze after-split folios and put them back to the right
+		 * list. @folio should be kept frozon until page cache entries
+		 * are updated with all the other after-split folios to prevent
+		 * others seeing stale page cache entries.
+		 */
+		for (new_folio = folio_next(folio); new_folio != next_folio;
+		     new_folio = next) {
+			next = folio_next(new_folio);
+
+			folio_ref_unfreeze(
+				new_folio,
+				1 + ((mapping || swap_cache) ?
+					     folio_nr_pages(new_folio) :
+					     0));
+
+			lru_add_split_folio(folio, new_folio, lruvec, list);
+
+			/* Some pages can be beyond EOF: drop them from cache */
+			if (new_folio->index >= end) {
+				if (shmem_mapping(mapping))
+					nr_shmem_dropped += folio_nr_pages(new_folio);
+				else if (folio_test_clear_dirty(new_folio))
+					folio_account_cleaned(
+						new_folio,
+						inode_to_wb(mapping->host));
+				__filemap_remove_folio(new_folio, NULL);
+				folio_put_refs(new_folio,
+					       folio_nr_pages(new_folio));
+			} else if (mapping) {
+				__xa_store(&mapping->i_pages, new_folio->index,
+					   new_folio, 0);
+			} else if (swap_cache) {
+				__xa_store(&swap_cache->i_pages,
+					   swap_cache_index(new_folio->swap),
+					   new_folio, 0);
+			}
+		}
+		/*
+		 * Unfreeze @folio only after all page cache entries, which
+		 * used to point to it, have been updated with new folios.
+		 * Otherwise, a parallel folio_try_get() can grab origin_folio
+		 * and its caller can see stale page cache entries.
+		 */
+		folio_ref_unfreeze(folio, 1 +
+			((mapping || swap_cache) ? folio_nr_pages(folio) : 0));
+
+		unlock_page_lruvec(lruvec);
+
+		if (swap_cache)
+			xa_unlock(&swap_cache->i_pages);
 	} else {
 		spin_unlock(&ds_queue->split_queue_lock);
-fail:
-		if (mapping)
-			xas_unlock(&xas);
-		local_irq_enable();
-		remap_page(folio, folio_nr_pages(folio), 0);
 		ret = -EAGAIN;
 	}
+fail:
+	if (mapping)
+		xas_unlock(&xas);
+
+	local_irq_enable();
+
+	if (nr_shmem_dropped)
+		shmem_uncharge(mapping->host, nr_shmem_dropped);
+
+	remap_page(folio, 1 << order,
+		   !ret && folio_test_anon(folio) ? RMP_USE_SHARED_ZEROPAGE :
+						    0);
+
+	/*
+	 * Unlock all after-split folios except the one containing @lock_at
+	 * page. If @folio is not split, it will be kept locked.
+	 */
+	for (new_folio = folio; new_folio != next_folio; new_folio = next) {
+		next = folio_next(new_folio);
+		if (new_folio == page_folio(lock_at))
+			continue;
+
+		folio_unlock(new_folio);
+		/*
+		 * Subpages may be freed if there wasn't any mapping
+		 * like if add_to_swap() is running on a lru page that
+		 * had its mapping zapped. And freeing these pages
+		 * requires taking the lru_lock so we do the put_page
+		 * of the tail pages after the split is complete.
+		 */
+		free_folio_and_swap_cache(new_folio);
+	}
 
 out_unlock:
 	if (anon_vma) {
-- 
2.47.2



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/2] mm/huge_memory: use folio_expected_ref_count() to calculate ref_count.
  2025-07-11 18:23 [PATCH v2 0/2] __folio_split() clean up Zi Yan
  2025-07-11 18:23 ` [PATCH v2 1/2] mm/huge_memory: move unrelated code out of __split_unmapped_folio() Zi Yan
@ 2025-07-11 18:23 ` Zi Yan
  2025-07-14  0:43   ` Balbir Singh
  2025-07-14 15:11   ` David Hildenbrand
  1 sibling, 2 replies; 8+ messages in thread
From: Zi Yan @ 2025-07-11 18:23 UTC (permalink / raw)
  To: Balbir Singh, David Hildenbrand, linux-mm
  Cc: Andrew Morton, Hugh Dickins, Kirill Shutemov, Lorenzo Stoakes,
	Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts,
	Dev Jain, Barry Song, linux-kernel

Instead of open coding the ref_count calculation, use
folio_expected_ref_count().

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Zi Yan <ziy@nvidia.com>
---
 mm/huge_memory.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 818a6bd9f0d4..57e5699cf638 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3739,6 +3739,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
 	if (folio_ref_freeze(folio, 1 + extra_pins)) {
 		struct address_space *swap_cache = NULL;
 		struct lruvec *lruvec;
+		int expected_refs;
 
 		if (folio_order(folio) > 1 &&
 		    !list_empty(&folio->_deferred_list)) {
@@ -3809,11 +3810,8 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
 		     new_folio = next) {
 			next = folio_next(new_folio);
 
-			folio_ref_unfreeze(
-				new_folio,
-				1 + ((mapping || swap_cache) ?
-					     folio_nr_pages(new_folio) :
-					     0));
+			expected_refs = folio_expected_ref_count(new_folio) + 1;
+			folio_ref_unfreeze(new_folio, expected_refs);
 
 			lru_add_split_folio(folio, new_folio, lruvec, list);
 
@@ -3843,8 +3841,8 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
 		 * Otherwise, a parallel folio_try_get() can grab origin_folio
 		 * and its caller can see stale page cache entries.
 		 */
-		folio_ref_unfreeze(folio, 1 +
-			((mapping || swap_cache) ? folio_nr_pages(folio) : 0));
+		expected_refs = folio_expected_ref_count(folio) + 1;
+		folio_ref_unfreeze(folio, expected_refs);
 
 		unlock_page_lruvec(lruvec);
 
-- 
2.47.2



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] mm/huge_memory: use folio_expected_ref_count() to calculate ref_count.
  2025-07-11 18:23 ` [PATCH v2 2/2] mm/huge_memory: use folio_expected_ref_count() to calculate ref_count Zi Yan
@ 2025-07-14  0:43   ` Balbir Singh
  2025-07-14 15:11   ` David Hildenbrand
  1 sibling, 0 replies; 8+ messages in thread
From: Balbir Singh @ 2025-07-14  0:43 UTC (permalink / raw)
  To: Zi Yan, David Hildenbrand, linux-mm
  Cc: Andrew Morton, Hugh Dickins, Kirill Shutemov, Lorenzo Stoakes,
	Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
	Barry Song, linux-kernel

On 7/12/25 04:23, Zi Yan wrote:
> Instead of open coding the ref_count calculation, use
> folio_expected_ref_count().
> 
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> ---

Acked-by: Balbir Singh <balbirs@nvidia.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] mm/huge_memory: move unrelated code out of __split_unmapped_folio()
  2025-07-11 18:23 ` [PATCH v2 1/2] mm/huge_memory: move unrelated code out of __split_unmapped_folio() Zi Yan
@ 2025-07-14  1:15   ` Balbir Singh
  2025-07-14 15:30   ` David Hildenbrand
  1 sibling, 0 replies; 8+ messages in thread
From: Balbir Singh @ 2025-07-14  1:15 UTC (permalink / raw)
  To: Zi Yan, David Hildenbrand, linux-mm
  Cc: Andrew Morton, Hugh Dickins, Kirill Shutemov, Lorenzo Stoakes,
	Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
	Barry Song, linux-kernel

On 7/12/25 04:23, Zi Yan wrote:
> remap(), folio_ref_unfreeze(), lru_add_split_folio() are not relevant to
> splitting unmapped folio operations. Move them out to the caller so that
> __split_unmapped_folio() only handles unmapped folio splits. This makes
> __split_unmapped_folio() reusable.
> 
> Convert VM_BUG_ON(mapping) to use VM_WARN_ON_ONCE_FOLIO().
> 
> Signed-off-by: Zi Yan <ziy@nvidia.com>


Acked-by: Balbir Singh <balbirs@nvidia.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] mm/huge_memory: use folio_expected_ref_count() to calculate ref_count.
  2025-07-11 18:23 ` [PATCH v2 2/2] mm/huge_memory: use folio_expected_ref_count() to calculate ref_count Zi Yan
  2025-07-14  0:43   ` Balbir Singh
@ 2025-07-14 15:11   ` David Hildenbrand
  1 sibling, 0 replies; 8+ messages in thread
From: David Hildenbrand @ 2025-07-14 15:11 UTC (permalink / raw)
  To: Zi Yan, Balbir Singh, linux-mm
  Cc: Andrew Morton, Hugh Dickins, Kirill Shutemov, Lorenzo Stoakes,
	Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
	Barry Song, linux-kernel

On 11.07.25 20:23, Zi Yan wrote:
> Instead of open coding the ref_count calculation, use
> folio_expected_ref_count().
> 
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> ---

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] mm/huge_memory: move unrelated code out of __split_unmapped_folio()
  2025-07-11 18:23 ` [PATCH v2 1/2] mm/huge_memory: move unrelated code out of __split_unmapped_folio() Zi Yan
  2025-07-14  1:15   ` Balbir Singh
@ 2025-07-14 15:30   ` David Hildenbrand
  2025-07-14 15:33     ` Zi Yan
  1 sibling, 1 reply; 8+ messages in thread
From: David Hildenbrand @ 2025-07-14 15:30 UTC (permalink / raw)
  To: Zi Yan, Balbir Singh, linux-mm
  Cc: Andrew Morton, Hugh Dickins, Kirill Shutemov, Lorenzo Stoakes,
	Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
	Barry Song, linux-kernel

On 11.07.25 20:23, Zi Yan wrote:
> remap(), folio_ref_unfreeze(), lru_add_split_folio() are not relevant to
> splitting unmapped folio operations. Move them out to the caller so that
> __split_unmapped_folio() only handles unmapped folio splits. This makes
> __split_unmapped_folio() reusable.
> 
> Convert VM_BUG_ON(mapping) to use VM_WARN_ON_ONCE_FOLIO().
> 
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> ---

[...]

> -	if (folio_test_swapcache(folio)) {
> -		VM_BUG_ON(mapping);
> -
> -		/* a swapcache folio can only be uniformly split to order-0 */
> -		if (!uniform_split || new_order != 0)
> -			return -EINVAL;
> -
> -		swap_cache = swap_address_space(folio->swap);
> -		xa_lock(&swap_cache->i_pages);
> -	}
> -
>   	if (folio_test_anon(folio))
>   		mod_mthp_stat(order, MTHP_STAT_NR_ANON, -1);
>   
> -	/* lock lru list/PageCompound, ref frozen by page_ref_freeze */
> -	lruvec = folio_lruvec_lock(folio);
>   

Nit: now double empty line.

>   	folio_clear_has_hwpoisoned(folio);
>   
> @@ -3480,9 +3451,9 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
>   	for (split_order = start_order;
>   	     split_order >= new_order && !stop_split;
>   	     split_order--) {
> -		int old_order = folio_order(folio);
> -		struct folio *release;
>   		struct folio *end_folio = folio_next(folio);
> +		int old_order = folio_order(folio);
> +		struct folio *new_folio;
>   
>   		/* order-1 anonymous folio is not supported */
>   		if (folio_test_anon(folio) && split_order == 1)
> @@ -3517,113 +3488,34 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
>   
>   after_split:
>   		/*
> -		 * Iterate through after-split folios and perform related
> -		 * operations. But in buddy allocator like split, the folio
> +		 * Iterate through after-split folios and update folio stats.
> +		 * But in buddy allocator like split, the folio
>   		 * containing the specified page is skipped until its order
>   		 * is new_order, since the folio will be worked on in next
>   		 * iteration.
>   		 */
> -		for (release = folio; release != end_folio; release = next) {
> -			next = folio_next(release);
> +		for (new_folio = folio; new_folio != end_folio; new_folio = next) {
> +			next = folio_next(new_folio);
>   			/*
> -			 * for buddy allocator like split, the folio containing
> -			 * page will be split next and should not be released,
> -			 * until the folio's order is new_order or stop_split
> -			 * is set to true by the above xas_split() failure.
> +			 * for buddy allocator like split, new_folio containing
> +			 * page could be split again, thus do not change stats
> +			 * yet. Wait until new_folio's order is new_order or
> +			 * stop_split is set to true by the above xas_split()
> +			 * failure.
>   			 */
> -			if (release == page_folio(split_at)) {
> -				folio = release;
> +			if (new_folio == page_folio(split_at)) {
> +				folio = new_folio;
>   				if (split_order != new_order && !stop_split)
>   					continue;
>   			}
> -			if (folio_test_anon(release)) {
> -				mod_mthp_stat(folio_order(release),
> +			if (folio_test_anon(new_folio)) {
> +				mod_mthp_stat(folio_order(new_folio),
>   						MTHP_STAT_NR_ANON, 1);
>   			}

Nit: {} can be dropped

Code is still confusing, so could be that I miss something, but in general
looks like an improvement to me.

I think we can easily get rid of the goto label in __split_unmapped_folio() doing something like

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 14bc0b54cf9f0..db0ae957a0ba8 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3435,18 +3435,18 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
                                 if (xas_error(xas)) {
                                         ret = xas_error(xas);
                                         stop_split = true;
-                                       goto after_split;
                                 }
                         }
                 }
  
-               folio_split_memcg_refs(folio, old_order, split_order);
-               split_page_owner(&folio->page, old_order, split_order);
-               pgalloc_tag_split(folio, old_order, split_order);
+               if (!stop_split) {
+                       folio_split_memcg_refs(folio, old_order, split_order);
+                       split_page_owner(&folio->page, old_order, split_order);
+                       pgalloc_tag_split(folio, old_order, split_order);
  
-               __split_folio_to_order(folio, old_order, split_order);
+                       __split_folio_to_order(folio, old_order, split_order);
+               }
  
-after_split:
                 /*
                  * Iterate through after-split folios and update folio stats.
                  * But in buddy allocator like split, the folio



-- 
Cheers,

David / dhildenb



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] mm/huge_memory: move unrelated code out of __split_unmapped_folio()
  2025-07-14 15:30   ` David Hildenbrand
@ 2025-07-14 15:33     ` Zi Yan
  0 siblings, 0 replies; 8+ messages in thread
From: Zi Yan @ 2025-07-14 15:33 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Balbir Singh, linux-mm, Andrew Morton, Hugh Dickins,
	Kirill Shutemov, Lorenzo Stoakes, Baolin Wang, Liam R. Howlett,
	Nico Pache, Ryan Roberts, Dev Jain, Barry Song, linux-kernel

On 14 Jul 2025, at 11:30, David Hildenbrand wrote:

> On 11.07.25 20:23, Zi Yan wrote:
>> remap(), folio_ref_unfreeze(), lru_add_split_folio() are not relevant to
>> splitting unmapped folio operations. Move them out to the caller so that
>> __split_unmapped_folio() only handles unmapped folio splits. This makes
>> __split_unmapped_folio() reusable.
>>
>> Convert VM_BUG_ON(mapping) to use VM_WARN_ON_ONCE_FOLIO().
>>
>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>> ---
>
> [...]
>
>> -	if (folio_test_swapcache(folio)) {
>> -		VM_BUG_ON(mapping);
>> -
>> -		/* a swapcache folio can only be uniformly split to order-0 */
>> -		if (!uniform_split || new_order != 0)
>> -			return -EINVAL;
>> -
>> -		swap_cache = swap_address_space(folio->swap);
>> -		xa_lock(&swap_cache->i_pages);
>> -	}
>> -
>>   	if (folio_test_anon(folio))
>>   		mod_mthp_stat(order, MTHP_STAT_NR_ANON, -1);
>>  -	/* lock lru list/PageCompound, ref frozen by page_ref_freeze */
>> -	lruvec = folio_lruvec_lock(folio);
>>
>
> Nit: now double empty line.
Will fix it.

>
>>   	folio_clear_has_hwpoisoned(folio);
>>  @@ -3480,9 +3451,9 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
>>   	for (split_order = start_order;
>>   	     split_order >= new_order && !stop_split;
>>   	     split_order--) {
>> -		int old_order = folio_order(folio);
>> -		struct folio *release;
>>   		struct folio *end_folio = folio_next(folio);
>> +		int old_order = folio_order(folio);
>> +		struct folio *new_folio;
>>    		/* order-1 anonymous folio is not supported */
>>   		if (folio_test_anon(folio) && split_order == 1)
>> @@ -3517,113 +3488,34 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
>>    after_split:
>>   		/*
>> -		 * Iterate through after-split folios and perform related
>> -		 * operations. But in buddy allocator like split, the folio
>> +		 * Iterate through after-split folios and update folio stats.
>> +		 * But in buddy allocator like split, the folio
>>   		 * containing the specified page is skipped until its order
>>   		 * is new_order, since the folio will be worked on in next
>>   		 * iteration.
>>   		 */
>> -		for (release = folio; release != end_folio; release = next) {
>> -			next = folio_next(release);
>> +		for (new_folio = folio; new_folio != end_folio; new_folio = next) {
>> +			next = folio_next(new_folio);
>>   			/*
>> -			 * for buddy allocator like split, the folio containing
>> -			 * page will be split next and should not be released,
>> -			 * until the folio's order is new_order or stop_split
>> -			 * is set to true by the above xas_split() failure.
>> +			 * for buddy allocator like split, new_folio containing
>> +			 * page could be split again, thus do not change stats
>> +			 * yet. Wait until new_folio's order is new_order or
>> +			 * stop_split is set to true by the above xas_split()
>> +			 * failure.
>>   			 */
>> -			if (release == page_folio(split_at)) {
>> -				folio = release;
>> +			if (new_folio == page_folio(split_at)) {
>> +				folio = new_folio;
>>   				if (split_order != new_order && !stop_split)
>>   					continue;
>>   			}
>> -			if (folio_test_anon(release)) {
>> -				mod_mthp_stat(folio_order(release),
>> +			if (folio_test_anon(new_folio)) {
>> +				mod_mthp_stat(folio_order(new_folio),
>>   						MTHP_STAT_NR_ANON, 1);
>>   			}
>
> Nit: {} can be dropped

Sure.

>
> Code is still confusing, so could be that I miss something, but in general
> looks like an improvement to me.
>
> I think we can easily get rid of the goto label in __split_unmapped_folio() doing something like
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 14bc0b54cf9f0..db0ae957a0ba8 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3435,18 +3435,18 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
>                                 if (xas_error(xas)) {
>                                         ret = xas_error(xas);
>                                         stop_split = true;
> -                                       goto after_split;
>                                 }
>                         }
>                 }
>  -               folio_split_memcg_refs(folio, old_order, split_order);
> -               split_page_owner(&folio->page, old_order, split_order);
> -               pgalloc_tag_split(folio, old_order, split_order);
> +               if (!stop_split) {
> +                       folio_split_memcg_refs(folio, old_order, split_order);
> +                       split_page_owner(&folio->page, old_order, split_order);
> +                       pgalloc_tag_split(folio, old_order, split_order);
>  -               __split_folio_to_order(folio, old_order, split_order);
> +                       __split_folio_to_order(folio, old_order, split_order);
> +               }
>  -after_split:
>                 /*
>                  * Iterate through after-split folios and update folio stats.
>                  * But in buddy allocator like split, the folio
>

Yep, looks much better to me. Let me fix it in V3. Thank you for the review
and suggestions.

Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-07-14 15:33 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-11 18:23 [PATCH v2 0/2] __folio_split() clean up Zi Yan
2025-07-11 18:23 ` [PATCH v2 1/2] mm/huge_memory: move unrelated code out of __split_unmapped_folio() Zi Yan
2025-07-14  1:15   ` Balbir Singh
2025-07-14 15:30   ` David Hildenbrand
2025-07-14 15:33     ` Zi Yan
2025-07-11 18:23 ` [PATCH v2 2/2] mm/huge_memory: use folio_expected_ref_count() to calculate ref_count Zi Yan
2025-07-14  0:43   ` Balbir Singh
2025-07-14 15:11   ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).