* [PATCH v2 0/2] __folio_split() clean up. @ 2025-07-11 18:23 Zi Yan 2025-07-11 18:23 ` [PATCH v2 1/2] mm/huge_memory: move unrelated code out of __split_unmapped_folio() Zi Yan 2025-07-11 18:23 ` [PATCH v2 2/2] mm/huge_memory: use folio_expected_ref_count() to calculate ref_count Zi Yan 0 siblings, 2 replies; 8+ messages in thread From: Zi Yan @ 2025-07-11 18:23 UTC (permalink / raw) To: Balbir Singh, David Hildenbrand, linux-mm Cc: Andrew Morton, Hugh Dickins, Kirill Shutemov, Lorenzo Stoakes, Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, linux-kernel Based on the prior discussion[1], this patch improves __split_unmapped_folio() by making it reusable for splitting unmapped folios. This helps avoid the need for a new boolean unmapped parameter to guard mapping-related code. An additional benefit is that __split_unmapped_folio() could be called on after-split folios by __folio_split(). It can enable new split methods. For example, at deferred split time, unmapped subpages can scatter arbitrarily within a large folio, neither uniform nor non-uniform split can maximize after-split folio orders for mapped subpages. The hope is that by calling __split_unmapped_folio() multiple times, a better split result can be achieved. It passed mm selftests. Changelog === From V1[2]: 1. Fixed indentations. 2. Used folio_expected_ref_count() to calculate ref_count instead of open coding. [1] https://lore.kernel.org/linux-mm/94D8C1A4-780C-4BEC-A336-7D3613B54845@nvidia.com/ [2] https://lore.kernel.org/linux-mm/20250711030259.3574392-1-ziy@nvidia.com/ Zi Yan (2): mm/huge_memory: move unrelated code out of __split_unmapped_folio() mm/huge_memory: use folio_expected_ref_count() to calculate ref_count. mm/huge_memory.c | 271 +++++++++++++++++++++++------------------------ 1 file changed, 135 insertions(+), 136 deletions(-) -- 2.47.2 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2 1/2] mm/huge_memory: move unrelated code out of __split_unmapped_folio() 2025-07-11 18:23 [PATCH v2 0/2] __folio_split() clean up Zi Yan @ 2025-07-11 18:23 ` Zi Yan 2025-07-14 1:15 ` Balbir Singh 2025-07-14 15:30 ` David Hildenbrand 2025-07-11 18:23 ` [PATCH v2 2/2] mm/huge_memory: use folio_expected_ref_count() to calculate ref_count Zi Yan 1 sibling, 2 replies; 8+ messages in thread From: Zi Yan @ 2025-07-11 18:23 UTC (permalink / raw) To: Balbir Singh, David Hildenbrand, linux-mm Cc: Andrew Morton, Hugh Dickins, Kirill Shutemov, Lorenzo Stoakes, Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, linux-kernel remap(), folio_ref_unfreeze(), lru_add_split_folio() are not relevant to splitting unmapped folio operations. Move them out to the caller so that __split_unmapped_folio() only handles unmapped folio splits. This makes __split_unmapped_folio() reusable. Convert VM_BUG_ON(mapping) to use VM_WARN_ON_ONCE_FOLIO(). Signed-off-by: Zi Yan <ziy@nvidia.com> --- mm/huge_memory.c | 273 ++++++++++++++++++++++++----------------------- 1 file changed, 137 insertions(+), 136 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 3eb1c34be601..818a6bd9f0d4 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3396,10 +3396,6 @@ static void __split_folio_to_order(struct folio *folio, int old_order, * order - 1 to new_order). * @split_at: in buddy allocator like split, the folio containing @split_at * will be split until its order becomes @new_order. - * @lock_at: the folio containing @lock_at is left locked for caller. - * @list: the after split folios will be added to @list if it is not NULL, - * otherwise to LRU lists. - * @end: the end of the file @folio maps to. -1 if @folio is anonymous memory. * @xas: xa_state pointing to folio->mapping->i_pages and locked by caller * @mapping: @folio->mapping * @uniform_split: if the split is uniform or not (buddy allocator like split) @@ -3425,51 +3421,26 @@ static void __split_folio_to_order(struct folio *folio, int old_order, * @page, which is split in next for loop. * * After splitting, the caller's folio reference will be transferred to the - * folio containing @page. The other folios may be freed if they are not mapped. - * - * In terms of locking, after splitting, - * 1. uniform split leaves @page (or the folio contains it) locked; - * 2. buddy allocator like (non-uniform) split leaves @folio locked. - * + * folio containing @page. The caller needs to unlock and/or free after-split + * folios if necessary. * * For !uniform_split, when -ENOMEM is returned, the original folio might be * split. The caller needs to check the input folio. */ static int __split_unmapped_folio(struct folio *folio, int new_order, - struct page *split_at, struct page *lock_at, - struct list_head *list, pgoff_t end, - struct xa_state *xas, struct address_space *mapping, - bool uniform_split) + struct page *split_at, struct xa_state *xas, + struct address_space *mapping, bool uniform_split) { - struct lruvec *lruvec; - struct address_space *swap_cache = NULL; - struct folio *origin_folio = folio; - struct folio *next_folio = folio_next(folio); - struct folio *new_folio; struct folio *next; int order = folio_order(folio); int split_order; int start_order = uniform_split ? new_order : order - 1; - int nr_dropped = 0; int ret = 0; bool stop_split = false; - if (folio_test_swapcache(folio)) { - VM_BUG_ON(mapping); - - /* a swapcache folio can only be uniformly split to order-0 */ - if (!uniform_split || new_order != 0) - return -EINVAL; - - swap_cache = swap_address_space(folio->swap); - xa_lock(&swap_cache->i_pages); - } - if (folio_test_anon(folio)) mod_mthp_stat(order, MTHP_STAT_NR_ANON, -1); - /* lock lru list/PageCompound, ref frozen by page_ref_freeze */ - lruvec = folio_lruvec_lock(folio); folio_clear_has_hwpoisoned(folio); @@ -3480,9 +3451,9 @@ static int __split_unmapped_folio(struct folio *folio, int new_order, for (split_order = start_order; split_order >= new_order && !stop_split; split_order--) { - int old_order = folio_order(folio); - struct folio *release; struct folio *end_folio = folio_next(folio); + int old_order = folio_order(folio); + struct folio *new_folio; /* order-1 anonymous folio is not supported */ if (folio_test_anon(folio) && split_order == 1) @@ -3517,113 +3488,34 @@ static int __split_unmapped_folio(struct folio *folio, int new_order, after_split: /* - * Iterate through after-split folios and perform related - * operations. But in buddy allocator like split, the folio + * Iterate through after-split folios and update folio stats. + * But in buddy allocator like split, the folio * containing the specified page is skipped until its order * is new_order, since the folio will be worked on in next * iteration. */ - for (release = folio; release != end_folio; release = next) { - next = folio_next(release); + for (new_folio = folio; new_folio != end_folio; new_folio = next) { + next = folio_next(new_folio); /* - * for buddy allocator like split, the folio containing - * page will be split next and should not be released, - * until the folio's order is new_order or stop_split - * is set to true by the above xas_split() failure. + * for buddy allocator like split, new_folio containing + * page could be split again, thus do not change stats + * yet. Wait until new_folio's order is new_order or + * stop_split is set to true by the above xas_split() + * failure. */ - if (release == page_folio(split_at)) { - folio = release; + if (new_folio == page_folio(split_at)) { + folio = new_folio; if (split_order != new_order && !stop_split) continue; } - if (folio_test_anon(release)) { - mod_mthp_stat(folio_order(release), + if (folio_test_anon(new_folio)) { + mod_mthp_stat(folio_order(new_folio), MTHP_STAT_NR_ANON, 1); } - /* - * origin_folio should be kept frozon until page cache - * entries are updated with all the other after-split - * folios to prevent others seeing stale page cache - * entries. - */ - if (release == origin_folio) - continue; - - folio_ref_unfreeze(release, 1 + - ((mapping || swap_cache) ? - folio_nr_pages(release) : 0)); - - lru_add_split_folio(origin_folio, release, lruvec, - list); - - /* Some pages can be beyond EOF: drop them from cache */ - if (release->index >= end) { - if (shmem_mapping(mapping)) - nr_dropped += folio_nr_pages(release); - else if (folio_test_clear_dirty(release)) - folio_account_cleaned(release, - inode_to_wb(mapping->host)); - __filemap_remove_folio(release, NULL); - folio_put_refs(release, folio_nr_pages(release)); - } else if (mapping) { - __xa_store(&mapping->i_pages, - release->index, release, 0); - } else if (swap_cache) { - __xa_store(&swap_cache->i_pages, - swap_cache_index(release->swap), - release, 0); - } } } - /* - * Unfreeze origin_folio only after all page cache entries, which used - * to point to it, have been updated with new folios. Otherwise, - * a parallel folio_try_get() can grab origin_folio and its caller can - * see stale page cache entries. - */ - folio_ref_unfreeze(origin_folio, 1 + - ((mapping || swap_cache) ? folio_nr_pages(origin_folio) : 0)); - - unlock_page_lruvec(lruvec); - - if (swap_cache) - xa_unlock(&swap_cache->i_pages); - if (mapping) - xa_unlock(&mapping->i_pages); - - /* Caller disabled irqs, so they are still disabled here */ - local_irq_enable(); - - if (nr_dropped) - shmem_uncharge(mapping->host, nr_dropped); - - remap_page(origin_folio, 1 << order, - folio_test_anon(origin_folio) ? - RMP_USE_SHARED_ZEROPAGE : 0); - - /* - * At this point, folio should contain the specified page. - * For uniform split, it is left for caller to unlock. - * For buddy allocator like split, the first after-split folio is left - * for caller to unlock. - */ - for (new_folio = origin_folio; new_folio != next_folio; new_folio = next) { - next = folio_next(new_folio); - if (new_folio == page_folio(lock_at)) - continue; - - folio_unlock(new_folio); - /* - * Subpages may be freed if there wasn't any mapping - * like if add_to_swap() is running on a lru page that - * had its mapping zapped. And freeing these pages - * requires taking the lru_lock so we do the put_page - * of the tail pages after the split is complete. - */ - free_folio_and_swap_cache(new_folio); - } return ret; } @@ -3706,10 +3598,13 @@ static int __folio_split(struct folio *folio, unsigned int new_order, { struct deferred_split *ds_queue = get_deferred_split_queue(folio); XA_STATE(xas, &folio->mapping->i_pages, folio->index); + struct folio *next_folio = folio_next(folio); bool is_anon = folio_test_anon(folio); struct address_space *mapping = NULL; struct anon_vma *anon_vma = NULL; int order = folio_order(folio); + struct folio *new_folio, *next; + int nr_shmem_dropped = 0; int extra_pins, ret; pgoff_t end; bool is_hzp; @@ -3833,13 +3728,18 @@ static int __folio_split(struct folio *folio, unsigned int new_order, */ xas_lock(&xas); xas_reset(&xas); - if (xas_load(&xas) != folio) + if (xas_load(&xas) != folio) { + ret = -EAGAIN; goto fail; + } } /* Prevent deferred_split_scan() touching ->_refcount */ spin_lock(&ds_queue->split_queue_lock); if (folio_ref_freeze(folio, 1 + extra_pins)) { + struct address_space *swap_cache = NULL; + struct lruvec *lruvec; + if (folio_order(folio) > 1 && !list_empty(&folio->_deferred_list)) { ds_queue->split_queue_len--; @@ -3873,18 +3773,119 @@ static int __folio_split(struct folio *folio, unsigned int new_order, } } - ret = __split_unmapped_folio(folio, new_order, - split_at, lock_at, list, end, &xas, mapping, - uniform_split); + if (folio_test_swapcache(folio)) { + if (mapping) { + VM_WARN_ON_ONCE_FOLIO(mapping, folio); + ret = -EINVAL; + goto fail; + } + + /* + * a swapcache folio can only be uniformly split to + * order-0 + */ + if (!uniform_split || new_order != 0) { + ret = -EINVAL; + goto fail; + } + + swap_cache = swap_address_space(folio->swap); + xa_lock(&swap_cache->i_pages); + } + + /* lock lru list/PageCompound, ref frozen by page_ref_freeze */ + lruvec = folio_lruvec_lock(folio); + + ret = __split_unmapped_folio(folio, new_order, split_at, &xas, + mapping, uniform_split); + + /* + * Unfreeze after-split folios and put them back to the right + * list. @folio should be kept frozon until page cache entries + * are updated with all the other after-split folios to prevent + * others seeing stale page cache entries. + */ + for (new_folio = folio_next(folio); new_folio != next_folio; + new_folio = next) { + next = folio_next(new_folio); + + folio_ref_unfreeze( + new_folio, + 1 + ((mapping || swap_cache) ? + folio_nr_pages(new_folio) : + 0)); + + lru_add_split_folio(folio, new_folio, lruvec, list); + + /* Some pages can be beyond EOF: drop them from cache */ + if (new_folio->index >= end) { + if (shmem_mapping(mapping)) + nr_shmem_dropped += folio_nr_pages(new_folio); + else if (folio_test_clear_dirty(new_folio)) + folio_account_cleaned( + new_folio, + inode_to_wb(mapping->host)); + __filemap_remove_folio(new_folio, NULL); + folio_put_refs(new_folio, + folio_nr_pages(new_folio)); + } else if (mapping) { + __xa_store(&mapping->i_pages, new_folio->index, + new_folio, 0); + } else if (swap_cache) { + __xa_store(&swap_cache->i_pages, + swap_cache_index(new_folio->swap), + new_folio, 0); + } + } + /* + * Unfreeze @folio only after all page cache entries, which + * used to point to it, have been updated with new folios. + * Otherwise, a parallel folio_try_get() can grab origin_folio + * and its caller can see stale page cache entries. + */ + folio_ref_unfreeze(folio, 1 + + ((mapping || swap_cache) ? folio_nr_pages(folio) : 0)); + + unlock_page_lruvec(lruvec); + + if (swap_cache) + xa_unlock(&swap_cache->i_pages); } else { spin_unlock(&ds_queue->split_queue_lock); -fail: - if (mapping) - xas_unlock(&xas); - local_irq_enable(); - remap_page(folio, folio_nr_pages(folio), 0); ret = -EAGAIN; } +fail: + if (mapping) + xas_unlock(&xas); + + local_irq_enable(); + + if (nr_shmem_dropped) + shmem_uncharge(mapping->host, nr_shmem_dropped); + + remap_page(folio, 1 << order, + !ret && folio_test_anon(folio) ? RMP_USE_SHARED_ZEROPAGE : + 0); + + /* + * Unlock all after-split folios except the one containing @lock_at + * page. If @folio is not split, it will be kept locked. + */ + for (new_folio = folio; new_folio != next_folio; new_folio = next) { + next = folio_next(new_folio); + if (new_folio == page_folio(lock_at)) + continue; + + folio_unlock(new_folio); + /* + * Subpages may be freed if there wasn't any mapping + * like if add_to_swap() is running on a lru page that + * had its mapping zapped. And freeing these pages + * requires taking the lru_lock so we do the put_page + * of the tail pages after the split is complete. + */ + free_folio_and_swap_cache(new_folio); + } out_unlock: if (anon_vma) { -- 2.47.2 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] mm/huge_memory: move unrelated code out of __split_unmapped_folio() 2025-07-11 18:23 ` [PATCH v2 1/2] mm/huge_memory: move unrelated code out of __split_unmapped_folio() Zi Yan @ 2025-07-14 1:15 ` Balbir Singh 2025-07-14 15:30 ` David Hildenbrand 1 sibling, 0 replies; 8+ messages in thread From: Balbir Singh @ 2025-07-14 1:15 UTC (permalink / raw) To: Zi Yan, David Hildenbrand, linux-mm Cc: Andrew Morton, Hugh Dickins, Kirill Shutemov, Lorenzo Stoakes, Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, linux-kernel On 7/12/25 04:23, Zi Yan wrote: > remap(), folio_ref_unfreeze(), lru_add_split_folio() are not relevant to > splitting unmapped folio operations. Move them out to the caller so that > __split_unmapped_folio() only handles unmapped folio splits. This makes > __split_unmapped_folio() reusable. > > Convert VM_BUG_ON(mapping) to use VM_WARN_ON_ONCE_FOLIO(). > > Signed-off-by: Zi Yan <ziy@nvidia.com> Acked-by: Balbir Singh <balbirs@nvidia.com> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] mm/huge_memory: move unrelated code out of __split_unmapped_folio() 2025-07-11 18:23 ` [PATCH v2 1/2] mm/huge_memory: move unrelated code out of __split_unmapped_folio() Zi Yan 2025-07-14 1:15 ` Balbir Singh @ 2025-07-14 15:30 ` David Hildenbrand 2025-07-14 15:33 ` Zi Yan 1 sibling, 1 reply; 8+ messages in thread From: David Hildenbrand @ 2025-07-14 15:30 UTC (permalink / raw) To: Zi Yan, Balbir Singh, linux-mm Cc: Andrew Morton, Hugh Dickins, Kirill Shutemov, Lorenzo Stoakes, Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, linux-kernel On 11.07.25 20:23, Zi Yan wrote: > remap(), folio_ref_unfreeze(), lru_add_split_folio() are not relevant to > splitting unmapped folio operations. Move them out to the caller so that > __split_unmapped_folio() only handles unmapped folio splits. This makes > __split_unmapped_folio() reusable. > > Convert VM_BUG_ON(mapping) to use VM_WARN_ON_ONCE_FOLIO(). > > Signed-off-by: Zi Yan <ziy@nvidia.com> > --- [...] > - if (folio_test_swapcache(folio)) { > - VM_BUG_ON(mapping); > - > - /* a swapcache folio can only be uniformly split to order-0 */ > - if (!uniform_split || new_order != 0) > - return -EINVAL; > - > - swap_cache = swap_address_space(folio->swap); > - xa_lock(&swap_cache->i_pages); > - } > - > if (folio_test_anon(folio)) > mod_mthp_stat(order, MTHP_STAT_NR_ANON, -1); > > - /* lock lru list/PageCompound, ref frozen by page_ref_freeze */ > - lruvec = folio_lruvec_lock(folio); > Nit: now double empty line. > folio_clear_has_hwpoisoned(folio); > > @@ -3480,9 +3451,9 @@ static int __split_unmapped_folio(struct folio *folio, int new_order, > for (split_order = start_order; > split_order >= new_order && !stop_split; > split_order--) { > - int old_order = folio_order(folio); > - struct folio *release; > struct folio *end_folio = folio_next(folio); > + int old_order = folio_order(folio); > + struct folio *new_folio; > > /* order-1 anonymous folio is not supported */ > if (folio_test_anon(folio) && split_order == 1) > @@ -3517,113 +3488,34 @@ static int __split_unmapped_folio(struct folio *folio, int new_order, > > after_split: > /* > - * Iterate through after-split folios and perform related > - * operations. But in buddy allocator like split, the folio > + * Iterate through after-split folios and update folio stats. > + * But in buddy allocator like split, the folio > * containing the specified page is skipped until its order > * is new_order, since the folio will be worked on in next > * iteration. > */ > - for (release = folio; release != end_folio; release = next) { > - next = folio_next(release); > + for (new_folio = folio; new_folio != end_folio; new_folio = next) { > + next = folio_next(new_folio); > /* > - * for buddy allocator like split, the folio containing > - * page will be split next and should not be released, > - * until the folio's order is new_order or stop_split > - * is set to true by the above xas_split() failure. > + * for buddy allocator like split, new_folio containing > + * page could be split again, thus do not change stats > + * yet. Wait until new_folio's order is new_order or > + * stop_split is set to true by the above xas_split() > + * failure. > */ > - if (release == page_folio(split_at)) { > - folio = release; > + if (new_folio == page_folio(split_at)) { > + folio = new_folio; > if (split_order != new_order && !stop_split) > continue; > } > - if (folio_test_anon(release)) { > - mod_mthp_stat(folio_order(release), > + if (folio_test_anon(new_folio)) { > + mod_mthp_stat(folio_order(new_folio), > MTHP_STAT_NR_ANON, 1); > } Nit: {} can be dropped Code is still confusing, so could be that I miss something, but in general looks like an improvement to me. I think we can easily get rid of the goto label in __split_unmapped_folio() doing something like diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 14bc0b54cf9f0..db0ae957a0ba8 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3435,18 +3435,18 @@ static int __split_unmapped_folio(struct folio *folio, int new_order, if (xas_error(xas)) { ret = xas_error(xas); stop_split = true; - goto after_split; } } } - folio_split_memcg_refs(folio, old_order, split_order); - split_page_owner(&folio->page, old_order, split_order); - pgalloc_tag_split(folio, old_order, split_order); + if (!stop_split) { + folio_split_memcg_refs(folio, old_order, split_order); + split_page_owner(&folio->page, old_order, split_order); + pgalloc_tag_split(folio, old_order, split_order); - __split_folio_to_order(folio, old_order, split_order); + __split_folio_to_order(folio, old_order, split_order); + } -after_split: /* * Iterate through after-split folios and update folio stats. * But in buddy allocator like split, the folio -- Cheers, David / dhildenb ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] mm/huge_memory: move unrelated code out of __split_unmapped_folio() 2025-07-14 15:30 ` David Hildenbrand @ 2025-07-14 15:33 ` Zi Yan 0 siblings, 0 replies; 8+ messages in thread From: Zi Yan @ 2025-07-14 15:33 UTC (permalink / raw) To: David Hildenbrand Cc: Balbir Singh, linux-mm, Andrew Morton, Hugh Dickins, Kirill Shutemov, Lorenzo Stoakes, Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, linux-kernel On 14 Jul 2025, at 11:30, David Hildenbrand wrote: > On 11.07.25 20:23, Zi Yan wrote: >> remap(), folio_ref_unfreeze(), lru_add_split_folio() are not relevant to >> splitting unmapped folio operations. Move them out to the caller so that >> __split_unmapped_folio() only handles unmapped folio splits. This makes >> __split_unmapped_folio() reusable. >> >> Convert VM_BUG_ON(mapping) to use VM_WARN_ON_ONCE_FOLIO(). >> >> Signed-off-by: Zi Yan <ziy@nvidia.com> >> --- > > [...] > >> - if (folio_test_swapcache(folio)) { >> - VM_BUG_ON(mapping); >> - >> - /* a swapcache folio can only be uniformly split to order-0 */ >> - if (!uniform_split || new_order != 0) >> - return -EINVAL; >> - >> - swap_cache = swap_address_space(folio->swap); >> - xa_lock(&swap_cache->i_pages); >> - } >> - >> if (folio_test_anon(folio)) >> mod_mthp_stat(order, MTHP_STAT_NR_ANON, -1); >> - /* lock lru list/PageCompound, ref frozen by page_ref_freeze */ >> - lruvec = folio_lruvec_lock(folio); >> > > Nit: now double empty line. Will fix it. > >> folio_clear_has_hwpoisoned(folio); >> @@ -3480,9 +3451,9 @@ static int __split_unmapped_folio(struct folio *folio, int new_order, >> for (split_order = start_order; >> split_order >= new_order && !stop_split; >> split_order--) { >> - int old_order = folio_order(folio); >> - struct folio *release; >> struct folio *end_folio = folio_next(folio); >> + int old_order = folio_order(folio); >> + struct folio *new_folio; >> /* order-1 anonymous folio is not supported */ >> if (folio_test_anon(folio) && split_order == 1) >> @@ -3517,113 +3488,34 @@ static int __split_unmapped_folio(struct folio *folio, int new_order, >> after_split: >> /* >> - * Iterate through after-split folios and perform related >> - * operations. But in buddy allocator like split, the folio >> + * Iterate through after-split folios and update folio stats. >> + * But in buddy allocator like split, the folio >> * containing the specified page is skipped until its order >> * is new_order, since the folio will be worked on in next >> * iteration. >> */ >> - for (release = folio; release != end_folio; release = next) { >> - next = folio_next(release); >> + for (new_folio = folio; new_folio != end_folio; new_folio = next) { >> + next = folio_next(new_folio); >> /* >> - * for buddy allocator like split, the folio containing >> - * page will be split next and should not be released, >> - * until the folio's order is new_order or stop_split >> - * is set to true by the above xas_split() failure. >> + * for buddy allocator like split, new_folio containing >> + * page could be split again, thus do not change stats >> + * yet. Wait until new_folio's order is new_order or >> + * stop_split is set to true by the above xas_split() >> + * failure. >> */ >> - if (release == page_folio(split_at)) { >> - folio = release; >> + if (new_folio == page_folio(split_at)) { >> + folio = new_folio; >> if (split_order != new_order && !stop_split) >> continue; >> } >> - if (folio_test_anon(release)) { >> - mod_mthp_stat(folio_order(release), >> + if (folio_test_anon(new_folio)) { >> + mod_mthp_stat(folio_order(new_folio), >> MTHP_STAT_NR_ANON, 1); >> } > > Nit: {} can be dropped Sure. > > Code is still confusing, so could be that I miss something, but in general > looks like an improvement to me. > > I think we can easily get rid of the goto label in __split_unmapped_folio() doing something like > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 14bc0b54cf9f0..db0ae957a0ba8 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -3435,18 +3435,18 @@ static int __split_unmapped_folio(struct folio *folio, int new_order, > if (xas_error(xas)) { > ret = xas_error(xas); > stop_split = true; > - goto after_split; > } > } > } > - folio_split_memcg_refs(folio, old_order, split_order); > - split_page_owner(&folio->page, old_order, split_order); > - pgalloc_tag_split(folio, old_order, split_order); > + if (!stop_split) { > + folio_split_memcg_refs(folio, old_order, split_order); > + split_page_owner(&folio->page, old_order, split_order); > + pgalloc_tag_split(folio, old_order, split_order); > - __split_folio_to_order(folio, old_order, split_order); > + __split_folio_to_order(folio, old_order, split_order); > + } > -after_split: > /* > * Iterate through after-split folios and update folio stats. > * But in buddy allocator like split, the folio > Yep, looks much better to me. Let me fix it in V3. Thank you for the review and suggestions. Best Regards, Yan, Zi ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2 2/2] mm/huge_memory: use folio_expected_ref_count() to calculate ref_count. 2025-07-11 18:23 [PATCH v2 0/2] __folio_split() clean up Zi Yan 2025-07-11 18:23 ` [PATCH v2 1/2] mm/huge_memory: move unrelated code out of __split_unmapped_folio() Zi Yan @ 2025-07-11 18:23 ` Zi Yan 2025-07-14 0:43 ` Balbir Singh 2025-07-14 15:11 ` David Hildenbrand 1 sibling, 2 replies; 8+ messages in thread From: Zi Yan @ 2025-07-11 18:23 UTC (permalink / raw) To: Balbir Singh, David Hildenbrand, linux-mm Cc: Andrew Morton, Hugh Dickins, Kirill Shutemov, Lorenzo Stoakes, Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, linux-kernel Instead of open coding the ref_count calculation, use folio_expected_ref_count(). Suggested-by: David Hildenbrand <david@redhat.com> Signed-off-by: Zi Yan <ziy@nvidia.com> --- mm/huge_memory.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 818a6bd9f0d4..57e5699cf638 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3739,6 +3739,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order, if (folio_ref_freeze(folio, 1 + extra_pins)) { struct address_space *swap_cache = NULL; struct lruvec *lruvec; + int expected_refs; if (folio_order(folio) > 1 && !list_empty(&folio->_deferred_list)) { @@ -3809,11 +3810,8 @@ static int __folio_split(struct folio *folio, unsigned int new_order, new_folio = next) { next = folio_next(new_folio); - folio_ref_unfreeze( - new_folio, - 1 + ((mapping || swap_cache) ? - folio_nr_pages(new_folio) : - 0)); + expected_refs = folio_expected_ref_count(new_folio) + 1; + folio_ref_unfreeze(new_folio, expected_refs); lru_add_split_folio(folio, new_folio, lruvec, list); @@ -3843,8 +3841,8 @@ static int __folio_split(struct folio *folio, unsigned int new_order, * Otherwise, a parallel folio_try_get() can grab origin_folio * and its caller can see stale page cache entries. */ - folio_ref_unfreeze(folio, 1 + - ((mapping || swap_cache) ? folio_nr_pages(folio) : 0)); + expected_refs = folio_expected_ref_count(folio) + 1; + folio_ref_unfreeze(folio, expected_refs); unlock_page_lruvec(lruvec); -- 2.47.2 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/2] mm/huge_memory: use folio_expected_ref_count() to calculate ref_count. 2025-07-11 18:23 ` [PATCH v2 2/2] mm/huge_memory: use folio_expected_ref_count() to calculate ref_count Zi Yan @ 2025-07-14 0:43 ` Balbir Singh 2025-07-14 15:11 ` David Hildenbrand 1 sibling, 0 replies; 8+ messages in thread From: Balbir Singh @ 2025-07-14 0:43 UTC (permalink / raw) To: Zi Yan, David Hildenbrand, linux-mm Cc: Andrew Morton, Hugh Dickins, Kirill Shutemov, Lorenzo Stoakes, Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, linux-kernel On 7/12/25 04:23, Zi Yan wrote: > Instead of open coding the ref_count calculation, use > folio_expected_ref_count(). > > Suggested-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Zi Yan <ziy@nvidia.com> > --- Acked-by: Balbir Singh <balbirs@nvidia.com> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/2] mm/huge_memory: use folio_expected_ref_count() to calculate ref_count. 2025-07-11 18:23 ` [PATCH v2 2/2] mm/huge_memory: use folio_expected_ref_count() to calculate ref_count Zi Yan 2025-07-14 0:43 ` Balbir Singh @ 2025-07-14 15:11 ` David Hildenbrand 1 sibling, 0 replies; 8+ messages in thread From: David Hildenbrand @ 2025-07-14 15:11 UTC (permalink / raw) To: Zi Yan, Balbir Singh, linux-mm Cc: Andrew Morton, Hugh Dickins, Kirill Shutemov, Lorenzo Stoakes, Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, linux-kernel On 11.07.25 20:23, Zi Yan wrote: > Instead of open coding the ref_count calculation, use > folio_expected_ref_count(). > > Suggested-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Zi Yan <ziy@nvidia.com> > --- Acked-by: David Hildenbrand <david@redhat.com> -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-07-14 15:33 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-07-11 18:23 [PATCH v2 0/2] __folio_split() clean up Zi Yan 2025-07-11 18:23 ` [PATCH v2 1/2] mm/huge_memory: move unrelated code out of __split_unmapped_folio() Zi Yan 2025-07-14 1:15 ` Balbir Singh 2025-07-14 15:30 ` David Hildenbrand 2025-07-14 15:33 ` Zi Yan 2025-07-11 18:23 ` [PATCH v2 2/2] mm/huge_memory: use folio_expected_ref_count() to calculate ref_count Zi Yan 2025-07-14 0:43 ` Balbir Singh 2025-07-14 15:11 ` David Hildenbrand
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).