From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BD71B274B26 for ; Mon, 24 Nov 2025 10:41:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763980917; cv=none; b=YqHrdimVo86WgTtqAmh2qq6fSP/jh3ZTmm3jMQJ15aa9fctqL4NZiHHAl9mxptNDSJZZi/ldUpss2JqANF62x2WaXSWaYUq8AOosNDFPdvqyqaH2H+y+Pcw5XVRJ93roF+WRYvl3HS7Gp8dmGicwf9CY2KOPFRZ/+ecPYYWYgK0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763980917; c=relaxed/simple; bh=EO7ufPh1vH3SRe9Uruu6w05UeWYS/8oZujKWXV+wUw4=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=TMLk5kxBNDfdGSkegSk/Ceae8L4E8907N51lZRggPigvv9xOH2AaKY28A+o7UTj+tPKCgq3s/L4Ke8E0Nj0HmuJKbES5Q+xuuO1Ytx9vBs7IgBtnREwx82RJ55215Q70Um2APxaPKhGOz2c4mzhtDeMLHowuOJjLu2Ge8nlX8/U= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=j1a2FVZa; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="j1a2FVZa" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 59079C4CEF1; Mon, 24 Nov 2025 10:41:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1763980917; bh=EO7ufPh1vH3SRe9Uruu6w05UeWYS/8oZujKWXV+wUw4=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=j1a2FVZaR/3FvkhClWt/458UA4QPePGf/qsYhFN6zX0RPzN60AMzud+mRQME82aJg b8jtC2DafmoFfdRpI2Uw+gZb1KZXBPy5VCT+520F6nR+Ox4YpLDVOYKSfp3eSJqgBZ yjIAYPZ08yJvqYOwbngqVPj7MzVcYyxU4pqD2HPvGtkCnJwfwc0V6rI/NEzVna1Hs1 NmmZXipWwr49ADt0a9wmw1L3Dj+d25icLtCh2BmwVvuOkhIPchqfzXYDzIv3Q7P7GP T8GPXRtgB3gA0YxGyXBGzgN++WBuV48hbaNoFp2xsrhVEJy0i2w8RdzKGyM3Hnv6b5 7g7WSW0+GhhxA== Message-ID: Date: Mon, 24 Nov 2025 11:41:50 +0100 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 2/4] mm/huge_memory: replace can_split_folio() with direct refcount calculation To: Zi Yan , Lorenzo Stoakes Cc: Andrew Morton , Baolin Wang , "Liam R. Howlett" , Nico Pache , Ryan Roberts , Dev Jain , Barry Song , Lance Yang , Miaohe Lin , Naoya Horiguchi , Wei Yang , Balbir Singh , linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20251122025529.1562592-1-ziy@nvidia.com> <20251122025529.1562592-3-ziy@nvidia.com> From: "David Hildenbrand (Red Hat)" Content-Language: en-US In-Reply-To: <20251122025529.1562592-3-ziy@nvidia.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 11/22/25 03:55, Zi Yan wrote: > can_split_folio() is just a refcount comparison, making sure only the > split caller holds an extra pin. Open code it with > folio_expected_ref_count() != folio_ref_count() - 1. For the extra_pins > used by folio_ref_freeze(), add folio_cache_references() to calculate it. > > Suggested-by: David Hildenbrand (Red Hat) > Signed-off-by: Zi Yan > --- > include/linux/huge_mm.h | 1 - > mm/huge_memory.c | 43 ++++++++++++++++------------------------- > mm/vmscan.c | 3 ++- > 3 files changed, 19 insertions(+), 28 deletions(-) > > diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h > index 97686fb46e30..1ecaeccf39c9 100644 > --- a/include/linux/huge_mm.h > +++ b/include/linux/huge_mm.h > @@ -369,7 +369,6 @@ enum split_type { > SPLIT_TYPE_NON_UNIFORM, > }; > > -bool can_split_folio(struct folio *folio, int caller_pins, int *pextra_pins); > int __split_huge_page_to_list_to_order(struct page *page, struct list_head *list, > unsigned int new_order); > int folio_split_unmapped(struct folio *folio, unsigned int new_order); > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index c1f1055165dd..6c821c1c0ac3 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -3455,23 +3455,6 @@ static void lru_add_split_folio(struct folio *folio, struct folio *new_folio, > } > } > > -/* Racy check whether the huge page can be split */ > -bool can_split_folio(struct folio *folio, int caller_pins, int *pextra_pins) > -{ > - int extra_pins; > - > - /* Additional pins from page cache */ > - if (folio_test_anon(folio)) > - extra_pins = folio_test_swapcache(folio) ? > - folio_nr_pages(folio) : 0; > - else > - extra_pins = folio_nr_pages(folio); > - if (pextra_pins) > - *pextra_pins = extra_pins; > - return folio_mapcount(folio) == folio_ref_count(folio) - extra_pins - > - caller_pins; > -} > - > static bool page_range_has_hwpoisoned(struct page *page, long nr_pages) > { > for (; nr_pages; page++, nr_pages--) > @@ -3776,17 +3759,26 @@ int folio_check_splittable(struct folio *folio, unsigned int new_order, > return 0; > } > > +/* Number of folio references from the pagecache or the swapcache. */ > +static unsigned int folio_cache_references(const struct folio *folio) > +{ > + if (folio_test_anon(folio) && !folio_test_swapcache(folio)) > + return 0; > + return folio_nr_pages(folio); > +} > + > static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int new_order, > struct page *split_at, struct xa_state *xas, > struct address_space *mapping, bool do_lru, > struct list_head *list, enum split_type split_type, > - pgoff_t end, int *nr_shmem_dropped, int extra_pins) > + pgoff_t end, int *nr_shmem_dropped) > { > struct folio *end_folio = folio_next(folio); > struct folio *new_folio, *next; > int old_order = folio_order(folio); > int ret = 0; > struct deferred_split *ds_queue; > + int extra_pins = folio_cache_references(folio); Can we just inline the call do folio_cache_references() and get rid of extra_pins. (which is a bad name either way) if (folio_ref_freeze(folio, folio_cache_references(folio) + 1) { BTW, now that we have this helper, I wonder if we should then also do for clarification on the unfreeze path: diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 0acdc2f26ee0c..7cbcf61b7971d 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3824,8 +3824,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n zone_device_private_split_cb(folio, new_folio); - expected_refs = folio_expected_ref_count(new_folio) + 1; - folio_ref_unfreeze(new_folio, expected_refs); + folio_ref_unfreeze(new_folio, folio_cache_references(new_folio) + 1); if (do_lru) lru_add_split_folio(folio, new_folio, lruvec, list); @@ -3868,8 +3867,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n * Otherwise, a parallel folio_try_get() can grab @folio * and its caller can see stale page cache entries. */ - expected_refs = folio_expected_ref_count(folio) + 1; - folio_ref_unfreeze(folio, expected_refs); + folio_ref_unfreeze(folio, folio_cache_references(folio) + 1); if (do_lru) unlock_page_lruvec(lruvec); -- Cheers David