From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DBF0F3B19B1 for ; Thu, 14 May 2026 10:50:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778755810; cv=none; b=RX7W1bk35zW+8cgGNKcVvWzGVpaF38auxPNF9escO/cthwEAANC53vzhcT+5+2wjCH05nahVzRu7eYC8FmAuI9+aJVXTZQb11/pgNkL3V8XcCmv9/Yi+CF6ixR3mVjrrS4qo4mafe7lbYidsVsAOkithVyeCpuLtekgUYgiOVsk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778755810; c=relaxed/simple; bh=WMLrsduBdHE1KBwJsp823Q6gclXLxlObdPC//VOAMuw=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Rqi3wvkGjMdyoyaIqbKZYiAN7GC8Cuw74j4+755j5/+AA7zM6pNRHRTk/fRaVb6gaf4zuUWwjORGcZv/x7FEvV9Vtuh1Fxl1IBeoz9jIWKijoEtzbsXgiwNkucHu24Zvo9zfXQf5jjFqpFprCAIIYCfIL87/960zyqAE8jYDuUw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=qRFzwpqq; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="qRFzwpqq" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C5E5CC2BCB3; Thu, 14 May 2026 10:50:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1778755810; bh=WMLrsduBdHE1KBwJsp823Q6gclXLxlObdPC//VOAMuw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=qRFzwpqqmLxT1/JTTa2PruTcM8MX1aELx2TzgGdbRDuYDw/98TWHbMmHekAa2qcrE KWrkq8OhlrtFF+Qez/UNeNBVg514WoznSJWEGy2OFMFgTu/2aGbMdVMSY7TCfAG2Qx SBBkuk3glKr1A24CgWMGuBw5ppTWNqtM3tKArIMfD0h59gQq+3P99Ib6LNLWiPp21i Ro/pIpYDQ/RVyV57kvczzTSwegO/EORFlitgGY54kiwP6zsRiODUZRKS7rGbUwx1h1 aDKUo9H75nDKPzpa09q/FxnbTrxhNcsiLqNdZi57GYBz82yPseJDSIgwgTwY49iWfZ W3RYHRkeghvCg== Date: Thu, 14 May 2026 11:50:03 +0100 From: Lorenzo Stoakes To: Hrushikesh Salunke Cc: akpm@linux-foundation.org, david@kernel.org, Liam.Howlett@oracle.com, vbabka@kernel.org, rppt@kernel.org, surenb@google.com, mhocko@suse.com, jackmanb@google.com, hannes@cmpxchg.org, ziy@nvidia.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, rkodsara@amd.com, bharata@amd.com, ankur.a.arora@oracle.com, shivankg@amd.com Subject: Re: [PATCH v4] mm/page_alloc: replace kernel_init_pages() with batch page clearing Message-ID: References: <20260504063942.553438-1-hsalunke@amd.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260504063942.553438-1-hsalunke@amd.com> On Mon, May 04, 2026 at 06:39:19AM +0000, Hrushikesh Salunke wrote: > When init_on_alloc is enabled, kernel_init_pages() clears every page > one at a time via clear_highpage_kasan_tagged(), which incurs per-page > kmap_local_page()/kunmap_local() overhead and prevents the architecture > clearing primitive from operating on contiguous ranges. > > Introduce clear_highpages_kasan_tagged() as a static batch clearing > helper in page_alloc.c that calls clear_pages() for the full contiguous > range on !HIGHMEM systems, bypassing the per-page kmap overhead and > allowing a single invocation of the arch clearing primitive across the > entire allocation. The HIGHMEM path falls back to per-page clearing > since those pages require kmap. > > Replace kernel_init_pages() with direct calls to the new helper, as it > becomes a trivial wrapper. > > Allocating 8192 x 2MB HugeTLB pages (16GB) with init_on_alloc=1: > > Before: 0.445s > After: 0.166s (-62.7%, 2.68x faster) Wow nice! > > Kernel time (sys) reduction per workload with init_on_alloc=1: > > Workload Before After Change > Graph500 64C128T 30m 41.8s 15m 14.8s -50.3% > Graph500 16C32T 15m 56.7s 9m 43.7s -39.0% > Pagerank 32T 1m 58.5s 1m 12.8s -38.5% > Pagerank 128T 2m 36.3s 1m 40.4s -35.7% Lovely :) > > Signed-off-by: Hrushikesh Salunke All looks sensible to me so: Acked-by: Lorenzo Stoakes Cheers, Lorenzo > Acked-by: Vlastimil Babka (SUSE) > Acked-by: Zi Yan > Acked-by: Pankaj Gupta > --- > Hi Andrew, > > This is v4 of the batch page clearing patch. v3 is already in > mm-unstable, please replace it with this one. > The only change is moving clear_highpages_kasan_tagged() from > include/linux/highmem.h to mm/page_alloc.c as a static function, > addressing the code size concern you raised on ARM allmodconfig. > > Thanks, > Hrushikesh > > base commit: 2bcc13c29c711381d815c1ba5d5b25737400c71a > > v3: https://lore.kernel.org/all/20260422102729.166599-1-hsalunke@amd.com/ > v2: https://lore.kernel.org/all/20260421042451.76918-1-hsalunke@amd.com/ > v1: https://lore.kernel.org/all/20260408092441.435133-1-hsalunke@amd.com/ > > Changes since v3: > - Moved clear_highpages_kasan_tagged() from include/linux/highmem.h to > mm/page_alloc.c as a static function to avoid code size increase. As > the function is only used within page_alloc.c. > > Changes since v2: > - Moved kasan_disable_current()/kasan_enable_current() into > clear_highpages_kasan_tagged(), per David and Zi Yan's suggestion. > - Removed kernel_init_pages() and replaced its two call sites with > direct calls to the helper. > > Changes since v1: > - Dropped cond_resched() and PROCESS_PAGES_NON_PREEMPT_BATCH as > kernel_init_pages() runs inside the page allocator and can be > called from atomic context, making cond_resched() unsafe. The > original code never had a cond_resched() here, and the > performance gain comes from batching, not rescheduling. > > - Moved the !HIGHMEM/HIGHMEM branching into a new > clear_highpages_kasan_tagged() helper in highmem.h, per David's > suggestion. > > mm/page_alloc.c | 18 +++++++++++------- > 1 file changed, 11 insertions(+), 7 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 65e205111553..3a59577f58a5 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1208,14 +1208,18 @@ static inline bool should_skip_kasan_poison(struct page *page) > return page_kasan_tag(page) == KASAN_TAG_KERNEL; > } > > -static void kernel_init_pages(struct page *page, int numpages) > +static void clear_highpages_kasan_tagged(struct page *page, int numpages) > { > - int i; > - > /* s390's use of memset() could override KASAN redzones. */ > kasan_disable_current(); > - for (i = 0; i < numpages; i++) > - clear_highpage_kasan_tagged(page + i); > + if (!IS_ENABLED(CONFIG_HIGHMEM)) { I hope that, soon, we won't need this :) > + clear_pages(kasan_reset_tag(page_address(page)), numpages); > + } else { > + int i; > + > + for (i = 0; i < numpages; i++) > + clear_highpage_kasan_tagged(page + i); > + } > kasan_enable_current(); > } > > @@ -1428,7 +1432,7 @@ __always_inline bool __free_pages_prepare(struct page *page, > init = false; > } > if (init) > - kernel_init_pages(page, 1 << order); > + clear_highpages_kasan_tagged(page, 1 << order); > > /* > * arch_free_page() can make the page's contents inaccessible. s390 > @@ -1853,7 +1857,7 @@ inline void post_alloc_hook(struct page *page, unsigned int order, > } > /* If memory is still not initialized, initialize it now. */ > if (init) > - kernel_init_pages(page, 1 << order); > + clear_highpages_kasan_tagged(page, 1 << order); > > set_page_owner(page, order, gfp_flags); > page_table_check_alloc(page, order); > -- > 2.43.0 >