From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3F5953ACA42 for ; Wed, 8 Apr 2026 09:47:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775641656; cv=none; b=Wy613RLVhSl7Fg6+ffmtpda8cuAftt5H30bCYh64pmL+k3Y3QlUtOvUvB53cvreTtcaaaPvX+Jt1ZH4AtrmHUsv3F/z/ODUNm8yhVWYKcd21hGfiT5DYrz0cJydGS4WixkpRrBQ4HwjTCdra2EoeHCp69ZeGqeGwUszQ/fr+PFc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775641656; c=relaxed/simple; bh=161FTND8aX7N/Si1WQ0uVlsbSqWm81xeGF710mBmYJ8=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=MZJjfwnp12oqYXmo0eFxSfTKmf3yucF5osgtZh4HPELNQtIp2J/VB4Og0tu0wekrNr3E70YHo7finkXXdG/6cPdxkrVUWdFkepXBNJMf7pkuuqmkcS9UWuce1Sna5qLbKZNhkwkZXubehlqkq+J2K/MXEe2mtDy6cYSNH8a/LPU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=aECfF2TH; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="aECfF2TH" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 12D2FC19421; Wed, 8 Apr 2026 09:47:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1775641655; bh=161FTND8aX7N/Si1WQ0uVlsbSqWm81xeGF710mBmYJ8=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=aECfF2TH5qpXPEirhk0ZZZkyaU/bnH9AeFkowVHuMeWRUaw3IfAtfQfOIDCnqsVfU jCREhFEW2ZCsML/0E/CWCT0ZBT3CxKS+fHzXVL0hqaerbevm/4YU1wDjTnp+3yF9Cy yvMhPG8Jh/dQmnG+uhv3a6WsS7JpboEC9a2ade1w/MSOR2nOIfbuN/TWKjPnCEXttv CFj1tvsiCpxUdXTrrtDXcervAvxzHenkRUczzn/BuO7ded2GuAeXg4xPB2YPzkrYP9 UEn9r7icEf+AB/wKcQclvFYDg8jYpTwa3AZrAAX5dlUySAakqS/knXwOayhHqkIoPw IbYLAEOjWewxQ== Message-ID: <22b6ff3c-9d41-4eb0-9beb-cb92f3ada89f@kernel.org> Date: Wed, 8 Apr 2026 11:47:30 +0200 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] mm/page_alloc: use batch page clearing in kernel_init_pages() Content-Language: en-US To: Hrushikesh Salunke , akpm@linux-foundation.org, surenb@google.com, mhocko@suse.com, jackmanb@google.com, hannes@cmpxchg.org, ziy@nvidia.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, rkodsara@amd.com, bharata@amd.com, ankur.a.arora@oracle.com, shivankg@amd.com, David Hildenbrand References: <20260408092441.435133-1-hsalunke@amd.com> From: "Vlastimil Babka (SUSE)" In-Reply-To: <20260408092441.435133-1-hsalunke@amd.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 4/8/26 11:24, Hrushikesh Salunke wrote: > When init_on_alloc is enabled, kernel_init_pages() clears every page > one at a time, calling clear_page() per page. This is unnecessarily > slow for large contiguous allocations (mTHPs, HugeTLB) that dominate > real workloads. > > On 64-bit (!HIGHMEM) systems, switch to clearing pages in batch via > clear_pages(), bypassing the per-page kmap_local_page()/kunmap_local() > overhead and allowing the arch clearing primitive to operate on the full > contiguous range in a single invocation. The batch size is the full > allocation when the preempt model is preemptible (preemption points are > implicit), or PROCESS_PAGES_NON_PREEMPT_BATCH otherwise, with > cond_resched() between batches to limit scheduling latency under > cooperative preemption. > > The HIGHMEM path is kept as-is since those pages require kmap. > > Allocating 8192 x 2MB HugeTLB pages (16GB) with init_on_alloc=1: > > Before: 0.445s > After: 0.166s (-62.7%, 2.68x faster) > > Kernel time (sys) reduction per workload with init_on_alloc=1: > > Workload Before After Change > Graph500 64C128T 30m 41.8s 15m 14.8s -50.3% > Graph500 16C32T 15m 56.7s 9m 43.7s -39.0% > Pagerank 32T 1m 58.5s 1m 12.8s -38.5% > Pagerank 128T 2m 36.3s 1m 40.4s -35.7% > > Signed-off-by: Hrushikesh Salunke > --- > base commit: 1a2fbbe3653f0ebb24af9b306a8a968287344a35 Any way to reuse the code added by [1], e.g. clear_user_highpages()? [1] https://lore.kernel.org/linux-mm/20250917152418.4077386-1-ankur.a.arora@oracle.com/ > > mm/page_alloc.c | 19 +++++++++++++++++-- > 1 file changed, 17 insertions(+), 2 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index b1c5430cad4e..178cbebadd50 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1224,8 +1224,23 @@ static void kernel_init_pages(struct page *page, int numpages) > > /* s390's use of memset() could override KASAN redzones. */ > kasan_disable_current(); > - for (i = 0; i < numpages; i++) > - clear_highpage_kasan_tagged(page + i); > + > + if (!IS_ENABLED(CONFIG_HIGHMEM)) { > + void *addr = kasan_reset_tag(page_address(page)); > + unsigned int unit = preempt_model_preemptible() ? > + numpages : PROCESS_PAGES_NON_PREEMPT_BATCH; > + int count; > + > + for (i = 0; i < numpages; i += count) { > + cond_resched(); > + count = min_t(int, unit, numpages - i); > + clear_pages(addr + (i << PAGE_SHIFT), count); > + } > + } else { > + for (i = 0; i < numpages; i++) > + clear_highpage_kasan_tagged(page + i); > + } > + > kasan_enable_current(); > } >