From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from out-172.mta1.migadu.com (out-172.mta1.migadu.com [95.215.58.172])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 01A3414EC73
	for <linux-kernel@vger.kernel.org>; Tue, 23 Jun 2026 00:46:27 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.172
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1782175591; cv=none; b=IAUVVnxb6/LwnmIKbIFRGiCP2bv1yd7z6ENY5gD/Pun69fuq4bqt+vvjVUro6AW+DD/eZRdS3xkGDHRKlt/uapckC/NS3lh7CCNVNSZ4iL8KKl6A6s2eU1sc4gWErwfZJHWXDgichQQoRIR2qkY/iVOx7gS8v1ZcLGwGDK8jZXc=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1782175591; c=relaxed/simple;
	bh=ZrJverUyxkEAcVgM9Na+U77TauFU/KgXHRND13Z1BDo=;
	h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=SiMy0qe5Zw6dFrgujIbkQsNJ2Diun5cWv2Wu9wuI54rD6Y4O8yu4vLzXrC1fjNxu1xzkqACe2tH0aEEvKC0tYOnQnZuN5nwlb+HXaHwXO5O+I59aXn6+WS1AYFH/z3HENpcz5LtDXCG2V8O4oiuOt51CYkV6YF7WRe+sSbMpAig=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=fpHGXV+b; arc=none smtp.client-ip=95.215.58.172
Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev
Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="fpHGXV+b"
X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers.
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1;
	t=1782175586;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:
	 content-transfer-encoding:content-transfer-encoding;
	bh=XlkqxyqOJoqvIllh9mUeQc2L3VjewKUXQAdYDGn19nU=;
	b=fpHGXV+bOWANXRUf4vFTJWaXsRwbL6mq5UVW/e2SdaLe860HN79yRDQ0c4NRJ851uGB9KZ
	oqCNbaDImHw9u08HRL4lY237DBueGOycSHhY6+++pidQlRl0zHnF8T/9alStrooX/nhIOq
	BN3Qa2m/1b/fzSyjeF7OxJ18ZV1+gs8=
From: JP Kobryn <jp.kobryn@linux.dev>
To: akpm@linux-foundation.org,
	david@kernel.org,
	ljs@kernel.org,
	liam@infradead.org,
	vbabka@kernel.org,
	rppt@kernel.org,
	surenb@google.com,
	mhocko@suse.com,
	jackmanb@google.com,
	hannes@cmpxchg.org,
	ziy@nvidia.com,
	fvdl@google.com,
	linux-mm@kvack.org
Cc: shakeel.butt@linux.dev,
	usama.arif@linux.dev,
	linux-kernel@vger.kernel.org
Subject: [PATCH v3] mm/page_alloc: use existing highatomic reserves on the buddy fastpath
Date: Mon, 22 Jun 2026 17:46:00 -0700
Message-ID: <20260623004600.113347-1-jp.kobryn@linux.dev>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Migadu-Flow: FLOW_OUT

ALLOC_HIGHATOMIC currently provides both access to MIGRATE_HIGHATOMIC free
pages and permission to create new highatomic pageblock reserves. This
makes it unsuitable for the fastpath.

However, the fastpath can reach rmqueue_buddy() while MIGRATE_HIGHATOMIC
reserves have free pages available. In this situation, the allocation can
fall back to other migratetypes without trying those reserves first.

Allow high-priority non-blocking allocations to use existing
MIGRATE_HIGHATOMIC reserves on the buddy fastpath without growing them.
First tighten the criteria for reserving pageblocks so that growth may only
occur in the slowpath. Then allow fastpath usage by enabling
ALLOC_HIGHATOMIC when the GFP mask describes a non-blocking high-priority
allocation. This logic has been factored out from gfp_to_alloc_flags() to a
new function gfp_to_alloc_flags_nonblocking().

A UDP receive workload was run with free MIGRATE_HIGHATOMIC pageblocks
available in the target zone. Before this patch, the workload did not
consume these blocks. With this patch, eligible order-1 allocations
reaching the buddy path consumed existing MIGRATE_HIGHATOMIC pageblocks,
with no highatomic misses observed. The workload did not grow highatomic
reserves and NAPI page-frag allocations remained healthy with no failures
or order-0 fallbacks.

Signed-off-by: JP Kobryn <jp.kobryn@linux.dev>
---
v3:
  - remove ALLOC_HIGHATOMIC_RESERVE and let ALLOC_HIGHATOMIC keep original behavior
  - use ALLOC_WMARK_MIN to identify slowpath before growing reserve
  - factor out non-blocking logic from gfp_to_alloc_flags() into *_nonblocking() helper
  - dropped reviewed-by tag

v2: https://lore.kernel.org/linux-mm/20260617234958.150339-1-jp.kobryn@linux.dev/
  - decouple use semantics from ALLOC_HIGHATOMIC_RESERVE
  - update changelog to reflect above change and reword test paragraph
  - adjust comment in PCP path

v1: https://lore.kernel.org/linux-mm/20260616191420.52556-1-jp.kobryn@linux.dev/

 mm/page_alloc.c | 44 ++++++++++++++++++++++++++++++--------------
 1 file changed, 30 insertions(+), 14 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f7db8f049bd2..7330f22e3f8f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3247,10 +3247,11 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone,
 	} while (check_new_pages(page, order));
 
 	/*
-	 * If this is a high-order atomic allocation then check
-	 * if the pageblock should be reserved for the future
+	 * Slowpath (precarious) high-atomic allocations may reserve
+	 * a pageblock for future use.
 	 */
-	if (unlikely(alloc_flags & ALLOC_HIGHATOMIC))
+	if (unlikely((alloc_flags & ALLOC_HIGHATOMIC) &&
+			((alloc_flags & ALLOC_WMARK_MASK) == ALLOC_WMARK_MIN)))
 		reserve_highatomic_pageblock(page, order, zone);
 
 	__count_zid_vm_events(PGALLOC, page_zonenum(page), 1 << order);
@@ -4473,6 +4474,29 @@ static void wake_all_kswapds(unsigned int order, gfp_t gfp_mask,
 	}
 }
 
+static inline unsigned int
+gfp_to_alloc_flags_nonblocking(gfp_t gfp_mask, unsigned int order)
+{
+	unsigned int alloc_flags = 0;
+
+	if (gfp_mask & __GFP_DIRECT_RECLAIM)
+		return 0;
+
+	/*
+	 * Not worth trying to allocate harder for __GFP_NOMEMALLOC even
+	 * if it can't schedule.
+	 */
+	if (gfp_mask & __GFP_NOMEMALLOC)
+		return 0;
+
+	alloc_flags |= ALLOC_NON_BLOCK;
+
+	if (order > 0 && (gfp_mask & __GFP_HIGH))
+		alloc_flags |= ALLOC_HIGHATOMIC;
+
+	return alloc_flags;
+}
+
 static inline unsigned int
 gfp_to_alloc_flags(gfp_t gfp_mask, unsigned int order)
 {
@@ -4495,18 +4519,9 @@ gfp_to_alloc_flags(gfp_t gfp_mask, unsigned int order)
 	alloc_flags |= (__force int)
 		(gfp_mask & (__GFP_HIGH | __GFP_KSWAPD_RECLAIM));
 
-	if (!(gfp_mask & __GFP_DIRECT_RECLAIM)) {
-		/*
-		 * Not worth trying to allocate harder for __GFP_NOMEMALLOC even
-		 * if it can't schedule.
-		 */
-		if (!(gfp_mask & __GFP_NOMEMALLOC)) {
-			alloc_flags |= ALLOC_NON_BLOCK;
-
-			if (order > 0 && (alloc_flags & ALLOC_MIN_RESERVE))
-				alloc_flags |= ALLOC_HIGHATOMIC;
-		}
+	alloc_flags |= gfp_to_alloc_flags_nonblocking(gfp_mask, order);
 
+	if (!(gfp_mask & __GFP_DIRECT_RECLAIM)) {
 		/*
 		 * Ignore cpuset mems for non-blocking __GFP_HIGH (probably
 		 * GFP_ATOMIC) rather than fail, see the comment for
@@ -5299,6 +5314,7 @@ struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order,
 	 * memory until all local zones are considered.
 	 */
 	alloc_flags |= alloc_flags_nofragment(zonelist_zone(ac.preferred_zoneref), gfp);
+	alloc_flags |= gfp_to_alloc_flags_nonblocking(gfp, order) & ALLOC_HIGHATOMIC;
 
 	/* First allocation attempt */
 	page = get_page_from_freelist(alloc_gfp, order, alloc_flags, &ac);
-- 
2.54.0