From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-172.mta1.migadu.com (out-172.mta1.migadu.com [95.215.58.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 01A3414EC73 for ; Tue, 23 Jun 2026 00:46:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782175591; cv=none; b=IAUVVnxb6/LwnmIKbIFRGiCP2bv1yd7z6ENY5gD/Pun69fuq4bqt+vvjVUro6AW+DD/eZRdS3xkGDHRKlt/uapckC/NS3lh7CCNVNSZ4iL8KKl6A6s2eU1sc4gWErwfZJHWXDgichQQoRIR2qkY/iVOx7gS8v1ZcLGwGDK8jZXc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782175591; c=relaxed/simple; bh=ZrJverUyxkEAcVgM9Na+U77TauFU/KgXHRND13Z1BDo=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=SiMy0qe5Zw6dFrgujIbkQsNJ2Diun5cWv2Wu9wuI54rD6Y4O8yu4vLzXrC1fjNxu1xzkqACe2tH0aEEvKC0tYOnQnZuN5nwlb+HXaHwXO5O+I59aXn6+WS1AYFH/z3HENpcz5LtDXCG2V8O4oiuOt51CYkV6YF7WRe+sSbMpAig= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=fpHGXV+b; arc=none smtp.client-ip=95.215.58.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="fpHGXV+b" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1782175586; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=XlkqxyqOJoqvIllh9mUeQc2L3VjewKUXQAdYDGn19nU=; b=fpHGXV+bOWANXRUf4vFTJWaXsRwbL6mq5UVW/e2SdaLe860HN79yRDQ0c4NRJ851uGB9KZ oqCNbaDImHw9u08HRL4lY237DBueGOycSHhY6+++pidQlRl0zHnF8T/9alStrooX/nhIOq BN3Qa2m/1b/fzSyjeF7OxJ18ZV1+gs8= From: JP Kobryn To: akpm@linux-foundation.org, david@kernel.org, ljs@kernel.org, liam@infradead.org, vbabka@kernel.org, rppt@kernel.org, surenb@google.com, mhocko@suse.com, jackmanb@google.com, hannes@cmpxchg.org, ziy@nvidia.com, fvdl@google.com, linux-mm@kvack.org Cc: shakeel.butt@linux.dev, usama.arif@linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH v3] mm/page_alloc: use existing highatomic reserves on the buddy fastpath Date: Mon, 22 Jun 2026 17:46:00 -0700 Message-ID: <20260623004600.113347-1-jp.kobryn@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT ALLOC_HIGHATOMIC currently provides both access to MIGRATE_HIGHATOMIC free pages and permission to create new highatomic pageblock reserves. This makes it unsuitable for the fastpath. However, the fastpath can reach rmqueue_buddy() while MIGRATE_HIGHATOMIC reserves have free pages available. In this situation, the allocation can fall back to other migratetypes without trying those reserves first. Allow high-priority non-blocking allocations to use existing MIGRATE_HIGHATOMIC reserves on the buddy fastpath without growing them. First tighten the criteria for reserving pageblocks so that growth may only occur in the slowpath. Then allow fastpath usage by enabling ALLOC_HIGHATOMIC when the GFP mask describes a non-blocking high-priority allocation. This logic has been factored out from gfp_to_alloc_flags() to a new function gfp_to_alloc_flags_nonblocking(). A UDP receive workload was run with free MIGRATE_HIGHATOMIC pageblocks available in the target zone. Before this patch, the workload did not consume these blocks. With this patch, eligible order-1 allocations reaching the buddy path consumed existing MIGRATE_HIGHATOMIC pageblocks, with no highatomic misses observed. The workload did not grow highatomic reserves and NAPI page-frag allocations remained healthy with no failures or order-0 fallbacks. Signed-off-by: JP Kobryn --- v3: - remove ALLOC_HIGHATOMIC_RESERVE and let ALLOC_HIGHATOMIC keep original behavior - use ALLOC_WMARK_MIN to identify slowpath before growing reserve - factor out non-blocking logic from gfp_to_alloc_flags() into *_nonblocking() helper - dropped reviewed-by tag v2: https://lore.kernel.org/linux-mm/20260617234958.150339-1-jp.kobryn@linux.dev/ - decouple use semantics from ALLOC_HIGHATOMIC_RESERVE - update changelog to reflect above change and reword test paragraph - adjust comment in PCP path v1: https://lore.kernel.org/linux-mm/20260616191420.52556-1-jp.kobryn@linux.dev/ mm/page_alloc.c | 44 ++++++++++++++++++++++++++++++-------------- 1 file changed, 30 insertions(+), 14 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index f7db8f049bd2..7330f22e3f8f 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3247,10 +3247,11 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone, } while (check_new_pages(page, order)); /* - * If this is a high-order atomic allocation then check - * if the pageblock should be reserved for the future + * Slowpath (precarious) high-atomic allocations may reserve + * a pageblock for future use. */ - if (unlikely(alloc_flags & ALLOC_HIGHATOMIC)) + if (unlikely((alloc_flags & ALLOC_HIGHATOMIC) && + ((alloc_flags & ALLOC_WMARK_MASK) == ALLOC_WMARK_MIN))) reserve_highatomic_pageblock(page, order, zone); __count_zid_vm_events(PGALLOC, page_zonenum(page), 1 << order); @@ -4473,6 +4474,29 @@ static void wake_all_kswapds(unsigned int order, gfp_t gfp_mask, } } +static inline unsigned int +gfp_to_alloc_flags_nonblocking(gfp_t gfp_mask, unsigned int order) +{ + unsigned int alloc_flags = 0; + + if (gfp_mask & __GFP_DIRECT_RECLAIM) + return 0; + + /* + * Not worth trying to allocate harder for __GFP_NOMEMALLOC even + * if it can't schedule. + */ + if (gfp_mask & __GFP_NOMEMALLOC) + return 0; + + alloc_flags |= ALLOC_NON_BLOCK; + + if (order > 0 && (gfp_mask & __GFP_HIGH)) + alloc_flags |= ALLOC_HIGHATOMIC; + + return alloc_flags; +} + static inline unsigned int gfp_to_alloc_flags(gfp_t gfp_mask, unsigned int order) { @@ -4495,18 +4519,9 @@ gfp_to_alloc_flags(gfp_t gfp_mask, unsigned int order) alloc_flags |= (__force int) (gfp_mask & (__GFP_HIGH | __GFP_KSWAPD_RECLAIM)); - if (!(gfp_mask & __GFP_DIRECT_RECLAIM)) { - /* - * Not worth trying to allocate harder for __GFP_NOMEMALLOC even - * if it can't schedule. - */ - if (!(gfp_mask & __GFP_NOMEMALLOC)) { - alloc_flags |= ALLOC_NON_BLOCK; - - if (order > 0 && (alloc_flags & ALLOC_MIN_RESERVE)) - alloc_flags |= ALLOC_HIGHATOMIC; - } + alloc_flags |= gfp_to_alloc_flags_nonblocking(gfp_mask, order); + if (!(gfp_mask & __GFP_DIRECT_RECLAIM)) { /* * Ignore cpuset mems for non-blocking __GFP_HIGH (probably * GFP_ATOMIC) rather than fail, see the comment for @@ -5299,6 +5314,7 @@ struct page *__alloc_frozen_pages_noprof(gfp_t gfp, unsigned int order, * memory until all local zones are considered. */ alloc_flags |= alloc_flags_nofragment(zonelist_zone(ac.preferred_zoneref), gfp); + alloc_flags |= gfp_to_alloc_flags_nonblocking(gfp, order) & ALLOC_HIGHATOMIC; /* First allocation attempt */ page = get_page_from_freelist(alloc_gfp, order, alloc_flags, &ac); -- 2.54.0