From: Johannes Weiner <hannes@cmpxchg.org>
To: Brendan Jackman <jackmanb@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Vlastimil Babka <vbabka@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>, Zi Yan <ziy@nvidia.com>,
"Harry Yoo (Oracle)" <harry@kernel.org>,
Gregory Price <gourry@gourry.net>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm/page_alloc: drop flag-conversion "optimisation"
Date: Fri, 12 Jun 2026 12:33:52 -0400 [thread overview]
Message-ID: <aiw08Iy6BYCPX9am@cmpxchg.org> (raw)
In-Reply-To: <20260612-gfp-pessimisation-v1-1-936eb04202e7@google.com>
On Fri, Jun 12, 2026 at 02:15:44PM +0000, Brendan Jackman wrote:
> This code uses flag equivalences to try to optimise conversion from
> GFP_ to ALLOC_ but there's no clear reason to believe it makes things
> faster. Even if it gets rid of conditional branches, it just trades them
> for a data dependency.
>
> CPUs are pretty good at conditional branches. But, in my GCC x86 build
> it doesn't look like there are any branches anyway, the compiler found
> some conditional instruction tricks. (Caveat: This was extracted &
> annotated by Gemini AI, I did not actually read the disasm myself)
>
> Old code:
>
> ae50: 8b 04 24 mov (%rsp),%eax # Load gfp_mask
> ...
> ae5d: 41 89 c4 mov %eax,%r12d
> ae64: 41 81 e4 20 08 00 00 and $0x820,%r12d # Mask both flags at once
> ...
> ae6f: 44 89 e1 mov %r12d,%ecx
> ae77: 83 c9 40 or $0x40,%ecx # OR with ALLOC_CPUSET (0x40)
> ae7a: 89 4c 24 60 mov %ecx,0x60(%rsp) # Store to alloc_flags
>
> New code:
>
> For __GFP_HIGH ( 0x20 ):
> It uses the Carry Flag (via sbb ) to conditionally add 0x20 to the base 0x40 ( ALLOC_CPUSET ) flag:
>
> ae63: 83 e0 20 and $0x20,%eax # Test __GFP_HIGH
> ...
> ae6a: 83 f8 01 cmp $0x1,%eax # Set carry flag if 0
> ae6f: 45 19 e4 sbb %r12d,%r12d # %r12d = (gfp & 0x20) ? 0 : -1
> ae80: 41 83 e4 e0 and $0xffffffe0,%r12d # %r12d = (gfp & 0x20) ? 0 : -32
> ae87: 41 83 c4 60 add $0x60,%r12d # %r12d = (gfp & 0x20) ? 0x60 : 0x40
>
> For __GFP_KSWAPD_RECLAIM ( 0x800 ):
> It uses a conditional move ( cmov ) later in the function to set the ALLOC_KSWAPD ( 0x800 ) bit:
>
> ae72: 25 00 08 00 00 and $0x800,%eax # Test __GFP_KSWAPD_RECLAIM
> ae77: 89 44 24 30 mov %eax,0x30(%rsp) # Store result
> ...
> af2c: 80 cf 08 or $0x8,%bh # Set ALLOC_KSWAPD (0x800) in temp reg
> af2f: 45 85 c9 test %r9d,%r9d # Check if __GFP_KSWAPD_RECLAIM was set
> af32: 0f 44 d8 cmove %eax,%ebx # If not, revert to flags without it
>
> Testing with a modified version[0] of lib/free_pages_test.c (adding
> printks with timing)...
>
> [0] https://github.com/bjackman/aethelred/blob/2ccdc84ef087c2a631914f58e106e99e19bd3b98/page-alloc-test/page-alloc-test.c
>
> Old results from a Sapphire Rapids consumer CPU:
>
> [ 67.157118] page_alloc_test: Testing with GFP_KERNEL
> [ 67.157122] page_alloc_test: Starting 1,000,000 allocations...
> [ 70.704446] page_alloc_test: Completed. Time: 3543002 us (Avg: 3543.00 ns per alloc+free loop)
> [ 70.704456] page_alloc_test: Testing with GFP_KERNEL | __GFP_COMP
> [ 70.704460] page_alloc_test: Starting 1,000,000 allocations...
> [ 70.944672] page_alloc_test: Completed. Time: 239980 us (Avg: 239.98 ns per alloc+free loop)
> [ 70.944675] page_alloc_test: Test completed
>
> New results:
>
> [ 70.079015] page_alloc_test: Testing with GFP_KERNEL
> [ 70.079020] page_alloc_test: Starting 1,000,000 allocations...
> [ 73.669396] page_alloc_test: Completed. Time: 3586954 us (Avg: 3586.95 ns per alloc+free loop)
> [ 73.669402] page_alloc_test: Testing with GFP_KERNEL | __GFP_COMP
> [ 73.669405] page_alloc_test: Starting 1,000,000 allocations...
> [ 73.905084] page_alloc_test: Completed. Time: 235496 us (Avg: 235.49 ns per alloc+free loop)
> [ 73.905086] page_alloc_test: Test completed
>
> Seems like a wash.
>
> So, drop the flag value coupling here and let the compiler and CPU do
> their job. Superscalar CPUs are pretty neat after all.
>
> (Used AI for the disasm but the rest is all manual).
>
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
Nice. With the _nofragment() bits folded:
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
prev parent reply other threads:[~2026-06-12 16:33 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-12 14:15 [PATCH] mm/page_alloc: drop flag-conversion "optimisation" Brendan Jackman
2026-06-12 14:23 ` Brendan Jackman
2026-06-12 14:24 ` Vlastimil Babka (SUSE)
2026-06-12 14:53 ` Brendan Jackman
2026-06-12 15:06 ` Zi Yan
2026-06-12 16:04 ` Gregory Price
2026-06-12 16:33 ` Johannes Weiner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aiw08Iy6BYCPX9am@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=gourry@gourry.net \
--cc=harry@kernel.org \
--cc=jackmanb@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.