From: Dmitry Ilvokhin <d@ilvokhin.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: dan.j.williams@intel.com, Vlastimil Babka <vbabka@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>,
Lorenzo Stoakes <ljs@kernel.org>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>,
Brendan Jackman <jackmanb@google.com>,
Johannes Weiner <hannes@cmpxchg.org>, Zi Yan <ziy@nvidia.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
kernel-team@meta.com
Subject: Re: [PATCH 1/8] mm: use zone lock guard in reserve_highatomic_pageblock()
Date: Tue, 28 Apr 2026 10:58:41 +0000 [thread overview]
Message-ID: <afCS4d4YccQFtvpi@shell.ilvokhin.com> (raw)
In-Reply-To: <20260309164516.GE606826@noisy.programming.kicks-ass.net>
On Mon, Mar 09, 2026 at 05:45:16PM +0100, Peter Zijlstra wrote:
> On Sat, Mar 07, 2026 at 02:09:41PM +0000, Dmitry Ilvokhin wrote:
> > On Sat, Mar 07, 2026 at 02:16:41PM +0100, Peter Zijlstra wrote:
> > > On Fri, Mar 06, 2026 at 07:24:56PM +0100, Vlastimil Babka wrote:
> > >
> > > > Yeah I don't think the guard construct in this case should be doing anything
> > > > here that wouldn't allow the compiler to compile to the exactly same result
> > > > as before? Either there's some problem with the infra, or we're just victim
> > > > of compiler heuristics. In both cases imho worth looking into rather than
> > > > rejecting the construct.
> > >
> > > I'd love to look into it, but I can't seem to apply these patches to
> > > anything.
> > >
> > > By virtue of not actually having the patches, I had to resort to b4, and
> > > I think the incantation is something like:
> > >
> > > b4 shazam cover.1772811429.git.d@ilvokhin.com
> > >
> > > but it doesn't want to apply to anything I have at hand. Specifically, I
> > > tried Linus' tree and tip, which is most of what I have at hand.
> >
> > Thanks for taking a look, Peter.
> >
> > This series is based on mm-new and depends on my earlier patchset:
> >
> > https://lore.kernel.org/all/cover.1772206930.git.d@ilvokhin.com/
> >
> > Those patches are currently only in Andrew's mm-new tree, so this series
> > won't apply cleanly on Linus' tree or tip.
> >
> > It should apply on top of mm-new from:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>
> OK, so the big problem is __GUARD_IS_ERR(), and that came up before, but
> while Linus told me how to fix it, he didn't actually like it very much:
>
> https://lore.kernel.org/all/20250513085001.GC25891@noisy.programming.kicks-ass.net/
>
> However it does help with this:
>
> $ ./scripts/bloat-o-meter defconfig-build/mm/page_alloc-pre-gcc-16.o defconfig-build/mm/page_alloc-post-gcc-16.o | grep -v __UNIQUE
> add/remove: 24/24 grow/shrink: 3/2 up/down: 296/-224 (72)
> Function old new delta
> get_page_from_freelist 6158 6198 +40
> free_pcppages_bulk 678 714 +36
> unreserve_highatomic_pageblock 708 736 +28
> make_alloc_exact 280 264 -16
> alloc_pages_bulk_noprof 1415 1399 -16
> Total: Before=45299, After=45371, chg +0.16%
>
> $ ./scripts/bloat-o-meter defconfig-build/mm/page_alloc-pre-gcc-16.o defconfig-build/mm/page_alloc.o | grep -v __UNIQUE
> add/remove: 24/24 grow/shrink: 3/15 up/down: 277/-363 (-86)
> Function old new delta
> unreserve_highatomic_pageblock 708 757 +49
> free_pcppages_bulk 678 707 +29
> get_page_from_freelist 6158 6165 +7
> try_to_claim_block 1729 1726 -3
> setup_per_zone_wmarks 656 653 -3
> free_pages_prepare 924 921 -3
> calculate_totalreserve_pages 282 279 -3
> alloc_frozen_pages_nolock_noprof 622 619 -3
> __free_pages_prepare 924 921 -3
> __free_pages_ok 1197 1194 -3
> __free_one_page 1330 1327 -3
> __free_frozen_pages 1303 1300 -3
> __rmqueue_pcplist 2786 2777 -9
> free_unref_folios 1905 1894 -11
> setup_per_zone_lowmem_reserve 388 374 -14
> make_alloc_exact 280 264 -16
> __alloc_frozen_pages_noprof 5411 5368 -43
> nr_free_zone_pages 189 138 -51
> Total: Before=45299, After=45213, chg -0.19%
>
>
>
> However, looking at things again, I think we can get rid of that
> unconditional __GUARD_IS_ERR(), something like the below, Dan?
>
> This then gives:
>
> $ ./scripts/bloat-o-meter defconfig-build/mm/page_alloc-pre-gcc-16.o defconfig-build/mm/page_alloc.o | grep -v __UNIQUE
> add/remove: 24/24 grow/shrink: 1/16 up/down: 213/-486 (-273)
> Function old new delta
> free_pcppages_bulk 678 699 +21
> try_to_claim_block 1729 1723 -6
> setup_per_zone_wmarks 656 650 -6
> free_pages_prepare 924 918 -6
> calculate_totalreserve_pages 282 276 -6
> alloc_frozen_pages_nolock_noprof 622 616 -6
> __free_pages_prepare 924 918 -6
> __free_pages_ok 1197 1191 -6
> __free_one_page 1330 1324 -6
> __free_frozen_pages 1303 1297 -6
> free_pages_exact 199 183 -16
> setup_per_zone_lowmem_reserve 388 371 -17
> free_unref_folios 1905 1888 -17
> __rmqueue_pcplist 2786 2768 -18
> nr_free_zone_pages 189 138 -51
> __alloc_frozen_pages_noprof 5411 5359 -52
> get_page_from_freelist 6158 6089 -69
> Total: Before=45299, After=45026, chg -0.60%
>
>
> Anyway, if you all care about the size of things -- those tracepoints
> consume *WAAY* more bytes than any of this.
>
>
> ---
> --- a/include/linux/cleanup.h
> +++ b/include/linux/cleanup.h
> @@ -286,15 +286,18 @@ static __always_inline _type class_##_na
> __no_context_analysis \
> { _type t = _init; return t; }
>
> -#define EXTEND_CLASS(_name, ext, _init, _init_args...) \
> -typedef lock_##_name##_t lock_##_name##ext##_t; \
> +#define EXTEND_CLASS_COND(_name, ext, _cond, _init, _init_args...) \
> +typedef lock_##_name##_t lock_##_name##ext##_t; \
> typedef class_##_name##_t class_##_name##ext##_t; \
> -static __always_inline void class_##_name##ext##_destructor(class_##_name##_t *p) \
> -{ class_##_name##_destructor(p); } \
> +static __always_inline void class_##_name##ext##_destructor(class_##_name##_t *_T) \
> +{ if (_cond) return; class_##_name##_destructor(_T); } \
> static __always_inline class_##_name##_t class_##_name##ext##_constructor(_init_args) \
> __no_context_analysis \
> { class_##_name##_t t = _init; return t; }
>
> +#define EXTEND_CLASS(_name, ext, _init, _init_args...) \
> + EXTEND_CLASS_COND(_name, ext, 0, _init, _init_args)
> +
> #define CLASS(_name, var) \
> class_##_name##_t var __cleanup(class_##_name##_destructor) = \
> class_##_name##_constructor
> @@ -394,12 +397,12 @@ static __maybe_unused const bool class_#
> __DEFINE_GUARD_LOCK_PTR(_name, _T)
>
> #define DEFINE_GUARD(_name, _type, _lock, _unlock) \
> - DEFINE_CLASS(_name, _type, if (!__GUARD_IS_ERR(_T)) { _unlock; }, ({ _lock; _T; }), _type _T); \
> + DEFINE_CLASS(_name, _type, if (_T) { _unlock; }, ({ _lock; _T; }), _type _T); \
> DEFINE_CLASS_IS_GUARD(_name)
>
> #define DEFINE_GUARD_COND_4(_name, _ext, _lock, _cond) \
> __DEFINE_CLASS_IS_CONDITIONAL(_name##_ext, true); \
> - EXTEND_CLASS(_name, _ext, \
> + EXTEND_CLASS_COND(_name, _ext, __GUARD_IS_ERR(*_T), \
> ({ void *_t = _T; int _RET = (_lock); if (_T && !(_cond)) _t = ERR_PTR(_RET); _t; }), \
> class_##_name##_t _T) \
> static __always_inline void * class_##_name##_ext##_lock_ptr(class_##_name##_t *_T) \
> @@ -488,7 +491,7 @@ typedef struct { \
> static __always_inline void class_##_name##_destructor(class_##_name##_t *_T) \
> __no_context_analysis \
> { \
> - if (!__GUARD_IS_ERR(_T->lock)) { _unlock; } \
> + if (_T->lock) { _unlock; } \
> } \
> \
> __DEFINE_GUARD_LOCK_PTR(_name, &_T->lock)
> @@ -565,7 +568,7 @@ __DEFINE_LOCK_GUARD_0(_name, _lock)
>
> #define DEFINE_LOCK_GUARD_1_COND_4(_name, _ext, _lock, _cond) \
> __DEFINE_CLASS_IS_CONDITIONAL(_name##_ext, true); \
> - EXTEND_CLASS(_name, _ext, \
> + EXTEND_CLASS_COND(_name, _ext, __GUARD_IS_ERR(_T->lock), \
> ({ class_##_name##_t _t = { .lock = l }, *_T = &_t;\
> int _RET = (_lock); \
> if (_T->lock && !(_cond)) _T->lock = ERR_PTR(_RET);\
I re-tested my original patchset after rebasing and can still reproduce
the regression (though smaller). It appears to depend on compiler
inlining decisions: in some cases the compiler is able to deduplicate
the cleanup path across multiple return sites, while in others it is
not.
Given that, I think we can go further than just removing
__GUARD_IS_ERR(). It should be possible to eliminate this branch
entirely and simplify the cleanup flow.
https://lore.kernel.org/all/20260427165037.205337-1-d@ilvokhin.com/
Reposting here to increase visibility, as several people involved in
this code have participated in this thread already.
Any feedback would be appreciated.
next prev parent reply other threads:[~2026-04-28 10:58 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-06 16:05 [PATCH 0/8] mm: introduce zone lock guards Dmitry Ilvokhin
2026-03-06 16:05 ` [PATCH 1/8] mm: use zone lock guard in reserve_highatomic_pageblock() Dmitry Ilvokhin
2026-03-06 17:53 ` Andrew Morton
2026-03-06 18:00 ` Steven Rostedt
2026-03-06 18:24 ` Vlastimil Babka
2026-03-06 18:33 ` Andrew Morton
2026-03-06 18:46 ` Steven Rostedt
2026-03-07 13:16 ` Peter Zijlstra
2026-03-07 14:09 ` Dmitry Ilvokhin
2026-03-09 16:45 ` Peter Zijlstra
2026-03-10 12:57 ` Dmitry Ilvokhin
2026-03-12 23:40 ` Dan Williams
2026-03-13 8:36 ` Peter Zijlstra
2026-04-28 10:58 ` Dmitry Ilvokhin [this message]
2026-04-28 11:47 ` Peter Zijlstra
2026-04-28 13:41 ` Dmitry Ilvokhin
2026-03-26 18:04 ` Dmitry Ilvokhin
2026-03-26 18:51 ` Andrew Morton
2026-03-06 16:05 ` [PATCH 2/8] mm: use zone lock guard in unset_migratetype_isolate() Dmitry Ilvokhin
2026-03-06 16:05 ` [PATCH 3/8] mm: use zone lock guard in unreserve_highatomic_pageblock() Dmitry Ilvokhin
2026-03-06 16:10 ` Steven Rostedt
2026-03-06 16:05 ` [PATCH 4/8] mm: use zone lock guard in set_migratetype_isolate() Dmitry Ilvokhin
2026-03-06 16:05 ` [PATCH 5/8] mm: use zone lock guard in take_page_off_buddy() Dmitry Ilvokhin
2026-03-06 16:05 ` [PATCH 6/8] mm: use zone lock guard in put_page_back_buddy() Dmitry Ilvokhin
2026-03-06 16:05 ` [PATCH 7/8] mm: use zone lock guard in free_pcppages_bulk() Dmitry Ilvokhin
2026-03-06 16:05 ` [PATCH 8/8] mm: use zone lock guard in __offline_isolated_pages() Dmitry Ilvokhin
2026-03-06 16:15 ` [PATCH 0/8] mm: introduce zone lock guards Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=afCS4d4YccQFtvpi@shell.ilvokhin.com \
--to=d@ilvokhin.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=dan.j.williams@intel.com \
--cc=david@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=jackmanb@google.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox