Re: [PATCH 1/8] mm: use zone lock guard in reserve_highatomic_pageblock()

public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed

From: Dmitry Ilvokhin <d@ilvokhin.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: dan.j.williams@intel.com, Vlastimil Babka <vbabka@kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@kernel.org>,
	Lorenzo Stoakes <ljs@kernel.org>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>,
	Brendan Jackman <jackmanb@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>, Zi Yan <ziy@nvidia.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@meta.com
Subject: Re: [PATCH 1/8] mm: use zone lock guard in reserve_highatomic_pageblock()
Date: Tue, 28 Apr 2026 10:58:41 +0000	[thread overview]
Message-ID: <afCS4d4YccQFtvpi@shell.ilvokhin.com> (raw)
In-Reply-To: <20260309164516.GE606826@noisy.programming.kicks-ass.net>

On Mon, Mar 09, 2026 at 05:45:16PM +0100, Peter Zijlstra wrote:
> On Sat, Mar 07, 2026 at 02:09:41PM +0000, Dmitry Ilvokhin wrote:
> > On Sat, Mar 07, 2026 at 02:16:41PM +0100, Peter Zijlstra wrote:
> > > On Fri, Mar 06, 2026 at 07:24:56PM +0100, Vlastimil Babka wrote:
> > > 
> > > > Yeah I don't think the guard construct in this case should be doing anything
> > > > here that wouldn't allow the compiler to compile to the exactly same result
> > > > as before? Either there's some problem with the infra, or we're just victim
> > > > of compiler heuristics. In both cases imho worth looking into rather than
> > > > rejecting the construct.
> > > 
> > > I'd love to look into it, but I can't seem to apply these patches to
> > > anything.
> > > 
> > > By virtue of not actually having the patches, I had to resort to b4, and
> > > I think the incantation is something like:
> > > 
> > >   b4 shazam cover.1772811429.git.d@ilvokhin.com
> > > 
> > > but it doesn't want to apply to anything I have at hand. Specifically, I
> > > tried Linus' tree and tip, which is most of what I have at hand.
> > 
> > Thanks for taking a look, Peter.
> > 
> > This series is based on mm-new and depends on my earlier patchset:
> > 
> > https://lore.kernel.org/all/cover.1772206930.git.d@ilvokhin.com/
> > 
> > Those patches are currently only in Andrew's mm-new tree, so this series
> > won't apply cleanly on Linus' tree or tip.
> > 
> > It should apply on top of mm-new from:
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> 
> OK, so the big problem is __GUARD_IS_ERR(), and that came up before, but
> while Linus told me how to fix it, he didn't actually like it very much:
> 
>   https://lore.kernel.org/all/20250513085001.GC25891@noisy.programming.kicks-ass.net/
> 
> However it does help with this:
> 
> $ ./scripts/bloat-o-meter defconfig-build/mm/page_alloc-pre-gcc-16.o defconfig-build/mm/page_alloc-post-gcc-16.o | grep -v __UNIQUE
> add/remove: 24/24 grow/shrink: 3/2 up/down: 296/-224 (72)
> Function                                     old     new   delta
> get_page_from_freelist                      6158    6198     +40
> free_pcppages_bulk                           678     714     +36
> unreserve_highatomic_pageblock               708     736     +28
> make_alloc_exact                             280     264     -16
> alloc_pages_bulk_noprof                     1415    1399     -16
> Total: Before=45299, After=45371, chg +0.16%
> 
> $ ./scripts/bloat-o-meter defconfig-build/mm/page_alloc-pre-gcc-16.o defconfig-build/mm/page_alloc.o | grep -v __UNIQUE
> add/remove: 24/24 grow/shrink: 3/15 up/down: 277/-363 (-86)
> Function                                     old     new   delta
> unreserve_highatomic_pageblock               708     757     +49
> free_pcppages_bulk                           678     707     +29
> get_page_from_freelist                      6158    6165      +7
> try_to_claim_block                          1729    1726      -3
> setup_per_zone_wmarks                        656     653      -3
> free_pages_prepare                           924     921      -3
> calculate_totalreserve_pages                 282     279      -3
> alloc_frozen_pages_nolock_noprof             622     619      -3
> __free_pages_prepare                         924     921      -3
> __free_pages_ok                             1197    1194      -3
> __free_one_page                             1330    1327      -3
> __free_frozen_pages                         1303    1300      -3
> __rmqueue_pcplist                           2786    2777      -9
> free_unref_folios                           1905    1894     -11
> setup_per_zone_lowmem_reserve                388     374     -14
> make_alloc_exact                             280     264     -16
> __alloc_frozen_pages_noprof                 5411    5368     -43
> nr_free_zone_pages                           189     138     -51
> Total: Before=45299, After=45213, chg -0.19%
> 
> 
> 
> However, looking at things again, I think we can get rid of that
> unconditional __GUARD_IS_ERR(), something like the below, Dan?
> 
> This then gives:
> 
> $ ./scripts/bloat-o-meter defconfig-build/mm/page_alloc-pre-gcc-16.o defconfig-build/mm/page_alloc.o | grep -v __UNIQUE
> add/remove: 24/24 grow/shrink: 1/16 up/down: 213/-486 (-273)
> Function                                     old     new   delta
> free_pcppages_bulk                           678     699     +21
> try_to_claim_block                          1729    1723      -6
> setup_per_zone_wmarks                        656     650      -6
> free_pages_prepare                           924     918      -6
> calculate_totalreserve_pages                 282     276      -6
> alloc_frozen_pages_nolock_noprof             622     616      -6
> __free_pages_prepare                         924     918      -6
> __free_pages_ok                             1197    1191      -6
> __free_one_page                             1330    1324      -6
> __free_frozen_pages                         1303    1297      -6
> free_pages_exact                             199     183     -16
> setup_per_zone_lowmem_reserve                388     371     -17
> free_unref_folios                           1905    1888     -17
> __rmqueue_pcplist                           2786    2768     -18
> nr_free_zone_pages                           189     138     -51
> __alloc_frozen_pages_noprof                 5411    5359     -52
> get_page_from_freelist                      6158    6089     -69
> Total: Before=45299, After=45026, chg -0.60%
> 
> 
> Anyway, if you all care about the size of things -- those tracepoints
> consume *WAAY* more bytes than any of this.
> 
> 
> ---
> --- a/include/linux/cleanup.h
> +++ b/include/linux/cleanup.h
> @@ -286,15 +286,18 @@ static __always_inline _type class_##_na
>  	__no_context_analysis						\
>  { _type t = _init; return t; }
>  
> -#define EXTEND_CLASS(_name, ext, _init, _init_args...)			\
> -typedef lock_##_name##_t lock_##_name##ext##_t;			\
> +#define EXTEND_CLASS_COND(_name, ext, _cond, _init, _init_args...)	\
> +typedef lock_##_name##_t lock_##_name##ext##_t;				\
>  typedef class_##_name##_t class_##_name##ext##_t;			\
> -static __always_inline void class_##_name##ext##_destructor(class_##_name##_t *p) \
> -{ class_##_name##_destructor(p); }					\
> +static __always_inline void class_##_name##ext##_destructor(class_##_name##_t *_T) \
> +{ if (_cond) return; class_##_name##_destructor(_T); }			\
>  static __always_inline class_##_name##_t class_##_name##ext##_constructor(_init_args) \
>  	__no_context_analysis \
>  { class_##_name##_t t = _init; return t; }
>  
> +#define EXTEND_CLASS(_name, ext, _init, _init_args...)			\
> +	EXTEND_CLASS_COND(_name, ext, 0, _init, _init_args)
> +
>  #define CLASS(_name, var)						\
>  	class_##_name##_t var __cleanup(class_##_name##_destructor) =	\
>  		class_##_name##_constructor
> @@ -394,12 +397,12 @@ static __maybe_unused const bool class_#
>  	__DEFINE_GUARD_LOCK_PTR(_name, _T)
>  
>  #define DEFINE_GUARD(_name, _type, _lock, _unlock) \
> -	DEFINE_CLASS(_name, _type, if (!__GUARD_IS_ERR(_T)) { _unlock; }, ({ _lock; _T; }), _type _T); \
> +	DEFINE_CLASS(_name, _type, if (_T) { _unlock; }, ({ _lock; _T; }), _type _T); \
>  	DEFINE_CLASS_IS_GUARD(_name)
>  
>  #define DEFINE_GUARD_COND_4(_name, _ext, _lock, _cond) \
>  	__DEFINE_CLASS_IS_CONDITIONAL(_name##_ext, true); \
> -	EXTEND_CLASS(_name, _ext, \
> +	EXTEND_CLASS_COND(_name, _ext, __GUARD_IS_ERR(*_T), \
>  		     ({ void *_t = _T; int _RET = (_lock); if (_T && !(_cond)) _t = ERR_PTR(_RET); _t; }), \
>  		     class_##_name##_t _T) \
>  	static __always_inline void * class_##_name##_ext##_lock_ptr(class_##_name##_t *_T) \
> @@ -488,7 +491,7 @@ typedef struct {							\
>  static __always_inline void class_##_name##_destructor(class_##_name##_t *_T) \
>  	__no_context_analysis						\
>  {									\
> -	if (!__GUARD_IS_ERR(_T->lock)) { _unlock; }			\
> +	if (_T->lock) { _unlock; }					\
>  }									\
>  									\
>  __DEFINE_GUARD_LOCK_PTR(_name, &_T->lock)
> @@ -565,7 +568,7 @@ __DEFINE_LOCK_GUARD_0(_name, _lock)
>  
>  #define DEFINE_LOCK_GUARD_1_COND_4(_name, _ext, _lock, _cond)		\
>  	__DEFINE_CLASS_IS_CONDITIONAL(_name##_ext, true);		\
> -	EXTEND_CLASS(_name, _ext,					\
> +	EXTEND_CLASS_COND(_name, _ext, __GUARD_IS_ERR(_T->lock),	\
>  		     ({ class_##_name##_t _t = { .lock = l }, *_T = &_t;\
>  		        int _RET = (_lock);                             \
>  		        if (_T->lock && !(_cond)) _T->lock = ERR_PTR(_RET);\

I re-tested my original patchset after rebasing and can still reproduce
the regression (though smaller). It appears to depend on compiler
inlining decisions: in some cases the compiler is able to deduplicate
the cleanup path across multiple return sites, while in others it is
not.

Given that, I think we can go further than just removing
__GUARD_IS_ERR(). It should be possible to eliminate this branch
entirely and simplify the cleanup flow.

    https://lore.kernel.org/all/20260427165037.205337-1-d@ilvokhin.com/

Reposting here to increase visibility, as several people involved in
this code have participated in this thread already.

Any feedback would be appreciated.

next prev parent reply	other threads:[~2026-04-28 10:58 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-06 16:05 [PATCH 0/8] mm: introduce zone lock guards Dmitry Ilvokhin
2026-03-06 16:05 ` [PATCH 1/8] mm: use zone lock guard in reserve_highatomic_pageblock() Dmitry Ilvokhin
2026-03-06 17:53   ` Andrew Morton
2026-03-06 18:00     ` Steven Rostedt
2026-03-06 18:24       ` Vlastimil Babka
2026-03-06 18:33         ` Andrew Morton
2026-03-06 18:46           ` Steven Rostedt
2026-03-07 13:16         ` Peter Zijlstra
2026-03-07 14:09           ` Dmitry Ilvokhin
2026-03-09 16:45             ` Peter Zijlstra
2026-03-10 12:57               ` Dmitry Ilvokhin
2026-03-12 23:40               ` Dan Williams
2026-03-13  8:36                 ` Peter Zijlstra
2026-04-28 10:58               ` Dmitry Ilvokhin [this message]
2026-04-28 11:47                 ` Peter Zijlstra
2026-04-28 13:41                   ` Dmitry Ilvokhin
2026-03-26 18:04     ` Dmitry Ilvokhin
2026-03-26 18:51       ` Andrew Morton
2026-03-06 16:05 ` [PATCH 2/8] mm: use zone lock guard in unset_migratetype_isolate() Dmitry Ilvokhin
2026-03-06 16:05 ` [PATCH 3/8] mm: use zone lock guard in unreserve_highatomic_pageblock() Dmitry Ilvokhin
2026-03-06 16:10   ` Steven Rostedt
2026-03-06 16:05 ` [PATCH 4/8] mm: use zone lock guard in set_migratetype_isolate() Dmitry Ilvokhin
2026-03-06 16:05 ` [PATCH 5/8] mm: use zone lock guard in take_page_off_buddy() Dmitry Ilvokhin
2026-03-06 16:05 ` [PATCH 6/8] mm: use zone lock guard in put_page_back_buddy() Dmitry Ilvokhin
2026-03-06 16:05 ` [PATCH 7/8] mm: use zone lock guard in free_pcppages_bulk() Dmitry Ilvokhin
2026-03-06 16:05 ` [PATCH 8/8] mm: use zone lock guard in __offline_isolated_pages() Dmitry Ilvokhin
2026-03-06 16:15 ` [PATCH 0/8] mm: introduce zone lock guards Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afCS4d4YccQFtvpi@shell.ilvokhin.com \
    --to=d@ilvokhin.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=jackmanb@google.com \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=mhocko@suse.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox