From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail.ilvokhin.com (mail.ilvokhin.com [178.62.254.231]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6249F2BD030 for ; Tue, 28 Apr 2026 10:58:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=178.62.254.231 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777373929; cv=none; b=NE/oL+CW30ogdUo4ZiKz6gff81kieQhAWt8TylUc1CObOph//WD7K/j7NDkFjJysBsGR7GrlaHP9nDvR53Z1W+zKFiokynAYKZQPGPvtBYDhV/vAmAqMl4fsae5b9S7vOrjD6njFr7COpAGVmHgKZdtBZocWlPaciqHtJo4t1uM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777373929; c=relaxed/simple; bh=prlYLsObd4wV9t6D56Ald1QfwI4zooX6ZfpR7AvtsMY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=VPMY35geJXm+N3BLSf27weqIxVJY5U6LJ/ptTf0ZPlN3Isec8z5q360tiqlvJW7sC+sLisy4AhF0PF+UdxKkmdhTsaR1zCyqhf/zUBeYswrMfXvLO6jlbVFc5QzSdTjbl8Y/2ViJ9NVHN7bJKxYGQjEzCMNcMRvX9wPjpGNH84E= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=ilvokhin.com; spf=pass smtp.mailfrom=ilvokhin.com; dkim=pass (1024-bit key) header.d=ilvokhin.com header.i=@ilvokhin.com header.b=MUayeyGk; arc=none smtp.client-ip=178.62.254.231 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=ilvokhin.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ilvokhin.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=ilvokhin.com header.i=@ilvokhin.com header.b="MUayeyGk" Received: from shell.ilvokhin.com (shell.ilvokhin.com [138.68.190.75]) (Authenticated sender: d@ilvokhin.com) by mail.ilvokhin.com (Postfix) with ESMTPSA id 554F7C79BE; Tue, 28 Apr 2026 10:58:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ilvokhin.com; s=mail; t=1777373925; bh=zIO9Z6eCCgsu+TbVbco8MmNWF3wPidcpQDa9ZnLSo4g=; h=Date:From:To:Cc:Subject:References:In-Reply-To; b=MUayeyGkeosTrf5GoC1Az5Guj7bmnpCNjX2ArIoYI/IyvY7Gndh4PmSifuHFLFOi7 lxJf6mElp2LUh+7jxUA4X3dnuCnLT2hf9FpJj+N4HxGnzm7IO5mrBfEMo3zQHHRxIE Seq2I3A7dly9h4vQVHkLl6mhpzls8CsBAtByTI8M= Date: Tue, 28 Apr 2026 10:58:41 +0000 From: Dmitry Ilvokhin To: Peter Zijlstra Cc: dan.j.williams@intel.com, Vlastimil Babka , Steven Rostedt , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Brendan Jackman , Johannes Weiner , Zi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@meta.com Subject: Re: [PATCH 1/8] mm: use zone lock guard in reserve_highatomic_pageblock() Message-ID: References: <20260306095336.a79fcc869a7f6d2b2e97501b@linux-foundation.org> <20260306130052.7da8eab3@gandalf.local.home> <20260307131641.GX606826@noisy.programming.kicks-ass.net> <20260309164516.GE606826@noisy.programming.kicks-ass.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260309164516.GE606826@noisy.programming.kicks-ass.net> On Mon, Mar 09, 2026 at 05:45:16PM +0100, Peter Zijlstra wrote: > On Sat, Mar 07, 2026 at 02:09:41PM +0000, Dmitry Ilvokhin wrote: > > On Sat, Mar 07, 2026 at 02:16:41PM +0100, Peter Zijlstra wrote: > > > On Fri, Mar 06, 2026 at 07:24:56PM +0100, Vlastimil Babka wrote: > > > > > > > Yeah I don't think the guard construct in this case should be doing anything > > > > here that wouldn't allow the compiler to compile to the exactly same result > > > > as before? Either there's some problem with the infra, or we're just victim > > > > of compiler heuristics. In both cases imho worth looking into rather than > > > > rejecting the construct. > > > > > > I'd love to look into it, but I can't seem to apply these patches to > > > anything. > > > > > > By virtue of not actually having the patches, I had to resort to b4, and > > > I think the incantation is something like: > > > > > > b4 shazam cover.1772811429.git.d@ilvokhin.com > > > > > > but it doesn't want to apply to anything I have at hand. Specifically, I > > > tried Linus' tree and tip, which is most of what I have at hand. > > > > Thanks for taking a look, Peter. > > > > This series is based on mm-new and depends on my earlier patchset: > > > > https://lore.kernel.org/all/cover.1772206930.git.d@ilvokhin.com/ > > > > Those patches are currently only in Andrew's mm-new tree, so this series > > won't apply cleanly on Linus' tree or tip. > > > > It should apply on top of mm-new from: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm > > OK, so the big problem is __GUARD_IS_ERR(), and that came up before, but > while Linus told me how to fix it, he didn't actually like it very much: > > https://lore.kernel.org/all/20250513085001.GC25891@noisy.programming.kicks-ass.net/ > > However it does help with this: > > $ ./scripts/bloat-o-meter defconfig-build/mm/page_alloc-pre-gcc-16.o defconfig-build/mm/page_alloc-post-gcc-16.o | grep -v __UNIQUE > add/remove: 24/24 grow/shrink: 3/2 up/down: 296/-224 (72) > Function old new delta > get_page_from_freelist 6158 6198 +40 > free_pcppages_bulk 678 714 +36 > unreserve_highatomic_pageblock 708 736 +28 > make_alloc_exact 280 264 -16 > alloc_pages_bulk_noprof 1415 1399 -16 > Total: Before=45299, After=45371, chg +0.16% > > $ ./scripts/bloat-o-meter defconfig-build/mm/page_alloc-pre-gcc-16.o defconfig-build/mm/page_alloc.o | grep -v __UNIQUE > add/remove: 24/24 grow/shrink: 3/15 up/down: 277/-363 (-86) > Function old new delta > unreserve_highatomic_pageblock 708 757 +49 > free_pcppages_bulk 678 707 +29 > get_page_from_freelist 6158 6165 +7 > try_to_claim_block 1729 1726 -3 > setup_per_zone_wmarks 656 653 -3 > free_pages_prepare 924 921 -3 > calculate_totalreserve_pages 282 279 -3 > alloc_frozen_pages_nolock_noprof 622 619 -3 > __free_pages_prepare 924 921 -3 > __free_pages_ok 1197 1194 -3 > __free_one_page 1330 1327 -3 > __free_frozen_pages 1303 1300 -3 > __rmqueue_pcplist 2786 2777 -9 > free_unref_folios 1905 1894 -11 > setup_per_zone_lowmem_reserve 388 374 -14 > make_alloc_exact 280 264 -16 > __alloc_frozen_pages_noprof 5411 5368 -43 > nr_free_zone_pages 189 138 -51 > Total: Before=45299, After=45213, chg -0.19% > > > > However, looking at things again, I think we can get rid of that > unconditional __GUARD_IS_ERR(), something like the below, Dan? > > This then gives: > > $ ./scripts/bloat-o-meter defconfig-build/mm/page_alloc-pre-gcc-16.o defconfig-build/mm/page_alloc.o | grep -v __UNIQUE > add/remove: 24/24 grow/shrink: 1/16 up/down: 213/-486 (-273) > Function old new delta > free_pcppages_bulk 678 699 +21 > try_to_claim_block 1729 1723 -6 > setup_per_zone_wmarks 656 650 -6 > free_pages_prepare 924 918 -6 > calculate_totalreserve_pages 282 276 -6 > alloc_frozen_pages_nolock_noprof 622 616 -6 > __free_pages_prepare 924 918 -6 > __free_pages_ok 1197 1191 -6 > __free_one_page 1330 1324 -6 > __free_frozen_pages 1303 1297 -6 > free_pages_exact 199 183 -16 > setup_per_zone_lowmem_reserve 388 371 -17 > free_unref_folios 1905 1888 -17 > __rmqueue_pcplist 2786 2768 -18 > nr_free_zone_pages 189 138 -51 > __alloc_frozen_pages_noprof 5411 5359 -52 > get_page_from_freelist 6158 6089 -69 > Total: Before=45299, After=45026, chg -0.60% > > > Anyway, if you all care about the size of things -- those tracepoints > consume *WAAY* more bytes than any of this. > > > --- > --- a/include/linux/cleanup.h > +++ b/include/linux/cleanup.h > @@ -286,15 +286,18 @@ static __always_inline _type class_##_na > __no_context_analysis \ > { _type t = _init; return t; } > > -#define EXTEND_CLASS(_name, ext, _init, _init_args...) \ > -typedef lock_##_name##_t lock_##_name##ext##_t; \ > +#define EXTEND_CLASS_COND(_name, ext, _cond, _init, _init_args...) \ > +typedef lock_##_name##_t lock_##_name##ext##_t; \ > typedef class_##_name##_t class_##_name##ext##_t; \ > -static __always_inline void class_##_name##ext##_destructor(class_##_name##_t *p) \ > -{ class_##_name##_destructor(p); } \ > +static __always_inline void class_##_name##ext##_destructor(class_##_name##_t *_T) \ > +{ if (_cond) return; class_##_name##_destructor(_T); } \ > static __always_inline class_##_name##_t class_##_name##ext##_constructor(_init_args) \ > __no_context_analysis \ > { class_##_name##_t t = _init; return t; } > > +#define EXTEND_CLASS(_name, ext, _init, _init_args...) \ > + EXTEND_CLASS_COND(_name, ext, 0, _init, _init_args) > + > #define CLASS(_name, var) \ > class_##_name##_t var __cleanup(class_##_name##_destructor) = \ > class_##_name##_constructor > @@ -394,12 +397,12 @@ static __maybe_unused const bool class_# > __DEFINE_GUARD_LOCK_PTR(_name, _T) > > #define DEFINE_GUARD(_name, _type, _lock, _unlock) \ > - DEFINE_CLASS(_name, _type, if (!__GUARD_IS_ERR(_T)) { _unlock; }, ({ _lock; _T; }), _type _T); \ > + DEFINE_CLASS(_name, _type, if (_T) { _unlock; }, ({ _lock; _T; }), _type _T); \ > DEFINE_CLASS_IS_GUARD(_name) > > #define DEFINE_GUARD_COND_4(_name, _ext, _lock, _cond) \ > __DEFINE_CLASS_IS_CONDITIONAL(_name##_ext, true); \ > - EXTEND_CLASS(_name, _ext, \ > + EXTEND_CLASS_COND(_name, _ext, __GUARD_IS_ERR(*_T), \ > ({ void *_t = _T; int _RET = (_lock); if (_T && !(_cond)) _t = ERR_PTR(_RET); _t; }), \ > class_##_name##_t _T) \ > static __always_inline void * class_##_name##_ext##_lock_ptr(class_##_name##_t *_T) \ > @@ -488,7 +491,7 @@ typedef struct { \ > static __always_inline void class_##_name##_destructor(class_##_name##_t *_T) \ > __no_context_analysis \ > { \ > - if (!__GUARD_IS_ERR(_T->lock)) { _unlock; } \ > + if (_T->lock) { _unlock; } \ > } \ > \ > __DEFINE_GUARD_LOCK_PTR(_name, &_T->lock) > @@ -565,7 +568,7 @@ __DEFINE_LOCK_GUARD_0(_name, _lock) > > #define DEFINE_LOCK_GUARD_1_COND_4(_name, _ext, _lock, _cond) \ > __DEFINE_CLASS_IS_CONDITIONAL(_name##_ext, true); \ > - EXTEND_CLASS(_name, _ext, \ > + EXTEND_CLASS_COND(_name, _ext, __GUARD_IS_ERR(_T->lock), \ > ({ class_##_name##_t _t = { .lock = l }, *_T = &_t;\ > int _RET = (_lock); \ > if (_T->lock && !(_cond)) _T->lock = ERR_PTR(_RET);\ I re-tested my original patchset after rebasing and can still reproduce the regression (though smaller). It appears to depend on compiler inlining decisions: in some cases the compiler is able to deduplicate the cleanup path across multiple return sites, while in others it is not. Given that, I think we can go further than just removing __GUARD_IS_ERR(). It should be possible to eliminate this branch entirely and simplify the cleanup flow. https://lore.kernel.org/all/20260427165037.205337-1-d@ilvokhin.com/ Reposting here to increase visibility, as several people involved in this code have participated in this thread already. Any feedback would be appreciated.