From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail.ilvokhin.com (mail.ilvokhin.com [178.62.254.231]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9D49B3B19BB for ; Tue, 28 Apr 2026 13:41:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=178.62.254.231 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777383696; cv=none; b=j6/6Y7T1CZl+ar1SmemD9foe5exBEA7O4vbOXiCTD1u2OgiH6fnVY7HCs0QBjcdDA2eG+mZmPHqfs8tKGkJJaLs6D2KD+3YqC4Y1hYDXyjf6OPQTG4ijyo92iPaUDT3MoFS3k0xeMtPARmyQr9MHqZLRZF4bcm2pErK7SZv2qmY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777383696; c=relaxed/simple; bh=AqFNrfj3qCpo1gqPIjkuBjXnVpO7qE4x7uakU1V+Rys=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=TD7tyoGbGMrwjdyHXu5q5vfgHgB5LIgTSA3JcXbQvoNAYzkCPlTctWBcTGceEGpoEJ1ix4yuc6OV49aqyjXwdazdGWxPD9JWdxeSy90Irx6pMxmfHMg3hcVx8eFgNobf2MaVo3nu3gg5LuwQvnZXaAR1A5BZzgVR3Z1TeG8AlXo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=ilvokhin.com; spf=pass smtp.mailfrom=ilvokhin.com; dkim=pass (1024-bit key) header.d=ilvokhin.com header.i=@ilvokhin.com header.b=bqtKGOP5; arc=none smtp.client-ip=178.62.254.231 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=ilvokhin.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ilvokhin.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=ilvokhin.com header.i=@ilvokhin.com header.b="bqtKGOP5" Received: from shell.ilvokhin.com (shell.ilvokhin.com [138.68.190.75]) (Authenticated sender: d@ilvokhin.com) by mail.ilvokhin.com (Postfix) with ESMTPSA id 31D14C79E0; Tue, 28 Apr 2026 13:41:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ilvokhin.com; s=mail; t=1777383692; bh=V6dtIaTvbG8kHNlcuTrpQlv4YvEPbNRPpSEHmCm4a4E=; h=Date:From:To:Cc:Subject:References:In-Reply-To; b=bqtKGOP5PM05ry5CtpcM6JwZwXvsc1Nid+SzBNnvadJGEG1J3oIJQ2e/JH2DvC/X7 z35tO8Di/OLbLYTVbK/Q6vHgtk5VnMYdP5kOtsQKGpafJg4OjuckiA35HaJmYRcA81 xpvKLdkAWnDp/TfEbtkvN5YZ9TxcHv/KQ3t+00xY= Date: Tue, 28 Apr 2026 13:41:31 +0000 From: Dmitry Ilvokhin To: Peter Zijlstra Cc: dan.j.williams@intel.com, Vlastimil Babka , Steven Rostedt , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Brendan Jackman , Johannes Weiner , Zi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@meta.com Subject: Re: [PATCH 1/8] mm: use zone lock guard in reserve_highatomic_pageblock() Message-ID: References: <20260306095336.a79fcc869a7f6d2b2e97501b@linux-foundation.org> <20260306130052.7da8eab3@gandalf.local.home> <20260307131641.GX606826@noisy.programming.kicks-ass.net> <20260309164516.GE606826@noisy.programming.kicks-ass.net> <20260428114721.GV3102624@noisy.programming.kicks-ass.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260428114721.GV3102624@noisy.programming.kicks-ass.net> On Tue, Apr 28, 2026 at 01:47:21PM +0200, Peter Zijlstra wrote: > On Tue, Apr 28, 2026 at 10:58:41AM +0000, Dmitry Ilvokhin wrote: > > > I re-tested my original patchset after rebasing and can still reproduce > > the regression (though smaller). It appears to depend on compiler > > inlining decisions: in some cases the compiler is able to deduplicate > > the cleanup path across multiple return sites, while in others it is > > not. > > I'm confused, all this has __always_inline on. And the compilers should > be able to track the assignment of the variable and eliminate the test > themselves if value-tracking excludes NULL. I'd expect that too, but seems it is not the case. I was looking at this patchset rebased on top of the mm-stable build with GCC 16 with defconfig. https://lore.kernel.org/all/cover.1775654118.git.d@ilvokhin.com/ $ ./scripts/bloat-o-meter /tmp/bloat-gcc-16/page_alloc.o.before /tmp/bloat-gcc-16/page_alloc.o.after | grep -v UNIQUE add/remove: 24/24 grow/shrink: 2/1 up/down: 232/-193 (39) Function old new delta free_pcppages_bulk 577 601 +24 free_pages_exact 153 169 +16 unreserve_highatomic_pageblock 608 607 -1 Total: Before=42366, After=42405, chg +0.09% I looked at free_pcppages_bulk() generated code diff and got the following differences (only relevant parts). [...] - mov 0x20(%rsp),%r12 - mov 0x28(%rsp),%r15 + mov 0x20(%rsp),%r13 + mov 0x28(%rsp),%r12 + test %r13,%r13 ; guard NULL branch + je 2ba6 [...] add $0x30,%rsp - mov %r15,%rsi - mov %r12,%rdi + mov %r12,%rsi + mov %r13,%rdi pop %rbx pop %rbp pop %r12 pop %r13 pop %r14 pop %r15 - jmp 2b8b + jmp 2b90 mov $0x800,%eax jmp 2af3 mov 0x1c(%rsp),%edx mov %ebp,%r14d jmp 29ae - data16 cs nopw 0x0(%rax,%rax,1) + add $0x30,%rsp ; epilogue is duplicated once again. + pop %rbx + pop %rbp + pop %r12 + pop %r13 + pop %r14 + pop %r15 + jmp 2bb9 So, NULL check is not eliminated and in addition compiler couldn't merge two almost identical epilogues here. I suspect the function calls between construction and destruction (e.g., __free_one_page()) prevent the compiler from proving .lock is non-NULL across the scope, but this is just speculation, I am not a compiler expert. > > > Given that, I think we can go further than just removing > > __GUARD_IS_ERR(). It should be possible to eliminate this branch > > entirely and simplify the cleanup flow. > > > > https://lore.kernel.org/all/20260427165037.205337-1-d@ilvokhin.com/ > > > > Reposting here to increase visibility, as several people involved in > > this code have participated in this thread already. > > > > Any feedback would be appreciated. > > This would require very careful audit of all the current users, it's had > this behaviour from the start. My thinking process was the following: the constructor doesn't check for NULL either, so if something is passing NULL to begin with it should already crash in constructor not even reaching the destructor. The only case when we'd need the NULL check in the destructor is if something changes .lock to NULL after construction, but before destruction. This can't happen: - The guard infrastructure provides no API to clear a guard after construction (no_free_ptr()/retain_ptr() only work with __free, not guards). - guard() creates an anonymous variable via __UNIQUE_ID, so user code can't reference it to modify .lock directly. - CLASS_INIT() is the only macro that allows bypassing the constructor, but it is only used for fd_prepare in include/linux/file.h, not for any lock guards. - Conditional (_try) variants have their own destructor via EXTEND_CLASS_COND() that checks __GUARD_IS_ERR() and returns early before reaching the base destructor. They are not affected by this change.