From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8BAAE376BD3 for ; Tue, 30 Jun 2026 17:42:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.50.34 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782841325; cv=none; b=k5K1QJodkIgLYTQ7+neQ7H5BliyP4STS5f1ewwIT3M0CoFtV7f+KkgTjhyKGk+qq/XzjlAR6wbWuLl8s6UUY30FAv08Nk9PrJqBM/rG0lrek3zstxto+j9eAn63QTprld0pWVqSvs/RDRiM+PUOWBbeGbp5GzSHGbkm125/HKts= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782841325; c=relaxed/simple; bh=A0DVz5oTYb8GB3xdHl2LeSLc0sLwDvyOmfIm/2pODEo=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=EE5C07/FiY5xoXdzblMY5bHr6ptIenUsu5SdgMFPbFsZ3WP+UCZi2V8YzMGzEHTeb8mIPdIJDz1nlcOvc79LHQllOVqBcNE7PrHBaSNtQ2TFu8vPU+3/Zceyzd3vR1ujH48Q7M2BqhdGU+j/1iEPV33wilj38jI6Z5A+phhYJ0Q= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org; spf=pass smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=BLab2+ua; arc=none smtp.client-ip=90.155.50.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="BLab2+ua" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Transfer-Encoding: Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description; bh=3IPiH3ZrRjA8TLtsYCzIODKdQd0XbJzMksVjihkoGa8=; b=BLab2+uaTXK0pPHBJIJ85D54R1 vNJ4dybb7ohPqHKwJwM3NaYBUfjOj9wwa1+LvC4aCUdG2S+n4bwP8xQDnvkoDHw92eX5WenCfCXVi +GGxTYLk375ybxJu+6vxLvAtyUu6cIWqnUiq5nh68dLGs91NSwu2UNW9/ZGkoDvh0C9EuyhR9qpXy x7XInOfuba0W4NDtCQszlj/FeyW02+MTFSx7GoRg9VfH0yNcTTL+k518arLfPbVJHxG0fcAvVbyAr q0wMmT3UrS7xaDzACyR+2hv7jOgydL9CHCBDfJ8+UFqu7em1Vtscr3Kj9DE6kp5Ek2tsoKC1MDBLu dbxtSxzA==; Received: from 77-249-17-252.cable.dynamic.v4.ziggo.nl ([77.249.17.252] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.99.1 #2 (Red Hat Linux)) id 1wecT9-00000005K8R-0YhB; Tue, 30 Jun 2026 17:41:59 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id 608B330035C; Tue, 30 Jun 2026 19:41:57 +0200 (CEST) Date: Tue, 30 Jun 2026 19:41:57 +0200 From: Peter Zijlstra To: Alexander Potapenko Cc: Dmitry Antipov , elver@google.com, dvyukov@google.com, Josh Poimboeuf , Thomas Gleixner , linux-kernel@vger.kernel.org, nathan@kernel.org, nick.desaulniers+lkml@gmail.com, morbo@google.com, justinstitt@google.com Subject: Re: objtool: undefined stack state in folio_zero_user() Message-ID: <20260630174157.GE48970@noisy.programming.kicks-ass.net> References: <35822cf3c35fc6621621f858e94a2b0ce19abf88.camel@yandex.ru> <20260630104434.GC751831@noisy.programming.kicks-ass.net> <20260630135450.GA921102@noisy.programming.kicks-ass.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: On Tue, Jun 30, 2026 at 04:14:35PM +0200, Alexander Potapenko wrote: > > diff --git a/tools/objtool/check.c b/tools/objtool/check.c > > index 10b18cf9c360..53a67b322856 100644 > > --- a/tools/objtool/check.c > > +++ b/tools/objtool/check.c > > @@ -3149,8 +3149,25 @@ static int update_cfi_state(struct instruction *insn, > > /* drap: mov disp(%rbp), %reg */ > > restore_reg(cfi, op->dest.reg); > > > > + } else if (op->src.reg == CFI_SP && > > + regs[CFI_SP].base == CFI_CFA && > > + op->src.offset == regs[CFI_SP].offset + cfi->stack_size) { > > + > > + /* > > + * Clang RSP musical chains: > > s/chains/chairs if you're going to submit that ;) :-) > I am not sure we can do much on the compiler side here. > KMSAN just heavily increases register pressure, and this is how the > backend handles it. > We can't even influence it from the middle-end where the instrumentation occurs. > I remember Clang having more than one regallocator (we used to fall > back to PBQP for some huge files when instrumenting Chrome), but > switching to the non-default one will probably open a can of worms. Something in that compiler is smoking very potent dope. The code I have here has the form: mov %rsp, %rcx 1: mov %rcx, %rsp ... mov %rsp, 0x68(%rsp) ... mov 0x68(%rsp), %rcx test je 1b mov %rcx, %r12 ... mov %r12, %rcx jmp 1b Which is really really stupid, it spills the rsp value to the stack, only to then load it into another register. Simply doing: mov %rsp, %rcx 1: mov %rcx, %rsp ... mov %rsp, %rcx test je 1b mov %rcx, %r12 ... mov %r12, %rcx jmp 1b Would have made it so much better. But I'm not at all sure why it is playing these rsp games to begin with; that code just doesn't make much sense to me at all. Gemini is suggesting it is: The rsp manipulation occurs for two primary reasons: - Strict Stack Alignment: Most Application Binary Interfaces (ABIs), such as the System V AMD64 ABI, require the stack pointer (rsp) to be 16-byte aligned (rsp (mod 16) = 0) immediately before a function call. In functions with highly optimized local variables or dynamically allocated stack memory using alloca(), the stack pointer can easily drift. Clang temporarily aligns the stack by rounding it down, but must stash the original rsp to restore it properly after the tracking function completes. - Dynamic Shadow/Origin Mapping: The function __msan_chain_origin modifies origin metadata. Passing localized stack data or updating origin chains can cause unpredictable frame offsets or displacement inside the compiler's temporary spilling phase. Stashing the stack pointer guarantees that the instrumentation code will not corrupt the compiler-generated local variables if it relies on a consistent frame pointer. But if this is the former (alignment), then it already notices the stack is properly aligned because there are no actual alignment instructions issued, at which point it can then elide the restore too, but it doesn't. Gemini further elaborates: The Call Site "Opaque Wrap" When the KMSAN pass runs, it treats the injection of __msan_chain_origin as a highly specific helper callback rather than a standard C function call. To prevent the compiler's backend from optimizing away or rearranging the timing of this tracking, the instrumentation framework wraps the call inside an execution envelope that dictates: "Save the CPU state, call the hook, restore the CPU state." Even if the backend later calculates that no alignment modification is needed, the instruction slots for the save/restore actions have already been allocated in the compiler's intermediate representation (LLVM IR). Because x86-64 requires rsp tracking for non-leaf functions, LLVM assigns a virtual register to stash rsp. ... When the compiler’s register allocator reaches the instruction sequence to save rsp, it discovers it has zero free registers available to hold the value temporarily. Its fallback mechanism for a lack of registers is to "spill" the value to memory. Because there is no frame pointer (rbp), the only way it knows how to address memory is relative to rsp. It emits the command to copy rsp to [rsp + offset], unknowingly creating the circular logic failure. Here, that last thing, surely it can be taught to detect this logical loop, storing rsp using rsp. Additionally, the moment it realizes it doesn't need to re-align the stack (and it does), it can also kill the restore. Also, there is always a 'free' register to store RSP, it is called: RSP :-) Now, clearly I don't actually know much of LLVM internals, but this is all quite insane.