Re: objtool: undefined stack state in folio_zero_user()

The Linux Kernel Mailing List
 help / color / mirror / Atom feed

From: Peter Zijlstra <peterz@infradead.org>
To: Alexander Potapenko <glider@google.com>
Cc: Dmitry Antipov <dmantipov@yandex.ru>,
	elver@google.com, dvyukov@google.com,
	Josh Poimboeuf <jpoimboe@kernel.org>,
	Thomas Gleixner <tglx@kernel.org>,
	linux-kernel@vger.kernel.org, nathan@kernel.org,
	nick.desaulniers+lkml@gmail.com, morbo@google.com,
	justinstitt@google.com
Subject: Re: objtool: undefined stack state in folio_zero_user()
Date: Tue, 30 Jun 2026 19:41:57 +0200	[thread overview]
Message-ID: <20260630174157.GE48970@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <CAG_fn=Ux8Dvvs1qGXTVD9m0pMrF48riqd54NTwW8tJo=mJr0WA@mail.gmail.com>

On Tue, Jun 30, 2026 at 04:14:35PM +0200, Alexander Potapenko wrote:
> > diff --git a/tools/objtool/check.c b/tools/objtool/check.c
> > index 10b18cf9c360..53a67b322856 100644
> > --- a/tools/objtool/check.c
> > +++ b/tools/objtool/check.c
> > @@ -3149,8 +3149,25 @@ static int update_cfi_state(struct instruction *insn,
> >                                 /* drap: mov disp(%rbp), %reg */
> >                                 restore_reg(cfi, op->dest.reg);
> >
> > +                       } else if (op->src.reg == CFI_SP &&
> > +                                  regs[CFI_SP].base == CFI_CFA &&
> > +                                  op->src.offset == regs[CFI_SP].offset + cfi->stack_size) {
> > +
> > +                               /*
> > +                                * Clang RSP musical chains:
> 
> s/chains/chairs if you're going to submit that ;)

:-)

> I am not sure we can do much on the compiler side here.
> KMSAN just heavily increases register pressure, and this is how the
> backend handles it.
> We can't even influence it from the middle-end where the instrumentation occurs.
> I remember Clang having more than one regallocator (we used to fall
> back to PBQP for some huge files when instrumenting Chrome), but
> switching to the non-default one will probably open a can of worms.

Something in that compiler is smoking very potent dope.

The code I have here has the form:

    mov     %rsp, %rcx
 1: mov     %rcx, %rsp
    ...
    mov     %rsp, 0x68(%rsp)
    ...
    mov     0x68(%rsp), %rcx
    test
    je 1b
    mov     %rcx, %r12
    ...
    mov     %r12, %rcx
    jmp 1b

Which is really really stupid, it spills the rsp value to the stack,
only to then load it into another register. Simply doing:

    mov     %rsp, %rcx
 1: mov     %rcx, %rsp
    ...
    mov     %rsp, %rcx
    test
    je 1b
    mov     %rcx, %r12
    ...
    mov     %r12, %rcx
    jmp 1b

Would have made it so much better. But I'm not at all sure why it is
playing these rsp games to begin with; that code just doesn't make much
sense to me at all.

Gemini is suggesting it is:

The rsp manipulation occurs for two primary reasons:

 - Strict Stack Alignment: Most Application Binary Interfaces (ABIs),
   such as the System V AMD64 ABI, require the stack pointer (rsp) to
   be 16-byte aligned (rsp (mod 16) = 0) immediately before a function
   call. In functions with highly optimized local variables or
   dynamically allocated stack memory using alloca(), the stack pointer
   can easily drift. Clang temporarily aligns the stack by rounding it
   down, but must stash the original rsp to restore it properly after
   the tracking function completes.

 - Dynamic Shadow/Origin Mapping: The function __msan_chain_origin
   modifies origin metadata. Passing localized stack data or updating
   origin chains can cause unpredictable frame offsets or displacement
   inside the compiler's temporary spilling phase. Stashing the stack
   pointer guarantees that the instrumentation code will not corrupt the
   compiler-generated local variables if it relies on a consistent frame
   pointer.

But if this is the former (alignment), then it already notices the stack
is properly aligned because there are no actual alignment instructions
issued, at which point it can then elide the restore too, but it
doesn't.

Gemini further elaborates:

  The Call Site "Opaque Wrap"

  When the KMSAN pass runs, it treats the injection of
  __msan_chain_origin as a highly specific helper callback rather than a
  standard C function call.

  To prevent the compiler's backend from optimizing away or rearranging
  the timing of this tracking, the instrumentation framework wraps the
  call inside an execution envelope that dictates: "Save the CPU state,
  call the hook, restore the CPU state."

  Even if the backend later calculates that no alignment modification is
  needed, the instruction slots for the save/restore actions have
  already been allocated in the compiler's intermediate representation
  (LLVM IR). Because x86-64 requires rsp tracking for non-leaf
  functions, LLVM assigns a virtual register to stash rsp.

  ...

  When the compiler’s register allocator reaches the instruction
  sequence to save rsp, it discovers it has zero free registers
  available to hold the value temporarily.

  Its fallback mechanism for a lack of registers is to "spill" the value
  to memory. Because there is no frame pointer (rbp), the only way it
  knows how to address memory is relative to rsp. It emits the command to
  copy rsp to [rsp + offset], unknowingly creating the circular logic
  failure.

Here, that last thing, surely it can be taught to detect this logical
loop, storing rsp using rsp. Additionally, the moment it realizes it
doesn't need to re-align the stack (and it does), it can also kill the
restore.

Also, there is always a 'free' register to store RSP, it is called: RSP
:-)

Now, clearly I don't actually know much of LLVM internals, but this is
all quite insane.

next prev parent reply	other threads:[~2026-06-30 17:42 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <35822cf3c35fc6621621f858e94a2b0ce19abf88.camel@yandex.ru>
2026-06-30 10:44 ` objtool: undefined stack state in folio_zero_user() Peter Zijlstra
2026-06-30 12:31   ` Dmitry Antipov
2026-06-30 13:54   ` Peter Zijlstra
2026-06-30 14:14     ` Alexander Potapenko
2026-06-30 17:41       ` Peter Zijlstra [this message]
2026-06-30 20:24         ` Peter Zijlstra
2026-06-30 18:36     ` Thomas Gleixner
2026-07-01 15:18       ` Alexander Potapenko
2026-07-01 16:23         ` Alexander Potapenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260630174157.GE48970@noisy.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=dmantipov@yandex.ru \
    --cc=dvyukov@google.com \
    --cc=elver@google.com \
    --cc=glider@google.com \
    --cc=jpoimboe@kernel.org \
    --cc=justinstitt@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=morbo@google.com \
    --cc=nathan@kernel.org \
    --cc=nick.desaulniers+lkml@gmail.com \
    --cc=tglx@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox