* Re: objtool: undefined stack state in folio_zero_user() [not found] <35822cf3c35fc6621621f858e94a2b0ce19abf88.camel@yandex.ru> @ 2026-06-30 10:44 ` Peter Zijlstra 2026-06-30 12:31 ` Dmitry Antipov 2026-06-30 13:54 ` Peter Zijlstra 0 siblings, 2 replies; 9+ messages in thread From: Peter Zijlstra @ 2026-06-30 10:44 UTC (permalink / raw) To: Dmitry Antipov; +Cc: Josh Poimboeuf, Thomas Gleixner, linux-kernel On Mon, Jun 22, 2026 at 04:23:46PM +0300, Dmitry Antipov wrote: > As of ef0c9f75a195 ("lib: Add stale 'raid6' directory to .gitignore file") > with clang 22.1.8 and KMSAN enabled, objtool stucks in folio_zero_user(): > > $ ./tools/objtool/objtool --hacks=jump_label --hacks=noinstr \ > --hacks=skylake --ibt --prefix=16 --orc --retpoline --rethunk \ > --static-call --uaccess --no-unreachable --noinstr --unret --link \ > vmlinux.o > vmlinux.o: warning: objtool: folio_zero_user+0x947: undefined stack state > vmlinux.o: error: objtool: folio_zero_user+0x947: unknown CFA base reg -1 > > Dmitry > 0000000001533940 <folio_zero_user>: > 1534272: 48 89 e1 mov %rsp,%rcx > 1534275: 48 85 ed test %rbp,%rbp > 1534278: 8b 54 24 1c mov 0x1c(%rsp),%edx > 153427c: 0f 85 c2 00 00 00 jne 1534344 <folio_zero_user+0xa04> > 1534282: 31 c0 xor %eax,%eax > 1534284: 48 89 cc mov %rcx,%rsp > 1534287: 4c 89 f7 mov %r14,%rdi ;; HERE ... > 1534327: 48 89 64 24 78 mov %rsp,0x78(%rsp) ... > 153433a: 48 8b 4c 24 78 mov 0x78(%rsp),%rcx > 153433f: e9 31 ff ff ff jmp 1534275 <folio_zero_user+0x935> This is well insane codegen, and I cannot blame objtool for hating on it -- in fact, I hate on it too. Let me try and figure out how best to fix this insane compiler output. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: objtool: undefined stack state in folio_zero_user() 2026-06-30 10:44 ` objtool: undefined stack state in folio_zero_user() Peter Zijlstra @ 2026-06-30 12:31 ` Dmitry Antipov 2026-06-30 13:54 ` Peter Zijlstra 1 sibling, 0 replies; 9+ messages in thread From: Dmitry Antipov @ 2026-06-30 12:31 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Josh Poimboeuf, Thomas Gleixner, linux-kernel On 6/30/26 1:44 PM, Peter Zijlstra wrote: > This is well insane codegen, and I cannot blame objtool for hating on it > -- in fact, I hate on it too. > > Let me try and figure out how best to fix this insane compiler output. Hm... is there a chance to workaround this by disabling some particular optimization with -fno-xxx or similar? Dmitry ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: objtool: undefined stack state in folio_zero_user() 2026-06-30 10:44 ` objtool: undefined stack state in folio_zero_user() Peter Zijlstra 2026-06-30 12:31 ` Dmitry Antipov @ 2026-06-30 13:54 ` Peter Zijlstra 2026-06-30 14:14 ` Alexander Potapenko 2026-06-30 18:36 ` Thomas Gleixner 1 sibling, 2 replies; 9+ messages in thread From: Peter Zijlstra @ 2026-06-30 13:54 UTC (permalink / raw) To: Dmitry Antipov, glider, elver, dvyukov Cc: Josh Poimboeuf, Thomas Gleixner, linux-kernel, nathan, nick.desaulniers+lkml, morbo, justinstitt + KMSAN / clang folks On Tue, Jun 30, 2026 at 12:44:35PM +0200, Peter Zijlstra wrote: > On Mon, Jun 22, 2026 at 04:23:46PM +0300, Dmitry Antipov wrote: > > As of ef0c9f75a195 ("lib: Add stale 'raid6' directory to .gitignore file") > > with clang 22.1.8 and KMSAN enabled, objtool stucks in folio_zero_user(): > > > > $ ./tools/objtool/objtool --hacks=jump_label --hacks=noinstr \ > > --hacks=skylake --ibt --prefix=16 --orc --retpoline --rethunk \ > > --static-call --uaccess --no-unreachable --noinstr --unret --link \ > > vmlinux.o > > vmlinux.o: warning: objtool: folio_zero_user+0x947: undefined stack state > > vmlinux.o: error: objtool: folio_zero_user+0x947: unknown CFA base reg -1 > > > > Dmitry > > > 0000000001533940 <folio_zero_user>: > > > 1534272: 48 89 e1 mov %rsp,%rcx > > 1534275: 48 85 ed test %rbp,%rbp > > 1534278: 8b 54 24 1c mov 0x1c(%rsp),%edx > > 153427c: 0f 85 c2 00 00 00 jne 1534344 <folio_zero_user+0xa04> > > 1534282: 31 c0 xor %eax,%eax > > 1534284: 48 89 cc mov %rcx,%rsp > > 1534287: 4c 89 f7 mov %r14,%rdi ;; HERE > > ... > > 1534327: 48 89 64 24 78 mov %rsp,0x78(%rsp) > ... > > 153433a: 48 8b 4c 24 78 mov 0x78(%rsp),%rcx > > 153433f: e9 31 ff ff ff jmp 1534275 <folio_zero_user+0x935> > > > This is well insane codegen, and I cannot blame objtool for hating on it > -- in fact, I hate on it too. > > Let me try and figure out how best to fix this insane compiler output. This seems to 'work', but it is somewhat yuck. Josh, any better ideas? --- diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c index 1b387d5a195b..839c91d3c28c 100644 --- a/tools/objtool/arch/x86/decode.c +++ b/tools/objtool/arch/x86/decode.c @@ -391,7 +391,7 @@ int arch_decode_instruction(struct objtool_file *file, const struct section *sec break; - case 0x89: + case 0x89: /* mov r16/32/64,r/m16/32/64 */ if (!rex_w) break; @@ -430,7 +430,7 @@ int arch_decode_instruction(struct objtool_file *file, const struct section *sec } fallthrough; - case 0x88: + case 0x88: /* mov r8, r/m8 */ if (!rex_w) break; @@ -462,7 +462,7 @@ int arch_decode_instruction(struct objtool_file *file, const struct section *sec break; - case 0x8b: + case 0x8b: /* mov r/m16/32/64, r16/32/64 */ if (!rex_w) break; @@ -494,6 +494,9 @@ int arch_decode_instruction(struct objtool_file *file, const struct section *sec break; + case 0x8a: /* mov r/m8, r8 */ + break; + case 0x8d: if (mod_is_reg()) { WARN("invalid LEA encoding at %s:0x%lx", sec->name, offset); diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 10b18cf9c360..53a67b322856 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -3149,8 +3149,25 @@ static int update_cfi_state(struct instruction *insn, /* drap: mov disp(%rbp), %reg */ restore_reg(cfi, op->dest.reg); + } else if (op->src.reg == CFI_SP && + regs[CFI_SP].base == CFI_CFA && + op->src.offset == regs[CFI_SP].offset + cfi->stack_size) { + + /* + * Clang RSP musical chains: + * + * mov %rsp, disp(%rsp) + * ... + * mov disp(%rsp), %reg [handled here] + * ... + * mov %reg, %rsp + */ + cfi->vals[op->dest.reg].base = CFI_CFA; + cfi->vals[op->dest.reg].offset = -cfi->stack_size; + restore_reg(cfi, CFI_SP); + } else if (op->src.reg == cfa->base && - op->src.offset == regs[op->dest.reg].offset + cfa->offset) { + op->src.offset == regs[op->dest.reg].offset + cfa->offset) { /* mov disp(%rbp), %reg */ /* mov disp(%rsp), %reg */ @@ -3233,6 +3250,12 @@ static int update_cfi_state(struct instruction *insn, } else if (op->dest.reg == cfa->base) { + /* mov %rsp, disp(%rsp) */ + if (op->src.reg == CFI_SP && cfi->regs[CFI_SP].base == CFI_UNDEFINED) { + cfi->regs[CFI_SP].base = CFI_CFA; + cfi->regs[CFI_SP].offset = op->dest.offset - cfi->stack_size; + } + /* mov reg, disp(%rbp) */ /* mov reg, disp(%rsp) */ save_reg(cfi, op->src.reg, CFI_CFA, ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: objtool: undefined stack state in folio_zero_user() 2026-06-30 13:54 ` Peter Zijlstra @ 2026-06-30 14:14 ` Alexander Potapenko 2026-06-30 17:41 ` Peter Zijlstra 2026-06-30 18:36 ` Thomas Gleixner 1 sibling, 1 reply; 9+ messages in thread From: Alexander Potapenko @ 2026-06-30 14:14 UTC (permalink / raw) To: Peter Zijlstra Cc: Dmitry Antipov, elver, dvyukov, Josh Poimboeuf, Thomas Gleixner, linux-kernel, nathan, nick.desaulniers+lkml, morbo, justinstitt > diff --git a/tools/objtool/check.c b/tools/objtool/check.c > index 10b18cf9c360..53a67b322856 100644 > --- a/tools/objtool/check.c > +++ b/tools/objtool/check.c > @@ -3149,8 +3149,25 @@ static int update_cfi_state(struct instruction *insn, > /* drap: mov disp(%rbp), %reg */ > restore_reg(cfi, op->dest.reg); > > + } else if (op->src.reg == CFI_SP && > + regs[CFI_SP].base == CFI_CFA && > + op->src.offset == regs[CFI_SP].offset + cfi->stack_size) { > + > + /* > + * Clang RSP musical chains: s/chains/chairs if you're going to submit that ;) I am not sure we can do much on the compiler side here. KMSAN just heavily increases register pressure, and this is how the backend handles it. We can't even influence it from the middle-end where the instrumentation occurs. I remember Clang having more than one regallocator (we used to fall back to PBQP for some huge files when instrumenting Chrome), but switching to the non-default one will probably open a can of worms. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: objtool: undefined stack state in folio_zero_user() 2026-06-30 14:14 ` Alexander Potapenko @ 2026-06-30 17:41 ` Peter Zijlstra 2026-06-30 20:24 ` Peter Zijlstra 0 siblings, 1 reply; 9+ messages in thread From: Peter Zijlstra @ 2026-06-30 17:41 UTC (permalink / raw) To: Alexander Potapenko Cc: Dmitry Antipov, elver, dvyukov, Josh Poimboeuf, Thomas Gleixner, linux-kernel, nathan, nick.desaulniers+lkml, morbo, justinstitt On Tue, Jun 30, 2026 at 04:14:35PM +0200, Alexander Potapenko wrote: > > diff --git a/tools/objtool/check.c b/tools/objtool/check.c > > index 10b18cf9c360..53a67b322856 100644 > > --- a/tools/objtool/check.c > > +++ b/tools/objtool/check.c > > @@ -3149,8 +3149,25 @@ static int update_cfi_state(struct instruction *insn, > > /* drap: mov disp(%rbp), %reg */ > > restore_reg(cfi, op->dest.reg); > > > > + } else if (op->src.reg == CFI_SP && > > + regs[CFI_SP].base == CFI_CFA && > > + op->src.offset == regs[CFI_SP].offset + cfi->stack_size) { > > + > > + /* > > + * Clang RSP musical chains: > > s/chains/chairs if you're going to submit that ;) :-) > I am not sure we can do much on the compiler side here. > KMSAN just heavily increases register pressure, and this is how the > backend handles it. > We can't even influence it from the middle-end where the instrumentation occurs. > I remember Clang having more than one regallocator (we used to fall > back to PBQP for some huge files when instrumenting Chrome), but > switching to the non-default one will probably open a can of worms. Something in that compiler is smoking very potent dope. The code I have here has the form: mov %rsp, %rcx 1: mov %rcx, %rsp ... mov %rsp, 0x68(%rsp) ... mov 0x68(%rsp), %rcx test je 1b mov %rcx, %r12 ... mov %r12, %rcx jmp 1b Which is really really stupid, it spills the rsp value to the stack, only to then load it into another register. Simply doing: mov %rsp, %rcx 1: mov %rcx, %rsp ... mov %rsp, %rcx test je 1b mov %rcx, %r12 ... mov %r12, %rcx jmp 1b Would have made it so much better. But I'm not at all sure why it is playing these rsp games to begin with; that code just doesn't make much sense to me at all. Gemini is suggesting it is: The rsp manipulation occurs for two primary reasons: - Strict Stack Alignment: Most Application Binary Interfaces (ABIs), such as the System V AMD64 ABI, require the stack pointer (rsp) to be 16-byte aligned (rsp (mod 16) = 0) immediately before a function call. In functions with highly optimized local variables or dynamically allocated stack memory using alloca(), the stack pointer can easily drift. Clang temporarily aligns the stack by rounding it down, but must stash the original rsp to restore it properly after the tracking function completes. - Dynamic Shadow/Origin Mapping: The function __msan_chain_origin modifies origin metadata. Passing localized stack data or updating origin chains can cause unpredictable frame offsets or displacement inside the compiler's temporary spilling phase. Stashing the stack pointer guarantees that the instrumentation code will not corrupt the compiler-generated local variables if it relies on a consistent frame pointer. But if this is the former (alignment), then it already notices the stack is properly aligned because there are no actual alignment instructions issued, at which point it can then elide the restore too, but it doesn't. Gemini further elaborates: The Call Site "Opaque Wrap" When the KMSAN pass runs, it treats the injection of __msan_chain_origin as a highly specific helper callback rather than a standard C function call. To prevent the compiler's backend from optimizing away or rearranging the timing of this tracking, the instrumentation framework wraps the call inside an execution envelope that dictates: "Save the CPU state, call the hook, restore the CPU state." Even if the backend later calculates that no alignment modification is needed, the instruction slots for the save/restore actions have already been allocated in the compiler's intermediate representation (LLVM IR). Because x86-64 requires rsp tracking for non-leaf functions, LLVM assigns a virtual register to stash rsp. ... When the compiler’s register allocator reaches the instruction sequence to save rsp, it discovers it has zero free registers available to hold the value temporarily. Its fallback mechanism for a lack of registers is to "spill" the value to memory. Because there is no frame pointer (rbp), the only way it knows how to address memory is relative to rsp. It emits the command to copy rsp to [rsp + offset], unknowingly creating the circular logic failure. Here, that last thing, surely it can be taught to detect this logical loop, storing rsp using rsp. Additionally, the moment it realizes it doesn't need to re-align the stack (and it does), it can also kill the restore. Also, there is always a 'free' register to store RSP, it is called: RSP :-) Now, clearly I don't actually know much of LLVM internals, but this is all quite insane. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: objtool: undefined stack state in folio_zero_user() 2026-06-30 17:41 ` Peter Zijlstra @ 2026-06-30 20:24 ` Peter Zijlstra 0 siblings, 0 replies; 9+ messages in thread From: Peter Zijlstra @ 2026-06-30 20:24 UTC (permalink / raw) To: Alexander Potapenko Cc: Dmitry Antipov, elver, dvyukov, Josh Poimboeuf, Thomas Gleixner, linux-kernel, nathan, nick.desaulniers+lkml, morbo, justinstitt On Tue, Jun 30, 2026 at 07:41:57PM +0200, Peter Zijlstra wrote: > Also, there is always a 'free' register to store RSP, it is called: RSP > :-) > > Now, clearly I don't actually know much of LLVM internals, but this is > all quite insane. I had Gemini talk me though trying to do this, and while I got the modified llvm to build, I could not actually get it to 'work'. It builds a kernel fine, but it still does the same stupid. The idea was to explicitly allow rematerialization of RSP 'loads'. But like said, it isn't actually helping. FWIW... --- diff --git a/llvm/lib/Target/X86/X86InstrInfo.cpp b/llvm/lib/Target/X86/X86InstrInfo.cpp index 86a5a631ce73..ebec3a7563ca 100644 --- a/llvm/lib/Target/X86/X86InstrInfo.cpp +++ b/llvm/lib/Target/X86/X86InstrInfo.cpp @@ -816,6 +816,13 @@ bool X86InstrInfo::isReMaterializableImpl( case X86::PTILEZEROV: return true; + case X86::MOV64rr: { + const MachineOperand &SrcOp = MI.getOperand(1); + if (SrcOp.isReg() && SrcOp.getReg() == X86::RSP) + return true; + break; + } + case X86::MOV8rm: case X86::MOV8rm_NOREX: case X86::MOV16rm: @@ -964,6 +971,15 @@ void X86InstrInfo::reMaterialize(MachineBasicBlock &MBB, Register DestReg, unsigned SubIdx, const MachineInstr &Orig, LaneBitmask UsedLanes) const { + const DebugLoc &DL = Orig.getDebugLoc(); + if (Orig.getOpcode() == X86::MOV64rr && + Orig.getOperand(1).isReg() && + Orig.getOperand(1).getReg() == X86::RSP) { + BuildMI(MBB, I, DL, get(X86::MOV64rr), DestReg) + .addReg(X86::RSP); + return; + } + bool ClobbersEFLAGS = Orig.modifiesRegister(X86::EFLAGS, &TRI); if (ClobbersEFLAGS && MBB.computeRegisterLiveness(&TRI, X86::EFLAGS, I) != MachineBasicBlock::LQR_Dead) { @@ -984,7 +1000,6 @@ void X86InstrInfo::reMaterialize(MachineBasicBlock &MBB, llvm_unreachable("Unexpected instruction!"); } - const DebugLoc &DL = Orig.getDebugLoc(); BuildMI(MBB, I, DL, get(X86::MOV32ri)) .add(Orig.getOperand(0)) .addImm(Value); diff --git a/llvm/lib/Target/X86/X86RegisterInfo.cpp b/llvm/lib/Target/X86/X86RegisterInfo.cpp index c84e0f441a45..913c28740eef 100644 --- a/llvm/lib/Target/X86/X86RegisterInfo.cpp +++ b/llvm/lib/Target/X86/X86RegisterInfo.cpp @@ -19,6 +19,7 @@ #include "llvm/ADT/BitVector.h" #include "llvm/ADT/STLExtras.h" #include "llvm/ADT/SmallSet.h" +#include "llvm/ADT/StringRef.h" #include "llvm/CodeGen/LiveRegMatrix.h" #include "llvm/CodeGen/MachineFrameInfo.h" #include "llvm/CodeGen/MachineRegisterInfo.h" @@ -1167,6 +1168,32 @@ bool X86RegisterInfo::getRegAllocationHints(Register VirtReg, if (!VRM) return BaseImplRetVal; + if (MachineInstr *DefMI = MRI->getVRegDef(VirtReg)) { + if (DefMI->getOpcode() == X86::MOV64rr && + DefMI->getOperand(1).isReg() && + DefMI->getOperand(1).getReg() == X86::RSP) { + bool IsKMSANTrackingBlock = false; + const MachineBasicBlock *MBB = DefMI->getParent(); + + for (const MachineInstr &MI : *MBB) { + if (MI.isCall() && MI.getOperand(0).isSymbol()) { + StringRef SymName(MI.getOperand(0).getSymbolName()); + if (SymName == "__msan_chain_origin") { + IsKMSANTrackingBlock = true; + break; + } + } + } + + if (IsKMSANTrackingBlock) { + if (llvm::is_contained(Order, X86::RSP)) { + Hints.insert(Hints.begin(), X86::RSP); + return true; + } + } + } + } + if (ID != X86::TILERegClassID) { if (DisableRegAllocNDDHints || !ST.hasNDD() || !TRI.isGeneralPurposeRegisterClass(&RC)) ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: objtool: undefined stack state in folio_zero_user() 2026-06-30 13:54 ` Peter Zijlstra 2026-06-30 14:14 ` Alexander Potapenko @ 2026-06-30 18:36 ` Thomas Gleixner 2026-07-01 15:18 ` Alexander Potapenko 1 sibling, 1 reply; 9+ messages in thread From: Thomas Gleixner @ 2026-06-30 18:36 UTC (permalink / raw) To: Peter Zijlstra, Dmitry Antipov, glider, elver, dvyukov Cc: Josh Poimboeuf, linux-kernel, nathan, nick.desaulniers+lkml, morbo, justinstitt On Tue, Jun 30 2026 at 15:54, Peter Zijlstra wrote: > + KMSAN / clang folks > > On Tue, Jun 30, 2026 at 12:44:35PM +0200, Peter Zijlstra wrote: >> On Mon, Jun 22, 2026 at 04:23:46PM +0300, Dmitry Antipov wrote: >> > As of ef0c9f75a195 ("lib: Add stale 'raid6' directory to .gitignore file") >> > with clang 22.1.8 and KMSAN enabled, objtool stucks in folio_zero_user(): >> > >> > $ ./tools/objtool/objtool --hacks=jump_label --hacks=noinstr \ >> > --hacks=skylake --ibt --prefix=16 --orc --retpoline --rethunk \ >> > --static-call --uaccess --no-unreachable --noinstr --unret --link \ >> > vmlinux.o >> > vmlinux.o: warning: objtool: folio_zero_user+0x947: undefined stack state >> > vmlinux.o: error: objtool: folio_zero_user+0x947: unknown CFA base reg -1 >> > >> > Dmitry >> >> > 0000000001533940 <folio_zero_user>: >> >> > 1534272: 48 89 e1 mov %rsp,%rcx >> > 1534275: 48 85 ed test %rbp,%rbp >> > 1534278: 8b 54 24 1c mov 0x1c(%rsp),%edx >> > 153427c: 0f 85 c2 00 00 00 jne 1534344 <folio_zero_user+0xa04> >> > 1534282: 31 c0 xor %eax,%eax >> > 1534284: 48 89 cc mov %rcx,%rsp >> > 1534287: 4c 89 f7 mov %r14,%rdi ;; HERE >> >> ... >> > 1534327: 48 89 64 24 78 mov %rsp,0x78(%rsp) >> ... >> > 153433a: 48 8b 4c 24 78 mov 0x78(%rsp),%rcx >> > 153433f: e9 31 ff ff ff jmp 1534275 <folio_zero_user+0x935> >> >> >> This is well insane codegen, and I cannot blame objtool for hating on it >> -- in fact, I hate on it too. >> >> Let me try and figure out how best to fix this insane compiler output. > > > This seems to 'work', but it is somewhat yuck. It makes the build fail go away, but the resulting kernel compiled with clang22 refuses to boot. It stops here: [ 0.283753] mem auto-init: stack:off, heap alloc:off, heap free:off [ 0.433144] stackdepot: allocating hash table via alloc_large_system_hash [ 0.433656] stackdepot hash table entries: 524288 (order: 11, 8388608 bytes, linear) [ 0.435775] stackdepot: allocating space for 8192 stack pools via memblock [ 0.462747] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1 [ 0.463268] Starting KernelMemorySanitizer [ 0.463527] ATTENTION: KMSAN is a debugging tool! Do not use it on production machines! When I attach gdb to the VM then it sits in the ASM entry code of the page fault handler, but the stack looks damaged and it seems to loop somewhere around there forever. Haven't had time to dig into it further. .config is here: https://tglx.de/~tglx/config.fail Thanks, tglx ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: objtool: undefined stack state in folio_zero_user() 2026-06-30 18:36 ` Thomas Gleixner @ 2026-07-01 15:18 ` Alexander Potapenko 2026-07-01 16:23 ` Alexander Potapenko 0 siblings, 1 reply; 9+ messages in thread From: Alexander Potapenko @ 2026-07-01 15:18 UTC (permalink / raw) To: Thomas Gleixner Cc: Peter Zijlstra, Dmitry Antipov, elver, dvyukov, Josh Poimboeuf, linux-kernel, nathan, nick.desaulniers+lkml, morbo, justinstitt On Tue, Jun 30, 2026 at 8:36 PM Thomas Gleixner <tglx@kernel.org> wrote: > > On Tue, Jun 30 2026 at 15:54, Peter Zijlstra wrote: > > > + KMSAN / clang folks > > > > On Tue, Jun 30, 2026 at 12:44:35PM +0200, Peter Zijlstra wrote: > >> On Mon, Jun 22, 2026 at 04:23:46PM +0300, Dmitry Antipov wrote: > >> > As of ef0c9f75a195 ("lib: Add stale 'raid6' directory to .gitignore file") > >> > with clang 22.1.8 and KMSAN enabled, objtool stucks in folio_zero_user(): > >> > > >> > $ ./tools/objtool/objtool --hacks=jump_label --hacks=noinstr \ > >> > --hacks=skylake --ibt --prefix=16 --orc --retpoline --rethunk \ > >> > --static-call --uaccess --no-unreachable --noinstr --unret --link \ > >> > vmlinux.o > >> > vmlinux.o: warning: objtool: folio_zero_user+0x947: undefined stack state > >> > vmlinux.o: error: objtool: folio_zero_user+0x947: unknown CFA base reg -1 > >> > > >> > Dmitry > >> > >> > 0000000001533940 <folio_zero_user>: > >> > >> > 1534272: 48 89 e1 mov %rsp,%rcx > >> > 1534275: 48 85 ed test %rbp,%rbp > >> > 1534278: 8b 54 24 1c mov 0x1c(%rsp),%edx > >> > 153427c: 0f 85 c2 00 00 00 jne 1534344 <folio_zero_user+0xa04> > >> > 1534282: 31 c0 xor %eax,%eax > >> > 1534284: 48 89 cc mov %rcx,%rsp > >> > 1534287: 4c 89 f7 mov %r14,%rdi ;; HERE > >> > >> ... > >> > 1534327: 48 89 64 24 78 mov %rsp,0x78(%rsp) > >> ... > >> > 153433a: 48 8b 4c 24 78 mov 0x78(%rsp),%rcx > >> > 153433f: e9 31 ff ff ff jmp 1534275 <folio_zero_user+0x935> > >> > >> > >> This is well insane codegen, and I cannot blame objtool for hating on it > >> -- in fact, I hate on it too. > >> > >> Let me try and figure out how best to fix this insane compiler output. > > > > > > This seems to 'work', but it is somewhat yuck. > > It makes the build fail go away, but the resulting kernel compiled with > clang22 refuses to boot. It stops here: > > [ 0.283753] mem auto-init: stack:off, heap alloc:off, heap free:off > [ 0.433144] stackdepot: allocating hash table via alloc_large_system_hash > [ 0.433656] stackdepot hash table entries: 524288 (order: 11, 8388608 bytes, linear) > [ 0.435775] stackdepot: allocating space for 8192 stack pools via memblock > [ 0.462747] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1 > [ 0.463268] Starting KernelMemorySanitizer > [ 0.463527] ATTENTION: KMSAN is a debugging tool! Do not use it on production machines! > > When I attach gdb to the VM then it sits in the ASM entry code of the > page fault handler, but the stack looks damaged and it seems to loop > somewhere around there forever. Haven't had time to dig into it further. > > .config is here: https://tglx.de/~tglx/config.fail If I switch to the frame pointer unwinder the kernel actually boots. Could ORC also need some massaging for this fix to work? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: objtool: undefined stack state in folio_zero_user() 2026-07-01 15:18 ` Alexander Potapenko @ 2026-07-01 16:23 ` Alexander Potapenko 0 siblings, 0 replies; 9+ messages in thread From: Alexander Potapenko @ 2026-07-01 16:23 UTC (permalink / raw) To: Thomas Gleixner Cc: Peter Zijlstra, Dmitry Antipov, elver, dvyukov, Josh Poimboeuf, linux-kernel, nathan, nick.desaulniers+lkml, morbo, justinstitt On Wed, Jul 1, 2026 at 5:18 PM Alexander Potapenko <glider@google.com> wrote: > > On Tue, Jun 30, 2026 at 8:36 PM Thomas Gleixner <tglx@kernel.org> wrote: > > > > On Tue, Jun 30 2026 at 15:54, Peter Zijlstra wrote: > > > > > + KMSAN / clang folks > > > > > > On Tue, Jun 30, 2026 at 12:44:35PM +0200, Peter Zijlstra wrote: > > >> On Mon, Jun 22, 2026 at 04:23:46PM +0300, Dmitry Antipov wrote: > > >> > As of ef0c9f75a195 ("lib: Add stale 'raid6' directory to .gitignore file") > > >> > with clang 22.1.8 and KMSAN enabled, objtool stucks in folio_zero_user(): > > >> > > > >> > $ ./tools/objtool/objtool --hacks=jump_label --hacks=noinstr \ > > >> > --hacks=skylake --ibt --prefix=16 --orc --retpoline --rethunk \ > > >> > --static-call --uaccess --no-unreachable --noinstr --unret --link \ > > >> > vmlinux.o > > >> > vmlinux.o: warning: objtool: folio_zero_user+0x947: undefined stack state > > >> > vmlinux.o: error: objtool: folio_zero_user+0x947: unknown CFA base reg -1 > > >> > > > >> > Dmitry > > >> > > >> > 0000000001533940 <folio_zero_user>: > > >> > > >> > 1534272: 48 89 e1 mov %rsp,%rcx > > >> > 1534275: 48 85 ed test %rbp,%rbp > > >> > 1534278: 8b 54 24 1c mov 0x1c(%rsp),%edx > > >> > 153427c: 0f 85 c2 00 00 00 jne 1534344 <folio_zero_user+0xa04> > > >> > 1534282: 31 c0 xor %eax,%eax > > >> > 1534284: 48 89 cc mov %rcx,%rsp > > >> > 1534287: 4c 89 f7 mov %r14,%rdi ;; HERE > > >> > > >> ... > > >> > 1534327: 48 89 64 24 78 mov %rsp,0x78(%rsp) > > >> ... > > >> > 153433a: 48 8b 4c 24 78 mov 0x78(%rsp),%rcx > > >> > 153433f: e9 31 ff ff ff jmp 1534275 <folio_zero_user+0x935> > > >> > > >> > > >> This is well insane codegen, and I cannot blame objtool for hating on it > > >> -- in fact, I hate on it too. > > >> > > >> Let me try and figure out how best to fix this insane compiler output. > > > > > > > > > This seems to 'work', but it is somewhat yuck. > > > > It makes the build fail go away, but the resulting kernel compiled with > > clang22 refuses to boot. It stops here: > > > > [ 0.283753] mem auto-init: stack:off, heap alloc:off, heap free:off > > [ 0.433144] stackdepot: allocating hash table via alloc_large_system_hash > > [ 0.433656] stackdepot hash table entries: 524288 (order: 11, 8388608 bytes, linear) > > [ 0.435775] stackdepot: allocating space for 8192 stack pools via memblock > > [ 0.462747] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1 > > [ 0.463268] Starting KernelMemorySanitizer > > [ 0.463527] ATTENTION: KMSAN is a debugging tool! Do not use it on production machines! > > > > When I attach gdb to the VM then it sits in the ASM entry code of the > > page fault handler, but the stack looks damaged and it seems to loop > > somewhere around there forever. Haven't had time to dig into it further. > > > > .config is here: https://tglx.de/~tglx/config.fail > > If I switch to the frame pointer unwinder the kernel actually boots. > Could ORC also need some massaging for this fix to work? Something along the lines of: --- a/tools/objtool/arch/x86/orc.c +++ b/tools/objtool/arch/x86/orc.c @@ -22,6 +22,11 @@ int init_orc_entry(struct orc_entry *orc, struct cfi_state *cfi, struct instruct return 0; } + if (cfi->cfa.base == CFI_UNDEFINED) { + orc->type = ORC_TYPE_UNDEFINED; + return 0; + } + switch (cfi->type) { case UNWIND_HINT_TYPE_UNDEFINED: orc->type = ORC_TYPE_UNDEFINED; @@ -70,9 +75,10 @@ int init_orc_entry(struct orc_entry *orc, struct cfi_state *cfi, struct instruct case CFI_DX: orc->sp_reg = ORC_REG_DX; break; + case CFI_UNDEFINED: default: - ERROR_INSN(insn, "unknown CFA base reg %d", cfi->cfa.base); - return -1; + orc->type = ORC_TYPE_UNDEFINED; + return 0; } ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-07-01 16:24 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <35822cf3c35fc6621621f858e94a2b0ce19abf88.camel@yandex.ru>
2026-06-30 10:44 ` objtool: undefined stack state in folio_zero_user() Peter Zijlstra
2026-06-30 12:31 ` Dmitry Antipov
2026-06-30 13:54 ` Peter Zijlstra
2026-06-30 14:14 ` Alexander Potapenko
2026-06-30 17:41 ` Peter Zijlstra
2026-06-30 20:24 ` Peter Zijlstra
2026-06-30 18:36 ` Thomas Gleixner
2026-07-01 15:18 ` Alexander Potapenko
2026-07-01 16:23 ` Alexander Potapenko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox