* odd objtool 'unreachable instruction' warning
@ 2025-10-28 19:29 Linus Torvalds
2025-10-29 0:21 ` Josh Poimboeuf
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Linus Torvalds @ 2025-10-28 19:29 UTC (permalink / raw)
To: Josh Poimboeuf, Peter Zijlstra; +Cc: Linux Kernel Mailing List
[-- Attachment #1: Type: text/plain, Size: 1169 bytes --]
Josh, Peter,
due to another entirely unrelated discussion, I ended up resurrecting
my "make asm readable" patch that I have had in my local tree when I
want to look at the actual generated code for user accesses.
That is a local hack that just removes the alternative noise for the
common ops, so that I actually see the fences and clac/stac
instructions as such, instead of seeing them as nops in the object
file or as horrible noise in the assembler output.
So that patch is not something I'd ever commit in general, but it's
really useful for checking code generation - but it results in odd
objdump warnings these days (I say "these days", because I've used
that patch locally over the years, and the objdump warning hasn't
always been there).
It's a pretty odd warning, because the code looks fine to me, but I
might be missing something obvious.
Anyway, this is clearly not a big and urgent problem, but I'd love for
you to take a look. I'm attaching the patch I use so you can see what
I mean.. Any ideas what triggers that warning? Because I'd love to
keep this patch in my local tree without having objtool be upset with
me....
Linus
[-- Attachment #2: 0001-Avoid-alternative-assembler-noise-for-common-ops.patch --]
[-- Type: text/x-patch, Size: 3214 bytes --]
From 97c16bd1f44287f95f1be82185631d654a6022d8 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed, 3 Apr 2024 10:33:12 -0700
Subject: [PATCH] Avoid alternative assembler noise for common ops
Just hardcode the ones I have for SMP and LFENCE
---
arch/x86/include/asm/alternative.h | 7 +------
arch/x86/include/asm/barrier.h | 2 +-
arch/x86/include/asm/smap.h | 18 ++++++------------
arch/x86/lib/getuser.S | 2 +-
4 files changed, 9 insertions(+), 20 deletions(-)
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 15bc07a5ebb3..281c823a869e 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -42,12 +42,7 @@
*/
#ifdef CONFIG_SMP
-#define LOCK_PREFIX_HERE \
- ".pushsection .smp_locks,\"a\"\n" \
- ".balign 4\n" \
- ".long 671f - .\n" /* offset */ \
- ".popsection\n" \
- "671:"
+#define LOCK_PREFIX_HERE
#define LOCK_PREFIX LOCK_PREFIX_HERE "\n\tlock "
diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index db70832232d4..2b911c88467b 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -45,7 +45,7 @@
__mask; })
/* Prevent speculative execution past this barrier. */
-#define barrier_nospec() alternative("", "lfence", X86_FEATURE_LFENCE_RDTSC)
+#define barrier_nospec() asm volatile("lfence":::"memory")
#define __dma_rmb() barrier()
#define __dma_wmb() barrier()
diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
index 4f84d421d1cf..613b24a58b26 100644
--- a/arch/x86/include/asm/smap.h
+++ b/arch/x86/include/asm/smap.h
@@ -15,24 +15,20 @@
#ifdef __ASSEMBLER__
-#define ASM_CLAC \
- ALTERNATIVE "", "clac", X86_FEATURE_SMAP
+#define ASM_CLAC "clac"
-#define ASM_STAC \
- ALTERNATIVE "", "stac", X86_FEATURE_SMAP
+#define ASM_STAC "stac"
#else /* __ASSEMBLER__ */
static __always_inline void clac(void)
{
- /* Note: a barrier is implicit in alternative() */
- alternative("", "clac", X86_FEATURE_SMAP);
+ asm volatile("clac": : :"memory");
}
static __always_inline void stac(void)
{
- /* Note: a barrier is implicit in alternative() */
- alternative("", "stac", X86_FEATURE_SMAP);
+ asm volatile("stac": : :"memory");
}
static __always_inline unsigned long smap_save(void)
@@ -58,10 +54,8 @@ static __always_inline void smap_restore(unsigned long flags)
}
/* These macros can be used in asm() statements */
-#define ASM_CLAC \
- ALTERNATIVE("", "clac", X86_FEATURE_SMAP)
-#define ASM_STAC \
- ALTERNATIVE("", "stac", X86_FEATURE_SMAP)
+#define ASM_CLAC "clac"
+#define ASM_STAC "stac"
#define ASM_CLAC_UNSAFE \
ALTERNATIVE("", ANNOTATE_IGNORE_ALTERNATIVE "clac", X86_FEATURE_SMAP)
diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S
index 9d5654b8a72a..1c5952c637ce 100644
--- a/arch/x86/lib/getuser.S
+++ b/arch/x86/lib/getuser.S
@@ -37,7 +37,7 @@
#include <asm/smap.h>
#include <asm/runtime-const.h>
-#define ASM_BARRIER_NOSPEC ALTERNATIVE "", "lfence", X86_FEATURE_LFENCE_RDTSC
+#define ASM_BARRIER_NOSPEC lfence
.macro check_range size:req
.if IS_ENABLED(CONFIG_X86_64)
--
2.51.1.534.g6d2b4bf2f4
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: odd objtool 'unreachable instruction' warning
2025-10-28 19:29 odd objtool 'unreachable instruction' warning Linus Torvalds
@ 2025-10-29 0:21 ` Josh Poimboeuf
2025-10-29 0:59 ` Linus Torvalds
2025-10-29 9:56 ` David Laight
2025-11-01 7:05 ` Ingo Molnar
2 siblings, 1 reply; 14+ messages in thread
From: Josh Poimboeuf @ 2025-10-29 0:21 UTC (permalink / raw)
To: Linus Torvalds
Cc: Peter Zijlstra, Linux Kernel Mailing List, Alexandre Chartre
On Tue, Oct 28, 2025 at 12:29:11PM -0700, Linus Torvalds wrote:
> Josh, Peter,
> due to another entirely unrelated discussion, I ended up resurrecting
> my "make asm readable" patch that I have had in my local tree when I
> want to look at the actual generated code for user accesses.
>
> That is a local hack that just removes the alternative noise for the
> common ops, so that I actually see the fences and clac/stac
> instructions as such, instead of seeing them as nops in the object
> file or as horrible noise in the assembler output.
>
> So that patch is not something I'd ever commit in general, but it's
> really useful for checking code generation - but it results in odd
> objdump warnings these days (I say "these days", because I've used
> that patch locally over the years, and the objdump warning hasn't
> always been there).
>
> It's a pretty odd warning, because the code looks fine to me, but I
> might be missing something obvious.
>
> Anyway, this is clearly not a big and urgent problem, but I'd love for
> you to take a look. I'm attaching the patch I use so you can see what
> I mean.. Any ideas what triggers that warning? Because I'd love to
> keep this patch in my local tree without having objtool be upset with
> me....
Ironically I think this "bug" was introduced by a patch whose goal was
to fix a regression in STAC/CLAC code readability:
2d12c6fb7875 ("objtool: Remove ANNOTATE_IGNORE_ALTERNATIVE from CLAC/STAC")
See the diff below, hopefully that fixes things for you? I can post a
proper patch in a bit.
On a related note, it would be nice if we could make that codegen more
readable... Here were a few formats I had played with before, any
thoughts?
Option 1:
# <ALTERNATIVE>
# ORIGINAL:
771: nop
772: .skip -(((775f-774f)-(772b-771b)) > 0) * ((775f-774f)-(772b-771b)), 0x90; 773: .pushsection .altinstructions, "a"; .long 771b - .; .long 774f - .; .4byte ( 9*32+20); .byte 773b-771b; .byte 775f-774f; .popsection; .pushsection .altinstr_replacement, "ax"
# REPLACEMENT: [X86_FEATURE_SMAP]
774: stac
775: .popsection
# </ALTERNATIVE>
Option 2:
# <ALTERNATIVE>
# ORIGINAL: nop
# REPLACEMENT: stac [X86_FEATURE_SMAP]
771: nop
772: .skip -(((775f-774f)-(772b-771b)) > 0) * ((775f-774f)-(772b-771b)), 0x90; 773: .pushsection .altinstructions, "a"; .long 771b - .; .long 774f - .; .4byte ( 9*32+20); .byte 773b-771b; .byte 775f-774f; .popsection; .pushsection .altinstr_replacement, "ax"
774: stac
775: .popsection
# </ALTERNATIVE>
Note there's also an objtool "disas" feature Alexandre is working on
which will show the disassembly annotated with runtime patching info
(alternatives, static branches, etc):
https://lore.kernel.org/20250619145659.1377970-1-alexandre.chartre@oracle.com
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 620854fdaaf63..9004fbc067693 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -3516,8 +3516,11 @@ static bool skip_alt_group(struct instruction *insn)
{
struct instruction *alt_insn = insn->alts ? insn->alts->insn : NULL;
+ if (!insn->alt_group)
+ return false;
+
/* ANNOTATE_IGNORE_ALTERNATIVE */
- if (insn->alt_group && insn->alt_group->ignore)
+ if (insn->alt_group->ignore)
return true;
/*
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: odd objtool 'unreachable instruction' warning
2025-10-29 0:21 ` Josh Poimboeuf
@ 2025-10-29 0:59 ` Linus Torvalds
2025-10-29 1:50 ` Josh Poimboeuf
0 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2025-10-29 0:59 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: Peter Zijlstra, Linux Kernel Mailing List, Alexandre Chartre
On Tue, 28 Oct 2025 at 17:21, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
>
> See the diff below, hopefully that fixes things for you?
Bingo, that patch works for me.
> On a related note, it would be nice if we could make that codegen more
> readable... Here were a few formats I had played with before, any
> thoughts?
I don't think any of this is ever going to be remotely readable,
because of the whole horrible section noise.
If I recall correctly, one of the ideas you had was to use macros to
make things more legible, but that sadly didn't work with clang or
something like that?
Anyway, I certainly agree that we can make it *slightly* better, and
I'm not objecting to trying to do some incremental improvement, but
it's just always going to be pretty bad, and so I'd probably always
end up doing my ugly hack anyway.
Because for the assembler readability case, I do think the "avoid
alternatives" is just fundamentally nicer and simpler.
Now, at one point I tried to do a build config option to just have the
"assume feature XYZ" - so that you could just have a config option
that turns on SMAP and other features unconditionally.
That made the asm obviously look fine, but my simplistic patch at the
time made the source code itself so much less legible that I trashed
it. It just ended up being a mess of ifdefs.
(I did think that maybe I could make my patch cleaner with a few
helper macros, but it all ended up feeling like I was overthinking
things. I suspect very few people actually care about that legible
assembler thing. End result: I just ended up with the very simple
local hack in my tree)
> Note there's also an objtool "disas" feature Alexandre is working on
> which will show the disassembly annotated with runtime patching info
> (alternatives, static branches, etc):
Yeah, that would be a good feature to have, although I wouldn't
obviate the asm cleanup for me.
Depending on what I am looking for, I end up looking at either the asm
output and the objdump output (or at both) because they each have
their good and bad parts.
Sometimes I *want* the comments that gcc in particular adds to the
assembler, and sometimes I prefer the "raw objdump" format.
Oh, and Alexandre: if you are working on improving objdump - are you
perhaps also looking at making the relocation output saner?
Because I absolutely detest the odd relocation output format. I use it
occasionally, but it's just crazy to see things like
<delayed_put_task_struct>:
...
call <delayed_put_task_struct+0x1a>
R_X86_64_PLT32 rethook_flush_task-0x4
when any *sane* tool would just output the simple
call rethook_flush_task
instead. Because as it is, the line without the relocation info is
useless, and then the separate relocation line (with -r) is just
crazy.
I kind of do understand why it happens - few people care, objdump
supports many different instruction formats, the instruction decoding
is a separate thing from the relocation logic etc etc. But if somebody
is looking at making objdump generate prettier output, I'd love for
this to be one of the things fixed.
(And yes, I see this horrid output because I run objdump on individual
object files before linking, because it's *so* much simpler and
quicker)
Again - the real problem is that I'm just doing odd things that are
probably unusual enough that very few people care. And I've done it so
long that I can deal with it and read the horrors when necessary.
So it's not like this is a huge problem, it's just that I go "Oh Gods,
objdump output is not meant for humans" every time I do this.
Linus
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: odd objtool 'unreachable instruction' warning
2025-10-29 0:59 ` Linus Torvalds
@ 2025-10-29 1:50 ` Josh Poimboeuf
2025-10-29 16:17 ` Linus Torvalds
0 siblings, 1 reply; 14+ messages in thread
From: Josh Poimboeuf @ 2025-10-29 1:50 UTC (permalink / raw)
To: Linus Torvalds
Cc: Peter Zijlstra, Linux Kernel Mailing List, Alexandre Chartre
On Tue, Oct 28, 2025 at 05:59:01PM -0700, Linus Torvalds wrote:
> On Tue, 28 Oct 2025 at 17:21, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> >
> > See the diff below, hopefully that fixes things for you?
>
> Bingo, that patch works for me.
Thanks, let me run that through some testing and then I'll post a proper
patch.
> > On a related note, it would be nice if we could make that codegen more
> > readable... Here were a few formats I had played with before, any
> > thoughts?
>
> I don't think any of this is ever going to be remotely readable,
> because of the whole horrible section noise.
>
> If I recall correctly, one of the ideas you had was to use macros to
> make things more legible, but that sadly didn't work with clang or
> something like that?
Right. Those macros were actually pretty nice. They worked great with
GCC but Clang wouldn't cooperate. Maybe I can talk to some Clang people
to try to get that working.
> > Note there's also an objtool "disas" feature Alexandre is working on
> > which will show the disassembly annotated with runtime patching info
> > (alternatives, static branches, etc):
>
> Yeah, that would be a good feature to have, although I wouldn't
> obviate the asm cleanup for me.
>
> Depending on what I am looking for, I end up looking at either the asm
> output and the objdump output (or at both) because they each have
> their good and bad parts.
>
> Sometimes I *want* the comments that gcc in particular adds to the
> assembler, and sometimes I prefer the "raw objdump" format.
>
> Oh, and Alexandre: if you are working on improving objdump - are you
> perhaps also looking at making the relocation output saner?
Just to clarify, Alexandre is working on objTOOL, not objDUMP :-)
> Because I absolutely detest the odd relocation output format. I use it
> occasionally, but it's just crazy to see things like
>
> <delayed_put_task_struct>:
> ...
> call <delayed_put_task_struct+0x1a>
> R_X86_64_PLT32 rethook_flush_task-0x4
>
> when any *sane* tool would just output the simple
>
> call rethook_flush_task
>
> instead. Because as it is, the line without the relocation info is
> useless, and then the separate relocation line (with -r) is just
> crazy.
FWIW, this can be slightly improved by adding '-w' (objdump -drw) which
at least puts the function name on the same line:
call 34d <delayed_put_task_struct+0x1d> 349: R_X86_64_PLT32 rethook_flush_task-0x4
The good news is that we will have the ability to print things however
we want with "objtool disas".
--
Josh
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: odd objtool 'unreachable instruction' warning
2025-10-29 1:50 ` Josh Poimboeuf
@ 2025-10-29 16:17 ` Linus Torvalds
2025-10-30 9:51 ` Alexandre Chartre
2025-11-05 19:51 ` David Sterba
0 siblings, 2 replies; 14+ messages in thread
From: Linus Torvalds @ 2025-10-29 16:17 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: Peter Zijlstra, Linux Kernel Mailing List, Alexandre Chartre
On Tue, 28 Oct 2025 at 18:50, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
>
> Just to clarify, Alexandre is working on objTOOL, not objDUMP :-)
Bah. Not the first time I misread - or miswrite - one for the other.
I actually have a stupid alias for disassembly, because I can't ever
remember the long argument names:
alias disassemble='objdump --disassemble --no-show-raw-insn --no-addresses'
so one day when somebody has an improved script around this, I'll just
change the alias.
(Those flags, btw, are really bad flags in general, because you can't
figure out where the jumps go when you don't see addresses. But my
use-case is generally to look at the compiler-generated asm for
"details", and then I use that objdump disassembly to get a filtered
view of what the code looks like without all the section annotations
and without all the other noise. So that alias is more about my
workflow than about any kind of sane use).
Now, if objtool disassembly ever gives the kind of good disassembly
that 'perf report' does - with branches turned into arrows etc - that
would be lovely, and I'd replace my hacky alias in a heartbeat.
> FWIW, this can be slightly improved by adding '-w' (objdump -drw) which
> at least puts the function name on the same line:
>
> call 34d <delayed_put_task_struct+0x1d> 349: R_X86_64_PLT32 rethook_flush_task-0x4
Yeah, that's better, and I think I could massage it with some
scripting even more.
But I've thought about writing some simple scripting to make objdump
output clearer many times, and I never really end up caring enough.
So I think I'll wait for Alexandre to do that for me ;)
Linus
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: odd objtool 'unreachable instruction' warning
2025-10-29 16:17 ` Linus Torvalds
@ 2025-10-30 9:51 ` Alexandre Chartre
2025-11-05 19:51 ` David Sterba
1 sibling, 0 replies; 14+ messages in thread
From: Alexandre Chartre @ 2025-10-30 9:51 UTC (permalink / raw)
To: Linus Torvalds, Josh Poimboeuf
Cc: alexandre.chartre, Peter Zijlstra, Linux Kernel Mailing List
On 10/29/25 17:17, Linus Torvalds wrote:
> On Tue, 28 Oct 2025 at 18:50, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
>>
>> Just to clarify, Alexandre is working on objTOOL, not objDUMP :-)
>
> Bah. Not the first time I misread - or miswrite - one for the other.
>
> I actually have a stupid alias for disassembly, because I can't ever
> remember the long argument names:
>
> alias disassemble='objdump --disassemble --no-show-raw-insn --no-addresses'
>
> so one day when somebody has an improved script around this, I'll just
> change the alias.
>
> (Those flags, btw, are really bad flags in general, because you can't
> figure out where the jumps go when you don't see addresses. But my
> use-case is generally to look at the compiler-generated asm for
> "details", and then I use that objdump disassembly to get a filtered
> view of what the code looks like without all the section annotations
> and without all the other noise. So that alias is more about my
> workflow than about any kind of sane use).
>
> Now, if objtool disassembly ever gives the kind of good disassembly
> that 'perf report' does - with branches turned into arrows etc - that
> would be lovely, and I'd replace my hacky alias in a heartbeat.
>> FWIW, this can be slightly improved by adding '-w' (objdump -drw) which
>> at least puts the function name on the same line:
>>
>> call 34d <delayed_put_task_struct+0x1d> 349: R_X86_64_PLT32 rethook_flush_task-0x4
>
> Yeah, that's better, and I think I could massage it with some
> scripting even more.
>
> But I've thought about writing some simple scripting to make objdump
> output clearer many times, and I never really end up caring enough.
>
> So I think I'll wait for Alexandre to do that for me ;)
>
I am currently working on a new version (V3) which is mostly done.
Alternatives are now disassemble side-by-side and reloc is handled.
I am finishing cleaning up patches and I will hopefully send them
soon.
Here are some examples:
Example with stac/clac:
4e6b6: read_ldt+0xb6 test %rsi,%rsi
4e6b9: read_ldt+0xb9 jne 0x4e6e6 <read_ldt+0xe6>
4e6bb: read_ldt+0xbb | <alternative.4e6bb> | X86_FEATURE_SMAP
4e6bb: read_ldt+0xbb | NOP1 | stac
4e6bc: read_ldt+0xbc | NOP1 |
4e6bd: read_ldt+0xbd | NOP1 |
4e6be: read_ldt+0xbe xor %eax,%eax
4e6c0: read_ldt+0xc0 | <alternative.4e6c0> | EXCEPTION | !X86_FEATURE_FSRS
4e6c0: read_ldt+0xc0 | rep stos %al,%es:(%rdi) | jmp 4e6c5 <read_ldt+0xc5> | callq 0x1707 <rep_stos_alternative>
4e6c2: read_ldt+0xc2 | NOP1 | |
4e6c3: read_ldt+0xc3 | NOP1 | |
4e6c4: read_ldt+0xc4 | NOP1 | |
4e6c5: read_ldt+0xc5 | <alternative.4e6c5> | X86_FEATURE_SMAP
4e6c5: read_ldt+0xc5 | NOP1 | clac
4e6c6: read_ldt+0xc6 | NOP1 |
4e6c7: read_ldt+0xc7 | NOP1 |
4e6c8: read_ldt+0xc8 test %rcx,%rcx
4e6cb: read_ldt+0xcb jne 0x4e6e6 <read_ldt+0xe6>
4e6cd: read_ldt+0xcd mov %rbp,%rdi
4e6d0: read_ldt+0xd0 callq 0x4e6d5 <up_read>
__switch_to_asm disassembly:
$ ./tools/objtool/objtool --disas=__switch_to_asm --link vmlinux.o
__switch_to_asm:
82c0: __switch_to_asm push %rbp
82c1: __switch_to_asm+0x1 push %rbx
82c2: __switch_to_asm+0x2 push %r12
82c4: __switch_to_asm+0x4 push %r13
82c6: __switch_to_asm+0x6 push %r14
82c8: __switch_to_asm+0x8 push %r15
82ca: __switch_to_asm+0xa mov %rsp,0x1670(%rdi)
82d1: __switch_to_asm+0x11 mov 0x1670(%rsi),%rsp
82d8: __switch_to_asm+0x18 mov 0xad8(%rsi),%rbx
82df: __switch_to_asm+0x1f mov %rbx,%gs:0x0(%rip) # 0x82e7 <__stack_chk_guard>
82e7: __switch_to_asm+0x27 | <alternative.82e7> | !X86_FEATURE_ALWAYS | X86_FEATURE_RSB_CTXSW
82e7: __switch_to_asm+0x27 | jmp 0x8312 <__switch_to_asm+0x52> | NOP1 | mov $0x10,%r12
82e8: __switch_to_asm+0x28 | | NOP1 |
82e9: __switch_to_asm+0x29 | NOP1 | callq 0x82ef <__switch_to_asm+0x2f> |
82ea: __switch_to_asm+0x2a | NOP1 | |
82eb: __switch_to_asm+0x2b | NOP1 | |
82ec: __switch_to_asm+0x2c | NOP1 | |
82ed: __switch_to_asm+0x2d | NOP1 | |
82ee: __switch_to_asm+0x2e | NOP1 | int3 | callq 0x82f4 <__switch_to_asm+0x34>
82ef: __switch_to_asm+0x2f | NOP1 | add $0x8,%rsp |
82f0: __switch_to_asm+0x30 | NOP1 | |
82f1: __switch_to_asm+0x31 | NOP1 | |
82f2: __switch_to_asm+0x32 | NOP1 | |
82f3: __switch_to_asm+0x33 | NOP1 | lfence | int3
82f4: __switch_to_asm+0x34 | NOP1 | | callq 0x82fa <__switch_to_asm+0x3a>
82f5: __switch_to_asm+0x35 | NOP1 | |
82f6: __switch_to_asm+0x36 | NOP1 | |
82f7: __switch_to_asm+0x37 | NOP1 | |
82f8: __switch_to_asm+0x38 | NOP1 | |
82f9: __switch_to_asm+0x39 | NOP1 | | int3
82fa: __switch_to_asm+0x3a | NOP1 | | add $0x10,%rsp
82fb: __switch_to_asm+0x3b | NOP1 | |
82fc: __switch_to_asm+0x3c | NOP1 | |
82fd: __switch_to_asm+0x3d | NOP1 | |
82fe: __switch_to_asm+0x3e | NOP1 | | dec %r12
82ff: __switch_to_asm+0x3f | NOP1 | |
8300: __switch_to_asm+0x40 | NOP1 | |
8301: __switch_to_asm+0x41 | NOP1 | | jne 0x82ee <__switch_to_asm+0x2e>
8302: __switch_to_asm+0x42 | NOP1 | |
8303: __switch_to_asm+0x43 | NOP1 | | lfence
8304: __switch_to_asm+0x44 | NOP1 | |
8305: __switch_to_asm+0x45 | NOP1 | |
8306: __switch_to_asm+0x46 | NOP1 | | movq $0xffffffffffffffff,%gs:0x0(%rip) # 0x20b <__x86_call_depth>
8307: __switch_to_asm+0x47 | NOP1 | |
8308: __switch_to_asm+0x48 | NOP1 | |
8309: __switch_to_asm+0x49 | NOP1 | |
830a: __switch_to_asm+0x4a | NOP1 | |
830b: __switch_to_asm+0x4b | NOP1 | |
830c: __switch_to_asm+0x4c | NOP1 | |
830d: __switch_to_asm+0x4d | NOP1 | |
830e: __switch_to_asm+0x4e | NOP1 | |
830f: __switch_to_asm+0x4f | NOP1 | |
8310: __switch_to_asm+0x50 | NOP1 | |
8311: __switch_to_asm+0x51 | NOP1 | |
8312: __switch_to_asm+0x52 pop %r15
8314: __switch_to_asm+0x54 pop %r14
8316: __switch_to_asm+0x56 pop %r13
8318: __switch_to_asm+0x58 pop %r12
831a: __switch_to_asm+0x5a pop %rbx
831b: __switch_to_asm+0x5b pop %rbp
831c: __switch_to_asm+0x5c jmpq 0x8321 <__switch_to>
alex.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: odd objtool 'unreachable instruction' warning
2025-10-29 16:17 ` Linus Torvalds
2025-10-30 9:51 ` Alexandre Chartre
@ 2025-11-05 19:51 ` David Sterba
2025-11-07 0:05 ` Linus Torvalds
1 sibling, 1 reply; 14+ messages in thread
From: David Sterba @ 2025-11-05 19:51 UTC (permalink / raw)
To: Linus Torvalds
Cc: Josh Poimboeuf, Peter Zijlstra, Linux Kernel Mailing List,
Alexandre Chartre
On Wed, Oct 29, 2025 at 09:17:36AM -0700, Linus Torvalds wrote:
> Now, if objtool disassembly ever gives the kind of good disassembly
> that 'perf report' does - with branches turned into arrows etc - that
> would be lovely, and I'd replace my hacky alias in a heartbeat.
FWIW, 'objdump --visualize-jumps' shows the arrows, with
'--visualize-jumps=extended-color' they're colored and it's readable.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: odd objtool 'unreachable instruction' warning
2025-11-05 19:51 ` David Sterba
@ 2025-11-07 0:05 ` Linus Torvalds
0 siblings, 0 replies; 14+ messages in thread
From: Linus Torvalds @ 2025-11-07 0:05 UTC (permalink / raw)
To: dsterba
Cc: Josh Poimboeuf, Peter Zijlstra, Linux Kernel Mailing List,
Alexandre Chartre
On Wed, 5 Nov 2025 at 11:51, David Sterba <dsterba@suse.cz> wrote:
>
> FWIW, 'objdump --visualize-jumps' shows the arrows, with
> '--visualize-jumps=extended-color' they're colored and it's readable.
Hey, hey, hey - when did that happen? I've clearly not read the
man-page in ages.
Because that's _almost_ lovely. If objdump just handled relocations
more sanely, it would be really nice.
[ Looking around in the binutil sources, it's been there for five
years, and was in the 2.34 release ]
Because of the insanity of relocation handling, it shows regular
"call" and "jmp" instructions out of the function as branches to the
next instruction (because that's the non-relocated info).
And because it puts the visualized things in front of the
instructions, the end result has strange random indentation depending
on complexity of the function.
But it's tantalizingly close to great, and it's certainly an
improvement over not having that at all.
Added to my "disassemble" alias.
Linus
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: odd objtool 'unreachable instruction' warning
2025-10-28 19:29 odd objtool 'unreachable instruction' warning Linus Torvalds
2025-10-29 0:21 ` Josh Poimboeuf
@ 2025-10-29 9:56 ` David Laight
2025-10-29 10:05 ` Peter Zijlstra
2025-10-29 14:05 ` Peter Zijlstra
2025-11-01 7:05 ` Ingo Molnar
2 siblings, 2 replies; 14+ messages in thread
From: David Laight @ 2025-10-29 9:56 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Josh Poimboeuf, Peter Zijlstra, Linux Kernel Mailing List
On Tue, 28 Oct 2025 12:29:11 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:
> Josh, Peter,
> due to another entirely unrelated discussion, I ended up resurrecting
> my "make asm readable" patch that I have had in my local tree when I
> want to look at the actual generated code for user accesses.
>
> That is a local hack that just removes the alternative noise for the
> common ops, so that I actually see the fences and clac/stac
> instructions as such, instead of seeing them as nops in the object
> file or as horrible noise in the assembler output.
I've toyed with using explicit nop sequences that would be identifiable
as stac, clac and lfence.
At least that would tell you which is which.
Since the flags can be trashed there are plenty to choose from.
(eg all the cmpb $n,%reg if you don't mind a false dependency.)
David
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: odd objtool 'unreachable instruction' warning
2025-10-29 9:56 ` David Laight
@ 2025-10-29 10:05 ` Peter Zijlstra
2025-10-29 13:00 ` David Laight
2025-10-29 14:05 ` Peter Zijlstra
1 sibling, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2025-10-29 10:05 UTC (permalink / raw)
To: David Laight; +Cc: Linus Torvalds, Josh Poimboeuf, Linux Kernel Mailing List
On Wed, Oct 29, 2025 at 09:56:38AM +0000, David Laight wrote:
> On Tue, 28 Oct 2025 12:29:11 -0700
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> > Josh, Peter,
> > due to another entirely unrelated discussion, I ended up resurrecting
> > my "make asm readable" patch that I have had in my local tree when I
> > want to look at the actual generated code for user accesses.
> >
> > That is a local hack that just removes the alternative noise for the
> > common ops, so that I actually see the fences and clac/stac
> > instructions as such, instead of seeing them as nops in the object
> > file or as horrible noise in the assembler output.
>
> I've toyed with using explicit nop sequences that would be identifiable
> as stac, clac and lfence.
>
> At least that would tell you which is which.
>
> Since the flags can be trashed there are plenty to choose from.
> (eg all the cmpb $n,%reg if you don't mind a false dependency.)
As long as you ensure that insn_is_nop() recognises it as such, they
won't actually ever get ran after alternative patching, since they'll be
rewritten in canonical nops by optimize_nops().
(be sure to use the tip/master branch, since I unified the various
is_nop implementations).
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: odd objtool 'unreachable instruction' warning
2025-10-29 10:05 ` Peter Zijlstra
@ 2025-10-29 13:00 ` David Laight
0 siblings, 0 replies; 14+ messages in thread
From: David Laight @ 2025-10-29 13:00 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Linus Torvalds, Josh Poimboeuf, Linux Kernel Mailing List
On Wed, 29 Oct 2025 11:05:33 +0100
Peter Zijlstra <peterz@infradead.org> wrote:
> On Wed, Oct 29, 2025 at 09:56:38AM +0000, David Laight wrote:
> > On Tue, 28 Oct 2025 12:29:11 -0700
> > Linus Torvalds <torvalds@linux-foundation.org> wrote:
> >
> > > Josh, Peter,
> > > due to another entirely unrelated discussion, I ended up resurrecting
> > > my "make asm readable" patch that I have had in my local tree when I
> > > want to look at the actual generated code for user accesses.
> > >
> > > That is a local hack that just removes the alternative noise for the
> > > common ops, so that I actually see the fences and clac/stac
> > > instructions as such, instead of seeing them as nops in the object
> > > file or as horrible noise in the assembler output.
> >
> > I've toyed with using explicit nop sequences that would be identifiable
> > as stac, clac and lfence.
> >
> > At least that would tell you which is which.
> >
> > Since the flags can be trashed there are plenty to choose from.
> > (eg all the cmpb $n,%reg if you don't mind a false dependency.)
>
> As long as you ensure that insn_is_nop() recognises it as such, they
> won't actually ever get ran after alternative patching, since they'll be
> rewritten in canonical nops by optimize_nops().
Ah - I've wondered about the 'nop; nop; nop' being slightly sub-optimal.
But it is hard enough to see what happens to the alternative that is
selected never mind the ones that aren't.
Instructions that change the flags can't be changed to nops.
So possibly the best options are the 0f-1f-cx (for x 0..7).
So 'long_nop %reg' rather than 0f-1f-00 'long_nop [%rax]'.
(But I've not checked the full rules for the addressing modes of
'long_nop'.)
David
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: odd objtool 'unreachable instruction' warning
2025-10-29 9:56 ` David Laight
2025-10-29 10:05 ` Peter Zijlstra
@ 2025-10-29 14:05 ` Peter Zijlstra
1 sibling, 0 replies; 14+ messages in thread
From: Peter Zijlstra @ 2025-10-29 14:05 UTC (permalink / raw)
To: David Laight; +Cc: Linus Torvalds, Josh Poimboeuf, Linux Kernel Mailing List
On Wed, Oct 29, 2025 at 09:56:38AM +0000, David Laight wrote:
> On Tue, 28 Oct 2025 12:29:11 -0700
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> > Josh, Peter,
> > due to another entirely unrelated discussion, I ended up resurrecting
> > my "make asm readable" patch that I have had in my local tree when I
> > want to look at the actual generated code for user accesses.
> >
> > That is a local hack that just removes the alternative noise for the
> > common ops, so that I actually see the fences and clac/stac
> > instructions as such, instead of seeing them as nops in the object
> > file or as horrible noise in the assembler output.
>
> I've toyed with using explicit nop sequences that would be identifiable
> as stac, clac and lfence.
>
> At least that would tell you which is which.
>
> Since the flags can be trashed there are plenty to choose from.
> (eg all the cmpb $n,%reg if you don't mind a false dependency.)
things like:
mov %reg, %reg
are 3 bytes on x86_64 and otherwise unused, since the canonical NOP for
>=3 bytes is NOPL.
So if you do something like:
mov %rax, %rax -- stac
mov %rcx, %rcx -- clac
that should be readily recognisable. Trouble is, this doesn't readily
work for 32bit, so you'll need to complicate the code to pick different
NOPs there. I suppose you can CS-prefix stuff the same MOVs, like:
cs mov %eax, %eax
cs mov %ecx, %ecx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: odd objtool 'unreachable instruction' warning
2025-10-28 19:29 odd objtool 'unreachable instruction' warning Linus Torvalds
2025-10-29 0:21 ` Josh Poimboeuf
2025-10-29 9:56 ` David Laight
@ 2025-11-01 7:05 ` Ingo Molnar
2025-11-01 16:59 ` Linus Torvalds
2 siblings, 1 reply; 14+ messages in thread
From: Ingo Molnar @ 2025-11-01 7:05 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Josh Poimboeuf, Peter Zijlstra, Linux Kernel Mailing List
* Linus Torvalds <torvalds@linux-foundation.org> wrote:
> That is a local hack that just removes the alternative noise for the
> common ops, so that I actually see the fences and clac/stac
> instructions as such, instead of seeing them as nops in the object
> file or as horrible noise in the assembler output.
>
> So that patch is not something I'd ever commit in general, but it's
> really useful for checking code generation [...]
> Subject: [PATCH] Avoid alternative assembler noise for common ops
>
> Just hardcode the ones I have for SMP and LFENCE
> ---
> arch/x86/include/asm/alternative.h | 7 +------
> arch/x86/include/asm/barrier.h | 2 +-
> arch/x86/include/asm/smap.h | 18 ++++++------------
> arch/x86/lib/getuser.S | 2 +-
> 4 files changed, 9 insertions(+), 20 deletions(-)
>
> diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
> index 15bc07a5ebb3..281c823a869e 100644
> --- a/arch/x86/include/asm/alternative.h
> +++ b/arch/x86/include/asm/alternative.h
> @@ -42,12 +42,7 @@
> */
>
> #ifdef CONFIG_SMP
> -#define LOCK_PREFIX_HERE \
> - ".pushsection .smp_locks,\"a\"\n" \
> - ".balign 4\n" \
> - ".long 671f - .\n" /* offset */ \
> - ".popsection\n" \
> - "671:"
> +#define LOCK_PREFIX_HERE
>
> #define LOCK_PREFIX LOCK_PREFIX_HERE "\n\tlock "
>
> diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
> index db70832232d4..2b911c88467b 100644
> --- a/arch/x86/include/asm/barrier.h
> +++ b/arch/x86/include/asm/barrier.h
> @@ -45,7 +45,7 @@
> __mask; })
>
> /* Prevent speculative execution past this barrier. */
> -#define barrier_nospec() alternative("", "lfence", X86_FEATURE_LFENCE_RDTSC)
> +#define barrier_nospec() asm volatile("lfence":::"memory")
So we could actually integrate much of this and avoid all the
alternatives noise by keying it off CONFIG_X86_NATIVE_CPU and passing
in two new config flags whether /proc/cpuinfo in the build environment
has X86_FEATURE_LFENCE_RDTSC and X86_FEATURE_SMAP ("smap").
CONFIG_X86_NATIVE_CPU is not supposed to boot/work on any other CPU
than the host CPU, so CPU-variant runtime patchery is pointless there.
Put differently: we could make alternative() use build-provided CPU
flags if CONFIG_X86_NATIVE_CPU=y - or at least introduce an
alternative() variant that can be used in the simpler cases.
We could even avoid .smp_locks if the build environment is SMP, or even
just hardcode the LOCK prefix on CONFIG_X86_NATIVE_CPU, there's no
modern UP processor in existence that we support and care about, and
patching out the prefix should be of dubious value anyway on anything
supported. LOCK used to have horrible overhead on some ancient CPUs but
that's it - it should be ignored on any x86 CPU built in the last 20
years?
Ie. I think we could integrate 100% of the functionality of your
disassembly readability hack-patch. :-)
Thanks,
Ingo
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: odd objtool 'unreachable instruction' warning
2025-11-01 7:05 ` Ingo Molnar
@ 2025-11-01 16:59 ` Linus Torvalds
0 siblings, 0 replies; 14+ messages in thread
From: Linus Torvalds @ 2025-11-01 16:59 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Josh Poimboeuf, Peter Zijlstra, Linux Kernel Mailing List
On Sat, 1 Nov 2025 at 00:05, Ingo Molnar <mingo@kernel.org> wrote:
>
> So we could actually integrate much of this and avoid all the
> alternatives noise by keying it off CONFIG_X86_NATIVE_CPU and passing
> in two new config flags whether /proc/cpuinfo in the build environment
> has X86_FEATURE_LFENCE_RDTSC and X86_FEATURE_SMAP ("smap").
So I think I mentioned in one other email (maybe another thread) that
I did at one point have a slightly bigger patch that then had more
#ifdef's etc to make it configurable and thus upstreamable.
The problem was that it just made the source uglier. The patch simply
looked pretty hacky, and it's not like it really improves anything for
anybody sane: the actual code at runtime ends up being identical.
(It might improve things for really old machines that don't support
SMAP at all, in that they'd avoid the nop instructions, but I don't
feel that is something worth optimizing for).
Now, if there are enough people that actually look at the assembler
code to make this worthwhile, that would be one thing, but I suspect
there's a handful of odd people who - like me - are then happy to have
local crud that doesn't affect anybody else.
But hey - if you think this would be good upstream, and you (and by
"you" I mean "the x86 maintainers" in general) don't mind having a
slightly more complicated ugly header file, I obviously will not
complain at all.
(I also at one point was hoping there was some way to make the
<asm/alternative.h> code itself have a "assume these flags are
set/clear" model, but that looks unworkable: you'd need a more capable
preprocessor than the regular stupid C preprocessor - something that
could do conditionals inside the macros).
Linus
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-11-07 0:05 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-28 19:29 odd objtool 'unreachable instruction' warning Linus Torvalds
2025-10-29 0:21 ` Josh Poimboeuf
2025-10-29 0:59 ` Linus Torvalds
2025-10-29 1:50 ` Josh Poimboeuf
2025-10-29 16:17 ` Linus Torvalds
2025-10-30 9:51 ` Alexandre Chartre
2025-11-05 19:51 ` David Sterba
2025-11-07 0:05 ` Linus Torvalds
2025-10-29 9:56 ` David Laight
2025-10-29 10:05 ` Peter Zijlstra
2025-10-29 13:00 ` David Laight
2025-10-29 14:05 ` Peter Zijlstra
2025-11-01 7:05 ` Ingo Molnar
2025-11-01 16:59 ` Linus Torvalds
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox