public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* odd objtool 'unreachable instruction' warning
@ 2025-10-28 19:29 Linus Torvalds
  2025-10-29  0:21 ` Josh Poimboeuf
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Linus Torvalds @ 2025-10-28 19:29 UTC (permalink / raw)
  To: Josh Poimboeuf, Peter Zijlstra; +Cc: Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 1169 bytes --]

Josh, Peter,
 due to another entirely unrelated discussion, I ended up resurrecting
my "make asm readable" patch that I have had in my local tree when I
want to look at the actual generated code for user accesses.

That is a local hack that just removes the alternative noise for the
common ops, so that I actually see the fences and clac/stac
instructions as such, instead of seeing them as nops in the object
file or as horrible noise in the assembler output.

So that patch is not something I'd ever commit in general, but it's
really useful for checking code generation - but it results in odd
objdump warnings these days (I say "these days", because I've used
that patch locally over the years, and the objdump warning hasn't
always been there).

It's a pretty odd warning, because the code looks fine to me, but I
might be missing something obvious.

Anyway, this is clearly not a big and urgent problem, but I'd love for
you to take a look. I'm attaching the patch I use so  you can see what
I mean.. Any ideas what triggers that warning? Because I'd love to
keep this patch in my local tree without having objtool be upset with
me....

                   Linus

[-- Attachment #2: 0001-Avoid-alternative-assembler-noise-for-common-ops.patch --]
[-- Type: text/x-patch, Size: 3214 bytes --]

From 97c16bd1f44287f95f1be82185631d654a6022d8 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed, 3 Apr 2024 10:33:12 -0700
Subject: [PATCH] Avoid alternative assembler noise for common ops

Just hardcode the ones I have for SMP and LFENCE
---
 arch/x86/include/asm/alternative.h |  7 +------
 arch/x86/include/asm/barrier.h     |  2 +-
 arch/x86/include/asm/smap.h        | 18 ++++++------------
 arch/x86/lib/getuser.S             |  2 +-
 4 files changed, 9 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 15bc07a5ebb3..281c823a869e 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -42,12 +42,7 @@
  */
 
 #ifdef CONFIG_SMP
-#define LOCK_PREFIX_HERE \
-		".pushsection .smp_locks,\"a\"\n"	\
-		".balign 4\n"				\
-		".long 671f - .\n" /* offset */		\
-		".popsection\n"				\
-		"671:"
+#define LOCK_PREFIX_HERE
 
 #define LOCK_PREFIX LOCK_PREFIX_HERE "\n\tlock "
 
diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index db70832232d4..2b911c88467b 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -45,7 +45,7 @@
 	__mask; })
 
 /* Prevent speculative execution past this barrier. */
-#define barrier_nospec() alternative("", "lfence", X86_FEATURE_LFENCE_RDTSC)
+#define barrier_nospec() asm volatile("lfence":::"memory")
 
 #define __dma_rmb()	barrier()
 #define __dma_wmb()	barrier()
diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
index 4f84d421d1cf..613b24a58b26 100644
--- a/arch/x86/include/asm/smap.h
+++ b/arch/x86/include/asm/smap.h
@@ -15,24 +15,20 @@
 
 #ifdef __ASSEMBLER__
 
-#define ASM_CLAC \
-	ALTERNATIVE "", "clac", X86_FEATURE_SMAP
+#define ASM_CLAC "clac"
 
-#define ASM_STAC \
-	ALTERNATIVE "", "stac", X86_FEATURE_SMAP
+#define ASM_STAC "stac"
 
 #else /* __ASSEMBLER__ */
 
 static __always_inline void clac(void)
 {
-	/* Note: a barrier is implicit in alternative() */
-	alternative("", "clac", X86_FEATURE_SMAP);
+	asm volatile("clac": : :"memory");
 }
 
 static __always_inline void stac(void)
 {
-	/* Note: a barrier is implicit in alternative() */
-	alternative("", "stac", X86_FEATURE_SMAP);
+	asm volatile("stac": : :"memory");
 }
 
 static __always_inline unsigned long smap_save(void)
@@ -58,10 +54,8 @@ static __always_inline void smap_restore(unsigned long flags)
 }
 
 /* These macros can be used in asm() statements */
-#define ASM_CLAC \
-	ALTERNATIVE("", "clac", X86_FEATURE_SMAP)
-#define ASM_STAC \
-	ALTERNATIVE("", "stac", X86_FEATURE_SMAP)
+#define ASM_CLAC "clac"
+#define ASM_STAC "stac"
 
 #define ASM_CLAC_UNSAFE \
 	ALTERNATIVE("", ANNOTATE_IGNORE_ALTERNATIVE "clac", X86_FEATURE_SMAP)
diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S
index 9d5654b8a72a..1c5952c637ce 100644
--- a/arch/x86/lib/getuser.S
+++ b/arch/x86/lib/getuser.S
@@ -37,7 +37,7 @@
 #include <asm/smap.h>
 #include <asm/runtime-const.h>
 
-#define ASM_BARRIER_NOSPEC ALTERNATIVE "", "lfence", X86_FEATURE_LFENCE_RDTSC
+#define ASM_BARRIER_NOSPEC lfence
 
 .macro check_range size:req
 .if IS_ENABLED(CONFIG_X86_64)
-- 
2.51.1.534.g6d2b4bf2f4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: odd objtool 'unreachable instruction' warning
  2025-10-28 19:29 odd objtool 'unreachable instruction' warning Linus Torvalds
@ 2025-10-29  0:21 ` Josh Poimboeuf
  2025-10-29  0:59   ` Linus Torvalds
  2025-10-29  9:56 ` David Laight
  2025-11-01  7:05 ` Ingo Molnar
  2 siblings, 1 reply; 14+ messages in thread
From: Josh Poimboeuf @ 2025-10-29  0:21 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Zijlstra, Linux Kernel Mailing List, Alexandre Chartre

On Tue, Oct 28, 2025 at 12:29:11PM -0700, Linus Torvalds wrote:
> Josh, Peter,
>  due to another entirely unrelated discussion, I ended up resurrecting
> my "make asm readable" patch that I have had in my local tree when I
> want to look at the actual generated code for user accesses.
> 
> That is a local hack that just removes the alternative noise for the
> common ops, so that I actually see the fences and clac/stac
> instructions as such, instead of seeing them as nops in the object
> file or as horrible noise in the assembler output.
> 
> So that patch is not something I'd ever commit in general, but it's
> really useful for checking code generation - but it results in odd
> objdump warnings these days (I say "these days", because I've used
> that patch locally over the years, and the objdump warning hasn't
> always been there).
> 
> It's a pretty odd warning, because the code looks fine to me, but I
> might be missing something obvious.
> 
> Anyway, this is clearly not a big and urgent problem, but I'd love for
> you to take a look. I'm attaching the patch I use so  you can see what
> I mean.. Any ideas what triggers that warning? Because I'd love to
> keep this patch in my local tree without having objtool be upset with
> me....

Ironically I think this "bug" was introduced by a patch whose goal was
to fix a regression in STAC/CLAC code readability:

  2d12c6fb7875 ("objtool: Remove ANNOTATE_IGNORE_ALTERNATIVE from CLAC/STAC")

See the diff below, hopefully that fixes things for you?  I can post a
proper patch in a bit.

On a related note, it would be nice if we could make that codegen more
readable...  Here were a few formats I had played with before, any
thoughts?

Option 1:

# <ALTERNATIVE>
# ORIGINAL:
771:	nop
772:	.skip -(((775f-774f)-(772b-771b)) > 0) * ((775f-774f)-(772b-771b)), 0x90; 773: .pushsection .altinstructions, "a"; .long 771b - .; .long 774f - .; .4byte ( 9*32+20); .byte 773b-771b; .byte 775f-774f; .popsection; .pushsection .altinstr_replacement, "ax"
# REPLACEMENT: [X86_FEATURE_SMAP]
774:	stac
775:	.popsection
# </ALTERNATIVE>

Option 2:

# <ALTERNATIVE>
# ORIGINAL:    nop
# REPLACEMENT: stac [X86_FEATURE_SMAP]
771:	nop
772:	.skip -(((775f-774f)-(772b-771b)) > 0) * ((775f-774f)-(772b-771b)), 0x90; 773: .pushsection .altinstructions, "a"; .long 771b - .; .long 774f - .; .4byte ( 9*32+20); .byte 773b-771b; .byte 775f-774f; .popsection; .pushsection .altinstr_replacement, "ax"
774:	stac
775:	.popsection
# </ALTERNATIVE>

Note there's also an objtool "disas" feature Alexandre is working on
which will show the disassembly annotated with runtime patching info
(alternatives, static branches, etc):

  https://lore.kernel.org/20250619145659.1377970-1-alexandre.chartre@oracle.com


diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 620854fdaaf63..9004fbc067693 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -3516,8 +3516,11 @@ static bool skip_alt_group(struct instruction *insn)
 {
 	struct instruction *alt_insn = insn->alts ? insn->alts->insn : NULL;
 
+	if (!insn->alt_group)
+		return false;
+
 	/* ANNOTATE_IGNORE_ALTERNATIVE */
-	if (insn->alt_group && insn->alt_group->ignore)
+	if (insn->alt_group->ignore)
 		return true;
 
 	/*

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: odd objtool 'unreachable instruction' warning
  2025-10-29  0:21 ` Josh Poimboeuf
@ 2025-10-29  0:59   ` Linus Torvalds
  2025-10-29  1:50     ` Josh Poimboeuf
  0 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2025-10-29  0:59 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, Linux Kernel Mailing List, Alexandre Chartre

On Tue, 28 Oct 2025 at 17:21, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
>
> See the diff below, hopefully that fixes things for you?

Bingo, that patch works for me.

> On a related note, it would be nice if we could make that codegen more
> readable...  Here were a few formats I had played with before, any
> thoughts?

I don't think any of this is ever going to be remotely readable,
because of the whole horrible section noise.

If I recall correctly, one of the ideas you had was to use macros to
make things more legible, but that sadly didn't work with clang or
something like that?

Anyway, I certainly agree that we can make it *slightly* better, and
I'm not objecting to trying to do some incremental improvement, but
it's just always going to be pretty bad, and so I'd probably always
end up doing my ugly hack anyway.

Because for the assembler readability case, I do think the "avoid
alternatives" is just fundamentally nicer and simpler.

Now, at one point I tried to do a build config option to just have the
"assume feature XYZ" - so that you could just have a config option
that turns on SMAP and other features unconditionally.

That made the asm obviously look fine, but my simplistic patch at the
time made the source code itself so much less legible that I trashed
it. It just ended up being a mess of ifdefs.

(I did think that maybe I could make my patch cleaner with a few
helper macros, but it all ended up feeling like I was overthinking
things.  I suspect very few people actually care about that legible
assembler thing. End result: I just ended up with the very simple
local hack in my tree)

> Note there's also an objtool "disas" feature Alexandre is working on
> which will show the disassembly annotated with runtime patching info
> (alternatives, static branches, etc):

Yeah, that would be a good feature to have, although I wouldn't
obviate the asm cleanup for me.

Depending on what I am looking for, I end up looking at either the asm
output and the objdump output (or at both) because they each have
their good and bad parts.

Sometimes I *want* the comments that gcc in particular adds to the
assembler, and sometimes I prefer the "raw objdump" format.

Oh, and Alexandre: if you are working on improving objdump - are you
perhaps also looking at making the relocation output saner?

Because I absolutely detest the odd relocation output format. I use it
occasionally, but it's just crazy to see things like

<delayed_put_task_struct>:
        ...
        call   <delayed_put_task_struct+0x1a>
                        R_X86_64_PLT32  rethook_flush_task-0x4

when any *sane* tool would just output the simple

        call   rethook_flush_task

instead.  Because as it is, the line without the relocation info is
useless, and then the separate relocation line (with -r) is just
crazy.

I kind of do understand why it happens - few people care, objdump
supports many different instruction formats, the instruction decoding
is a separate thing from the relocation logic etc etc. But if somebody
is looking at making objdump generate prettier output, I'd love for
this to be one of the things fixed.

(And yes, I see this horrid output because I run objdump on individual
object files before linking, because it's *so* much simpler and
quicker)

Again - the real problem is that I'm just doing odd things that are
probably unusual enough that very few people care. And I've done it so
long that I can deal with it and read the horrors when necessary.

So it's not like this is a huge problem, it's just that I go "Oh Gods,
objdump output is not meant for humans" every time I do this.

                Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: odd objtool 'unreachable instruction' warning
  2025-10-29  0:59   ` Linus Torvalds
@ 2025-10-29  1:50     ` Josh Poimboeuf
  2025-10-29 16:17       ` Linus Torvalds
  0 siblings, 1 reply; 14+ messages in thread
From: Josh Poimboeuf @ 2025-10-29  1:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Zijlstra, Linux Kernel Mailing List, Alexandre Chartre

On Tue, Oct 28, 2025 at 05:59:01PM -0700, Linus Torvalds wrote:
> On Tue, 28 Oct 2025 at 17:21, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> >
> > See the diff below, hopefully that fixes things for you?
> 
> Bingo, that patch works for me.

Thanks, let me run that through some testing and then I'll post a proper
patch.

> > On a related note, it would be nice if we could make that codegen more
> > readable...  Here were a few formats I had played with before, any
> > thoughts?
> 
> I don't think any of this is ever going to be remotely readable,
> because of the whole horrible section noise.
> 
> If I recall correctly, one of the ideas you had was to use macros to
> make things more legible, but that sadly didn't work with clang or
> something like that?

Right.  Those macros were actually pretty nice.  They worked great with
GCC but Clang wouldn't cooperate.  Maybe I can talk to some Clang people
to try to get that working.

> > Note there's also an objtool "disas" feature Alexandre is working on
> > which will show the disassembly annotated with runtime patching info
> > (alternatives, static branches, etc):
> 
> Yeah, that would be a good feature to have, although I wouldn't
> obviate the asm cleanup for me.
> 
> Depending on what I am looking for, I end up looking at either the asm
> output and the objdump output (or at both) because they each have
> their good and bad parts.
> 
> Sometimes I *want* the comments that gcc in particular adds to the
> assembler, and sometimes I prefer the "raw objdump" format.
> 
> Oh, and Alexandre: if you are working on improving objdump - are you
> perhaps also looking at making the relocation output saner?

Just to clarify, Alexandre is working on objTOOL, not objDUMP :-)

> Because I absolutely detest the odd relocation output format. I use it
> occasionally, but it's just crazy to see things like
> 
> <delayed_put_task_struct>:
>         ...
>         call   <delayed_put_task_struct+0x1a>
>                         R_X86_64_PLT32  rethook_flush_task-0x4
> 
> when any *sane* tool would just output the simple
> 
>         call   rethook_flush_task
> 
> instead.  Because as it is, the line without the relocation info is
> useless, and then the separate relocation line (with -r) is just
> crazy.

FWIW, this can be slightly improved by adding '-w' (objdump -drw) which
at least puts the function name on the same line:

     call   34d <delayed_put_task_struct+0x1d>       349: R_X86_64_PLT32     rethook_flush_task-0x4

The good news is that we will have the ability to print things however
we want with "objtool disas".

-- 
Josh

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: odd objtool 'unreachable instruction' warning
  2025-10-28 19:29 odd objtool 'unreachable instruction' warning Linus Torvalds
  2025-10-29  0:21 ` Josh Poimboeuf
@ 2025-10-29  9:56 ` David Laight
  2025-10-29 10:05   ` Peter Zijlstra
  2025-10-29 14:05   ` Peter Zijlstra
  2025-11-01  7:05 ` Ingo Molnar
  2 siblings, 2 replies; 14+ messages in thread
From: David Laight @ 2025-10-29  9:56 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Josh Poimboeuf, Peter Zijlstra, Linux Kernel Mailing List

On Tue, 28 Oct 2025 12:29:11 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> Josh, Peter,
>  due to another entirely unrelated discussion, I ended up resurrecting
> my "make asm readable" patch that I have had in my local tree when I
> want to look at the actual generated code for user accesses.
> 
> That is a local hack that just removes the alternative noise for the
> common ops, so that I actually see the fences and clac/stac
> instructions as such, instead of seeing them as nops in the object
> file or as horrible noise in the assembler output.

I've toyed with using explicit nop sequences that would be identifiable
as stac, clac and lfence.

At least that would tell you which is which.

Since the flags can be trashed there are plenty to choose from.
(eg all the cmpb $n,%reg if you don't mind a false dependency.)

	David

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: odd objtool 'unreachable instruction' warning
  2025-10-29  9:56 ` David Laight
@ 2025-10-29 10:05   ` Peter Zijlstra
  2025-10-29 13:00     ` David Laight
  2025-10-29 14:05   ` Peter Zijlstra
  1 sibling, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2025-10-29 10:05 UTC (permalink / raw)
  To: David Laight; +Cc: Linus Torvalds, Josh Poimboeuf, Linux Kernel Mailing List

On Wed, Oct 29, 2025 at 09:56:38AM +0000, David Laight wrote:
> On Tue, 28 Oct 2025 12:29:11 -0700
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > Josh, Peter,
> >  due to another entirely unrelated discussion, I ended up resurrecting
> > my "make asm readable" patch that I have had in my local tree when I
> > want to look at the actual generated code for user accesses.
> > 
> > That is a local hack that just removes the alternative noise for the
> > common ops, so that I actually see the fences and clac/stac
> > instructions as such, instead of seeing them as nops in the object
> > file or as horrible noise in the assembler output.
> 
> I've toyed with using explicit nop sequences that would be identifiable
> as stac, clac and lfence.
> 
> At least that would tell you which is which.
> 
> Since the flags can be trashed there are plenty to choose from.
> (eg all the cmpb $n,%reg if you don't mind a false dependency.)

As long as you ensure that insn_is_nop() recognises it as such, they
won't actually ever get ran after alternative patching, since they'll be
rewritten in canonical nops by optimize_nops().

(be sure to use the tip/master branch, since I unified the various
is_nop implementations).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: odd objtool 'unreachable instruction' warning
  2025-10-29 10:05   ` Peter Zijlstra
@ 2025-10-29 13:00     ` David Laight
  0 siblings, 0 replies; 14+ messages in thread
From: David Laight @ 2025-10-29 13:00 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Linus Torvalds, Josh Poimboeuf, Linux Kernel Mailing List

On Wed, 29 Oct 2025 11:05:33 +0100
Peter Zijlstra <peterz@infradead.org> wrote:

> On Wed, Oct 29, 2025 at 09:56:38AM +0000, David Laight wrote:
> > On Tue, 28 Oct 2025 12:29:11 -0700
> > Linus Torvalds <torvalds@linux-foundation.org> wrote:
> >   
> > > Josh, Peter,
> > >  due to another entirely unrelated discussion, I ended up resurrecting
> > > my "make asm readable" patch that I have had in my local tree when I
> > > want to look at the actual generated code for user accesses.
> > > 
> > > That is a local hack that just removes the alternative noise for the
> > > common ops, so that I actually see the fences and clac/stac
> > > instructions as such, instead of seeing them as nops in the object
> > > file or as horrible noise in the assembler output.  
> > 
> > I've toyed with using explicit nop sequences that would be identifiable
> > as stac, clac and lfence.
> > 
> > At least that would tell you which is which.
> > 
> > Since the flags can be trashed there are plenty to choose from.
> > (eg all the cmpb $n,%reg if you don't mind a false dependency.)  
> 
> As long as you ensure that insn_is_nop() recognises it as such, they
> won't actually ever get ran after alternative patching, since they'll be
> rewritten in canonical nops by optimize_nops().

Ah - I've wondered about the 'nop; nop; nop' being slightly sub-optimal.
But it is hard enough to see what happens to the alternative that is
selected never mind the ones that aren't.

Instructions that change the flags can't be changed to nops.
So possibly the best options are the 0f-1f-cx (for x 0..7).
So 'long_nop %reg' rather than 0f-1f-00 'long_nop [%rax]'.

(But I've not checked the full rules for the addressing modes of
'long_nop'.)

	David



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: odd objtool 'unreachable instruction' warning
  2025-10-29  9:56 ` David Laight
  2025-10-29 10:05   ` Peter Zijlstra
@ 2025-10-29 14:05   ` Peter Zijlstra
  1 sibling, 0 replies; 14+ messages in thread
From: Peter Zijlstra @ 2025-10-29 14:05 UTC (permalink / raw)
  To: David Laight; +Cc: Linus Torvalds, Josh Poimboeuf, Linux Kernel Mailing List

On Wed, Oct 29, 2025 at 09:56:38AM +0000, David Laight wrote:
> On Tue, 28 Oct 2025 12:29:11 -0700
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > Josh, Peter,
> >  due to another entirely unrelated discussion, I ended up resurrecting
> > my "make asm readable" patch that I have had in my local tree when I
> > want to look at the actual generated code for user accesses.
> > 
> > That is a local hack that just removes the alternative noise for the
> > common ops, so that I actually see the fences and clac/stac
> > instructions as such, instead of seeing them as nops in the object
> > file or as horrible noise in the assembler output.
> 
> I've toyed with using explicit nop sequences that would be identifiable
> as stac, clac and lfence.
> 
> At least that would tell you which is which.
> 
> Since the flags can be trashed there are plenty to choose from.
> (eg all the cmpb $n,%reg if you don't mind a false dependency.)

things like:

	mov %reg, %reg

are 3 bytes on x86_64 and otherwise unused, since the canonical NOP for
>=3 bytes is NOPL.

So if you do something like:

	mov %rax, %rax -- stac
	mov %rcx, %rcx -- clac

that should be readily recognisable. Trouble is, this doesn't readily
work for 32bit, so you'll need to complicate the code to pick different
NOPs there. I suppose you can CS-prefix stuff the same MOVs, like:

	cs mov %eax, %eax
	cs mov %ecx, %ecx


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: odd objtool 'unreachable instruction' warning
  2025-10-29  1:50     ` Josh Poimboeuf
@ 2025-10-29 16:17       ` Linus Torvalds
  2025-10-30  9:51         ` Alexandre Chartre
  2025-11-05 19:51         ` David Sterba
  0 siblings, 2 replies; 14+ messages in thread
From: Linus Torvalds @ 2025-10-29 16:17 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, Linux Kernel Mailing List, Alexandre Chartre

On Tue, 28 Oct 2025 at 18:50, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
>
> Just to clarify, Alexandre is working on objTOOL, not objDUMP :-)

Bah. Not the first time I misread - or miswrite - one for the other.

I actually have a stupid alias for disassembly, because I can't ever
remember the long argument names:

   alias disassemble='objdump --disassemble --no-show-raw-insn --no-addresses'

so one day when somebody has an improved script around this, I'll just
change the alias.

(Those flags, btw, are really bad flags in general, because you can't
figure out where the jumps go when you don't see addresses. But my
use-case is generally to look at the compiler-generated asm for
"details", and then I use that objdump disassembly to get a filtered
view of what the code looks like without all the section annotations
and without all the other noise. So that alias is more about my
workflow than about any kind of sane use).

Now, if objtool disassembly ever gives the kind of good disassembly
that 'perf report' does - with branches turned into arrows etc - that
would be lovely, and I'd replace my hacky alias in a heartbeat.
> FWIW, this can be slightly improved by adding '-w' (objdump -drw) which
> at least puts the function name on the same line:
>
>      call   34d <delayed_put_task_struct+0x1d>       349: R_X86_64_PLT32     rethook_flush_task-0x4

Yeah, that's better, and I think I could massage it with some
scripting even more.

But I've thought about writing some simple scripting to make objdump
output clearer many times, and I never really end up caring enough.

So I think I'll wait for Alexandre to do that for me ;)

         Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: odd objtool 'unreachable instruction' warning
  2025-10-29 16:17       ` Linus Torvalds
@ 2025-10-30  9:51         ` Alexandre Chartre
  2025-11-05 19:51         ` David Sterba
  1 sibling, 0 replies; 14+ messages in thread
From: Alexandre Chartre @ 2025-10-30  9:51 UTC (permalink / raw)
  To: Linus Torvalds, Josh Poimboeuf
  Cc: alexandre.chartre, Peter Zijlstra, Linux Kernel Mailing List


On 10/29/25 17:17, Linus Torvalds wrote:
> On Tue, 28 Oct 2025 at 18:50, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
>>
>> Just to clarify, Alexandre is working on objTOOL, not objDUMP :-)
> 
> Bah. Not the first time I misread - or miswrite - one for the other.
> 
> I actually have a stupid alias for disassembly, because I can't ever
> remember the long argument names:
> 
>     alias disassemble='objdump --disassemble --no-show-raw-insn --no-addresses'
> 
> so one day when somebody has an improved script around this, I'll just
> change the alias.
> 
> (Those flags, btw, are really bad flags in general, because you can't
> figure out where the jumps go when you don't see addresses. But my
> use-case is generally to look at the compiler-generated asm for
> "details", and then I use that objdump disassembly to get a filtered
> view of what the code looks like without all the section annotations
> and without all the other noise. So that alias is more about my
> workflow than about any kind of sane use).
> 
> Now, if objtool disassembly ever gives the kind of good disassembly
> that 'perf report' does - with branches turned into arrows etc - that
> would be lovely, and I'd replace my hacky alias in a heartbeat.
>> FWIW, this can be slightly improved by adding '-w' (objdump -drw) which
>> at least puts the function name on the same line:
>>
>>       call   34d <delayed_put_task_struct+0x1d>       349: R_X86_64_PLT32     rethook_flush_task-0x4
> 
> Yeah, that's better, and I think I could massage it with some
> scripting even more.
> 
> But I've thought about writing some simple scripting to make objdump
> output clearer many times, and I never really end up caring enough.
> 
> So I think I'll wait for Alexandre to do that for me ;)
> 

I am currently working on a new version (V3) which is mostly done.
Alternatives are now disassemble side-by-side and reloc is handled.
I am finishing cleaning up patches and I will hopefully send them
soon.

Here are some examples:

Example with stac/clac:

  4e6b6:  read_ldt+0xb6      test   %rsi,%rsi
  4e6b9:  read_ldt+0xb9      jne    0x4e6e6 <read_ldt+0xe6>
  4e6bb:  read_ldt+0xbb    | <alternative.4e6bb> | X86_FEATURE_SMAP
  4e6bb:  read_ldt+0xbb	  | NOP1                | stac
  4e6bc:  read_ldt+0xbc	  | NOP1                |
  4e6bd:  read_ldt+0xbd	  | NOP1                |
  4e6be:  read_ldt+0xbe	    xor    %eax,%eax
  4e6c0:  read_ldt+0xc0    | <alternative.4e6c0>     | EXCEPTION                    | !X86_FEATURE_FSRS
  4e6c0:  read_ldt+0xc0    | rep stos %al,%es:(%rdi) | jmp    4e6c5 <read_ldt+0xc5> | callq  0x1707 <rep_stos_alternative>
  4e6c2:  read_ldt+0xc2    | NOP1                    |                              |
  4e6c3:  read_ldt+0xc3    | NOP1                    |                              |
  4e6c4:  read_ldt+0xc4    | NOP1                    |                              |
  4e6c5:  read_ldt+0xc5	  | <alternative.4e6c5> | X86_FEATURE_SMAP
  4e6c5:  read_ldt+0xc5	  | NOP1                | clac
  4e6c6:  read_ldt+0xc6	  | NOP1                |
  4e6c7:  read_ldt+0xc7	  | NOP1                |
  4e6c8:  read_ldt+0xc8	    test   %rcx,%rcx
  4e6cb:  read_ldt+0xcb	    jne    0x4e6e6 <read_ldt+0xe6>
  4e6cd:  read_ldt+0xcd	    mov    %rbp,%rdi
  4e6d0:  read_ldt+0xd0	    callq  0x4e6d5 <up_read>

__switch_to_asm disassembly:

$ ./tools/objtool/objtool --disas=__switch_to_asm --link vmlinux.o
__switch_to_asm:
   82c0:  __switch_to_asm          push   %rbp
   82c1:  __switch_to_asm+0x1      push   %rbx
   82c2:  __switch_to_asm+0x2	  push   %r12
   82c4:  __switch_to_asm+0x4	  push   %r13
   82c6:  __switch_to_asm+0x6	  push   %r14
   82c8:  __switch_to_asm+0x8	  push   %r15
   82ca:  __switch_to_asm+0xa	  mov    %rsp,0x1670(%rdi)
   82d1:  __switch_to_asm+0x11	  mov    0x1670(%rsi),%rsp
   82d8:  __switch_to_asm+0x18	  mov    0xad8(%rsi),%rbx
   82df:  __switch_to_asm+0x1f	  mov    %rbx,%gs:0x0(%rip)        # 0x82e7 <__stack_chk_guard>
   82e7:  __switch_to_asm+0x27	| <alternative.82e7>                   | !X86_FEATURE_ALWAYS                  | X86_FEATURE_RSB_CTXSW
   82e7:  __switch_to_asm+0x27	| jmp    0x8312 <__switch_to_asm+0x52> | NOP1                                 | mov    $0x10,%r12
   82e8:  __switch_to_asm+0x28	|                                      | NOP1                                 |
   82e9:  __switch_to_asm+0x29	| NOP1                                 | callq  0x82ef <__switch_to_asm+0x2f> |
   82ea:  __switch_to_asm+0x2a	| NOP1                                 |                                      |
   82eb:  __switch_to_asm+0x2b	| NOP1                                 |                                      |
   82ec:  __switch_to_asm+0x2c	| NOP1                                 |                                      |
   82ed:  __switch_to_asm+0x2d	| NOP1                                 |                                      |
   82ee:  __switch_to_asm+0x2e	| NOP1                                 | int3                                 | callq  0x82f4 <__switch_to_asm+0x34>
   82ef:  __switch_to_asm+0x2f	| NOP1                                 | add    $0x8,%rsp                     |
   82f0:  __switch_to_asm+0x30	| NOP1                                 |                                      |
   82f1:  __switch_to_asm+0x31	| NOP1                                 |                                      |
   82f2:  __switch_to_asm+0x32	| NOP1                                 |                                      |
   82f3:  __switch_to_asm+0x33	| NOP1                                 | lfence                               | int3
   82f4:  __switch_to_asm+0x34	| NOP1                                 |                                      | callq  0x82fa <__switch_to_asm+0x3a>
   82f5:  __switch_to_asm+0x35	| NOP1                                 |                                      |
   82f6:  __switch_to_asm+0x36	| NOP1                                 |                                      |
   82f7:  __switch_to_asm+0x37	| NOP1                                 |                                      |
   82f8:  __switch_to_asm+0x38	| NOP1                                 |                                      |
   82f9:  __switch_to_asm+0x39	| NOP1                                 |                                      | int3
   82fa:  __switch_to_asm+0x3a	| NOP1                                 |                                      | add    $0x10,%rsp
   82fb:  __switch_to_asm+0x3b	| NOP1                                 |                                      |
   82fc:  __switch_to_asm+0x3c	| NOP1                                 |                                      |
   82fd:  __switch_to_asm+0x3d	| NOP1                                 |                                      |
   82fe:  __switch_to_asm+0x3e	| NOP1                                 |                                      | dec    %r12
   82ff:  __switch_to_asm+0x3f	| NOP1                                 |                                      |
   8300:  __switch_to_asm+0x40	| NOP1                                 |                                      |
   8301:  __switch_to_asm+0x41	| NOP1                                 |                                      | jne    0x82ee <__switch_to_asm+0x2e>
   8302:  __switch_to_asm+0x42	| NOP1                                 |                                      |
   8303:  __switch_to_asm+0x43	| NOP1                                 |                                      | lfence
   8304:  __switch_to_asm+0x44	| NOP1                                 |                                      |
   8305:  __switch_to_asm+0x45	| NOP1                                 |                                      |
   8306:  __switch_to_asm+0x46	| NOP1                                 |                                      | movq   $0xffffffffffffffff,%gs:0x0(%rip)  # 0x20b <__x86_call_depth>
   8307:  __switch_to_asm+0x47	| NOP1                                 |                                      |
   8308:  __switch_to_asm+0x48	| NOP1                                 |                                      |
   8309:  __switch_to_asm+0x49	| NOP1                                 |                                      |
   830a:  __switch_to_asm+0x4a	| NOP1                                 |                                      |
   830b:  __switch_to_asm+0x4b	| NOP1                                 |                                      |
   830c:  __switch_to_asm+0x4c	| NOP1                                 |                                      |
   830d:  __switch_to_asm+0x4d	| NOP1                                 |                                      |
   830e:  __switch_to_asm+0x4e	| NOP1                                 |                                      |
   830f:  __switch_to_asm+0x4f	| NOP1                                 |                                      |
   8310:  __switch_to_asm+0x50	| NOP1                                 |                                      |
   8311:  __switch_to_asm+0x51	| NOP1                                 |                                      |
   8312:  __switch_to_asm+0x52	  pop    %r15
   8314:  __switch_to_asm+0x54	  pop    %r14
   8316:  __switch_to_asm+0x56	  pop    %r13
   8318:  __switch_to_asm+0x58	  pop    %r12
   831a:  __switch_to_asm+0x5a	  pop    %rbx
   831b:  __switch_to_asm+0x5b	  pop    %rbp
   831c:  __switch_to_asm+0x5c	  jmpq   0x8321 <__switch_to>

alex.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: odd objtool 'unreachable instruction' warning
  2025-10-28 19:29 odd objtool 'unreachable instruction' warning Linus Torvalds
  2025-10-29  0:21 ` Josh Poimboeuf
  2025-10-29  9:56 ` David Laight
@ 2025-11-01  7:05 ` Ingo Molnar
  2025-11-01 16:59   ` Linus Torvalds
  2 siblings, 1 reply; 14+ messages in thread
From: Ingo Molnar @ 2025-11-01  7:05 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Josh Poimboeuf, Peter Zijlstra, Linux Kernel Mailing List


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> That is a local hack that just removes the alternative noise for the 
> common ops, so that I actually see the fences and clac/stac 
> instructions as such, instead of seeing them as nops in the object 
> file or as horrible noise in the assembler output.
> 
> So that patch is not something I'd ever commit in general, but it's
> really useful for checking code generation [...]

> Subject: [PATCH] Avoid alternative assembler noise for common ops
> 
> Just hardcode the ones I have for SMP and LFENCE
> ---
>  arch/x86/include/asm/alternative.h |  7 +------
>  arch/x86/include/asm/barrier.h     |  2 +-
>  arch/x86/include/asm/smap.h        | 18 ++++++------------
>  arch/x86/lib/getuser.S             |  2 +-
>  4 files changed, 9 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
> index 15bc07a5ebb3..281c823a869e 100644
> --- a/arch/x86/include/asm/alternative.h
> +++ b/arch/x86/include/asm/alternative.h
> @@ -42,12 +42,7 @@
>   */
>  
>  #ifdef CONFIG_SMP
> -#define LOCK_PREFIX_HERE \
> -		".pushsection .smp_locks,\"a\"\n"	\
> -		".balign 4\n"				\
> -		".long 671f - .\n" /* offset */		\
> -		".popsection\n"				\
> -		"671:"
> +#define LOCK_PREFIX_HERE

>  
>  #define LOCK_PREFIX LOCK_PREFIX_HERE "\n\tlock "
>  
> diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
> index db70832232d4..2b911c88467b 100644
> --- a/arch/x86/include/asm/barrier.h
> +++ b/arch/x86/include/asm/barrier.h
> @@ -45,7 +45,7 @@
>  	__mask; })
>  
>  /* Prevent speculative execution past this barrier. */
> -#define barrier_nospec() alternative("", "lfence", X86_FEATURE_LFENCE_RDTSC)
> +#define barrier_nospec() asm volatile("lfence":::"memory")

So we could actually integrate much of this and avoid all the 
alternatives noise by keying it off CONFIG_X86_NATIVE_CPU and passing 
in two new config flags whether /proc/cpuinfo in the build environment 
has X86_FEATURE_LFENCE_RDTSC and X86_FEATURE_SMAP ("smap").

CONFIG_X86_NATIVE_CPU is not supposed to boot/work on any other CPU 
than the host CPU, so CPU-variant runtime patchery is pointless there.

Put differently: we could make alternative() use build-provided CPU 
flags if CONFIG_X86_NATIVE_CPU=y - or at least introduce an 
alternative() variant that can be used in the simpler cases.

We could even avoid .smp_locks if the build environment is SMP, or even 
just hardcode the LOCK prefix on CONFIG_X86_NATIVE_CPU, there's no 
modern UP processor in existence that we support and care about, and 
patching out the prefix should be of dubious value anyway on anything 
supported. LOCK used to have horrible overhead on some ancient CPUs but 
that's it - it should be ignored on any x86 CPU built in the last 20 
years?

Ie. I think we could integrate 100% of the functionality of your 
disassembly readability hack-patch. :-)

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: odd objtool 'unreachable instruction' warning
  2025-11-01  7:05 ` Ingo Molnar
@ 2025-11-01 16:59   ` Linus Torvalds
  0 siblings, 0 replies; 14+ messages in thread
From: Linus Torvalds @ 2025-11-01 16:59 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Josh Poimboeuf, Peter Zijlstra, Linux Kernel Mailing List

On Sat, 1 Nov 2025 at 00:05, Ingo Molnar <mingo@kernel.org> wrote:
>
> So we could actually integrate much of this and avoid all the
> alternatives noise by keying it off CONFIG_X86_NATIVE_CPU and passing
> in two new config flags whether /proc/cpuinfo in the build environment
> has X86_FEATURE_LFENCE_RDTSC and X86_FEATURE_SMAP ("smap").

So I think I mentioned in one other email (maybe another thread) that
I did at one point have a slightly bigger patch that then had more
#ifdef's etc to make it configurable and thus upstreamable.

The problem was that it just made the source uglier. The patch simply
looked pretty hacky, and it's not like it really improves anything for
anybody sane: the actual code at runtime ends up being identical.

(It might improve things for really old machines that don't support
SMAP at all, in that they'd avoid the nop instructions, but I don't
feel that is something worth optimizing for).

Now, if there are enough people that actually look at the assembler
code to make this worthwhile, that would be one thing, but I suspect
there's a handful of odd people who - like me - are then happy to have
local crud that doesn't affect anybody else.

But hey - if you think this would be good upstream, and you (and by
"you" I mean "the x86 maintainers" in general) don't mind having a
slightly more complicated ugly header file, I obviously will not
complain at all.

(I also at one point was hoping there was some way to make the
<asm/alternative.h> code itself have a "assume these flags are
set/clear" model, but that looks unworkable: you'd need a more capable
preprocessor than the regular stupid C preprocessor - something that
could do conditionals inside the macros).

               Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: odd objtool 'unreachable instruction' warning
  2025-10-29 16:17       ` Linus Torvalds
  2025-10-30  9:51         ` Alexandre Chartre
@ 2025-11-05 19:51         ` David Sterba
  2025-11-07  0:05           ` Linus Torvalds
  1 sibling, 1 reply; 14+ messages in thread
From: David Sterba @ 2025-11-05 19:51 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Josh Poimboeuf, Peter Zijlstra, Linux Kernel Mailing List,
	Alexandre Chartre

On Wed, Oct 29, 2025 at 09:17:36AM -0700, Linus Torvalds wrote:
> Now, if objtool disassembly ever gives the kind of good disassembly
> that 'perf report' does - with branches turned into arrows etc - that
> would be lovely, and I'd replace my hacky alias in a heartbeat.

FWIW, 'objdump --visualize-jumps' shows the arrows, with
'--visualize-jumps=extended-color' they're colored and it's readable.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: odd objtool 'unreachable instruction' warning
  2025-11-05 19:51         ` David Sterba
@ 2025-11-07  0:05           ` Linus Torvalds
  0 siblings, 0 replies; 14+ messages in thread
From: Linus Torvalds @ 2025-11-07  0:05 UTC (permalink / raw)
  To: dsterba
  Cc: Josh Poimboeuf, Peter Zijlstra, Linux Kernel Mailing List,
	Alexandre Chartre

On Wed, 5 Nov 2025 at 11:51, David Sterba <dsterba@suse.cz> wrote:
>
> FWIW, 'objdump --visualize-jumps' shows the arrows, with
> '--visualize-jumps=extended-color' they're colored and it's readable.

Hey, hey, hey - when did that happen? I've clearly not read the
man-page in ages.

Because that's _almost_ lovely. If objdump just handled relocations
more sanely, it would be really nice.

[ Looking around in the binutil sources, it's been there for five
years, and was in the 2.34 release ]

Because of the insanity of relocation handling, it shows regular
"call" and "jmp" instructions out of the function as branches to the
next instruction (because that's the non-relocated info).

And because it puts the visualized things in front of the
instructions, the end result has strange random indentation depending
on complexity of the function.

But it's tantalizingly close to great, and it's certainly an
improvement over not having that at all.

Added to my "disassemble" alias.

          Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-11-07  0:05 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-28 19:29 odd objtool 'unreachable instruction' warning Linus Torvalds
2025-10-29  0:21 ` Josh Poimboeuf
2025-10-29  0:59   ` Linus Torvalds
2025-10-29  1:50     ` Josh Poimboeuf
2025-10-29 16:17       ` Linus Torvalds
2025-10-30  9:51         ` Alexandre Chartre
2025-11-05 19:51         ` David Sterba
2025-11-07  0:05           ` Linus Torvalds
2025-10-29  9:56 ` David Laight
2025-10-29 10:05   ` Peter Zijlstra
2025-10-29 13:00     ` David Laight
2025-10-29 14:05   ` Peter Zijlstra
2025-11-01  7:05 ` Ingo Molnar
2025-11-01 16:59   ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox