* [PATCH] x86/hweight: Fix and improve __arch_hweight{32,64}() assembly
@ 2025-03-10 20:08 Uros Bizjak
2025-03-10 20:12 ` Borislav Petkov
2025-03-10 20:16 ` Ingo Molnar
0 siblings, 2 replies; 12+ messages in thread
From: Uros Bizjak @ 2025-03-10 20:08 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin
a) Use ASM_CALL_CONSTRAINT to prevent inline asm that includes call
instruction from being scheduled before the frame pointer gets set
up by the containing function, causing objtool to print a "call
without frame pointer save/setup" warning.
b) Use asm_inline to instruct the compiler that the size of asm()
is the minimum size of one instruction, ignoring how many instructions
the compiler thinks it is. ALTERNATIVE macro that expands to several
pseudo directives causes instruction length estimate to count
more than 20 instructions.
c) Use named operands in inline asm.
More inlining causes slight increase in the code size:
text data bss dec hex filename
27261832 4640296 814660 32716788 1f337f4 vmlinux-new.o
27261222 4640320 814660 32716202 1f335aa vmlinux-old.o
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
---
arch/x86/include/asm/arch_hweight.h | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/x86/include/asm/arch_hweight.h b/arch/x86/include/asm/arch_hweight.h
index ba88edd0d58b..20b0633744e4 100644
--- a/arch/x86/include/asm/arch_hweight.h
+++ b/arch/x86/include/asm/arch_hweight.h
@@ -16,10 +16,10 @@ static __always_inline unsigned int __arch_hweight32(unsigned int w)
{
unsigned int res;
- asm (ALTERNATIVE("call __sw_hweight32", "popcntl %1, %0", X86_FEATURE_POPCNT)
- : "="REG_OUT (res)
- : REG_IN (w));
-
+ asm_inline (ALTERNATIVE("call __sw_hweight32",
+ "popcntl %[val], %[cnt]", X86_FEATURE_POPCNT)
+ : [cnt] "="REG_OUT (res), ASM_CALL_CONSTRAINT
+ : [val] REG_IN (w));
return res;
}
@@ -44,10 +44,10 @@ static __always_inline unsigned long __arch_hweight64(__u64 w)
{
unsigned long res;
- asm (ALTERNATIVE("call __sw_hweight64", "popcntq %1, %0", X86_FEATURE_POPCNT)
- : "="REG_OUT (res)
- : REG_IN (w));
-
+ asm_inline (ALTERNATIVE("call __sw_hweight64",
+ "popcntq %[val], %[cnt]", X86_FEATURE_POPCNT)
+ : [cnt] "="REG_OUT (res), ASM_CALL_CONSTRAINT
+ : [val] REG_IN (w));
return res;
}
#endif /* CONFIG_X86_32 */
--
2.42.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/hweight: Fix and improve __arch_hweight{32,64}() assembly
2025-03-10 20:08 [PATCH] x86/hweight: Fix and improve __arch_hweight{32,64}() assembly Uros Bizjak
@ 2025-03-10 20:12 ` Borislav Petkov
2025-03-10 20:35 ` Uros Bizjak
2025-03-10 20:16 ` Ingo Molnar
1 sibling, 1 reply; 12+ messages in thread
From: Borislav Petkov @ 2025-03-10 20:12 UTC (permalink / raw)
To: Uros Bizjak
Cc: x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Dave Hansen,
H. Peter Anvin
On Mon, Mar 10, 2025 at 09:08:04PM +0100, Uros Bizjak wrote:
> a) Use ASM_CALL_CONSTRAINT to prevent inline asm that includes call
> instruction from being scheduled before the frame pointer gets set
> up by the containing function, causing objtool to print a "call
> without frame pointer save/setup" warning.
The other two are ok but this is new. How do you trigger this? I've never seen
it in my randconfig builds...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/hweight: Fix and improve __arch_hweight{32,64}() assembly
2025-03-10 20:08 [PATCH] x86/hweight: Fix and improve __arch_hweight{32,64}() assembly Uros Bizjak
2025-03-10 20:12 ` Borislav Petkov
@ 2025-03-10 20:16 ` Ingo Molnar
2025-03-10 21:25 ` Uros Bizjak
1 sibling, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2025-03-10 20:16 UTC (permalink / raw)
To: Uros Bizjak
Cc: x86, linux-kernel, Thomas Gleixner, Borislav Petkov, Dave Hansen,
H. Peter Anvin
* Uros Bizjak <ubizjak@gmail.com> wrote:
> a) Use ASM_CALL_CONSTRAINT to prevent inline asm that includes call
> instruction from being scheduled before the frame pointer gets set
> up by the containing function, causing objtool to print a "call
> without frame pointer save/setup" warning.
>
> b) Use asm_inline to instruct the compiler that the size of asm()
> is the minimum size of one instruction, ignoring how many instructions
> the compiler thinks it is. ALTERNATIVE macro that expands to several
> pseudo directives causes instruction length estimate to count
> more than 20 instructions.
>
> c) Use named operands in inline asm.
>
> More inlining causes slight increase in the code size:
>
> text data bss dec hex filename
> 27261832 4640296 814660 32716788 1f337f4 vmlinux-new.o
> 27261222 4640320 814660 32716202 1f335aa vmlinux-old.o
What is the per call/inlining-instance change in code size, measured in
fast-path instruction bytes? Also, exception code or cold branches near
the epilogue of the function after the main RET don't fully count as a
size increase.
This kind of normalization and filtering of changes to relevant
generated instructions is a better metric than some rather meaningless
'+610 bytes of code' figure.
Also, please always specify the kind of config you used for building
the vmlinux.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/hweight: Fix and improve __arch_hweight{32,64}() assembly
2025-03-10 20:12 ` Borislav Petkov
@ 2025-03-10 20:35 ` Uros Bizjak
2025-03-10 20:42 ` Ingo Molnar
2025-03-10 20:44 ` Borislav Petkov
0 siblings, 2 replies; 12+ messages in thread
From: Uros Bizjak @ 2025-03-10 20:35 UTC (permalink / raw)
To: Borislav Petkov
Cc: x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Dave Hansen,
H. Peter Anvin
On Mon, Mar 10, 2025 at 9:12 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Mon, Mar 10, 2025 at 09:08:04PM +0100, Uros Bizjak wrote:
> > a) Use ASM_CALL_CONSTRAINT to prevent inline asm that includes call
> > instruction from being scheduled before the frame pointer gets set
> > up by the containing function, causing objtool to print a "call
> > without frame pointer save/setup" warning.
>
> The other two are ok but this is new. How do you trigger this? I've never seen
> it in my randconfig builds...
It is not triggered now, but without this constraint, nothing prevents
the compiler from scheduling the insn in front of frame creation.
Uros.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/hweight: Fix and improve __arch_hweight{32,64}() assembly
2025-03-10 20:35 ` Uros Bizjak
@ 2025-03-10 20:42 ` Ingo Molnar
2025-03-10 20:44 ` Borislav Petkov
1 sibling, 0 replies; 12+ messages in thread
From: Ingo Molnar @ 2025-03-10 20:42 UTC (permalink / raw)
To: Uros Bizjak
Cc: Borislav Petkov, x86, linux-kernel, Thomas Gleixner, Dave Hansen,
H. Peter Anvin
* Uros Bizjak <ubizjak@gmail.com> wrote:
> On Mon, Mar 10, 2025 at 9:12 PM Borislav Petkov <bp@alien8.de> wrote:
> >
> > On Mon, Mar 10, 2025 at 09:08:04PM +0100, Uros Bizjak wrote:
> > > a) Use ASM_CALL_CONSTRAINT to prevent inline asm that includes call
> > > instruction from being scheduled before the frame pointer gets set
> > > up by the containing function, causing objtool to print a "call
> > > without frame pointer save/setup" warning.
> >
> > The other two are ok but this is new. How do you trigger this? I've never seen
> > it in my randconfig builds...
>
> It is not triggered now, but without this constraint, nothing prevents
> the compiler from scheduling the insn in front of frame creation.
Please add:
'Current versions of compilers don't seem to trigger this condition,
but without this constraint there's nothing to prevent the compiler
from scheduling the insn in front of frame creation.'
Thanks,
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/hweight: Fix and improve __arch_hweight{32,64}() assembly
2025-03-10 20:35 ` Uros Bizjak
2025-03-10 20:42 ` Ingo Molnar
@ 2025-03-10 20:44 ` Borislav Petkov
2025-03-10 20:54 ` Uros Bizjak
2025-03-10 21:00 ` Ingo Molnar
1 sibling, 2 replies; 12+ messages in thread
From: Borislav Petkov @ 2025-03-10 20:44 UTC (permalink / raw)
To: Uros Bizjak
Cc: x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Dave Hansen,
H. Peter Anvin
On Mon, Mar 10, 2025 at 09:35:42PM +0100, Uros Bizjak wrote:
> On Mon, Mar 10, 2025 at 9:12 PM Borislav Petkov <bp@alien8.de> wrote:
> >
> > On Mon, Mar 10, 2025 at 09:08:04PM +0100, Uros Bizjak wrote:
> > > a) Use ASM_CALL_CONSTRAINT to prevent inline asm that includes call
> > > instruction from being scheduled before the frame pointer gets set
> > > up by the containing function, causing objtool to print a "call
> > > without frame pointer save/setup" warning.
> >
> > The other two are ok but this is new. How do you trigger this? I've never seen
> > it in my randconfig builds...
>
> It is not triggered now, but without this constraint, nothing prevents
> the compiler from scheduling the insn in front of frame creation.
Can you please stop with this silliness?
When we start doing git archeology months, years from now, it should be
perfectly clear why a commit was done. This one is not. So either the compiler
is doing the bad scheduling or it isn't. Things can't just work by chance.
Geez.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/hweight: Fix and improve __arch_hweight{32,64}() assembly
2025-03-10 20:44 ` Borislav Petkov
@ 2025-03-10 20:54 ` Uros Bizjak
2025-03-10 21:07 ` Borislav Petkov
2025-03-10 21:00 ` Ingo Molnar
1 sibling, 1 reply; 12+ messages in thread
From: Uros Bizjak @ 2025-03-10 20:54 UTC (permalink / raw)
To: Borislav Petkov
Cc: x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Dave Hansen,
H. Peter Anvin
On Mon, Mar 10, 2025 at 9:45 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Mon, Mar 10, 2025 at 09:35:42PM +0100, Uros Bizjak wrote:
> > On Mon, Mar 10, 2025 at 9:12 PM Borislav Petkov <bp@alien8.de> wrote:
> > >
> > > On Mon, Mar 10, 2025 at 09:08:04PM +0100, Uros Bizjak wrote:
> > > > a) Use ASM_CALL_CONSTRAINT to prevent inline asm that includes call
> > > > instruction from being scheduled before the frame pointer gets set
> > > > up by the containing function, causing objtool to print a "call
> > > > without frame pointer save/setup" warning.
> > >
> > > The other two are ok but this is new. How do you trigger this? I've never seen
> > > it in my randconfig builds...
> >
> > It is not triggered now, but without this constraint, nothing prevents
> > the compiler from scheduling the insn in front of frame creation.
>
> Can you please stop with this silliness?
>
> When we start doing git archeology months, years from now, it should be
> perfectly clear why a commit was done. This one is not. So either the compiler
> is doing the bad scheduling or it isn't. Things can't just work by chance.
>
> Geez.
Ok, so let it be your way and let's just sweep the issue under the carpet.
BR,
Uros.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/hweight: Fix and improve __arch_hweight{32,64}() assembly
2025-03-10 20:44 ` Borislav Petkov
2025-03-10 20:54 ` Uros Bizjak
@ 2025-03-10 21:00 ` Ingo Molnar
1 sibling, 0 replies; 12+ messages in thread
From: Ingo Molnar @ 2025-03-10 21:00 UTC (permalink / raw)
To: Borislav Petkov
Cc: Uros Bizjak, x86, linux-kernel, Thomas Gleixner, Dave Hansen,
H. Peter Anvin
* Borislav Petkov <bp@alien8.de> wrote:
> On Mon, Mar 10, 2025 at 09:35:42PM +0100, Uros Bizjak wrote:
> > On Mon, Mar 10, 2025 at 9:12 PM Borislav Petkov <bp@alien8.de> wrote:
> > >
> > > On Mon, Mar 10, 2025 at 09:08:04PM +0100, Uros Bizjak wrote:
> > > > a) Use ASM_CALL_CONSTRAINT to prevent inline asm that includes call
> > > > instruction from being scheduled before the frame pointer gets set
> > > > up by the containing function, causing objtool to print a "call
> > > > without frame pointer save/setup" warning.
> > >
> > > The other two are ok but this is new. How do you trigger this? I've never seen
> > > it in my randconfig builds...
> >
> > It is not triggered now, but without this constraint, nothing prevents
> > the compiler from scheduling the insn in front of frame creation.
>
> Can you please stop with this silliness?
>
> When we start doing git archeology months, years from now, it should
> be perfectly clear why a commit was done. This one is not. So either
> the compiler is doing the bad scheduling or it isn't. Things can't
> just work by chance.
So this particular code generation aspect seems to be working by random
implementational chance right now: objtool is basically a second,
independent layer of tooling with its own assumptions and expectations,
which is why objtool warnings are not hard build failures.
But whether unexpected instruction scheduling is known to occur or not
with current compilers should be included in the changelog and is
relevant information.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/hweight: Fix and improve __arch_hweight{32,64}() assembly
2025-03-10 20:54 ` Uros Bizjak
@ 2025-03-10 21:07 ` Borislav Petkov
2025-03-10 21:18 ` Uros Bizjak
0 siblings, 1 reply; 12+ messages in thread
From: Borislav Petkov @ 2025-03-10 21:07 UTC (permalink / raw)
To: Uros Bizjak
Cc: x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Dave Hansen,
H. Peter Anvin
On Mon, Mar 10, 2025 at 09:54:25PM +0100, Uros Bizjak wrote:
> Ok, so let it be your way and let's just sweep the issue under the carpet.
Can you please read my mails more carefilly? Where did I say we should sweep
the issue under the carpet?
The commit message should be *perfectly* clear what it is fixing. This
"a) Use ASM_CALL_CONSTRAINT to prevent inline asm that includes call
instruction from being scheduled before the frame pointer gets set
up by the containing function, causing objtool to print a "call
without frame pointer save/setup" warning."
says that objool is printing a warning. When I ask, it is not really printing
a warning but it can potentially do so because the compiler is allowed to
schedule things wrongly.
Do you notice the difference?
Dammit, it is very important *why* a commit message is there - it is not
write-only and people look at it. So *again* *please* be precise when
explaining why your patch exists!
All that stuff has been documented at length:
https://kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes
Thanks.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/hweight: Fix and improve __arch_hweight{32,64}() assembly
2025-03-10 21:07 ` Borislav Petkov
@ 2025-03-10 21:18 ` Uros Bizjak
2025-03-10 21:34 ` Borislav Petkov
0 siblings, 1 reply; 12+ messages in thread
From: Uros Bizjak @ 2025-03-10 21:18 UTC (permalink / raw)
To: Borislav Petkov
Cc: x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Dave Hansen,
H. Peter Anvin
On Mon, Mar 10, 2025 at 10:08 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Mon, Mar 10, 2025 at 09:54:25PM +0100, Uros Bizjak wrote:
> > Ok, so let it be your way and let's just sweep the issue under the carpet.
>
> Can you please read my mails more carefilly? Where did I say we should sweep
> the issue under the carpet?
The "stop with this silliness" part? But let's put this at rest.
> The commit message should be *perfectly* clear what it is fixing. This
>
> "a) Use ASM_CALL_CONSTRAINT to prevent inline asm that includes call
> instruction from being scheduled before the frame pointer gets set
> up by the containing function, causing objtool to print a "call
> without frame pointer save/setup" warning."
>
> says that objool is printing a warning. When I ask, it is not really printing
> a warning but it can potentially do so because the compiler is allowed to
> schedule things wrongly.
>
> Do you notice the difference?
So, rewording this part to:
a) Use ASM_CALL_CONSTRAINT to prevent inline asm that includes call
instruction from being scheduled by the compiler before the frame
pointer gets set
up by the containing function. This unconstrained scheduling might
cause objtool to print a "call without frame pointer save/setup"
warning.
would be ok?
Thanks,
Uros.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/hweight: Fix and improve __arch_hweight{32,64}() assembly
2025-03-10 20:16 ` Ingo Molnar
@ 2025-03-10 21:25 ` Uros Bizjak
0 siblings, 0 replies; 12+ messages in thread
From: Uros Bizjak @ 2025-03-10 21:25 UTC (permalink / raw)
To: Ingo Molnar
Cc: x86, linux-kernel, Thomas Gleixner, Borislav Petkov, Dave Hansen,
H. Peter Anvin
On Mon, Mar 10, 2025 at 9:16 PM Ingo Molnar <mingo@kernel.org> wrote:
>
>
> * Uros Bizjak <ubizjak@gmail.com> wrote:
>
> > a) Use ASM_CALL_CONSTRAINT to prevent inline asm that includes call
> > instruction from being scheduled before the frame pointer gets set
> > up by the containing function, causing objtool to print a "call
> > without frame pointer save/setup" warning.
> >
> > b) Use asm_inline to instruct the compiler that the size of asm()
> > is the minimum size of one instruction, ignoring how many instructions
> > the compiler thinks it is. ALTERNATIVE macro that expands to several
> > pseudo directives causes instruction length estimate to count
> > more than 20 instructions.
> >
> > c) Use named operands in inline asm.
> >
> > More inlining causes slight increase in the code size:
> >
> > text data bss dec hex filename
> > 27261832 4640296 814660 32716788 1f337f4 vmlinux-new.o
> > 27261222 4640320 814660 32716202 1f335aa vmlinux-old.o
>
> What is the per call/inlining-instance change in code size, measured in
> fast-path instruction bytes? Also, exception code or cold branches near
> the epilogue of the function after the main RET don't fully count as a
> size increase.
>
> This kind of normalization and filtering of changes to relevant
> generated instructions is a better metric than some rather meaningless
> '+610 bytes of code' figure.
>
> Also, please always specify the kind of config you used for building
> the vmlinux.
Sorry, this just slipped my mind. x86_64 defconfig - I'll note this in
the revised commit entry.
BTW: The difference between old and new number of inlined __sw_hweight
calls is: 367 -> 396. I'll try to analyze this some more.
Thanks,
Uros.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/hweight: Fix and improve __arch_hweight{32,64}() assembly
2025-03-10 21:18 ` Uros Bizjak
@ 2025-03-10 21:34 ` Borislav Petkov
0 siblings, 0 replies; 12+ messages in thread
From: Borislav Petkov @ 2025-03-10 21:34 UTC (permalink / raw)
To: Uros Bizjak
Cc: x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Dave Hansen,
H. Peter Anvin
On Mon, Mar 10, 2025 at 10:18:50PM +0100, Uros Bizjak wrote:
> a) Use ASM_CALL_CONSTRAINT to prevent inline asm that includes call
> instruction from being scheduled by the compiler before the frame
> pointer gets set
> up by the containing function. This unconstrained scheduling might
> cause objtool to print a "call without frame pointer save/setup"
> warning.
>
> would be ok?
Yes, and pls say something along the lines of: this is not a currently
triggered issue but it can potentially happen, so that it is perfectly clear
what this patch is addressing.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-03-10 21:34 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-10 20:08 [PATCH] x86/hweight: Fix and improve __arch_hweight{32,64}() assembly Uros Bizjak
2025-03-10 20:12 ` Borislav Petkov
2025-03-10 20:35 ` Uros Bizjak
2025-03-10 20:42 ` Ingo Molnar
2025-03-10 20:44 ` Borislav Petkov
2025-03-10 20:54 ` Uros Bizjak
2025-03-10 21:07 ` Borislav Petkov
2025-03-10 21:18 ` Uros Bizjak
2025-03-10 21:34 ` Borislav Petkov
2025-03-10 21:00 ` Ingo Molnar
2025-03-10 20:16 ` Ingo Molnar
2025-03-10 21:25 ` Uros Bizjak
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.