public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 1/3] x86/hweight: Use named operands in inline asm()
@ 2025-03-12 12:38 Uros Bizjak
  2025-03-12 12:38 ` [PATCH v3 2/3] x86/hweight: Use ASM_CALL_CONSTRAINT " Uros Bizjak
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Uros Bizjak @ 2025-03-12 12:38 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin

No functional change intended.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
---
v3: Split patch into three separate patches.
---
 arch/x86/include/asm/arch_hweight.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/arch_hweight.h b/arch/x86/include/asm/arch_hweight.h
index ba88edd0d58b..a11bb841c434 100644
--- a/arch/x86/include/asm/arch_hweight.h
+++ b/arch/x86/include/asm/arch_hweight.h
@@ -16,9 +16,9 @@ static __always_inline unsigned int __arch_hweight32(unsigned int w)
 {
 	unsigned int res;
 
-	asm (ALTERNATIVE("call __sw_hweight32", "popcntl %1, %0", X86_FEATURE_POPCNT)
-			 : "="REG_OUT (res)
-			 : REG_IN (w));
+	asm (ALTERNATIVE("call __sw_hweight32", "popcntl %[val], %[cnt]", X86_FEATURE_POPCNT)
+			 : [cnt] "=" REG_OUT (res)
+			 : [val] REG_IN (w));
 
 	return res;
 }
@@ -44,9 +44,9 @@ static __always_inline unsigned long __arch_hweight64(__u64 w)
 {
 	unsigned long res;
 
-	asm (ALTERNATIVE("call __sw_hweight64", "popcntq %1, %0", X86_FEATURE_POPCNT)
-			 : "="REG_OUT (res)
-			 : REG_IN (w));
+	asm (ALTERNATIVE("call __sw_hweight64", "popcntq %[val], %[cnt]", X86_FEATURE_POPCNT)
+			 : [cnt] "=" REG_OUT (res)
+			 : [val] REG_IN (w));
 
 	return res;
 }
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 2/3] x86/hweight: Use ASM_CALL_CONSTRAINT in inline asm()
  2025-03-12 12:38 [PATCH v3 1/3] x86/hweight: Use named operands in inline asm() Uros Bizjak
@ 2025-03-12 12:38 ` Uros Bizjak
  2025-03-12 19:33   ` [tip: x86/asm] " tip-bot2 for Uros Bizjak
  2025-03-19 11:03   ` [tip: x86/core] " tip-bot2 for Uros Bizjak
  2025-03-12 12:38 ` [PATCH v3 3/3] x86/hweight: Use asm_inline " Uros Bizjak
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 9+ messages in thread
From: Uros Bizjak @ 2025-03-12 12:38 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin

Use ASM_CALL_CONSTRAINT to prevent inline asm that includes call
instruction from being scheduled before the frame pointer gets set
up by the containing function. This unconstrained scheduling might
cause objtool to print a "call without frame pointer save/setup"
warning. Current versions of compilers don't seem to trigger this
condition, but without this constraint there's nothing to prevent
the compiler from scheduling the insn in front of frame creation.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
---
v2: Expand ASM_CALL_CONSTRANT commit message to mention that
    current compilers don't schedule insn before freame creation.
v3: Split patch into three separate patches.
---
 arch/x86/include/asm/arch_hweight.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/arch_hweight.h b/arch/x86/include/asm/arch_hweight.h
index a11bb841c434..f233eb00f41f 100644
--- a/arch/x86/include/asm/arch_hweight.h
+++ b/arch/x86/include/asm/arch_hweight.h
@@ -17,7 +17,7 @@ static __always_inline unsigned int __arch_hweight32(unsigned int w)
 	unsigned int res;
 
 	asm (ALTERNATIVE("call __sw_hweight32", "popcntl %[val], %[cnt]", X86_FEATURE_POPCNT)
-			 : [cnt] "=" REG_OUT (res)
+			 : [cnt] "=" REG_OUT (res), ASM_CALL_CONSTRAINT
 			 : [val] REG_IN (w));
 
 	return res;
@@ -45,7 +45,7 @@ static __always_inline unsigned long __arch_hweight64(__u64 w)
 	unsigned long res;
 
 	asm (ALTERNATIVE("call __sw_hweight64", "popcntq %[val], %[cnt]", X86_FEATURE_POPCNT)
-			 : [cnt] "=" REG_OUT (res)
+			 : [cnt] "=" REG_OUT (res), ASM_CALL_CONSTRAINT
 			 : [val] REG_IN (w));
 
 	return res;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 3/3] x86/hweight: Use asm_inline in inline asm()
  2025-03-12 12:38 [PATCH v3 1/3] x86/hweight: Use named operands in inline asm() Uros Bizjak
  2025-03-12 12:38 ` [PATCH v3 2/3] x86/hweight: Use ASM_CALL_CONSTRAINT " Uros Bizjak
@ 2025-03-12 12:38 ` Uros Bizjak
  2025-03-12 19:33   ` [tip: x86/asm] x86/hweight: Use asm_inline() instead of asm() tip-bot2 for Uros Bizjak
  2025-03-19 11:03   ` [tip: x86/core] " tip-bot2 for Uros Bizjak
  2025-03-12 19:33 ` [tip: x86/asm] x86/hweight: Use named operands in inline asm() tip-bot2 for Uros Bizjak
  2025-03-19 11:03 ` [tip: x86/core] " tip-bot2 for Uros Bizjak
  3 siblings, 2 replies; 9+ messages in thread
From: Uros Bizjak @ 2025-03-12 12:38 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin

Use asm_inline to instruct the compiler that the size of asm()
is the minimum size of one instruction, ignoring how many instructions
the compiler thinks it is. ALTERNATIVE macro that expands to several
pseudo directives causes instruction length estimate to count
more than 20 instructions.

bloat-o-meter reports slight reduction of the code size
for x86_64 defconfig object file, compiled with gcc-14.2:

add/remove: 6/12 grow/shrink: 59/50 up/down: 3389/-3560 (-171)
Total: Before=22734393, After=22734222, chg -0.00%

where 29 instances of code blocks involving POPCNT now gets inlined,
resulting in the removal of several functions:

format_is_yuv_semiplanar.part.isra            41       -     -41
cdclk_divider                                 69       -     -69
intel_joiner_adjust_timings                  140       -    -140
nl80211_send_wowlan_tcp_caps                 369       -    -369
nl80211_send_iftype_data                     579       -    -579
__do_sys_pidfd_send_signal                   809       -    -809

One noticeable change is:

pcpu_page_first_chunk                       1075    1060     -15

Where the compiler now inlines 4 more instances of POPCNT insns,
but still manages to compile to a function with smaller code size.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
---
v2: Use bloat-o-meter to assess code size changes.
v3: Split patch into three separate patches.
---
 arch/x86/include/asm/arch_hweight.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/arch_hweight.h b/arch/x86/include/asm/arch_hweight.h
index f233eb00f41f..b5982b94bdba 100644
--- a/arch/x86/include/asm/arch_hweight.h
+++ b/arch/x86/include/asm/arch_hweight.h
@@ -16,7 +16,8 @@ static __always_inline unsigned int __arch_hweight32(unsigned int w)
 {
 	unsigned int res;
 
-	asm (ALTERNATIVE("call __sw_hweight32", "popcntl %[val], %[cnt]", X86_FEATURE_POPCNT)
+	asm_inline (ALTERNATIVE("call __sw_hweight32",
+				"popcntl %[val], %[cnt]", X86_FEATURE_POPCNT)
 			 : [cnt] "=" REG_OUT (res), ASM_CALL_CONSTRAINT
 			 : [val] REG_IN (w));
 
@@ -44,7 +45,8 @@ static __always_inline unsigned long __arch_hweight64(__u64 w)
 {
 	unsigned long res;
 
-	asm (ALTERNATIVE("call __sw_hweight64", "popcntq %[val], %[cnt]", X86_FEATURE_POPCNT)
+	asm_inline (ALTERNATIVE("call __sw_hweight64",
+				"popcntq %[val], %[cnt]", X86_FEATURE_POPCNT)
 			 : [cnt] "=" REG_OUT (res), ASM_CALL_CONSTRAINT
 			 : [val] REG_IN (w));
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [tip: x86/asm] x86/hweight: Use asm_inline() instead of asm()
  2025-03-12 12:38 ` [PATCH v3 3/3] x86/hweight: Use asm_inline " Uros Bizjak
@ 2025-03-12 19:33   ` tip-bot2 for Uros Bizjak
  2025-03-19 11:03   ` [tip: x86/core] " tip-bot2 for Uros Bizjak
  1 sibling, 0 replies; 9+ messages in thread
From: tip-bot2 for Uros Bizjak @ 2025-03-12 19:33 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Uros Bizjak, Ingo Molnar, H. Peter Anvin, Nathan Chancellor,
	Nick Desaulniers, Linus Torvalds, x86, linux-kernel

The following commit has been merged into the x86/asm branch of tip:

Commit-ID:     3aeb02062eae312550be0d4344466d0bced8c8ad
Gitweb:        https://git.kernel.org/tip/3aeb02062eae312550be0d4344466d0bced8c8ad
Author:        Uros Bizjak <ubizjak@gmail.com>
AuthorDate:    Wed, 12 Mar 2025 13:38:45 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 12 Mar 2025 20:18:29 +01:00

x86/hweight: Use asm_inline() instead of asm()

Use asm_inline() to instruct the compiler that the size of asm()
is the minimum size of one instruction, ignoring how many instructions
the compiler thinks it is. ALTERNATIVE macro that expands to several
pseudo directives causes instruction length estimate to count
more than 20 instructions.

bloat-o-meter reports slight reduction of the code size
for x86_64 defconfig object file, compiled with gcc-14.2:

  add/remove: 6/12 grow/shrink: 59/50 up/down: 3389/-3560 (-171)
  Total: Before=22734393, After=22734222, chg -0.00%

where 29 instances of code blocks involving POPCNT now gets inlined,
resulting in the removal of several functions:

  format_is_yuv_semiplanar.part.isra            41       -     -41
  cdclk_divider                                 69       -     -69
  intel_joiner_adjust_timings                  140       -    -140
  nl80211_send_wowlan_tcp_caps                 369       -    -369
  nl80211_send_iftype_data                     579       -    -579
  __do_sys_pidfd_send_signal                   809       -    -809

One noticeable change is:

  pcpu_page_first_chunk                       1075    1060     -15

Where the compiler now inlines 4 more instances of POPCNT insns,
but still manages to compile to a function with smaller code size.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250312123905.149298-3-ubizjak@gmail.com
---
 arch/x86/include/asm/arch_hweight.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/arch_hweight.h b/arch/x86/include/asm/arch_hweight.h
index f233eb0..b5982b9 100644
--- a/arch/x86/include/asm/arch_hweight.h
+++ b/arch/x86/include/asm/arch_hweight.h
@@ -16,7 +16,8 @@ static __always_inline unsigned int __arch_hweight32(unsigned int w)
 {
 	unsigned int res;
 
-	asm (ALTERNATIVE("call __sw_hweight32", "popcntl %[val], %[cnt]", X86_FEATURE_POPCNT)
+	asm_inline (ALTERNATIVE("call __sw_hweight32",
+				"popcntl %[val], %[cnt]", X86_FEATURE_POPCNT)
 			 : [cnt] "=" REG_OUT (res), ASM_CALL_CONSTRAINT
 			 : [val] REG_IN (w));
 
@@ -44,7 +45,8 @@ static __always_inline unsigned long __arch_hweight64(__u64 w)
 {
 	unsigned long res;
 
-	asm (ALTERNATIVE("call __sw_hweight64", "popcntq %[val], %[cnt]", X86_FEATURE_POPCNT)
+	asm_inline (ALTERNATIVE("call __sw_hweight64",
+				"popcntq %[val], %[cnt]", X86_FEATURE_POPCNT)
 			 : [cnt] "=" REG_OUT (res), ASM_CALL_CONSTRAINT
 			 : [val] REG_IN (w));
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [tip: x86/asm] x86/hweight: Use ASM_CALL_CONSTRAINT in inline asm()
  2025-03-12 12:38 ` [PATCH v3 2/3] x86/hweight: Use ASM_CALL_CONSTRAINT " Uros Bizjak
@ 2025-03-12 19:33   ` tip-bot2 for Uros Bizjak
  2025-03-19 11:03   ` [tip: x86/core] " tip-bot2 for Uros Bizjak
  1 sibling, 0 replies; 9+ messages in thread
From: tip-bot2 for Uros Bizjak @ 2025-03-12 19:33 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Uros Bizjak, Ingo Molnar, H. Peter Anvin, Nathan Chancellor,
	Nick Desaulniers, Linus Torvalds, x86, linux-kernel

The following commit has been merged into the x86/asm branch of tip:

Commit-ID:     01ba23bf1b3f9a4035faedc2aa450e251bcc2c7c
Gitweb:        https://git.kernel.org/tip/01ba23bf1b3f9a4035faedc2aa450e251bcc2c7c
Author:        Uros Bizjak <ubizjak@gmail.com>
AuthorDate:    Wed, 12 Mar 2025 13:38:44 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 12 Mar 2025 20:18:29 +01:00

x86/hweight: Use ASM_CALL_CONSTRAINT in inline asm()

Use ASM_CALL_CONSTRAINT to prevent inline asm() that includes call
instruction from being scheduled before the frame pointer gets set
up by the containing function. This unconstrained scheduling might
cause objtool to print a "call without frame pointer save/setup"
warning. Current versions of compilers don't seem to trigger this
condition, but without this constraint there's nothing to prevent
the compiler from scheduling the insn in front of frame creation.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250312123905.149298-2-ubizjak@gmail.com
---
 arch/x86/include/asm/arch_hweight.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/arch_hweight.h b/arch/x86/include/asm/arch_hweight.h
index a11bb84..f233eb0 100644
--- a/arch/x86/include/asm/arch_hweight.h
+++ b/arch/x86/include/asm/arch_hweight.h
@@ -17,7 +17,7 @@ static __always_inline unsigned int __arch_hweight32(unsigned int w)
 	unsigned int res;
 
 	asm (ALTERNATIVE("call __sw_hweight32", "popcntl %[val], %[cnt]", X86_FEATURE_POPCNT)
-			 : [cnt] "=" REG_OUT (res)
+			 : [cnt] "=" REG_OUT (res), ASM_CALL_CONSTRAINT
 			 : [val] REG_IN (w));
 
 	return res;
@@ -45,7 +45,7 @@ static __always_inline unsigned long __arch_hweight64(__u64 w)
 	unsigned long res;
 
 	asm (ALTERNATIVE("call __sw_hweight64", "popcntq %[val], %[cnt]", X86_FEATURE_POPCNT)
-			 : [cnt] "=" REG_OUT (res)
+			 : [cnt] "=" REG_OUT (res), ASM_CALL_CONSTRAINT
 			 : [val] REG_IN (w));
 
 	return res;

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [tip: x86/asm] x86/hweight: Use named operands in inline asm()
  2025-03-12 12:38 [PATCH v3 1/3] x86/hweight: Use named operands in inline asm() Uros Bizjak
  2025-03-12 12:38 ` [PATCH v3 2/3] x86/hweight: Use ASM_CALL_CONSTRAINT " Uros Bizjak
  2025-03-12 12:38 ` [PATCH v3 3/3] x86/hweight: Use asm_inline " Uros Bizjak
@ 2025-03-12 19:33 ` tip-bot2 for Uros Bizjak
  2025-03-19 11:03 ` [tip: x86/core] " tip-bot2 for Uros Bizjak
  3 siblings, 0 replies; 9+ messages in thread
From: tip-bot2 for Uros Bizjak @ 2025-03-12 19:33 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Uros Bizjak, Ingo Molnar, H. Peter Anvin, Nathan Chancellor,
	Nick Desaulniers, Linus Torvalds, x86, linux-kernel

The following commit has been merged into the x86/asm branch of tip:

Commit-ID:     f9d73498e7bfe3dc47bb4c7ce37ad07286dd8d16
Gitweb:        https://git.kernel.org/tip/f9d73498e7bfe3dc47bb4c7ce37ad07286dd8d16
Author:        Uros Bizjak <ubizjak@gmail.com>
AuthorDate:    Wed, 12 Mar 2025 13:38:43 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 12 Mar 2025 20:18:28 +01:00

x86/hweight: Use named operands in inline asm()

No functional change intended.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250312123905.149298-1-ubizjak@gmail.com
---
 arch/x86/include/asm/arch_hweight.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/arch_hweight.h b/arch/x86/include/asm/arch_hweight.h
index ba88edd..a11bb84 100644
--- a/arch/x86/include/asm/arch_hweight.h
+++ b/arch/x86/include/asm/arch_hweight.h
@@ -16,9 +16,9 @@ static __always_inline unsigned int __arch_hweight32(unsigned int w)
 {
 	unsigned int res;
 
-	asm (ALTERNATIVE("call __sw_hweight32", "popcntl %1, %0", X86_FEATURE_POPCNT)
-			 : "="REG_OUT (res)
-			 : REG_IN (w));
+	asm (ALTERNATIVE("call __sw_hweight32", "popcntl %[val], %[cnt]", X86_FEATURE_POPCNT)
+			 : [cnt] "=" REG_OUT (res)
+			 : [val] REG_IN (w));
 
 	return res;
 }
@@ -44,9 +44,9 @@ static __always_inline unsigned long __arch_hweight64(__u64 w)
 {
 	unsigned long res;
 
-	asm (ALTERNATIVE("call __sw_hweight64", "popcntq %1, %0", X86_FEATURE_POPCNT)
-			 : "="REG_OUT (res)
-			 : REG_IN (w));
+	asm (ALTERNATIVE("call __sw_hweight64", "popcntq %[val], %[cnt]", X86_FEATURE_POPCNT)
+			 : [cnt] "=" REG_OUT (res)
+			 : [val] REG_IN (w));
 
 	return res;
 }

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [tip: x86/core] x86/hweight: Use asm_inline() instead of asm()
  2025-03-12 12:38 ` [PATCH v3 3/3] x86/hweight: Use asm_inline " Uros Bizjak
  2025-03-12 19:33   ` [tip: x86/asm] x86/hweight: Use asm_inline() instead of asm() tip-bot2 for Uros Bizjak
@ 2025-03-19 11:03   ` tip-bot2 for Uros Bizjak
  1 sibling, 0 replies; 9+ messages in thread
From: tip-bot2 for Uros Bizjak @ 2025-03-19 11:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Uros Bizjak, Ingo Molnar, H. Peter Anvin, Nathan Chancellor,
	Nick Desaulniers, Linus Torvalds, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     21fe2514849bb4de05fbd098e311a87de6a62d4b
Gitweb:        https://git.kernel.org/tip/21fe2514849bb4de05fbd098e311a87de6a62d4b
Author:        Uros Bizjak <ubizjak@gmail.com>
AuthorDate:    Wed, 12 Mar 2025 13:38:45 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 19 Mar 2025 11:26:58 +01:00

x86/hweight: Use asm_inline() instead of asm()

Use asm_inline() to instruct the compiler that the size of asm()
is the minimum size of one instruction, ignoring how many instructions
the compiler thinks it is. ALTERNATIVE macro that expands to several
pseudo directives causes instruction length estimate to count
more than 20 instructions.

bloat-o-meter reports slight reduction of the code size
for x86_64 defconfig object file, compiled with gcc-14.2:

  add/remove: 6/12 grow/shrink: 59/50 up/down: 3389/-3560 (-171)
  Total: Before=22734393, After=22734222, chg -0.00%

where 29 instances of code blocks involving POPCNT now gets inlined,
resulting in the removal of several functions:

  format_is_yuv_semiplanar.part.isra            41       -     -41
  cdclk_divider                                 69       -     -69
  intel_joiner_adjust_timings                  140       -    -140
  nl80211_send_wowlan_tcp_caps                 369       -    -369
  nl80211_send_iftype_data                     579       -    -579
  __do_sys_pidfd_send_signal                   809       -    -809

One noticeable change is:

  pcpu_page_first_chunk                       1075    1060     -15

Where the compiler now inlines 4 more instances of POPCNT insns,
but still manages to compile to a function with smaller code size.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250312123905.149298-3-ubizjak@gmail.com
---
 arch/x86/include/asm/arch_hweight.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/arch_hweight.h b/arch/x86/include/asm/arch_hweight.h
index f233eb0..b5982b9 100644
--- a/arch/x86/include/asm/arch_hweight.h
+++ b/arch/x86/include/asm/arch_hweight.h
@@ -16,7 +16,8 @@ static __always_inline unsigned int __arch_hweight32(unsigned int w)
 {
 	unsigned int res;
 
-	asm (ALTERNATIVE("call __sw_hweight32", "popcntl %[val], %[cnt]", X86_FEATURE_POPCNT)
+	asm_inline (ALTERNATIVE("call __sw_hweight32",
+				"popcntl %[val], %[cnt]", X86_FEATURE_POPCNT)
 			 : [cnt] "=" REG_OUT (res), ASM_CALL_CONSTRAINT
 			 : [val] REG_IN (w));
 
@@ -44,7 +45,8 @@ static __always_inline unsigned long __arch_hweight64(__u64 w)
 {
 	unsigned long res;
 
-	asm (ALTERNATIVE("call __sw_hweight64", "popcntq %[val], %[cnt]", X86_FEATURE_POPCNT)
+	asm_inline (ALTERNATIVE("call __sw_hweight64",
+				"popcntq %[val], %[cnt]", X86_FEATURE_POPCNT)
 			 : [cnt] "=" REG_OUT (res), ASM_CALL_CONSTRAINT
 			 : [val] REG_IN (w));
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [tip: x86/core] x86/hweight: Use ASM_CALL_CONSTRAINT in inline asm()
  2025-03-12 12:38 ` [PATCH v3 2/3] x86/hweight: Use ASM_CALL_CONSTRAINT " Uros Bizjak
  2025-03-12 19:33   ` [tip: x86/asm] " tip-bot2 for Uros Bizjak
@ 2025-03-19 11:03   ` tip-bot2 for Uros Bizjak
  1 sibling, 0 replies; 9+ messages in thread
From: tip-bot2 for Uros Bizjak @ 2025-03-19 11:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Uros Bizjak, Ingo Molnar, H. Peter Anvin, Nathan Chancellor,
	Nick Desaulniers, Linus Torvalds, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     194a613088a8c9dae300dfb08433287cee803e8d
Gitweb:        https://git.kernel.org/tip/194a613088a8c9dae300dfb08433287cee803e8d
Author:        Uros Bizjak <ubizjak@gmail.com>
AuthorDate:    Wed, 12 Mar 2025 13:38:44 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 19 Mar 2025 11:26:58 +01:00

x86/hweight: Use ASM_CALL_CONSTRAINT in inline asm()

Use ASM_CALL_CONSTRAINT to prevent inline asm() that includes call
instruction from being scheduled before the frame pointer gets set
up by the containing function. This unconstrained scheduling might
cause objtool to print a "call without frame pointer save/setup"
warning. Current versions of compilers don't seem to trigger this
condition, but without this constraint there's nothing to prevent
the compiler from scheduling the insn in front of frame creation.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250312123905.149298-2-ubizjak@gmail.com
---
 arch/x86/include/asm/arch_hweight.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/arch_hweight.h b/arch/x86/include/asm/arch_hweight.h
index a11bb84..f233eb0 100644
--- a/arch/x86/include/asm/arch_hweight.h
+++ b/arch/x86/include/asm/arch_hweight.h
@@ -17,7 +17,7 @@ static __always_inline unsigned int __arch_hweight32(unsigned int w)
 	unsigned int res;
 
 	asm (ALTERNATIVE("call __sw_hweight32", "popcntl %[val], %[cnt]", X86_FEATURE_POPCNT)
-			 : [cnt] "=" REG_OUT (res)
+			 : [cnt] "=" REG_OUT (res), ASM_CALL_CONSTRAINT
 			 : [val] REG_IN (w));
 
 	return res;
@@ -45,7 +45,7 @@ static __always_inline unsigned long __arch_hweight64(__u64 w)
 	unsigned long res;
 
 	asm (ALTERNATIVE("call __sw_hweight64", "popcntq %[val], %[cnt]", X86_FEATURE_POPCNT)
-			 : [cnt] "=" REG_OUT (res)
+			 : [cnt] "=" REG_OUT (res), ASM_CALL_CONSTRAINT
 			 : [val] REG_IN (w));
 
 	return res;

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [tip: x86/core] x86/hweight: Use named operands in inline asm()
  2025-03-12 12:38 [PATCH v3 1/3] x86/hweight: Use named operands in inline asm() Uros Bizjak
                   ` (2 preceding siblings ...)
  2025-03-12 19:33 ` [tip: x86/asm] x86/hweight: Use named operands in inline asm() tip-bot2 for Uros Bizjak
@ 2025-03-19 11:03 ` tip-bot2 for Uros Bizjak
  3 siblings, 0 replies; 9+ messages in thread
From: tip-bot2 for Uros Bizjak @ 2025-03-19 11:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Uros Bizjak, Ingo Molnar, H. Peter Anvin, Nathan Chancellor,
	Nick Desaulniers, Linus Torvalds, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     72899899e4f9de0b545218e66bf14cfa2579f2f8
Gitweb:        https://git.kernel.org/tip/72899899e4f9de0b545218e66bf14cfa2579f2f8
Author:        Uros Bizjak <ubizjak@gmail.com>
AuthorDate:    Wed, 12 Mar 2025 13:38:43 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 19 Mar 2025 11:26:58 +01:00

x86/hweight: Use named operands in inline asm()

No functional change intended.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250312123905.149298-1-ubizjak@gmail.com
---
 arch/x86/include/asm/arch_hweight.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/arch_hweight.h b/arch/x86/include/asm/arch_hweight.h
index ba88edd..a11bb84 100644
--- a/arch/x86/include/asm/arch_hweight.h
+++ b/arch/x86/include/asm/arch_hweight.h
@@ -16,9 +16,9 @@ static __always_inline unsigned int __arch_hweight32(unsigned int w)
 {
 	unsigned int res;
 
-	asm (ALTERNATIVE("call __sw_hweight32", "popcntl %1, %0", X86_FEATURE_POPCNT)
-			 : "="REG_OUT (res)
-			 : REG_IN (w));
+	asm (ALTERNATIVE("call __sw_hweight32", "popcntl %[val], %[cnt]", X86_FEATURE_POPCNT)
+			 : [cnt] "=" REG_OUT (res)
+			 : [val] REG_IN (w));
 
 	return res;
 }
@@ -44,9 +44,9 @@ static __always_inline unsigned long __arch_hweight64(__u64 w)
 {
 	unsigned long res;
 
-	asm (ALTERNATIVE("call __sw_hweight64", "popcntq %1, %0", X86_FEATURE_POPCNT)
-			 : "="REG_OUT (res)
-			 : REG_IN (w));
+	asm (ALTERNATIVE("call __sw_hweight64", "popcntq %[val], %[cnt]", X86_FEATURE_POPCNT)
+			 : [cnt] "=" REG_OUT (res)
+			 : [val] REG_IN (w));
 
 	return res;
 }

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-03-19 11:03 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-12 12:38 [PATCH v3 1/3] x86/hweight: Use named operands in inline asm() Uros Bizjak
2025-03-12 12:38 ` [PATCH v3 2/3] x86/hweight: Use ASM_CALL_CONSTRAINT " Uros Bizjak
2025-03-12 19:33   ` [tip: x86/asm] " tip-bot2 for Uros Bizjak
2025-03-19 11:03   ` [tip: x86/core] " tip-bot2 for Uros Bizjak
2025-03-12 12:38 ` [PATCH v3 3/3] x86/hweight: Use asm_inline " Uros Bizjak
2025-03-12 19:33   ` [tip: x86/asm] x86/hweight: Use asm_inline() instead of asm() tip-bot2 for Uros Bizjak
2025-03-19 11:03   ` [tip: x86/core] " tip-bot2 for Uros Bizjak
2025-03-12 19:33 ` [tip: x86/asm] x86/hweight: Use named operands in inline asm() tip-bot2 for Uros Bizjak
2025-03-19 11:03 ` [tip: x86/core] " tip-bot2 for Uros Bizjak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox