All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/processor.h: Force inlining of cpu_relax()
@ 2015-09-24 12:02 Denys Vlasenko
  2015-09-25 11:44 ` Borislav Petkov
  2015-09-25 12:22 ` [tip:x86/asm] x86/asm: " tip-bot for Denys Vlasenko
  0 siblings, 2 replies; 3+ messages in thread
From: Denys Vlasenko @ 2015-09-24 12:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Denys Vlasenko, Andy Lutomirski, H. Peter Anvin, Borislav Petkov,
	Brian Gerst, x86, linux-kernel

On x86, cpu_relax() simply calls rep_nop(), which generates one
instruction, PAUSE (aka REP NOP).

With this config:
http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os
gcc-4.7.2 does not always inline rep_nop(): it generates
several copies of this:

<rep_nop> (16 copies, 194 calls):
       55                      push   %rbp
       48 89 e5                mov    %rsp,%rbp
       f3 90                   pause
       5d                      pop    %rbp
       c3                      retq

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122

This patch fixes this via s/inline/__always_inline/
on rep_nop() and cpu_relax().
(Forcing inlining only on rep_nop() causes gcc to
deinline cpu_relax(), with almost no change in generated code).

    text     data      bss       dec     hex filename
88118971 19905208 36421632 144445811 89c1173 vmlinux.before
88118139 19905208 36421632 144444979 89c0e33 vmlinux

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Ingo Molnar <mingo@kernel.org>
CC: Andy Lutomirski <luto@amacapital.net>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Borislav Petkov <bp@alien8.de>
CC: Brian Gerst <brgerst@gmail.com>
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
---
 arch/x86/include/asm/processor.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 19577dd..b55f309 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -556,12 +556,12 @@ static inline unsigned int cpuid_edx(unsigned int op)
 }
 
 /* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */
-static inline void rep_nop(void)
+static __always_inline void rep_nop(void)
 {
 	asm volatile("rep; nop" ::: "memory");
 }
 
-static inline void cpu_relax(void)
+static __always_inline void cpu_relax(void)
 {
 	rep_nop();
 }
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] x86/processor.h: Force inlining of cpu_relax()
  2015-09-24 12:02 [PATCH] x86/processor.h: Force inlining of cpu_relax() Denys Vlasenko
@ 2015-09-25 11:44 ` Borislav Petkov
  2015-09-25 12:22 ` [tip:x86/asm] x86/asm: " tip-bot for Denys Vlasenko
  1 sibling, 0 replies; 3+ messages in thread
From: Borislav Petkov @ 2015-09-25 11:44 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Ingo Molnar, Andy Lutomirski, H. Peter Anvin, Brian Gerst, x86,
	linux-kernel

On Thu, Sep 24, 2015 at 02:02:29PM +0200, Denys Vlasenko wrote:
> On x86, cpu_relax() simply calls rep_nop(), which generates one
> instruction, PAUSE (aka REP NOP).
> 
> With this config:
> http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os
> gcc-4.7.2 does not always inline rep_nop(): it generates
> several copies of this:
> 
> <rep_nop> (16 copies, 194 calls):
>        55                      push   %rbp
>        48 89 e5                mov    %rsp,%rbp
>        f3 90                   pause
>        5d                      pop    %rbp
>        c3                      retq
> 
> See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122
> 
> This patch fixes this via s/inline/__always_inline/
> on rep_nop() and cpu_relax().
> (Forcing inlining only on rep_nop() causes gcc to
> deinline cpu_relax(), with almost no change in generated code).
> 
>     text     data      bss       dec     hex filename
> 88118971 19905208 36421632 144445811 89c1173 vmlinux.before
> 88118139 19905208 36421632 144444979 89c0e33 vmlinux

Looks ok to me, text even grows smaller.

Acked-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [tip:x86/asm] x86/asm: Force inlining of cpu_relax()
  2015-09-24 12:02 [PATCH] x86/processor.h: Force inlining of cpu_relax() Denys Vlasenko
  2015-09-25 11:44 ` Borislav Petkov
@ 2015-09-25 12:22 ` tip-bot for Denys Vlasenko
  1 sibling, 0 replies; 3+ messages in thread
From: tip-bot for Denys Vlasenko @ 2015-09-25 12:22 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, dvlasenk, tglx, brgerst, mingo, torvalds, hpa, luto,
	bp, peterz

Commit-ID:  0b101e62afe626ecae60173f92f1e0ec72151653
Gitweb:     http://git.kernel.org/tip/0b101e62afe626ecae60173f92f1e0ec72151653
Author:     Denys Vlasenko <dvlasenk@redhat.com>
AuthorDate: Thu, 24 Sep 2015 14:02:29 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 25 Sep 2015 09:44:34 +0200

x86/asm: Force inlining of cpu_relax()

On x86, cpu_relax() simply calls rep_nop(), which generates one
instruction, PAUSE (aka REP NOP).

With this config:

  http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os

gcc-4.7.2 does not always inline rep_nop(): it generates several
copies of this:

  <rep_nop> (16 copies, 194 calls):
       55                      push   %rbp
       48 89 e5                mov    %rsp,%rbp
       f3 90                   pause
       5d                      pop    %rbp
       c3                      retq

See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122

This patch fixes this via s/inline/__always_inline/
on rep_nop() and cpu_relax().

( Forcing inlining only on rep_nop() causes GCC to
  deinline cpu_relax(), with almost no change in generated code).

      text     data      bss       dec     hex filename
  88118971 19905208 36421632 144445811 89c1173 vmlinux.before
  88118139 19905208 36421632 144444979 89c0e33 vmlinux

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1443096149-27291-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/processor.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 19577dd..b55f309 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -556,12 +556,12 @@ static inline unsigned int cpuid_edx(unsigned int op)
 }
 
 /* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */
-static inline void rep_nop(void)
+static __always_inline void rep_nop(void)
 {
 	asm volatile("rep; nop" ::: "memory");
 }
 
-static inline void cpu_relax(void)
+static __always_inline void cpu_relax(void)
 {
 	rep_nop();
 }

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-09-25 12:23 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-24 12:02 [PATCH] x86/processor.h: Force inlining of cpu_relax() Denys Vlasenko
2015-09-25 11:44 ` Borislav Petkov
2015-09-25 12:22 ` [tip:x86/asm] x86/asm: " tip-bot for Denys Vlasenko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.