All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/processor.h: Force inlining of cpu_relax()
@ 2015-09-24 12:02 Denys Vlasenko
  2015-09-25 11:44 ` Borislav Petkov
  2015-09-25 12:22 ` [tip:x86/asm] x86/asm: " tip-bot for Denys Vlasenko
  0 siblings, 2 replies; 3+ messages in thread
From: Denys Vlasenko @ 2015-09-24 12:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Denys Vlasenko, Andy Lutomirski, H. Peter Anvin, Borislav Petkov,
	Brian Gerst, x86, linux-kernel

On x86, cpu_relax() simply calls rep_nop(), which generates one
instruction, PAUSE (aka REP NOP).

With this config:
http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os
gcc-4.7.2 does not always inline rep_nop(): it generates
several copies of this:

<rep_nop> (16 copies, 194 calls):
       55                      push   %rbp
       48 89 e5                mov    %rsp,%rbp
       f3 90                   pause
       5d                      pop    %rbp
       c3                      retq

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122

This patch fixes this via s/inline/__always_inline/
on rep_nop() and cpu_relax().
(Forcing inlining only on rep_nop() causes gcc to
deinline cpu_relax(), with almost no change in generated code).

    text     data      bss       dec     hex filename
88118971 19905208 36421632 144445811 89c1173 vmlinux.before
88118139 19905208 36421632 144444979 89c0e33 vmlinux

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Ingo Molnar <mingo@kernel.org>
CC: Andy Lutomirski <luto@amacapital.net>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Borislav Petkov <bp@alien8.de>
CC: Brian Gerst <brgerst@gmail.com>
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
---
 arch/x86/include/asm/processor.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 19577dd..b55f309 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -556,12 +556,12 @@ static inline unsigned int cpuid_edx(unsigned int op)
 }
 
 /* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */
-static inline void rep_nop(void)
+static __always_inline void rep_nop(void)
 {
 	asm volatile("rep; nop" ::: "memory");
 }
 
-static inline void cpu_relax(void)
+static __always_inline void cpu_relax(void)
 {
 	rep_nop();
 }
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-09-25 12:23 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-24 12:02 [PATCH] x86/processor.h: Force inlining of cpu_relax() Denys Vlasenko
2015-09-25 11:44 ` Borislav Petkov
2015-09-25 12:22 ` [tip:x86/asm] x86/asm: " tip-bot for Denys Vlasenko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.