* [PATCH] x86/processor.h: Force inlining of cpu_relax()
@ 2015-09-24 12:02 Denys Vlasenko
2015-09-25 11:44 ` Borislav Petkov
2015-09-25 12:22 ` [tip:x86/asm] x86/asm: " tip-bot for Denys Vlasenko
0 siblings, 2 replies; 3+ messages in thread
From: Denys Vlasenko @ 2015-09-24 12:02 UTC (permalink / raw)
To: Ingo Molnar
Cc: Denys Vlasenko, Andy Lutomirski, H. Peter Anvin, Borislav Petkov,
Brian Gerst, x86, linux-kernel
On x86, cpu_relax() simply calls rep_nop(), which generates one
instruction, PAUSE (aka REP NOP).
With this config:
http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os
gcc-4.7.2 does not always inline rep_nop(): it generates
several copies of this:
<rep_nop> (16 copies, 194 calls):
55 push %rbp
48 89 e5 mov %rsp,%rbp
f3 90 pause
5d pop %rbp
c3 retq
See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122
This patch fixes this via s/inline/__always_inline/
on rep_nop() and cpu_relax().
(Forcing inlining only on rep_nop() causes gcc to
deinline cpu_relax(), with almost no change in generated code).
text data bss dec hex filename
88118971 19905208 36421632 144445811 89c1173 vmlinux.before
88118139 19905208 36421632 144444979 89c0e33 vmlinux
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Ingo Molnar <mingo@kernel.org>
CC: Andy Lutomirski <luto@amacapital.net>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Borislav Petkov <bp@alien8.de>
CC: Brian Gerst <brgerst@gmail.com>
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
---
arch/x86/include/asm/processor.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 19577dd..b55f309 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -556,12 +556,12 @@ static inline unsigned int cpuid_edx(unsigned int op)
}
/* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */
-static inline void rep_nop(void)
+static __always_inline void rep_nop(void)
{
asm volatile("rep; nop" ::: "memory");
}
-static inline void cpu_relax(void)
+static __always_inline void cpu_relax(void)
{
rep_nop();
}
--
1.8.1.4
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] x86/processor.h: Force inlining of cpu_relax()
2015-09-24 12:02 [PATCH] x86/processor.h: Force inlining of cpu_relax() Denys Vlasenko
@ 2015-09-25 11:44 ` Borislav Petkov
2015-09-25 12:22 ` [tip:x86/asm] x86/asm: " tip-bot for Denys Vlasenko
1 sibling, 0 replies; 3+ messages in thread
From: Borislav Petkov @ 2015-09-25 11:44 UTC (permalink / raw)
To: Denys Vlasenko
Cc: Ingo Molnar, Andy Lutomirski, H. Peter Anvin, Brian Gerst, x86,
linux-kernel
On Thu, Sep 24, 2015 at 02:02:29PM +0200, Denys Vlasenko wrote:
> On x86, cpu_relax() simply calls rep_nop(), which generates one
> instruction, PAUSE (aka REP NOP).
>
> With this config:
> http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os
> gcc-4.7.2 does not always inline rep_nop(): it generates
> several copies of this:
>
> <rep_nop> (16 copies, 194 calls):
> 55 push %rbp
> 48 89 e5 mov %rsp,%rbp
> f3 90 pause
> 5d pop %rbp
> c3 retq
>
> See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122
>
> This patch fixes this via s/inline/__always_inline/
> on rep_nop() and cpu_relax().
> (Forcing inlining only on rep_nop() causes gcc to
> deinline cpu_relax(), with almost no change in generated code).
>
> text data bss dec hex filename
> 88118971 19905208 36421632 144445811 89c1173 vmlinux.before
> 88118139 19905208 36421632 144444979 89c0e33 vmlinux
Looks ok to me, text even grows smaller.
Acked-by: Borislav Petkov <bp@suse.de>
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [tip:x86/asm] x86/asm: Force inlining of cpu_relax()
2015-09-24 12:02 [PATCH] x86/processor.h: Force inlining of cpu_relax() Denys Vlasenko
2015-09-25 11:44 ` Borislav Petkov
@ 2015-09-25 12:22 ` tip-bot for Denys Vlasenko
1 sibling, 0 replies; 3+ messages in thread
From: tip-bot for Denys Vlasenko @ 2015-09-25 12:22 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, dvlasenk, tglx, brgerst, mingo, torvalds, hpa, luto,
bp, peterz
Commit-ID: 0b101e62afe626ecae60173f92f1e0ec72151653
Gitweb: http://git.kernel.org/tip/0b101e62afe626ecae60173f92f1e0ec72151653
Author: Denys Vlasenko <dvlasenk@redhat.com>
AuthorDate: Thu, 24 Sep 2015 14:02:29 +0200
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 25 Sep 2015 09:44:34 +0200
x86/asm: Force inlining of cpu_relax()
On x86, cpu_relax() simply calls rep_nop(), which generates one
instruction, PAUSE (aka REP NOP).
With this config:
http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os
gcc-4.7.2 does not always inline rep_nop(): it generates several
copies of this:
<rep_nop> (16 copies, 194 calls):
55 push %rbp
48 89 e5 mov %rsp,%rbp
f3 90 pause
5d pop %rbp
c3 retq
See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122
This patch fixes this via s/inline/__always_inline/
on rep_nop() and cpu_relax().
( Forcing inlining only on rep_nop() causes GCC to
deinline cpu_relax(), with almost no change in generated code).
text data bss dec hex filename
88118971 19905208 36421632 144445811 89c1173 vmlinux.before
88118139 19905208 36421632 144444979 89c0e33 vmlinux
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1443096149-27291-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/include/asm/processor.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 19577dd..b55f309 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -556,12 +556,12 @@ static inline unsigned int cpuid_edx(unsigned int op)
}
/* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */
-static inline void rep_nop(void)
+static __always_inline void rep_nop(void)
{
asm volatile("rep; nop" ::: "memory");
}
-static inline void cpu_relax(void)
+static __always_inline void cpu_relax(void)
{
rep_nop();
}
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-09-25 12:23 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-24 12:02 [PATCH] x86/processor.h: Force inlining of cpu_relax() Denys Vlasenko
2015-09-25 11:44 ` Borislav Petkov
2015-09-25 12:22 ` [tip:x86/asm] x86/asm: " tip-bot for Denys Vlasenko
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.