All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/5] x86: faster smp_mb()+documentation tweaks
@ 2016-01-28 17:02 Michael S. Tsirkin
  2016-01-28 17:02 ` [PATCH v5 1/5] x86: add cc clobber for addl Michael S. Tsirkin
                   ` (9 more replies)
  0 siblings, 10 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-01-28 17:02 UTC (permalink / raw)
  To: linux-kernel, Linus Torvalds
  Cc: Davidlohr Bueso, Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
	Paul E. McKenney, the arch/x86 maintainers, Davidlohr Bueso,
	H. Peter Anvin, virtualization, Borislav Petkov

mb() typically uses mfence on modern x86, but a micro-benchmark shows that it's
2 to 3 times slower than lock; addl that we use on older CPUs.

So we really should use the locked variant everywhere, except that intel manual
says that clflush is only ordered by mfence, so we can't.
Note: some callers of clflush seems to assume sfence will
order it, so there could be existing bugs around this code.

Fortunately no callers of clflush (except one) order it using smp_mb(), so
after fixing that one caller, it seems safe to override smp_mb straight away.

Down the road, it might make sense to introduce clflush_mb() and switch
to that for clflush callers.

While I was at it, I found some inconsistencies in comments in
arch/x86/include/asm/barrier.h

The documentation fixes are included first - I verified that
they do not change the generated code at all. Borislav Petkov
said they will appear in tip eventually, included here for
completeness.

The last patch changes __smp_mb() to lock addl. I was unable to
measure a speed difference on a macro benchmark,
but I noted that even doing
	#define mb() barrier()
seems to make no difference for most benchmarks
(it causes hangs sometimes, of course).

Lightly tested on my laptop.

HPA asked that the last patch is deferred until we hear back from
intel, which makes sense of course. So it needs HPA's ack.

Changes from v4:
	Fix up the 64 bit version.

Changes from v3:
	Leave mb() alone for now since it's used to order
	clflush, which requires mfence. Optimize smp_mb instead.

Changes from v2:
	add patch adding cc clobber for addl
	tweak commit log for patch 2
	use addl at SP-4 (as opposed to SP) to reduce data dependencies

Michael S. Tsirkin (5):
  x86: add cc clobber for addl
  x86: drop a comment left over from X86_OOSTORE
  x86: tweak the comment about use of wmb for IO
  x86: use mb() around clflush
  x86: drop mfence in favor of lock+addl

 arch/x86/include/asm/barrier.h | 21 ++++++++++++---------
 arch/x86/kernel/process.c      |  4 ++--
 2 files changed, 14 insertions(+), 11 deletions(-)

-- 
MST

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v5 1/5] x86: add cc clobber for addl
  2016-01-28 17:02 [PATCH v5 0/5] x86: faster smp_mb()+documentation tweaks Michael S. Tsirkin
  2016-01-28 17:02 ` [PATCH v5 1/5] x86: add cc clobber for addl Michael S. Tsirkin
@ 2016-01-28 17:02 ` Michael S. Tsirkin
  2016-01-28 17:02 ` [PATCH v5 2/5] x86: drop a comment left over from X86_OOSTORE Michael S. Tsirkin
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-01-28 17:02 UTC (permalink / raw)
  To: linux-kernel, Linus Torvalds
  Cc: Davidlohr Bueso, Davidlohr Bueso, Peter Zijlstra,
	Andrey Konovalov, the arch/x86 maintainers, virtualization,
	Ingo Molnar, Borislav Petkov, Borislav Petkov, Andy Lutomirski,
	H. Peter Anvin, Thomas Gleixner, Paul E. McKenney, Ingo Molnar

addl clobbers flags (such as CF) but barrier.h didn't tell this to gcc.
Historically, gcc doesn't need one on x86, and always considers flags
clobbered. We are probably missing the cc clobber in a *lot* of places
for this reason.

But even if not necessary, it's probably a good thing to add for
documentation, and in case gcc semantcs ever change.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 arch/x86/include/asm/barrier.h | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index a584e1c..a65bdb1 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -15,9 +15,12 @@
  * Some non-Intel clones support out of order store. wmb() ceases to be a
  * nop for these.
  */
-#define mb() alternative("lock; addl $0,0(%%esp)", "mfence", X86_FEATURE_XMM2)
-#define rmb() alternative("lock; addl $0,0(%%esp)", "lfence", X86_FEATURE_XMM2)
-#define wmb() alternative("lock; addl $0,0(%%esp)", "sfence", X86_FEATURE_XMM)
+#define mb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "mfence", \
+				      X86_FEATURE_XMM2) ::: "memory", "cc")
+#define rmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "lfence", \
+				       X86_FEATURE_XMM2) ::: "memory", "cc")
+#define wmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "sfence", \
+				       X86_FEATURE_XMM2) ::: "memory", "cc")
 #else
 #define mb() 	asm volatile("mfence":::"memory")
 #define rmb()	asm volatile("lfence":::"memory")
-- 
MST

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 1/5] x86: add cc clobber for addl
  2016-01-28 17:02 [PATCH v5 0/5] x86: faster smp_mb()+documentation tweaks Michael S. Tsirkin
@ 2016-01-28 17:02 ` Michael S. Tsirkin
  2016-01-29 11:32     ` tip-bot for Michael S. Tsirkin
  2016-01-28 17:02 ` [PATCH v5 1/5] x86: add cc clobber for addl Michael S. Tsirkin
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-01-28 17:02 UTC (permalink / raw)
  To: linux-kernel, Linus Torvalds
  Cc: Davidlohr Bueso, Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
	Paul E. McKenney, the arch/x86 maintainers, Davidlohr Bueso,
	H. Peter Anvin, virtualization, Borislav Petkov, Ingo Molnar,
	Borislav Petkov, Andrey Konovalov, Andy Lutomirski

addl clobbers flags (such as CF) but barrier.h didn't tell this to gcc.
Historically, gcc doesn't need one on x86, and always considers flags
clobbered. We are probably missing the cc clobber in a *lot* of places
for this reason.

But even if not necessary, it's probably a good thing to add for
documentation, and in case gcc semantcs ever change.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 arch/x86/include/asm/barrier.h | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index a584e1c..a65bdb1 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -15,9 +15,12 @@
  * Some non-Intel clones support out of order store. wmb() ceases to be a
  * nop for these.
  */
-#define mb() alternative("lock; addl $0,0(%%esp)", "mfence", X86_FEATURE_XMM2)
-#define rmb() alternative("lock; addl $0,0(%%esp)", "lfence", X86_FEATURE_XMM2)
-#define wmb() alternative("lock; addl $0,0(%%esp)", "sfence", X86_FEATURE_XMM)
+#define mb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "mfence", \
+				      X86_FEATURE_XMM2) ::: "memory", "cc")
+#define rmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "lfence", \
+				       X86_FEATURE_XMM2) ::: "memory", "cc")
+#define wmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "sfence", \
+				       X86_FEATURE_XMM2) ::: "memory", "cc")
 #else
 #define mb() 	asm volatile("mfence":::"memory")
 #define rmb()	asm volatile("lfence":::"memory")
-- 
MST

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 2/5] x86: drop a comment left over from X86_OOSTORE
  2016-01-28 17:02 [PATCH v5 0/5] x86: faster smp_mb()+documentation tweaks Michael S. Tsirkin
  2016-01-28 17:02 ` [PATCH v5 1/5] x86: add cc clobber for addl Michael S. Tsirkin
  2016-01-28 17:02 ` [PATCH v5 1/5] x86: add cc clobber for addl Michael S. Tsirkin
@ 2016-01-28 17:02 ` Michael S. Tsirkin
  2016-01-28 17:02 ` Michael S. Tsirkin
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-01-28 17:02 UTC (permalink / raw)
  To: linux-kernel, Linus Torvalds
  Cc: Davidlohr Bueso, Davidlohr Bueso, Peter Zijlstra,
	Andrey Konovalov, the arch/x86 maintainers, virtualization,
	Ingo Molnar, Borislav Petkov, Borislav Petkov, Andy Lutomirski,
	H. Peter Anvin, Thomas Gleixner, Paul E. McKenney, Ingo Molnar

The comment about wmb being non-nop to deal with non-intel CPUs is a
left over from before commit 09df7c4c8097 ("x86: Remove
CONFIG_X86_OOSTORE").

It makes no sense now: in particular, wmb is not a nop even for regular
intel CPUs because of weird use-cases e.g. dealing with WC memory.

Drop this comment.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 arch/x86/include/asm/barrier.h | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index a65bdb1..a291745 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -11,10 +11,6 @@
  */
 
 #ifdef CONFIG_X86_32
-/*
- * Some non-Intel clones support out of order store. wmb() ceases to be a
- * nop for these.
- */
 #define mb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "mfence", \
 				      X86_FEATURE_XMM2) ::: "memory", "cc")
 #define rmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "lfence", \
-- 
MST

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 2/5] x86: drop a comment left over from X86_OOSTORE
  2016-01-28 17:02 [PATCH v5 0/5] x86: faster smp_mb()+documentation tweaks Michael S. Tsirkin
                   ` (2 preceding siblings ...)
  2016-01-28 17:02 ` [PATCH v5 2/5] x86: drop a comment left over from X86_OOSTORE Michael S. Tsirkin
@ 2016-01-28 17:02 ` Michael S. Tsirkin
  2016-01-29 11:32     ` tip-bot for Michael S. Tsirkin
  2016-01-28 17:02 ` [PATCH v5 3/5] x86: tweak the comment about use of wmb for IO Michael S. Tsirkin
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-01-28 17:02 UTC (permalink / raw)
  To: linux-kernel, Linus Torvalds
  Cc: Davidlohr Bueso, Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
	Paul E. McKenney, the arch/x86 maintainers, Davidlohr Bueso,
	H. Peter Anvin, virtualization, Borislav Petkov, Ingo Molnar,
	Borislav Petkov, Andy Lutomirski, Andrey Konovalov

The comment about wmb being non-nop to deal with non-intel CPUs is a
left over from before commit 09df7c4c8097 ("x86: Remove
CONFIG_X86_OOSTORE").

It makes no sense now: in particular, wmb is not a nop even for regular
intel CPUs because of weird use-cases e.g. dealing with WC memory.

Drop this comment.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 arch/x86/include/asm/barrier.h | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index a65bdb1..a291745 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -11,10 +11,6 @@
  */
 
 #ifdef CONFIG_X86_32
-/*
- * Some non-Intel clones support out of order store. wmb() ceases to be a
- * nop for these.
- */
 #define mb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "mfence", \
 				      X86_FEATURE_XMM2) ::: "memory", "cc")
 #define rmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "lfence", \
-- 
MST

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 3/5] x86: tweak the comment about use of wmb for IO
  2016-01-28 17:02 [PATCH v5 0/5] x86: faster smp_mb()+documentation tweaks Michael S. Tsirkin
                   ` (4 preceding siblings ...)
  2016-01-28 17:02 ` [PATCH v5 3/5] x86: tweak the comment about use of wmb for IO Michael S. Tsirkin
@ 2016-01-28 17:02 ` Michael S. Tsirkin
  2016-01-28 17:02 ` [PATCH v5 4/5] x86: use mb() around clflush Michael S. Tsirkin
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-01-28 17:02 UTC (permalink / raw)
  To: linux-kernel, Linus Torvalds
  Cc: Davidlohr Bueso, Davidlohr Bueso, Peter Zijlstra,
	Andrey Konovalov, the arch/x86 maintainers, virtualization,
	Ingo Molnar, Borislav Petkov, Borislav Petkov, Andy Lutomirski,
	H. Peter Anvin, Thomas Gleixner, Paul E. McKenney, Ingo Molnar

On x86, we *do* still use the non-nop rmb/wmb for IO barriers, but even
that is generally questionable.

Leave them around as historial unless somebody can point to a case where
they care about the performance, but tweak the comment so people
don't think they are strictly required in all cases.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 arch/x86/include/asm/barrier.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index a291745..bfb28ca 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -6,7 +6,7 @@
 
 /*
  * Force strict CPU ordering.
- * And yes, this is required on UP too when we're talking
+ * And yes, this might be required on UP too when we're talking
  * to devices.
  */
 
-- 
MST

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 3/5] x86: tweak the comment about use of wmb for IO
  2016-01-28 17:02 [PATCH v5 0/5] x86: faster smp_mb()+documentation tweaks Michael S. Tsirkin
                   ` (3 preceding siblings ...)
  2016-01-28 17:02 ` Michael S. Tsirkin
@ 2016-01-28 17:02 ` Michael S. Tsirkin
  2016-01-29 11:32     ` tip-bot for Michael S. Tsirkin
  2016-01-28 17:02 ` [PATCH v5 3/5] x86: tweak the comment about use of wmb " Michael S. Tsirkin
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-01-28 17:02 UTC (permalink / raw)
  To: linux-kernel, Linus Torvalds
  Cc: Davidlohr Bueso, Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
	Paul E. McKenney, the arch/x86 maintainers, Davidlohr Bueso,
	H. Peter Anvin, virtualization, Borislav Petkov, Ingo Molnar,
	Andy Lutomirski, Andrey Konovalov, Borislav Petkov

On x86, we *do* still use the non-nop rmb/wmb for IO barriers, but even
that is generally questionable.

Leave them around as historial unless somebody can point to a case where
they care about the performance, but tweak the comment so people
don't think they are strictly required in all cases.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 arch/x86/include/asm/barrier.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index a291745..bfb28ca 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -6,7 +6,7 @@
 
 /*
  * Force strict CPU ordering.
- * And yes, this is required on UP too when we're talking
+ * And yes, this might be required on UP too when we're talking
  * to devices.
  */
 
-- 
MST

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 4/5] x86: use mb() around clflush
  2016-01-28 17:02 [PATCH v5 0/5] x86: faster smp_mb()+documentation tweaks Michael S. Tsirkin
                   ` (6 preceding siblings ...)
  2016-01-28 17:02 ` [PATCH v5 4/5] x86: use mb() around clflush Michael S. Tsirkin
@ 2016-01-28 17:02 ` Michael S. Tsirkin
  2016-01-28 17:02 ` [PATCH v5 5/5] x86: drop mfence in favor of lock+addl Michael S. Tsirkin
  2016-01-28 17:02 ` Michael S. Tsirkin
  9 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-01-28 17:02 UTC (permalink / raw)
  To: linux-kernel, Linus Torvalds
  Cc: Len Brown, Davidlohr Bueso, Davidlohr Bueso, Peter Zijlstra,
	the arch/x86 maintainers, Oleg Nesterov, virtualization,
	Mike Galbraith, Ingo Molnar, Borislav Petkov, Andy Lutomirski,
	H. Peter Anvin, Thomas Gleixner, Paul E. McKenney, Ingo Molnar

commit f8e617f4582995f7c25ef25b4167213120ad122b ("sched/idle/x86:
Optimize unnecessary mwait_idle() resched IPIs") adds
memory barriers around clflush, but this seems wrong
for UP since barrier() has no effect on clflush.
We really want mfence so switch to mb() instead.

Cc: Mike Galbraith <bitbucket@online.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 arch/x86/kernel/process.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 9f7c21c..9decee2 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -418,9 +418,9 @@ static void mwait_idle(void)
 	if (!current_set_polling_and_test()) {
 		trace_cpu_idle_rcuidle(1, smp_processor_id());
 		if (this_cpu_has(X86_BUG_CLFLUSH_MONITOR)) {
-			smp_mb(); /* quirk */
+			mb(); /* quirk */
 			clflush((void *)&current_thread_info()->flags);
-			smp_mb(); /* quirk */
+			mb(); /* quirk */
 		}
 
 		__monitor((void *)&current_thread_info()->flags, 0, 0);
-- 
MST

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 4/5] x86: use mb() around clflush
  2016-01-28 17:02 [PATCH v5 0/5] x86: faster smp_mb()+documentation tweaks Michael S. Tsirkin
                   ` (5 preceding siblings ...)
  2016-01-28 17:02 ` [PATCH v5 3/5] x86: tweak the comment about use of wmb " Michael S. Tsirkin
@ 2016-01-28 17:02 ` Michael S. Tsirkin
  2016-01-28 18:25   ` Peter Zijlstra
                     ` (2 more replies)
  2016-01-28 17:02 ` [PATCH v5 4/5] x86: use mb() around clflush Michael S. Tsirkin
                   ` (2 subsequent siblings)
  9 siblings, 3 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-01-28 17:02 UTC (permalink / raw)
  To: linux-kernel, Linus Torvalds
  Cc: Davidlohr Bueso, Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
	Paul E. McKenney, the arch/x86 maintainers, Davidlohr Bueso,
	H. Peter Anvin, virtualization, Borislav Petkov, Mike Galbraith,
	Ingo Molnar, Andy Lutomirski, Oleg Nesterov, Len Brown

commit f8e617f4582995f7c25ef25b4167213120ad122b ("sched/idle/x86:
Optimize unnecessary mwait_idle() resched IPIs") adds
memory barriers around clflush, but this seems wrong
for UP since barrier() has no effect on clflush.
We really want mfence so switch to mb() instead.

Cc: Mike Galbraith <bitbucket@online.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 arch/x86/kernel/process.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 9f7c21c..9decee2 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -418,9 +418,9 @@ static void mwait_idle(void)
 	if (!current_set_polling_and_test()) {
 		trace_cpu_idle_rcuidle(1, smp_processor_id());
 		if (this_cpu_has(X86_BUG_CLFLUSH_MONITOR)) {
-			smp_mb(); /* quirk */
+			mb(); /* quirk */
 			clflush((void *)&current_thread_info()->flags);
-			smp_mb(); /* quirk */
+			mb(); /* quirk */
 		}
 
 		__monitor((void *)&current_thread_info()->flags, 0, 0);
-- 
MST

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 5/5] x86: drop mfence in favor of lock+addl
  2016-01-28 17:02 [PATCH v5 0/5] x86: faster smp_mb()+documentation tweaks Michael S. Tsirkin
                   ` (8 preceding siblings ...)
  2016-01-28 17:02 ` [PATCH v5 5/5] x86: drop mfence in favor of lock+addl Michael S. Tsirkin
@ 2016-01-28 17:02 ` Michael S. Tsirkin
  9 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-01-28 17:02 UTC (permalink / raw)
  To: linux-kernel, Linus Torvalds
  Cc: Davidlohr Bueso, Davidlohr Bueso, Peter Zijlstra,
	Andrey Konovalov, the arch/x86 maintainers, virtualization,
	Andy Lutomirski, Borislav Petkov, Borislav Petkov,
	Andy Lutomirski, H. Peter Anvin, Thomas Gleixner,
	Paul E. McKenney, Ingo Molnar, Ingo Molnar

mfence appears to be way slower than a locked instruction - let's use
lock+add unconditionally, as we always did on old 32-bit.

Just poking at SP would be the most natural, but if we
then read the value from SP, we get a false dependency
which will slow us down.

This was noted in this article:
http://shipilev.net/blog/2014/on-the-fence-with-dependencies/

And is easy to reproduce by sticking a barrier in a small non-inline
function.

So let's use a negative offset - which avoids this problem since we
build with the red zone disabled.

Unfortunately there's some code that wants to order clflush instructions
using mb(), so we can't replace that - but smp_mb should be safe
to replace.

Update mb/rmb/wmb on 32 bit to use the negative offset, too, for
consistency.

Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 arch/x86/include/asm/barrier.h | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index bfb28ca..3c6ba1e 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -11,11 +11,11 @@
  */
 
 #ifdef CONFIG_X86_32
-#define mb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "mfence", \
+#define mb() asm volatile(ALTERNATIVE("lock; addl $0,-4(%%esp)", "mfence", \
 				      X86_FEATURE_XMM2) ::: "memory", "cc")
-#define rmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "lfence", \
+#define rmb() asm volatile(ALTERNATIVE("lock; addl $0,-4(%%esp)", "lfence", \
 				       X86_FEATURE_XMM2) ::: "memory", "cc")
-#define wmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "sfence", \
+#define wmb() asm volatile(ALTERNATIVE("lock; addl $0,-4(%%esp)", "sfence", \
 				       X86_FEATURE_XMM2) ::: "memory", "cc")
 #else
 #define mb() 	asm volatile("mfence":::"memory")
@@ -30,7 +30,11 @@
 #endif
 #define dma_wmb()	barrier()
 
-#define __smp_mb()	mb()
+#ifdef CONFIG_X86_32
+#define __smp_mb()	asm volatile("lock; addl $0,-4(%%esp)" ::: "memory", "cc")
+#else
+#define __smp_mb()	asm volatile("lock; addl $0,-4(%%rsp)" ::: "memory", "cc")
+#endif
 #define __smp_rmb()	dma_rmb()
 #define __smp_wmb()	barrier()
 #define __smp_store_mb(var, value) do { (void)xchg(&var, value); } while (0)
-- 
MST

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v5 5/5] x86: drop mfence in favor of lock+addl
  2016-01-28 17:02 [PATCH v5 0/5] x86: faster smp_mb()+documentation tweaks Michael S. Tsirkin
                   ` (7 preceding siblings ...)
  2016-01-28 17:02 ` [PATCH v5 4/5] x86: use mb() around clflush Michael S. Tsirkin
@ 2016-01-28 17:02 ` Michael S. Tsirkin
  2016-01-28 17:02 ` Michael S. Tsirkin
  9 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-01-28 17:02 UTC (permalink / raw)
  To: linux-kernel, Linus Torvalds
  Cc: Davidlohr Bueso, Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
	Paul E. McKenney, the arch/x86 maintainers, Davidlohr Bueso,
	H. Peter Anvin, virtualization, Borislav Petkov, Andy Lutomirski,
	Ingo Molnar, Andy Lutomirski, Borislav Petkov, Andrey Konovalov

mfence appears to be way slower than a locked instruction - let's use
lock+add unconditionally, as we always did on old 32-bit.

Just poking at SP would be the most natural, but if we
then read the value from SP, we get a false dependency
which will slow us down.

This was noted in this article:
http://shipilev.net/blog/2014/on-the-fence-with-dependencies/

And is easy to reproduce by sticking a barrier in a small non-inline
function.

So let's use a negative offset - which avoids this problem since we
build with the red zone disabled.

Unfortunately there's some code that wants to order clflush instructions
using mb(), so we can't replace that - but smp_mb should be safe
to replace.

Update mb/rmb/wmb on 32 bit to use the negative offset, too, for
consistency.

Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 arch/x86/include/asm/barrier.h | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index bfb28ca..3c6ba1e 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -11,11 +11,11 @@
  */
 
 #ifdef CONFIG_X86_32
-#define mb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "mfence", \
+#define mb() asm volatile(ALTERNATIVE("lock; addl $0,-4(%%esp)", "mfence", \
 				      X86_FEATURE_XMM2) ::: "memory", "cc")
-#define rmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "lfence", \
+#define rmb() asm volatile(ALTERNATIVE("lock; addl $0,-4(%%esp)", "lfence", \
 				       X86_FEATURE_XMM2) ::: "memory", "cc")
-#define wmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "sfence", \
+#define wmb() asm volatile(ALTERNATIVE("lock; addl $0,-4(%%esp)", "sfence", \
 				       X86_FEATURE_XMM2) ::: "memory", "cc")
 #else
 #define mb() 	asm volatile("mfence":::"memory")
@@ -30,7 +30,11 @@
 #endif
 #define dma_wmb()	barrier()
 
-#define __smp_mb()	mb()
+#ifdef CONFIG_X86_32
+#define __smp_mb()	asm volatile("lock; addl $0,-4(%%esp)" ::: "memory", "cc")
+#else
+#define __smp_mb()	asm volatile("lock; addl $0,-4(%%rsp)" ::: "memory", "cc")
+#endif
 #define __smp_rmb()	dma_rmb()
 #define __smp_wmb()	barrier()
 #define __smp_store_mb(var, value) do { (void)xchg(&var, value); } while (0)
-- 
MST

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 4/5] x86: use mb() around clflush
  2016-01-28 17:02 ` [PATCH v5 4/5] x86: use mb() around clflush Michael S. Tsirkin
  2016-01-28 18:25   ` Peter Zijlstra
@ 2016-01-28 18:25   ` Peter Zijlstra
  2016-01-29 11:33     ` tip-bot for Michael S. Tsirkin
  2 siblings, 0 replies; 21+ messages in thread
From: Peter Zijlstra @ 2016-01-28 18:25 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Len Brown, Davidlohr Bueso, Davidlohr Bueso, Mike Galbraith,
	the arch/x86 maintainers, Oleg Nesterov, linux-kernel,
	virtualization, Ingo Molnar, Borislav Petkov, Andy Lutomirski,
	H. Peter Anvin, Thomas Gleixner, Paul E. McKenney, Linus Torvalds,
	Ingo Molnar

On Thu, Jan 28, 2016 at 07:02:51PM +0200, Michael S. Tsirkin wrote:
> commit f8e617f4582995f7c25ef25b4167213120ad122b ("sched/idle/x86:
> Optimize unnecessary mwait_idle() resched IPIs") adds
> memory barriers around clflush, but this seems wrong
> for UP since barrier() has no effect on clflush.
> We really want mfence so switch to mb() instead.
> 
> Cc: Mike Galbraith <bitbucket@online.de>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 4/5] x86: use mb() around clflush
  2016-01-28 17:02 ` [PATCH v5 4/5] x86: use mb() around clflush Michael S. Tsirkin
@ 2016-01-28 18:25   ` Peter Zijlstra
  2016-01-28 18:25   ` Peter Zijlstra
  2016-01-29 11:33     ` tip-bot for Michael S. Tsirkin
  2 siblings, 0 replies; 21+ messages in thread
From: Peter Zijlstra @ 2016-01-28 18:25 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Linus Torvalds, Davidlohr Bueso, Ingo Molnar,
	Thomas Gleixner, Paul E. McKenney, the arch/x86 maintainers,
	Davidlohr Bueso, H. Peter Anvin, virtualization, Borislav Petkov,
	Mike Galbraith, Ingo Molnar, Andy Lutomirski, Oleg Nesterov,
	Len Brown

On Thu, Jan 28, 2016 at 07:02:51PM +0200, Michael S. Tsirkin wrote:
> commit f8e617f4582995f7c25ef25b4167213120ad122b ("sched/idle/x86:
> Optimize unnecessary mwait_idle() resched IPIs") adds
> memory barriers around clflush, but this seems wrong
> for UP since barrier() has no effect on clflush.
> We really want mfence so switch to mb() instead.
> 
> Cc: Mike Galbraith <bitbucket@online.de>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tip:locking/core] locking/x86: Add cc clobber for ADDL
  2016-01-28 17:02 ` [PATCH v5 1/5] x86: add cc clobber for addl Michael S. Tsirkin
@ 2016-01-29 11:32     ` tip-bot for Michael S. Tsirkin
  0 siblings, 0 replies; 21+ messages in thread
From: tip-bot for Michael S. Tsirkin @ 2016-01-29 11:32 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dave, dvlasenk, akpm, mst, peterz, andreyknvl, hpa,
	virtualization, linux-kernel, luto, dbueso, bp, luto, brgerst,
	paulmck, tglx, bp, torvalds, mingo

Commit-ID:  bd922477d9350a3006d73dabb241400e6c4181b0
Gitweb:     http://git.kernel.org/tip/bd922477d9350a3006d73dabb241400e6c4181b0
Author:     Michael S. Tsirkin <mst@redhat.com>
AuthorDate: Thu, 28 Jan 2016 19:02:29 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 29 Jan 2016 09:40:10 +0100

locking/x86: Add cc clobber for ADDL

ADDL clobbers flags (such as CF) but barrier.h didn't tell this
to GCC. Historically, GCC doesn't need one on x86, and always
considers flags clobbered. We are probably missing the cc
clobber in a *lot* of places for this reason.

But even if not necessary, it's probably a good thing to add for
documentation, and in case GCC semantcs ever change.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: virtualization <virtualization@lists.linux-foundation.org>
Link: http://lkml.kernel.org/r/1453921746-16178-2-git-send-email-mst@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/barrier.h | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index a584e1c..a65bdb1 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -15,9 +15,12 @@
  * Some non-Intel clones support out of order store. wmb() ceases to be a
  * nop for these.
  */
-#define mb() alternative("lock; addl $0,0(%%esp)", "mfence", X86_FEATURE_XMM2)
-#define rmb() alternative("lock; addl $0,0(%%esp)", "lfence", X86_FEATURE_XMM2)
-#define wmb() alternative("lock; addl $0,0(%%esp)", "sfence", X86_FEATURE_XMM)
+#define mb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "mfence", \
+				      X86_FEATURE_XMM2) ::: "memory", "cc")
+#define rmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "lfence", \
+				       X86_FEATURE_XMM2) ::: "memory", "cc")
+#define wmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "sfence", \
+				       X86_FEATURE_XMM2) ::: "memory", "cc")
 #else
 #define mb() 	asm volatile("mfence":::"memory")
 #define rmb()	asm volatile("lfence":::"memory")

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip:locking/core] locking/x86: Add cc clobber for ADDL
@ 2016-01-29 11:32     ` tip-bot for Michael S. Tsirkin
  0 siblings, 0 replies; 21+ messages in thread
From: tip-bot for Michael S. Tsirkin @ 2016-01-29 11:32 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mst, andreyknvl, tglx, mingo, linux-kernel, peterz, bp, brgerst,
	dvlasenk, akpm, torvalds, luto, luto, dave, hpa, dbueso, bp,
	virtualization, paulmck

Commit-ID:  bd922477d9350a3006d73dabb241400e6c4181b0
Gitweb:     http://git.kernel.org/tip/bd922477d9350a3006d73dabb241400e6c4181b0
Author:     Michael S. Tsirkin <mst@redhat.com>
AuthorDate: Thu, 28 Jan 2016 19:02:29 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 29 Jan 2016 09:40:10 +0100

locking/x86: Add cc clobber for ADDL

ADDL clobbers flags (such as CF) but barrier.h didn't tell this
to GCC. Historically, GCC doesn't need one on x86, and always
considers flags clobbered. We are probably missing the cc
clobber in a *lot* of places for this reason.

But even if not necessary, it's probably a good thing to add for
documentation, and in case GCC semantcs ever change.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: virtualization <virtualization@lists.linux-foundation.org>
Link: http://lkml.kernel.org/r/1453921746-16178-2-git-send-email-mst@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/barrier.h | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index a584e1c..a65bdb1 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -15,9 +15,12 @@
  * Some non-Intel clones support out of order store. wmb() ceases to be a
  * nop for these.
  */
-#define mb() alternative("lock; addl $0,0(%%esp)", "mfence", X86_FEATURE_XMM2)
-#define rmb() alternative("lock; addl $0,0(%%esp)", "lfence", X86_FEATURE_XMM2)
-#define wmb() alternative("lock; addl $0,0(%%esp)", "sfence", X86_FEATURE_XMM)
+#define mb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "mfence", \
+				      X86_FEATURE_XMM2) ::: "memory", "cc")
+#define rmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "lfence", \
+				       X86_FEATURE_XMM2) ::: "memory", "cc")
+#define wmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "sfence", \
+				       X86_FEATURE_XMM2) ::: "memory", "cc")
 #else
 #define mb() 	asm volatile("mfence":::"memory")
 #define rmb()	asm volatile("lfence":::"memory")

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip:locking/core] locking/x86: Drop a comment left over from X86_OOSTORE
  2016-01-28 17:02 ` Michael S. Tsirkin
@ 2016-01-29 11:32     ` tip-bot for Michael S. Tsirkin
  0 siblings, 0 replies; 21+ messages in thread
From: tip-bot for Michael S. Tsirkin @ 2016-01-29 11:32 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: paulmck, dvlasenk, akpm, dbueso, peterz, andreyknvl, mst, hpa,
	linux-kernel, luto, dave, bp, bp, luto, brgerst, tglx,
	virtualization, torvalds, mingo

Commit-ID:  e37cee133c72c9529f74a20d9b7eb3b6dfb928b5
Gitweb:     http://git.kernel.org/tip/e37cee133c72c9529f74a20d9b7eb3b6dfb928b5
Author:     Michael S. Tsirkin <mst@redhat.com>
AuthorDate: Thu, 28 Jan 2016 19:02:37 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 29 Jan 2016 09:40:10 +0100

locking/x86: Drop a comment left over from X86_OOSTORE

The comment about wmb being non-NOP to deal with non-Intel CPUs
is a left over from before the following commit:

  09df7c4c8097 ("x86: Remove CONFIG_X86_OOSTORE")

It makes no sense now: in particular, wmb() is not a NOP even for
regular Intel CPUs because of weird use-cases e.g. dealing with
WC memory.

Drop this comment.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: virtualization <virtualization@lists.linux-foundation.org>
Link: http://lkml.kernel.org/r/1453921746-16178-3-git-send-email-mst@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/barrier.h | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index a65bdb1..a291745 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -11,10 +11,6 @@
  */
 
 #ifdef CONFIG_X86_32
-/*
- * Some non-Intel clones support out of order store. wmb() ceases to be a
- * nop for these.
- */
 #define mb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "mfence", \
 				      X86_FEATURE_XMM2) ::: "memory", "cc")
 #define rmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "lfence", \

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip:locking/core] locking/x86: Drop a comment left over from X86_OOSTORE
@ 2016-01-29 11:32     ` tip-bot for Michael S. Tsirkin
  0 siblings, 0 replies; 21+ messages in thread
From: tip-bot for Michael S. Tsirkin @ 2016-01-29 11:32 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: luto, tglx, virtualization, luto, dbueso, peterz, linux-kernel,
	mingo, paulmck, mst, dvlasenk, bp, torvalds, akpm, andreyknvl,
	dave, brgerst, bp, hpa

Commit-ID:  e37cee133c72c9529f74a20d9b7eb3b6dfb928b5
Gitweb:     http://git.kernel.org/tip/e37cee133c72c9529f74a20d9b7eb3b6dfb928b5
Author:     Michael S. Tsirkin <mst@redhat.com>
AuthorDate: Thu, 28 Jan 2016 19:02:37 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 29 Jan 2016 09:40:10 +0100

locking/x86: Drop a comment left over from X86_OOSTORE

The comment about wmb being non-NOP to deal with non-Intel CPUs
is a left over from before the following commit:

  09df7c4c8097 ("x86: Remove CONFIG_X86_OOSTORE")

It makes no sense now: in particular, wmb() is not a NOP even for
regular Intel CPUs because of weird use-cases e.g. dealing with
WC memory.

Drop this comment.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: virtualization <virtualization@lists.linux-foundation.org>
Link: http://lkml.kernel.org/r/1453921746-16178-3-git-send-email-mst@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/barrier.h | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index a65bdb1..a291745 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -11,10 +11,6 @@
  */
 
 #ifdef CONFIG_X86_32
-/*
- * Some non-Intel clones support out of order store. wmb() ceases to be a
- * nop for these.
- */
 #define mb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "mfence", \
 				      X86_FEATURE_XMM2) ::: "memory", "cc")
 #define rmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "lfence", \

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip:locking/core] locking/x86: Tweak the comment about use of wmb() for IO
  2016-01-28 17:02 ` [PATCH v5 3/5] x86: tweak the comment about use of wmb for IO Michael S. Tsirkin
@ 2016-01-29 11:32     ` tip-bot for Michael S. Tsirkin
  0 siblings, 0 replies; 21+ messages in thread
From: tip-bot for Michael S. Tsirkin @ 2016-01-29 11:32 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dave, dvlasenk, akpm, brgerst, dbueso, peterz, andreyknvl, mst,
	linux-kernel, virtualization, luto, bp, bp, luto, hpa, tglx,
	paulmck, torvalds, mingo

Commit-ID:  57d9b1b43433a6ba7267c80b87d8e8f6e86edceb
Gitweb:     http://git.kernel.org/tip/57d9b1b43433a6ba7267c80b87d8e8f6e86edceb
Author:     Michael S. Tsirkin <mst@redhat.com>
AuthorDate: Thu, 28 Jan 2016 19:02:44 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 29 Jan 2016 09:40:10 +0100

locking/x86: Tweak the comment about use of wmb() for IO

On x86, we *do* still use the non-NOP rmb()/wmb() for IO barriers,
but even that is generally questionable.

Leave them around as historial unless somebody can point to a
case where they care about the performance, but tweak the
comment so people don't think they are strictly required in all
cases.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: virtualization <virtualization@lists.linux-foundation.org>
Link: http://lkml.kernel.org/r/1453921746-16178-4-git-send-email-mst@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/barrier.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index a291745..bfb28ca 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -6,7 +6,7 @@
 
 /*
  * Force strict CPU ordering.
- * And yes, this is required on UP too when we're talking
+ * And yes, this might be required on UP too when we're talking
  * to devices.
  */

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip:locking/core] locking/x86: Tweak the comment about use of wmb() for IO
@ 2016-01-29 11:32     ` tip-bot for Michael S. Tsirkin
  0 siblings, 0 replies; 21+ messages in thread
From: tip-bot for Michael S. Tsirkin @ 2016-01-29 11:32 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, dbueso, tglx, andreyknvl, hpa, paulmck, peterz,
	dave, brgerst, bp, virtualization, mingo, luto, torvalds, mst,
	akpm, luto, bp, dvlasenk

Commit-ID:  57d9b1b43433a6ba7267c80b87d8e8f6e86edceb
Gitweb:     http://git.kernel.org/tip/57d9b1b43433a6ba7267c80b87d8e8f6e86edceb
Author:     Michael S. Tsirkin <mst@redhat.com>
AuthorDate: Thu, 28 Jan 2016 19:02:44 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 29 Jan 2016 09:40:10 +0100

locking/x86: Tweak the comment about use of wmb() for IO

On x86, we *do* still use the non-NOP rmb()/wmb() for IO barriers,
but even that is generally questionable.

Leave them around as historial unless somebody can point to a
case where they care about the performance, but tweak the
comment so people don't think they are strictly required in all
cases.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: virtualization <virtualization@lists.linux-foundation.org>
Link: http://lkml.kernel.org/r/1453921746-16178-4-git-send-email-mst@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/barrier.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index a291745..bfb28ca 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -6,7 +6,7 @@
 
 /*
  * Force strict CPU ordering.
- * And yes, this is required on UP too when we're talking
+ * And yes, this might be required on UP too when we're talking
  * to devices.
  */
 

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip:locking/core] locking/x86: Use mb() around clflush()
  2016-01-28 17:02 ` [PATCH v5 4/5] x86: use mb() around clflush Michael S. Tsirkin
@ 2016-01-29 11:33     ` tip-bot for Michael S. Tsirkin
  2016-01-28 18:25   ` Peter Zijlstra
  2016-01-29 11:33     ` tip-bot for Michael S. Tsirkin
  2 siblings, 0 replies; 21+ messages in thread
From: tip-bot for Michael S. Tsirkin @ 2016-01-29 11:33 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: len.brown, dave, dvlasenk, akpm, mst, bitbucket, brgerst, dbueso,
	oleg, linux-kernel, virtualization, peterz, luto, bp, luto, hpa,
	tglx, paulmck, torvalds, mingo

Commit-ID:  ca59809ff6d572ae58fc6bedf7500f5a60fdbd64
Gitweb:     http://git.kernel.org/tip/ca59809ff6d572ae58fc6bedf7500f5a60fdbd64
Author:     Michael S. Tsirkin <mst@redhat.com>
AuthorDate: Thu, 28 Jan 2016 19:02:51 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 29 Jan 2016 09:40:10 +0100

locking/x86: Use mb() around clflush()

The following commit:

  f8e617f4582995f ("sched/idle/x86: Optimize unnecessary mwait_idle() resched IPIs")

adds memory barriers around clflush(), but this seems wrong for UP since
barrier() has no effect on clflush().  We really want MFENCE, so switch
to mb() instead.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: virtualization <virtualization@lists.linux-foundation.org>
Link: http://lkml.kernel.org/r/1453921746-16178-5-git-send-email-mst@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/process.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 9f7c21c..9decee2 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -418,9 +418,9 @@ static void mwait_idle(void)
 	if (!current_set_polling_and_test()) {
 		trace_cpu_idle_rcuidle(1, smp_processor_id());
 		if (this_cpu_has(X86_BUG_CLFLUSH_MONITOR)) {
-			smp_mb(); /* quirk */
+			mb(); /* quirk */
 			clflush((void *)&current_thread_info()->flags);
-			smp_mb(); /* quirk */
+			mb(); /* quirk */
 		}
 
 		__monitor((void *)&current_thread_info()->flags, 0, 0);

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip:locking/core] locking/x86: Use mb() around clflush()
@ 2016-01-29 11:33     ` tip-bot for Michael S. Tsirkin
  0 siblings, 0 replies; 21+ messages in thread
From: tip-bot for Michael S. Tsirkin @ 2016-01-29 11:33 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: brgerst, mst, virtualization, linux-kernel, dave, bp, paulmck,
	hpa, mingo, luto, dvlasenk, oleg, bitbucket, len.brown, torvalds,
	peterz, tglx, akpm, luto, dbueso

Commit-ID:  ca59809ff6d572ae58fc6bedf7500f5a60fdbd64
Gitweb:     http://git.kernel.org/tip/ca59809ff6d572ae58fc6bedf7500f5a60fdbd64
Author:     Michael S. Tsirkin <mst@redhat.com>
AuthorDate: Thu, 28 Jan 2016 19:02:51 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 29 Jan 2016 09:40:10 +0100

locking/x86: Use mb() around clflush()

The following commit:

  f8e617f4582995f ("sched/idle/x86: Optimize unnecessary mwait_idle() resched IPIs")

adds memory barriers around clflush(), but this seems wrong for UP since
barrier() has no effect on clflush().  We really want MFENCE, so switch
to mb() instead.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: virtualization <virtualization@lists.linux-foundation.org>
Link: http://lkml.kernel.org/r/1453921746-16178-5-git-send-email-mst@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/process.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 9f7c21c..9decee2 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -418,9 +418,9 @@ static void mwait_idle(void)
 	if (!current_set_polling_and_test()) {
 		trace_cpu_idle_rcuidle(1, smp_processor_id());
 		if (this_cpu_has(X86_BUG_CLFLUSH_MONITOR)) {
-			smp_mb(); /* quirk */
+			mb(); /* quirk */
 			clflush((void *)&current_thread_info()->flags);
-			smp_mb(); /* quirk */
+			mb(); /* quirk */
 		}
 
 		__monitor((void *)&current_thread_info()->flags, 0, 0);

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2016-01-29 11:34 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-28 17:02 [PATCH v5 0/5] x86: faster smp_mb()+documentation tweaks Michael S. Tsirkin
2016-01-28 17:02 ` [PATCH v5 1/5] x86: add cc clobber for addl Michael S. Tsirkin
2016-01-29 11:32   ` [tip:locking/core] locking/x86: Add cc clobber for ADDL tip-bot for Michael S. Tsirkin
2016-01-29 11:32     ` tip-bot for Michael S. Tsirkin
2016-01-28 17:02 ` [PATCH v5 1/5] x86: add cc clobber for addl Michael S. Tsirkin
2016-01-28 17:02 ` [PATCH v5 2/5] x86: drop a comment left over from X86_OOSTORE Michael S. Tsirkin
2016-01-28 17:02 ` Michael S. Tsirkin
2016-01-29 11:32   ` [tip:locking/core] locking/x86: Drop " tip-bot for Michael S. Tsirkin
2016-01-29 11:32     ` tip-bot for Michael S. Tsirkin
2016-01-28 17:02 ` [PATCH v5 3/5] x86: tweak the comment about use of wmb for IO Michael S. Tsirkin
2016-01-29 11:32   ` [tip:locking/core] locking/x86: Tweak the comment about use of wmb() " tip-bot for Michael S. Tsirkin
2016-01-29 11:32     ` tip-bot for Michael S. Tsirkin
2016-01-28 17:02 ` [PATCH v5 3/5] x86: tweak the comment about use of wmb " Michael S. Tsirkin
2016-01-28 17:02 ` [PATCH v5 4/5] x86: use mb() around clflush Michael S. Tsirkin
2016-01-28 18:25   ` Peter Zijlstra
2016-01-28 18:25   ` Peter Zijlstra
2016-01-29 11:33   ` [tip:locking/core] locking/x86: Use mb() around clflush() tip-bot for Michael S. Tsirkin
2016-01-29 11:33     ` tip-bot for Michael S. Tsirkin
2016-01-28 17:02 ` [PATCH v5 4/5] x86: use mb() around clflush Michael S. Tsirkin
2016-01-28 17:02 ` [PATCH v5 5/5] x86: drop mfence in favor of lock+addl Michael S. Tsirkin
2016-01-28 17:02 ` Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.