From: Nick Piggin <npiggin@suse.de>
To: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andi Kleen <ak@suse.de>
Subject: [rfc][patch 3/3] x86: optimise barriers
Date: Thu, 4 Oct 2007 07:23:48 +0200 [thread overview]
Message-ID: <20071004052348.GC15131@wotan.suse.de> (raw)
In-Reply-To: <20071004052153.GA15131@wotan.suse.de>
According to latest memory ordering specification documents from Intel and
AMD, both manufacturers are committed to in-order loads from cacheable memory
for the x86 architecture. Hence, smp_rmb() may be a simple barrier.
Also according to those documents, and according to existing practice in Linux
(eg. spin_unlock doesn't enforce ordering), stores to cacheable memory are
visible in program order too. Special string stores are safe -- their
constituent stores may be out of order, but they must complete in order WRT
surrounding stores. Nontemporal stores to WB memory can go out of order, and so
they should be fenced explicitly to make them appear in-order WRT other stores.
Hence, smp_wmb() may be a simple barrier.
http://developer.intel.com/products/processor/manuals/318147.pdf
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24593.pdf
In userspace microbenchmarks on a core2 system, fence instructions range
anywhere from around 15 cycles to 50, which may not be totally insignificant
in performance critical paths (code size will go down too).
However the primary motivation for this is to have the canonical barrier
implementation for x86 architecture.
smp_rmb on buggy pentium pros remains a locked op, which is apparently
required.
Signed-off-by: Nick Piggin <npiggin@suse.de>
---
Index: linux-2.6/include/asm-i386/system.h
===================================================================
--- linux-2.6.orig/include/asm-i386/system.h
+++ linux-2.6/include/asm-i386/system.h
@@ -274,7 +274,11 @@ static inline unsigned long get_limit(un
#ifdef CONFIG_SMP
#define smp_mb() mb()
-#define smp_rmb() rmb()
+#ifdef CONFIG_X86_PPRO_FENCE
+# define smp_rmb() rmb()
+#else
+# define smp_rmb() barrier()
+#endif
#ifdef CONFIG_X86_OOSTORE
# define smp_wmb() wmb()
#else
Index: linux-2.6/include/asm-x86_64/system.h
===================================================================
--- linux-2.6.orig/include/asm-x86_64/system.h
+++ linux-2.6/include/asm-x86_64/system.h
@@ -141,8 +141,8 @@ static inline void write_cr8(unsigned lo
#ifdef CONFIG_SMP
#define smp_mb() mb()
-#define smp_rmb() rmb()
-#define smp_wmb() wmb()
+#define smp_rmb() barrier()
+#define smp_wmb() barrier()
#define smp_read_barrier_depends() do {} while(0)
#else
#define smp_mb() barrier()
next prev parent reply other threads:[~2007-10-04 5:23 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-10-04 5:21 [rfc][patch 1/3] x86_64: fence nontemproal stores Nick Piggin
2007-10-04 5:22 ` [rfc][patch 2/3] x86: fix IO write barriers Nick Piggin
2007-10-04 17:32 ` Dave Jones
2007-10-04 17:53 ` Andi Kleen
2007-10-04 18:10 ` Dave Jones
2007-10-04 18:21 ` Andi Kleen
2007-10-04 18:41 ` Dave Jones
2007-10-04 18:58 ` Andi Kleen
2007-10-04 19:08 ` Dave Jones
2007-10-04 20:52 ` Alan Cox
2007-10-04 5:23 ` Nick Piggin [this message]
2007-10-12 8:25 ` [rfc][patch 3/3] x86: optimise barriers Jarek Poplawski
2007-10-12 8:42 ` Helge Hafting
2007-10-12 9:12 ` Jarek Poplawski
2007-10-12 9:44 ` Nick Piggin
2007-10-12 10:04 ` Jarek Poplawski
2007-10-12 12:44 ` Helge Hafting
2007-10-12 13:29 ` Jarek Poplawski
2007-10-15 10:17 ` Helge Hafting
2007-10-15 11:53 ` Jarek Poplawski
2007-10-12 8:57 ` Nick Piggin
2007-10-12 9:55 ` Jarek Poplawski
2007-10-12 10:42 ` Nick Piggin
2007-10-12 11:55 ` Jarek Poplawski
2007-10-12 12:10 ` Jarek Poplawski
2007-10-12 15:13 ` Linus Torvalds
2007-10-15 7:44 ` Jarek Poplawski
2007-10-15 8:09 ` Nick Piggin
2007-10-15 9:10 ` Jarek Poplawski
2007-10-15 9:24 ` Jarek Poplawski
2007-10-16 0:50 ` Nick Piggin
2007-10-16 9:00 ` Jarek Poplawski
2007-10-16 9:14 ` david
2007-10-16 12:49 ` Jarek Poplawski
2007-10-15 14:38 ` David Schwartz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20071004052348.GC15131@wotan.suse.de \
--to=npiggin@suse.de \
--cc=ak@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox