linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Ingo Molnar <mingo@elte.hu>,
	Andrew Morton <akpm@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Waiman Long <waiman.long@hp.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Alex Shi <alex.shi@linaro.org>, Andi Kleen <andi@firstfloor.org>,
	Michel Lespinasse <walken@google.com>,
	Davidlohr Bueso <davidlohr.bueso@hp.com>,
	Matthew R Wilcox <matthew.r.wilcox@intel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Rik van Riel <riel@redhat.com>,
	Peter Hurley <peter@hurleysoftware.com>,
	Raghavendra
Subject: Re: [PATCH v6 4/5] MCS Lock: Barrier corrections
Date: Thu, 21 Nov 2013 14:18:59 -0800	[thread overview]
Message-ID: <20131121221859.GH4138@linux.vnet.ibm.com> (raw)
In-Reply-To: <20131121215249.GZ16796@laptop.programming.kicks-ass.net>

On Thu, Nov 21, 2013 at 10:52:49PM +0100, Peter Zijlstra wrote:
> On Thu, Nov 21, 2013 at 09:25:58AM -0800, Paul E. McKenney wrote:
> > I am still thinking not, at least for x86, given Section 8.2.2 of
> > "Intel® 64 and IA-32 Architectures Developer's Manual: Vol. 3A"
> > dated March 2013 from:
> >
> > http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-vol-3a-part-1-manual.html
> >
> > Also from Sewell et al. "x86-TSO: A Rigorous and Usable Programmer's
> > Model for x86 Multiprocessors" in 2010 CACM.
> 
> Should be this one:
> 
>   http://www.cl.cam.ac.uk/~pes20/weakmemory/cacm.pdf
> 
> And the rules referenced below are on page 5; left-hand column.

Yep, that is the one!  (I was relying on my ACM Digital Library
subscription.)

> > Let's apply the Intel manual to the earlier example:
> >
> >	CPU 0		CPU 1			CPU 2
> >	-----		-----			-----
> >	x = 1;		r1 = SLA(lock);		y = 1;
> >	SSR(lock, 1);	r2 = y;			smp_mb();
> >						r3 = x;
> >
> >	assert(!(r1 == 1 && r2 == 0 && r3 == 0));
> >
> > Let's try applying this to x86:
> >
> > o	Stores from a given processor are ordered, so everyone
> >	agrees that CPU 0's store to x happens before the store-release
> >	to lock.
> >
> > o	Reads from a given processor are ordered, so everyone agrees
> >	that CPU 1's load from lock precedes its load from y.
> >
> > o	Because r1 == 1, we know that CPU 0's store to lock happened
> >	before CPU 1's load from lock.
> >
> > o	Causality (AKA transitive visibility) means that everyone
> >	agrees that CPU 0's store to x happens before CPU 1's load
> >	from y.  (This is a crucial point, so it would be good to
> >	have confirmation/debunking from someone who understands
> >	the x86 architecture better than I do.)
> >
> > o	CPU 2's memory barrier prohibits CPU 2's store to y from
> >	being reordered with its load from x.
> >
> > o	Because r2 == 0, we know that CPU 1's load from y happened
> >	before CPU 2's store to y.
> >
> > o	At this point, it looks to me that (r1 == 1 && r2 == 0)
> >	implies r3 == 1.
> >
> > Sewell's model plays out as follows:
> >
> > o	Rule 2 never applies in this example because no processor
> >	is reading its own write.
> >
> > o	Rules 3 and 4 force CPU 0's writes to be seen in order.
> >
> > o	Rule 1 combined with the ordered-instruction nature of
> >	the model force CPU 1's reads to happen in order.
> >
> > o	Rule 4 means that if r1 == 1, CPU 0's write to x is
> >	globally visible before CPU 1 loads from y.
> >
> > o	The fact that r2 == 0 combined with rules 1, 3, and 4
> >	mean that CPU 1's load from y happens before CPU 2 makes
> >	its store to y visible.
> >
> > o	Rule 5 means that CPU 1 cannot execute its load from x
> >	until it has made its store to y globally visible.
> >
> > o	Therefore, when CPU 2 executes its load from x, CPU 0's
> >	store to x must be visible, ruling out r3 == 0, and
> >	preventing the assertion from firing.
> >
> > The other three orderings would play out similarly.  (These are read
> > before lock release and read after subsequent lock acquisition, read
> > before lock release and write after subsequent lock acquisition, and
> > read before lock release and read after subsequent lock acquisition.)
> >
> > But before chasing those down, is the analysis above sound?
> 
> I _think_ so.. but its late. I'm also struggling to find where lwsync
> goes bad in this story, because as you say lwsync does all except flush
> the store buffer, which sounds like TSO.. but clearly is not quite the
> same.
> 
> For one TSO has multi-copy atomicity, whereas ARM/PPC do not have this.

At least PPC lwsync does not have multi-copy atomicity.  The heavier sync
instruction does.  Last I checked, ARM did not have a direct replacment
for lwsync, though it does have something sort of like eieio.

But yes, we were trying to use lwsync for both smp_load_acquire() and
smp_store_release(), which does not provide exactly the same guarantees
that TSO does.  Though it does come close in many cases.

> The explanation of lwsync given in 3.3 of 'A Tutorial Introduction to
> the ARM and POWER Relaxed Memory Models'
> 
>   http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf
> 
> Leaves me slightly puzzled as to the exact differences between the 2 WW
> variants.

The WW at the top of the page is discussing the ARM "dmb" instruction
and the PPC "sync" instruction, both of which are full barriers, and
both of which can therefore be thought of as flushing the store buffer.

In contrast, the WW at the bottom of the page is discussing the PPC
lwsync instruction, which most definitely is not guaranteed to flush
the write buffer.

> Anyway, hopefully a little sleep will cure some of my confusion,
> otherwise I'll try and confuse you more tomorrow ;-)

Fair enough!  ;-)

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-11-21 22:18 UTC|newest]

Thread overview: 123+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <cover.1384885312.git.tim.c.chen@linux.intel.com>
2013-11-20  1:37 ` [PATCH v6 0/5] MCS Lock: MCS lock code cleanup and optimizations Tim Chen
2013-11-20  1:37   ` Tim Chen
2013-11-20 10:19   ` Will Deacon
2013-11-20 12:50     ` Paul E. McKenney
2013-11-20 17:00       ` Will Deacon
2013-11-20 17:14         ` Paul E. McKenney
2013-11-20 17:00     ` Tim Chen
2013-11-20 17:16       ` Paul E. McKenney
2013-11-20  1:37 ` [PATCH v6 1/5] MCS Lock: Restructure the MCS lock defines and locking code into its own file Tim Chen
2013-11-20  1:37   ` Tim Chen
2013-11-20  1:37 ` [PATCH v6 2/5] MCS Lock: optimizations and extra comments Tim Chen
2013-11-20  1:37   ` Tim Chen
2013-11-20  1:37 ` [PATCH v6 3/5] MCS Lock: Move mcs_lock/unlock function into its own file Tim Chen
2013-11-20  1:37   ` Tim Chen
2013-11-20  1:37 ` [PATCH v6 4/5] MCS Lock: Barrier corrections Tim Chen
2013-11-20  1:37   ` Tim Chen
2013-11-20 15:31   ` Paul E. McKenney
2013-11-20 15:31     ` Paul E. McKenney
2013-11-20 15:46     ` Will Deacon
2013-11-20 17:14       ` Paul E. McKenney
2013-11-20 18:43         ` Tim Chen
2013-11-20 19:06           ` Paul E. McKenney
2013-11-20 20:36             ` Tim Chen
2013-11-20 21:44               ` Paul E. McKenney
2013-11-20 23:51                 ` Tim Chen
2013-11-21  4:53                   ` Paul E. McKenney
2013-11-21 10:17                     ` Will Deacon
2013-11-21 13:16                       ` Paul E. McKenney
2013-11-21 10:45                     ` Peter Zijlstra
2013-11-21 13:18                       ` Paul E. McKenney
2013-11-21 22:27                     ` Linus Torvalds
2013-11-21 22:52                       ` Paul E. McKenney
2013-11-22  0:09                         ` Linus Torvalds
2013-11-22  4:08                           ` Paul E. McKenney
2013-11-22  4:25                             ` Linus Torvalds
2013-11-22  6:23                               ` Paul E. McKenney
2013-11-22 15:16                                 ` Ingo Molnar
2013-11-22 18:49                                   ` Paul E. McKenney
2013-11-22 19:06                                     ` Linus Torvalds
2013-11-22 20:06                                       ` Paul E. McKenney
2013-11-22 20:09                                         ` Linus Torvalds
2013-11-22 20:37                                           ` Paul E. McKenney
2013-11-22 21:01                                             ` Linus Torvalds
2013-11-22 21:52                                               ` Paul E. McKenney
2013-11-22 22:19                                                 ` Linus Torvalds
2013-11-23  0:25                                                   ` Paul E. McKenney
2013-11-23  0:42                                                     ` Linus Torvalds
2013-11-23  1:36                                                       ` Paul E. McKenney
2013-11-23  2:11                                                         ` Linus Torvalds
2013-11-23  4:05                                                           ` Paul E. McKenney
2013-11-23 11:24                                                             ` Ingo Molnar
2013-11-23 17:06                                                               ` Paul E. McKenney
2013-11-26 12:02                                                                 ` Ingo Molnar
2013-11-26 19:28                                                                   ` Paul E. McKenney
2013-11-23 20:21                                                         ` Linus Torvalds
2013-11-23 20:39                                                           ` Linus Torvalds
2013-11-25 12:09                                                             ` Peter Zijlstra
2013-11-25 17:18                                                               ` Will Deacon
2013-11-25 17:56                                                                 ` Paul E. McKenney
2013-11-25 17:54                                                             ` Paul E. McKenney
2013-11-23 21:29                                                           ` Peter Zijlstra
2013-11-23 22:24                                                             ` Linus Torvalds
2013-11-25 17:53                                                           ` Paul E. McKenney
2013-11-25 18:21                                                             ` Peter Zijlstra
2013-11-21 11:03         ` Peter Zijlstra
2013-11-21 12:56           ` Peter Zijlstra
2013-11-21 13:20             ` Paul E. McKenney
2013-11-21 17:25               ` Paul E. McKenney
2013-11-21 21:52                 ` Peter Zijlstra
2013-11-21 22:18                   ` Paul E. McKenney [this message]
2013-11-22 15:58                     ` Peter Zijlstra
2013-11-22 18:26                       ` Paul E. McKenney
2013-11-22 18:51                         ` Peter Zijlstra
2013-11-22 18:59                           ` Paul E. McKenney
2013-11-25 17:35                           ` Peter Zijlstra
2013-11-25 18:02                             ` Paul E. McKenney
2013-11-25 18:24                               ` Peter Zijlstra
2013-11-25 18:34                                 ` Tim Chen
2013-11-25 18:27                               ` Peter Zijlstra
2013-11-25 23:52                                 ` Paul E. McKenney
2013-11-26  9:59                                   ` Peter Zijlstra
2013-11-26 17:11                                     ` Paul E. McKenney
2013-11-26 17:18                                       ` Peter Zijlstra
2013-11-26 19:00                                     ` Linus Torvalds
2013-11-26 19:20                                       ` Paul E. McKenney
2013-11-26 19:32                                         ` Linus Torvalds
2013-11-26 22:51                                           ` Paul E. McKenney
2013-11-26 23:58                                             ` Linus Torvalds
2013-11-27  0:21                                               ` Thomas Gleixner
2013-11-27  0:39                                               ` Paul E. McKenney
2013-11-27  1:05                                                 ` Linus Torvalds
2013-11-27  1:31                                                   ` Paul E. McKenney
2013-11-27 10:16                                             ` Will Deacon
2013-11-27 17:11                                               ` Paul E. McKenney
2013-11-28 11:40                                                 ` Will Deacon
2013-11-28 17:38                                                   ` Paul E. McKenney
2013-11-28 18:03                                                     ` Will Deacon
2013-11-28 18:27                                                       ` Paul E. McKenney
2013-11-28 18:53                                                         ` Will Deacon
2013-11-28 19:50                                                           ` Paul E. McKenney
2013-11-29 16:17                                                             ` Will Deacon
2013-11-29 16:44                                                               ` Linus Torvalds
2013-11-29 18:18                                                                 ` Will Deacon
2013-11-30 17:38                                                                 ` Paul E. McKenney
2013-11-26 19:21                                       ` Peter Zijlstra
2013-11-27 16:58                                         ` Oleg Nesterov
2013-11-26 23:08                                       ` Benjamin Herrenschmidt
2013-11-25 23:55                               ` H. Peter Anvin
2013-11-26  3:16                                 ` Paul E. McKenney
2013-11-27  0:46                                   ` H. Peter Anvin
2013-11-27  1:07                                     ` Linus Torvalds
2013-11-27  1:27                                     ` Paul E. McKenney
2013-11-27  2:59                                       ` H. Peter Anvin
2013-11-25 18:52                             ` H. Peter Anvin
2013-11-25 22:58                               ` Tim Chen
2013-11-25 23:28                                 ` H. Peter Anvin
2013-11-25 23:51                                   ` Paul E. McKenney
2013-11-25 23:36                               ` Paul E. McKenney
2013-12-04 21:26                 ` Andi Kleen
2013-12-04 22:07                   ` Paul E. McKenney
2013-11-21 13:19           ` Paul E. McKenney
2013-11-20  1:37 ` [PATCH v6 5/5] MCS Lock: Allows for architecture specific mcs lock and unlock Tim Chen
2013-11-20  1:37   ` Tim Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131121221859.GH4138@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@linaro.org \
    --cc=andi@firstfloor.org \
    --cc=dave.hansen@intel.com \
    --cc=davidlohr.bueso@hp.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew.r.wilcox@intel.com \
    --cc=mingo@elte.hu \
    --cc=peter@hurleysoftware.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=waiman.long@hp.com \
    --cc=walken@google.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).