Re: call_function_many: fix list delete vs add race

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Milton Miller <miltonm@bga.com>,
	akpm@linux-foundation.org, Anton Blanchard <anton@samba.org>,
	xiaoguangrong@cn.fujitsu.com, mingo@elte.hu, jaxboe@fusionio.com,
	npiggin@gmail.com, rusty@rustcorp.com.au,
	linux-kernel@vger.kernel.org
Subject: Re: call_function_many: fix list delete vs add race
Date: Mon, 31 Jan 2011 18:18:31 -0800	[thread overview]
Message-ID: <20110201021831.GB2158@linux.vnet.ibm.com> (raw)
In-Reply-To: <AANLkTikEo4Bbqn27nCO_xjwEuy9rkoqBjoOb_HeGw18D@mail.gmail.com>

On Tue, Feb 01, 2011 at 11:39:13AM +1000, Linus Torvalds wrote:
> On Tue, Feb 1, 2011 at 10:22 AM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
> > On Mon, 2011-01-31 at 22:17 +0100, Peter Zijlstra wrote:
> >> That's wrong:
> >>
> >>  ->foo =
> >>  LOCK
> >>  UNLOCK
> >>  ->bar =
> >>
> >> can be re-ordered as:
> >>
> >>  LOCK
> >>  ->bar =
> >>  ->foo =
> >>  UNLOCK
> >
> > Can it ? I though UNLOCK had a write barrier semantic ?
> 
> The re-ordering of ->foo and UNLOCK that Peter claims is definitely
> possible, and the unlock is not guaranteed to be a write barrier. The
> unlock just guarantees "release" consistency, which means that no
> previous reads or writes will be migrated down from within the locked
> region, but it is very much a one-way barrier, so a subsequent write -
> or more commonly a read - could easily migrate up past the unlock.
> 
> (Same for lock, except the permeability is obviously the other way
> around - "acquire" consistency)
> 
> > So yes, ->bar = can leak into the lock, as can ->foo =, but they can't
> > be re-ordered vs. each other because the implied barrier will keep ->foo
> > = in the same "domain" as the unlock itself.
> >
> > Or do other archs do something really nasty here that don't provide this
> > guarantee ?
> 
> I think we actually allow the accesses to ->bar and ->foo to be
> re-ordered wrt each other, exactly because they can *both* get
> re-ordered first into the locked region, and then after that they can
> get re-ordered wrt each other (because there is no other memory
> barrier).
> 
> So a "unlock+lock" is guaranteed to be equivalent to a full memory
> barrier (because an operation before the unlock cannot pass the
> unlock, and an access after the lock cannot percolate up before it).
> But the regular "lock+unlock" sequence is not, exactly because
> accesses outside of it are allowed to first leak in, and then not have
> ordering constraints within the locked region.
> 
> That said, we may have some confusion there, and I would *STRONGLY*
> suggest that architectures should have stronger lock consistency
> guarantees than the theoretical ones. Especially since x86 has such
> strong rules, and locking is a full memory barrier. Anybody with very
> weak ordering is likely to just hit more bugs.
> 
> And so while I'm not sure it's ever been documented, I do think it is
> likely a good idea to just make sure that "lock+unlock" is a full
> memory barrier, the same way "unlock+lock" is. I think it's
> practically true anyway on all architectures (with the possible
> exception of ia64, which I think actually implements real
> acquire/release semantics)

Documentation/memory-barriers.txt specifies this.

	Therefore, from (1), (2) and (4) an UNLOCK followed by an
	unconditional LOCK is equivalent to a full barrier, but a LOCK
	followed by an UNLOCK is not.

							Thanx, Paul

> (Practically speaking, there really aren't many reasons to allow
> writes to be re-ordered wrt lock/unlock operations, and there is no
> reason to ever move reads later, only earlier. Writes are _not_
> performance-sensitive the way reads are, so there is no real reason to
> really take advantage of the one-way permeability for them. It's reads
> that you really want to speculate and do early, not writes, and so
> allowing a read after a UNLOCK to percolate into the critical region
> is really the only relevant case from a performance perspective).
> 
>                     Linus

next prev parent reply	other threads:[~2011-02-01  2:18 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-12  4:07 [PATCH] smp_call_function_many SMP race Anton Blanchard
2011-01-17 18:17 ` Peter Zijlstra
2011-01-18 21:05   ` Milton Miller
2011-01-18 21:06     ` [PATCH 2/2] consolidate writes in smp_call_funtion_interrupt Milton Miller
2011-01-27 16:22       ` Peter Zijlstra
2011-01-27 21:59         ` Milton Miller
2011-01-29  0:20           ` call_function_many: fix list delete vs add race Milton Miller
2011-01-31  7:21             ` Mike Galbraith
2011-01-31 20:26               ` [PATCH] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-02-01  3:15                 ` Mike Galbraith
2011-01-31 10:27             ` call_function_many: fix list delete vs add race Peter Zijlstra
2011-01-31 20:26               ` Milton Miller
2011-01-31 20:39                 ` Peter Zijlstra
2011-01-31 21:17             ` Peter Zijlstra
2011-01-31 21:36               ` Milton Miller
2011-02-01  0:22               ` Benjamin Herrenschmidt
2011-02-01  1:39                 ` Linus Torvalds
2011-02-01  2:18                   ` Paul E. McKenney [this message]
2011-02-01  2:43                     ` Linus Torvalds
2011-02-01  4:45                       ` Paul E. McKenney
2011-02-01  5:46                         ` Linus Torvalds
2011-02-01  6:18                           ` Benjamin Herrenschmidt
2011-02-01 14:13                           ` Paul E. McKenney
2011-02-01  6:16                       ` Benjamin Herrenschmidt
     [not found]             ` <ipi-list-reply@mdm.bga.com>
2011-02-01  7:12               ` [PATCH 1/3 v2] " Milton Miller
2011-02-01 22:00                 ` Paul E. McKenney
2011-02-01 22:00                   ` Milton Miller
2011-02-02  4:17                     ` Paul E. McKenney
2011-02-06 23:51                       ` Paul E. McKenney
2011-03-15 19:27                         ` [PATCH 0/4 v3] smp_call_function_many issues from review Milton Miller
2011-03-15 20:22                           ` Luck, Tony
2011-03-15 20:32                             ` Dimitri Sivanich
2011-03-15 20:39                           ` Peter Zijlstra
2011-03-16 17:55                           ` Linus Torvalds
2011-03-16 18:13                             ` Peter Zijlstra
2011-03-17  3:15                           ` Mike Galbraith
2011-02-07  8:12                       ` [PATCH 1/3 v2] call_function_many: fix list delete vs add race Mike Galbraith
2011-02-08 19:36                         ` Paul E. McKenney
2011-08-21  6:17                           ` Mike Galbraith
2011-02-02  6:22                     ` Mike Galbraith
2011-02-01  7:12               ` [PATCH 2/3 v2] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-03-15 19:27               ` [PATCH 1/4 v3] call_function_many: fix list delete vs add race Milton Miller
2011-03-15 19:27               ` [PATCH 2/4 v3] call_function_many: add missing ordering Milton Miller
2011-03-16 12:06                 ` Paul E. McKenney
2011-03-15 19:27               ` [PATCH 3/4 v3] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-03-15 22:32                 ` Catalin Marinas
2011-03-16  7:52                 ` Jan Beulich
2011-03-15 19:27               ` [PATCH 4/4 v3] smp_call_function_interrupt: use typedef and %pf Milton Miller
2011-01-18 21:07     ` [PATCH 1/2] smp_call_function_many SMP race Milton Miller
2011-01-20  0:41       ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110201021831.GB2158@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=anton@samba.org \
    --cc=benh@kernel.crashing.org \
    --cc=jaxboe@fusionio.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miltonm@bga.com \
    --cc=mingo@elte.hu \
    --cc=npiggin@gmail.com \
    --cc=peterz@infradead.org \
    --cc=rusty@rustcorp.com.au \
    --cc=torvalds@linux-foundation.org \
    --cc=xiaoguangrong@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).