public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Milton Miller <miltonm@bga.com>
Cc: akpm@linux-foundation.org, Anton Blanchard <anton@samba.org>,
	xiaoguangrong@cn.fujitsu.com, mingo@elte.hu, jaxboe@fusionio.com,
	npiggin@gmail.com, rusty@rustcorp.com.au,
	torvalds@linux-foundation.org, paulmck@linux.vnet.ibm.com,
	benh@kernel.crashing.org, linux-kernel@vger.kernel.org
Subject: Re: call_function_many: fix list delete vs add race
Date: Mon, 31 Jan 2011 21:39:56 +0100	[thread overview]
Message-ID: <1296506396.26581.76.camel@laptop> (raw)
In-Reply-To: <smp-cfm-list-comment@mdm.bga.com>

On Mon, 2011-01-31 at 14:26 -0600, Milton Miller wrote:
> On Mon, 31 Jan 2011 about 11:27:45 +0100, Peter Zijlstra wrote: 
> > On Fri, 2011-01-28 at 18:20 -0600, Milton Miller wrote:
> > > Peter pointed out there was nothing preventing the list_del_rcu in
> > > smp_call_function_interrupt from running before the list_add_rcu in
> > > smp_call_function_many.   Fix this by not setting refs until we have put
> > > the entry on the list.  We can use the lock acquire and release instead
> > > of a wmb.
> > > 
> > > Reported-by: Peter Zijlstra <peterz@infradead.org>
> > > Signed-off-by: Milton Miller <miltonm@bga.com>
> > > Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > ---
> > > 
> > > I tried to force this race with a udelay before the lock & list_add and
> > > by mixing all 64 online cpus with just 3 random cpus in the mask, but
> > > was unsuccessful.  Still, it seems to be a valid race, and the fix
> > > is a simple change to the current code.
> > 
> > Yes, I think this will fix it, I think simply putting that assignment
> > under the lock is sufficient, because then the list removal will
> > serialize again the list add. But placing it after the list add does
> > also seem sufficient.
> > 
> > Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > 
> 
> I was worried some architectures would allow a write before the spinlock
> to drop into the spinlock region,

That is indeed allowed to happen

>  in which case the data or function
> pointer could be found stale with the cpu mask bit set. 

But that is ok, right? the atomic_read(->refs) test will fail and we'll
continue.

>  The unlock
> must flush all prior writes and 

and reads

> therefore the new function and data
> will be seen before refs is set.


Which again should be just fine, given the interrupt does:

if (!cpumask_test_cpu())
	continue

rmb

if (!atomic_read())
	continue

and thus we'll be on our happy merry way. If we do however observe the
new ->refs value we have already acquired the lock on the sending end
and the spinlock before the list_del_rcu() will serialize against it
such that we'll always finish the list_add_rcu() before executing the
del.

Or am I not quite understanding things?


  reply	other threads:[~2011-01-31 20:39 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-12  4:07 [PATCH] smp_call_function_many SMP race Anton Blanchard
2011-01-17 18:17 ` Peter Zijlstra
2011-01-18 21:05   ` Milton Miller
2011-01-18 21:06     ` [PATCH 2/2] consolidate writes in smp_call_funtion_interrupt Milton Miller
2011-01-27 16:22       ` Peter Zijlstra
2011-01-27 21:59         ` Milton Miller
2011-01-29  0:20           ` call_function_many: fix list delete vs add race Milton Miller
2011-01-31  7:21             ` Mike Galbraith
2011-01-31 20:26               ` [PATCH] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-02-01  3:15                 ` Mike Galbraith
2011-01-31 10:27             ` call_function_many: fix list delete vs add race Peter Zijlstra
2011-01-31 20:26               ` Milton Miller
2011-01-31 20:39                 ` Peter Zijlstra [this message]
2011-01-31 21:17             ` Peter Zijlstra
2011-01-31 21:36               ` Milton Miller
2011-02-01  0:22               ` Benjamin Herrenschmidt
2011-02-01  1:39                 ` Linus Torvalds
2011-02-01  2:18                   ` Paul E. McKenney
2011-02-01  2:43                     ` Linus Torvalds
2011-02-01  4:45                       ` Paul E. McKenney
2011-02-01  5:46                         ` Linus Torvalds
2011-02-01  6:18                           ` Benjamin Herrenschmidt
2011-02-01 14:13                           ` Paul E. McKenney
2011-02-01  6:16                       ` Benjamin Herrenschmidt
     [not found]             ` <ipi-list-reply@mdm.bga.com>
2011-02-01  7:12               ` [PATCH 1/3 v2] " Milton Miller
2011-02-01 22:00                 ` Paul E. McKenney
2011-02-01 22:00                   ` Milton Miller
2011-02-02  4:17                     ` Paul E. McKenney
2011-02-06 23:51                       ` Paul E. McKenney
2011-03-15 19:27                         ` [PATCH 0/4 v3] smp_call_function_many issues from review Milton Miller
2011-03-15 20:22                           ` Luck, Tony
2011-03-15 20:32                             ` Dimitri Sivanich
2011-03-15 20:39                           ` Peter Zijlstra
2011-03-16 17:55                           ` Linus Torvalds
2011-03-16 18:13                             ` Peter Zijlstra
2011-03-17  3:15                           ` Mike Galbraith
2011-02-07  8:12                       ` [PATCH 1/3 v2] call_function_many: fix list delete vs add race Mike Galbraith
2011-02-08 19:36                         ` Paul E. McKenney
2011-08-21  6:17                           ` Mike Galbraith
2011-02-02  6:22                     ` Mike Galbraith
2011-02-01  7:12               ` [PATCH 2/3 v2] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-03-15 19:27               ` [PATCH 2/4 v3] call_function_many: add missing ordering Milton Miller
2011-03-16 12:06                 ` Paul E. McKenney
2011-03-15 19:27               ` [PATCH 1/4 v3] call_function_many: fix list delete vs add race Milton Miller
2011-03-15 19:27               ` [PATCH 4/4 v3] smp_call_function_interrupt: use typedef and %pf Milton Miller
2011-03-15 19:27               ` [PATCH 3/4 v3] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-03-15 22:32                 ` Catalin Marinas
2011-03-16  7:52                 ` Jan Beulich
2011-01-18 21:07     ` [PATCH 1/2] smp_call_function_many SMP race Milton Miller
2011-01-20  0:41       ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1296506396.26581.76.camel@laptop \
    --to=peterz@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=anton@samba.org \
    --cc=benh@kernel.crashing.org \
    --cc=jaxboe@fusionio.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miltonm@bga.com \
    --cc=mingo@elte.hu \
    --cc=npiggin@gmail.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=rusty@rustcorp.com.au \
    --cc=torvalds@linux-foundation.org \
    --cc=xiaoguangrong@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox