All of lore.kernel.org
 help / color / mirror / Atom feed
From: Milton Miller <miltonm@bga.com>
To: Peter Zijlstra <peterz@infradead.org>,
	Anton Blanchard <anton@samba.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: xiaoguangrong@cn.fujitsu.com, mingo@elte.hu, jaxboe@fusionio.com,
	npiggin@gmail.com, rusty@rustcorp.com.au,
	akpm@linux-foundation.org, torvalds@linux-foundation.org,
	miltonm@bga.com, benh@kernel.crashing.org,
	linux-kernel@vger.kernel.org
Subject: RE: [PATCH] smp_call_function_many SMP race
Date: Tue, 18 Jan 2011 15:05:39 -0600	[thread overview]
Message-ID: <smp-call-function-peter-reply@mdm.bga.com> (raw)
In-Reply-To: <1295288253.30950.280.camel@laptop>

On Mon, 17 Jan 2011 around 19:17:33 +0100, Peter Zijlstra wrote:
> On Wed, 2011-01-12 at 15:07 +1100, Anton Blanchard wrote:
> 
> > I managed to forget all about this bug, probably because of how much it
> > makes my brain hurt.
> 
> Agreed.
> 
> 
> > I tried to fix it by ordering the read and the write of ->cpumask and
> > ->refs. In doing so I missed a critical case but Paul McKenney was able
> > to spot my bug thankfully :) To ensure we arent viewing previous
> > iterations the interrupt handler needs to read ->refs then ->cpumask
> > then ->refs _again_.
> 
> > ---
> > 
> > Index: linux-2.6/kernel/smp.c
> > =====================================================================
> > --- linux-2.6.orig/kernel/smp.c 2010-12-22 17:19:11.262835785 +1100
> > +++ linux-2.6/kernel/smp.c      2011-01-12 15:03:08.793324402 +1100
> > @@ -194,6 +194,31 @@ void generic_smp_call_function_interrupt
> >         list_for_each_entry_rcu(data, &call_function.queue, csd.list) {
> >                 int refs;
> >  
> > +               /*
> > +                * Since we walk the list without any locks, we might
> > +                * see an entry that was completed, removed from the
> > +                * list and is in the process of being reused.
> > +                *
> > +                * Just checking data->refs then data->cpumask is not good
> > +                * enough because we could see a non zero data->refs from a
> > +                * previous iteration. We need to check data->refs, then
> > +                * data->cpumask then data->refs again. Talk about
> > +                * complicated!
> > +                */
> > +
> > +               if (atomic_read(&data->refs) == 0)
> > +                       continue;
> > +
> 
> So here we might see the old ref
> 
> > +               smp_rmb();
> > +
> > +               if (!cpumask_test_cpu(cpu, data->cpumask))
> > +                       continue;
> 
> Here we might see the new cpumask
> 
> > +               smp_rmb();
> > +
> > +               if (atomic_read(&data->refs) == 0)
> > +                       continue;
> > +
> 
> But then still see a 0 ref, at which point we skip this entry and rely
> on the fact that arch_send_call_function_ipi_mask() will simply latch
> our IPI line and cause a back-to-back IPI such that we can process the
> data on the second go-round?
> 
> >                 if (!cpumask_test_and_clear_cpu(cpu, data->cpumask))
> >                         continue;
> 
> And finally, once we observe a valid ->refs, do we test the ->cpumask
> again so we cross with the store order (->cpumask first, then ->refs).
> 
> > @@ -458,6 +483,14 @@ void smp_call_function_many(const struct
> >         data->csd.info = info;
> >         cpumask_and(data->cpumask, mask, cpu_online_mask);
> >         cpumask_clear_cpu(this_cpu, data->cpumask);
> > +
> > +       /*
> > +        * To ensure the interrupt handler gets an up to date view
> > +        * we order the cpumask and refs writes and order the
> > +        * read of them in the interrupt handler.
> > +        */
> > +       smp_wmb();
> > +
> >         atomic_set(&data->refs, cpumask_weight(data->cpumask));
> >  
> >         raw_spin_lock_irqsave(&call_function.lock, flags); 
> 
> Read side:			Write side:
> 
> list_for_each_rcu()
>   !->refs, continue		  ->cpumask = 
> rmb				wmb
>   !->cpumask, continue		  ->refs = 
> rmb				wmb
>   !->refs, continue		  list_add_rcu()
> mb
>   !->cpumask, continue
> 
> 
> 
> Wouldn't something like:
> 
> list_for_each_rcu()
>   !->cpumask, continue		  ->refs =
> rmb				wmb
>   !->refs, continue		  ->cpumask =
> mb				wmb
>   !->cpumask, continue		  list_add_rcu()
> 

Yes, I believe it does.   Paul found the race case after I went home,
and I found the resulting patch with the extra calls posted the next
morning.   When I tried to rase the issue, Paul said he wanted me to
try it on hardware before he did the analysis again.   I finally got a
machine to do that yesterday afternoon.


> 
> Suffice? There we can observe the old ->cpumask, new ->refs and new
> ->cpumask in crossed order, so we filter out the old, and cross the new,
> and have one rmb and conditional less.
> 
> Or am I totally missing something here,.. like said, this stuff hurts
> brains.
> 
> 

In fact, if we assert that the called function is not allowed to enable
interrupts, then we can consolidate both writes to be after the function
is executed instead of doing one atomic read/write before (for the
cpumask bit) and a second one after (for the ref count).   I ran the 
timings on Anton's test case on a 4-node 256 thread power7 box.

What follows is the a two patch set.   I am keeping the second
seperate in case some wiere funtion is enabling interrupts in the
middle of this.

milton

  reply	other threads:[~2011-01-18 21:05 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-12  4:07 [PATCH] smp_call_function_many SMP race Anton Blanchard
2011-01-17 18:17 ` Peter Zijlstra
2011-01-18 21:05   ` Milton Miller [this message]
2011-01-18 21:06     ` [PATCH 2/2] consolidate writes in smp_call_funtion_interrupt Milton Miller
2011-01-27 16:22       ` Peter Zijlstra
2011-01-27 21:59         ` Milton Miller
2011-01-29  0:20           ` call_function_many: fix list delete vs add race Milton Miller
2011-01-31  7:21             ` Mike Galbraith
2011-01-31 20:26               ` [PATCH] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-02-01  3:15                 ` Mike Galbraith
2011-01-31 10:27             ` call_function_many: fix list delete vs add race Peter Zijlstra
2011-01-31 20:26               ` Milton Miller
2011-01-31 20:39                 ` Peter Zijlstra
2011-01-31 21:17             ` Peter Zijlstra
2011-01-31 21:36               ` Milton Miller
2011-02-01  0:22               ` Benjamin Herrenschmidt
2011-02-01  1:39                 ` Linus Torvalds
2011-02-01  2:18                   ` Paul E. McKenney
2011-02-01  2:43                     ` Linus Torvalds
2011-02-01  4:45                       ` Paul E. McKenney
2011-02-01  5:46                         ` Linus Torvalds
2011-02-01  6:18                           ` Benjamin Herrenschmidt
2011-02-01 14:13                           ` Paul E. McKenney
2011-02-01  6:16                       ` Benjamin Herrenschmidt
     [not found]             ` <ipi-list-reply@mdm.bga.com>
2011-02-01  7:12               ` [PATCH 1/3 v2] " Milton Miller
2011-02-01 22:00                 ` Paul E. McKenney
2011-02-01 22:00                   ` Milton Miller
2011-02-02  4:17                     ` Paul E. McKenney
2011-02-06 23:51                       ` Paul E. McKenney
2011-03-15 19:27                         ` [PATCH 0/4 v3] smp_call_function_many issues from review Milton Miller
2011-03-15 20:22                           ` Luck, Tony
2011-03-15 20:32                             ` Dimitri Sivanich
2011-03-15 20:39                           ` Peter Zijlstra
2011-03-16 17:55                           ` Linus Torvalds
2011-03-16 18:13                             ` Peter Zijlstra
2011-03-17  3:15                           ` Mike Galbraith
2011-02-07  8:12                       ` [PATCH 1/3 v2] call_function_many: fix list delete vs add race Mike Galbraith
2011-02-08 19:36                         ` Paul E. McKenney
2011-08-21  6:17                           ` Mike Galbraith
2011-02-02  6:22                     ` Mike Galbraith
2011-02-01  7:12               ` [PATCH 2/3 v2] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-03-15 19:27               ` [PATCH 1/4 v3] call_function_many: fix list delete vs add race Milton Miller
2011-03-15 19:27               ` [PATCH 2/4 v3] call_function_many: add missing ordering Milton Miller
2011-03-16 12:06                 ` Paul E. McKenney
2011-03-15 19:27               ` [PATCH 4/4 v3] smp_call_function_interrupt: use typedef and %pf Milton Miller
2011-03-15 19:27               ` [PATCH 3/4 v3] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-03-15 22:32                 ` Catalin Marinas
2011-03-16  7:52                 ` Jan Beulich
2011-01-18 21:07     ` [PATCH 1/2] smp_call_function_many SMP race Milton Miller
2011-01-20  0:41       ` Andrew Morton
  -- strict thread matches above, loose matches on Subject: below --
2010-03-23 11:15 [PATCH] " Anton Blanchard
2010-03-23 12:26 ` Peter Zijlstra
2010-03-23 15:33   ` Paul E. McKenney
2010-03-23 15:49     ` Peter Zijlstra
2010-03-23 21:31   ` Anton Blanchard
2010-03-23 16:41 ` Paul E. McKenney
2010-05-03 14:24 ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=smp-call-function-peter-reply@mdm.bga.com \
    --to=miltonm@bga.com \
    --cc=akpm@linux-foundation.org \
    --cc=anton@samba.org \
    --cc=benh@kernel.crashing.org \
    --cc=jaxboe@fusionio.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=npiggin@gmail.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rusty@rustcorp.com.au \
    --cc=torvalds@linux-foundation.org \
    --cc=xiaoguangrong@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.