All of lore.kernel.org
 help / color / mirror / Atom feed
From: Milton Miller <miltonm@bga.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Anton Blanchard <anton@samba.org>,
	xiaoguangrong@cn.fujitsu.com, mingo@elte.hu, jaxboe@fusionio.com,
	npiggin@gmail.com, rusty@rustcorp.com.au,
	akpm@linux-foundation.org, torvalds@linux-foundation.org,
	paulmck@linux.vnet.ibm.com, benh@kernel.crashing.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] consolidate writes in smp_call_funtion_interrupt
Date: Thu, 27 Jan 2011 15:59:31 -0600	[thread overview]
Message-ID: <smp-call-function-list-race@mdm.bga.com> (raw)
In-Reply-To: <1296145360.15234.234.camel@laptop>

On Thu, 27 Jan 2011 about 17:22:40 +0100, Peter Zijlstra wrote:
> On Tue, 2011-01-18 at 15:06 -0600, Milton Miller wrote:
> > Index: common/kernel/smp.c
> > =====================================================================
> > --- common.orig/kernel/smp.c	2011-01-17 20:16:18.000000000 -0600
> > +++ common/kernel/smp.c	2011-01-17 20:17:50.000000000 -0600
> > @@ -193,6 +193,7 @@ void generic_smp_call_function_interrupt
> >  	 */
> >  	list_for_each_entry_rcu(data, &call_function.queue, csd.list) {
> >  		int refs;
> > +		void (*func) (void *info);
> >  
> >  		/*
> >  		 * Since we walk the list without any locks, we might
> > @@ -212,24 +213,32 @@ void generic_smp_call_function_interrupt
> >  		if (atomic_read(&data->refs) == 0)
> >  			continue;
> >  
> > +		func = data->csd.func;			/* for later warn */
> >  		data->csd.func(data->csd.info);
> >  
> > +		/*
> > +		 * If the cpu mask is not still set then it enabled interrupts,
> > +		 * we took another smp interrupt, and executed the function
> > +		 * twice on this cpu.  In theory that copy decremented refs.
> > +		 */
> > +		if (!cpumask_test_and_clear_cpu(cpu, data->cpumask)) {
> > +			WARN(1, "%pS enabled interrupts and double executed\n",
> > +			     func);
> > +			continue;
> > +		}
> > +
> >  		refs = atomic_dec_return(&data->refs);
> >  		WARN_ON(refs < 0);
> >  
> >  		if (refs)
> >  			continue;
> >  
> > +		WARN_ON(!cpumask_empty(data->cpumask));
> > +
> > +		raw_spin_lock(&call_function.lock);
> > +		list_del_rcu(&data->csd.list);
> > +		raw_spin_unlock(&call_function.lock);
> > +
> >  		csd_unlock(&data->csd);
> >  	}
> >  
> 
> 
> So after this we have:
> 
> list_for_each_entry_rcu()
> rbd
>   !->cpumask			  ->cpumask = 
> rmb				wmb
>   !->refs			  ->refs =
>   ->func()			wmb
> mb				  list_add_rcu()
>   ->refs--
>   if (!->refs)
>     list_del_rcu()
> 
> So even if we see it as an old-ref, when we see a valid cpumask, valid
> ref, we execute the function clear our cpumask bit and decrement the ref
> and delete the entry, even though it might not yet be added?
> 
> 
> (old-ref)
> 				->cpumask =
>   if (!->cpumask)
> 				->refs = 
>   if (!->refs)
> 
>   ->func()
>   ->refs--
>   if (!->refs)
>     list_del_rcu()
> 				list_add_rcu()
> 
> Then what happens?

The item remains on the queue unlocked.  If we are lucky all entries
after it are serviced before we try to reuse it.  If not, when we
do the list_add_rcu while still in the list all entries after it
get lost from the service queue and will hang.

This is a further scenerio exposed by the lock removal.  Good spotting.

A simple fix would be to add to the queue before setting refs.  I can
try to work on that tomorrow, hopefully the test machine is available.

The next question is can/should we use the lock & unlock to put the
entry on the queue in place of the wmb() between setting cpu mask and
setting refs?  We still need the rmb() in _interrupt.

milton

  reply	other threads:[~2011-01-27 21:59 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-12  4:07 [PATCH] smp_call_function_many SMP race Anton Blanchard
2011-01-17 18:17 ` Peter Zijlstra
2011-01-18 21:05   ` Milton Miller
2011-01-18 21:06     ` [PATCH 2/2] consolidate writes in smp_call_funtion_interrupt Milton Miller
2011-01-27 16:22       ` Peter Zijlstra
2011-01-27 21:59         ` Milton Miller [this message]
2011-01-29  0:20           ` call_function_many: fix list delete vs add race Milton Miller
2011-01-31  7:21             ` Mike Galbraith
2011-01-31 20:26               ` [PATCH] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-02-01  3:15                 ` Mike Galbraith
2011-01-31 10:27             ` call_function_many: fix list delete vs add race Peter Zijlstra
2011-01-31 20:26               ` Milton Miller
2011-01-31 20:39                 ` Peter Zijlstra
2011-01-31 21:17             ` Peter Zijlstra
2011-01-31 21:36               ` Milton Miller
2011-02-01  0:22               ` Benjamin Herrenschmidt
2011-02-01  1:39                 ` Linus Torvalds
2011-02-01  2:18                   ` Paul E. McKenney
2011-02-01  2:43                     ` Linus Torvalds
2011-02-01  4:45                       ` Paul E. McKenney
2011-02-01  5:46                         ` Linus Torvalds
2011-02-01  6:18                           ` Benjamin Herrenschmidt
2011-02-01 14:13                           ` Paul E. McKenney
2011-02-01  6:16                       ` Benjamin Herrenschmidt
     [not found]             ` <ipi-list-reply@mdm.bga.com>
2011-02-01  7:12               ` [PATCH 1/3 v2] " Milton Miller
2011-02-01 22:00                 ` Paul E. McKenney
2011-02-01 22:00                   ` Milton Miller
2011-02-02  4:17                     ` Paul E. McKenney
2011-02-06 23:51                       ` Paul E. McKenney
2011-03-15 19:27                         ` [PATCH 0/4 v3] smp_call_function_many issues from review Milton Miller
2011-03-15 20:22                           ` Luck, Tony
2011-03-15 20:32                             ` Dimitri Sivanich
2011-03-15 20:39                           ` Peter Zijlstra
2011-03-16 17:55                           ` Linus Torvalds
2011-03-16 18:13                             ` Peter Zijlstra
2011-03-17  3:15                           ` Mike Galbraith
2011-02-07  8:12                       ` [PATCH 1/3 v2] call_function_many: fix list delete vs add race Mike Galbraith
2011-02-08 19:36                         ` Paul E. McKenney
2011-08-21  6:17                           ` Mike Galbraith
2011-02-02  6:22                     ` Mike Galbraith
2011-02-01  7:12               ` [PATCH 2/3 v2] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-03-15 19:27               ` [PATCH 1/4 v3] call_function_many: fix list delete vs add race Milton Miller
2011-03-15 19:27               ` [PATCH 2/4 v3] call_function_many: add missing ordering Milton Miller
2011-03-16 12:06                 ` Paul E. McKenney
2011-03-15 19:27               ` [PATCH 3/4 v3] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-03-15 22:32                 ` Catalin Marinas
2011-03-16  7:52                 ` Jan Beulich
2011-03-15 19:27               ` [PATCH 4/4 v3] smp_call_function_interrupt: use typedef and %pf Milton Miller
2011-01-18 21:07     ` [PATCH 1/2] smp_call_function_many SMP race Milton Miller
2011-01-20  0:41       ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=smp-call-function-list-race@mdm.bga.com \
    --to=miltonm@bga.com \
    --cc=akpm@linux-foundation.org \
    --cc=anton@samba.org \
    --cc=benh@kernel.crashing.org \
    --cc=jaxboe@fusionio.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=npiggin@gmail.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rusty@rustcorp.com.au \
    --cc=torvalds@linux-foundation.org \
    --cc=xiaoguangrong@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.