From: Peter Zijlstra <peterz@infradead.org>
To: Anton Blanchard <anton@samba.org>
Cc: xiaoguangrong@cn.fujitsu.com, mingo@elte.hu, jaxboe@fusionio.com,
npiggin@gmail.com, rusty@rustcorp.com.au,
akpm@linux-foundation.org, torvalds@linux-foundation.org,
paulmck@linux.vnet.ibm.com, miltonm@bga.com,
benh@kernel.crashing.org, linux-kernel@vger.kernel.org
Subject: RE: [PATCH] smp_call_function_many SMP race
Date: Mon, 17 Jan 2011 19:17:33 +0100 [thread overview]
Message-ID: <1295288253.30950.280.camel@laptop> (raw)
In-Reply-To: <20110112150740.77dde58c@kryten>
On Wed, 2011-01-12 at 15:07 +1100, Anton Blanchard wrote:
> I managed to forget all about this bug, probably because of how much it
> makes my brain hurt.
Agreed.
> I tried to fix it by ordering the read and the write of ->cpumask and
> ->refs. In doing so I missed a critical case but Paul McKenney was able
> to spot my bug thankfully :) To ensure we arent viewing previous
> iterations the interrupt handler needs to read ->refs then ->cpumask
> then ->refs _again_.
> ---
>
> Index: linux-2.6/kernel/smp.c
> ===================================================================
> --- linux-2.6.orig/kernel/smp.c 2010-12-22 17:19:11.262835785 +1100
> +++ linux-2.6/kernel/smp.c 2011-01-12 15:03:08.793324402 +1100
> @@ -194,6 +194,31 @@ void generic_smp_call_function_interrupt
> list_for_each_entry_rcu(data, &call_function.queue, csd.list) {
> int refs;
>
> + /*
> + * Since we walk the list without any locks, we might
> + * see an entry that was completed, removed from the
> + * list and is in the process of being reused.
> + *
> + * Just checking data->refs then data->cpumask is not good
> + * enough because we could see a non zero data->refs from a
> + * previous iteration. We need to check data->refs, then
> + * data->cpumask then data->refs again. Talk about
> + * complicated!
> + */
> +
> + if (atomic_read(&data->refs) == 0)
> + continue;
> +
So here we might see the old ref
> + smp_rmb();
> +
> + if (!cpumask_test_cpu(cpu, data->cpumask))
> + continue;
Here we might see the new cpumask
> + smp_rmb();
> +
> + if (atomic_read(&data->refs) == 0)
> + continue;
> +
But then still see a 0 ref, at which point we skip this entry and rely
on the fact that arch_send_call_function_ipi_mask() will simply latch
our IPI line and cause a back-to-back IPI such that we can process the
data on the second go-round?
> if (!cpumask_test_and_clear_cpu(cpu, data->cpumask))
> continue;
And finally, once we observe a valid ->refs, do we test the ->cpumask
again so we cross with the store order (->cpumask first, then ->refs).
> @@ -458,6 +483,14 @@ void smp_call_function_many(const struct
> data->csd.info = info;
> cpumask_and(data->cpumask, mask, cpu_online_mask);
> cpumask_clear_cpu(this_cpu, data->cpumask);
> +
> + /*
> + * To ensure the interrupt handler gets an up to date view
> + * we order the cpumask and refs writes and order the
> + * read of them in the interrupt handler.
> + */
> + smp_wmb();
> +
> atomic_set(&data->refs, cpumask_weight(data->cpumask));
>
> raw_spin_lock_irqsave(&call_function.lock, flags);
Read side: Write side:
list_for_each_rcu()
!->refs, continue ->cpumask =
rmb wmb
!->cpumask, continue ->refs =
rmb wmb
!->refs, continue list_add_rcu()
mb
!->cpumask, continue
Wouldn't something like:
list_for_each_rcu()
!->cpumask, continue ->refs =
rmb wmb
!->refs, continue ->cpumask =
mb wmb
!->cpumask, continue list_add_rcu()
Suffice? There we can observe the old ->cpumask, new ->refs and new
->cpumask in crossed order, so we filter out the old, and cross the new,
and have one rmb and conditional less.
Or am I totally missing something here,.. like said, this stuff hurts
brains.
next prev parent reply other threads:[~2011-01-17 18:17 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-01-12 4:07 [PATCH] smp_call_function_many SMP race Anton Blanchard
2011-01-17 18:17 ` Peter Zijlstra [this message]
2011-01-18 21:05 ` Milton Miller
2011-01-18 21:06 ` [PATCH 2/2] consolidate writes in smp_call_funtion_interrupt Milton Miller
2011-01-27 16:22 ` Peter Zijlstra
2011-01-27 21:59 ` Milton Miller
2011-01-29 0:20 ` call_function_many: fix list delete vs add race Milton Miller
2011-01-31 7:21 ` Mike Galbraith
2011-01-31 20:26 ` [PATCH] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-02-01 3:15 ` Mike Galbraith
2011-01-31 10:27 ` call_function_many: fix list delete vs add race Peter Zijlstra
2011-01-31 20:26 ` Milton Miller
2011-01-31 20:39 ` Peter Zijlstra
2011-01-31 21:17 ` Peter Zijlstra
2011-01-31 21:36 ` Milton Miller
2011-02-01 0:22 ` Benjamin Herrenschmidt
2011-02-01 1:39 ` Linus Torvalds
2011-02-01 2:18 ` Paul E. McKenney
2011-02-01 2:43 ` Linus Torvalds
2011-02-01 4:45 ` Paul E. McKenney
2011-02-01 5:46 ` Linus Torvalds
2011-02-01 6:18 ` Benjamin Herrenschmidt
2011-02-01 14:13 ` Paul E. McKenney
2011-02-01 6:16 ` Benjamin Herrenschmidt
[not found] ` <ipi-list-reply@mdm.bga.com>
2011-02-01 7:12 ` [PATCH 1/3 v2] " Milton Miller
2011-02-01 22:00 ` Paul E. McKenney
2011-02-01 22:00 ` Milton Miller
2011-02-02 4:17 ` Paul E. McKenney
2011-02-06 23:51 ` Paul E. McKenney
2011-03-15 19:27 ` [PATCH 0/4 v3] smp_call_function_many issues from review Milton Miller
2011-03-15 20:22 ` Luck, Tony
2011-03-15 20:32 ` Dimitri Sivanich
2011-03-15 20:39 ` Peter Zijlstra
2011-03-16 17:55 ` Linus Torvalds
2011-03-16 18:13 ` Peter Zijlstra
2011-03-17 3:15 ` Mike Galbraith
2011-02-07 8:12 ` [PATCH 1/3 v2] call_function_many: fix list delete vs add race Mike Galbraith
2011-02-08 19:36 ` Paul E. McKenney
2011-08-21 6:17 ` Mike Galbraith
2011-02-02 6:22 ` Mike Galbraith
2011-02-01 7:12 ` [PATCH 2/3 v2] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-03-15 19:27 ` [PATCH 2/4 v3] call_function_many: add missing ordering Milton Miller
2011-03-16 12:06 ` Paul E. McKenney
2011-03-15 19:27 ` [PATCH 1/4 v3] call_function_many: fix list delete vs add race Milton Miller
2011-03-15 19:27 ` [PATCH 4/4 v3] smp_call_function_interrupt: use typedef and %pf Milton Miller
2011-03-15 19:27 ` [PATCH 3/4 v3] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-03-15 22:32 ` Catalin Marinas
2011-03-16 7:52 ` Jan Beulich
2011-01-18 21:07 ` [PATCH 1/2] smp_call_function_many SMP race Milton Miller
2011-01-20 0:41 ` Andrew Morton
-- strict thread matches above, loose matches on Subject: below --
2010-03-23 11:15 [PATCH] " Anton Blanchard
2010-03-23 12:26 ` Peter Zijlstra
2010-03-23 15:33 ` Paul E. McKenney
2010-03-23 15:49 ` Peter Zijlstra
2010-03-23 21:31 ` Anton Blanchard
2010-03-23 16:41 ` Paul E. McKenney
2010-05-03 14:24 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1295288253.30950.280.camel@laptop \
--to=peterz@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=anton@samba.org \
--cc=benh@kernel.crashing.org \
--cc=jaxboe@fusionio.com \
--cc=linux-kernel@vger.kernel.org \
--cc=miltonm@bga.com \
--cc=mingo@elte.hu \
--cc=npiggin@gmail.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=rusty@rustcorp.com.au \
--cc=torvalds@linux-foundation.org \
--cc=xiaoguangrong@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.