All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Milton Miller <miltonm@bga.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	akpm@linux-foundation.org, Anton Blanchard <anton@samba.org>,
	xiaoguangrong@cn.fujitsu.com, mingo@elte.hu, jaxboe@fusionio.com,
	npiggin@gmail.com, rusty@rustcorp.com.au, efault@gmx.de,
	Jan Beulich <JBeulich@novell.com>,
	Dimitri Sivanich <sivanich@sgi.com>,
	Tony Luck <tony.luck@intel.com>,
	torvalds@linux-foundation.org, benh@kernel.crashing.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/4 v3] call_function_many: add missing ordering
Date: Wed, 16 Mar 2011 05:06:06 -0700	[thread overview]
Message-ID: <20110316120606.GN2273@linux.vnet.ibm.com> (raw)
In-Reply-To: <smp-call-function-barriers-v3@mdm.bga.com>

On Tue, Mar 15, 2011 at 01:27:16PM -0600, Milton Miller wrote:
> Paul McKenney's review pointed out two problems with the barriers
> in the 2.6.38 update to the smp call function many code.
> 
> First, a barrier that would force the func and info members of data
> to be visible before their consumption in the interrupt handler
> was missing.  This can be solved by adding a smp_wmb between
> setting the func and info members and setting setting the cpumask;
> this will pair with the existing and required smp_rmb ordering
> the cpumask read before the read of refs.  This placement avoids
> the need a second smp_rmb in the interrupt handler which would
> be executed on each of the N cpus executing the call request.
> (I was thinking this barrier was present but was not).
> 
> Second, the previous write to refs (establishing the zero that
> we the interrupt handler was testing from all cpus) was performed
> by a third party cpu.  This would invoke transitivity which, as
> a recient or concurrent addition to memory-barriers.txt now
> explicitly states, would require a full smp_mb().
> 
> However, we know the cpumask will only be set by one cpu (the
> data owner) and any preivous iteration of the mask would have
> cleared by the reading cpu.  By redundantly writing refs to 0 on
> the owning cpu before the smp_wmb, the write to refs will follow
> the same path as the writes that set the cpumask, which in turn
> allows us to keep the barrier in the interrupt handler a smp_rmb
> instead of promoting it to a smp_mb (which will be be executed
> by N cpus for each of the possible M elements on the list).
> 
> I moved and expanded the comment about our (ab)use of the rcu list
> primitives for the concurrent walk earlier into this function.
> I considered moving the first two paragraphs to the queue list
> head and lock, but felt it would have been too disconected from
> the code.
> 
> Cc: Paul McKinney <paulmck@linux.vnet.ibm.com>
> Cc: stable (2.6.32 and later)
> Signed-off-by: Milton Miller <miltonm@bga.com>
> 
> Paul, please review this alternative to your patch at either of

Hello, Milton,

At first glance, this looks promising, but it will take me a few
more days to wrap my head fully around it.  Fair enough?

						Thanx, Paul

> http://marc.info/?l=linux-kernel&m=129662029916241&w=2
> https://patchwork.kernel.org/patch/525891/
> 
> In contrast to your patch, this proposal keeps the additional
> barriers in the (interrupt enabled) requester instead of the
> execution interrupt, where they would be executed by all N cpus
> in the system triggered concurrently for each of the N possible
> elements in the list.
> 
> Index: common/kernel/smp.c
> ===================================================================
> --- common.orig/kernel/smp.c	2011-03-15 05:21:41.000000000 -0500
> +++ common/kernel/smp.c	2011-03-15 05:22:26.000000000 -0500
> @@ -483,23 +483,42 @@ void smp_call_function_many(const struct
> 
>  	data = &__get_cpu_var(cfd_data);
>  	csd_lock(&data->csd);
> -	BUG_ON(atomic_read(&data->refs) || !cpumask_empty(data->cpumask));
> 
> -	data->csd.func = func;
> -	data->csd.info = info;
> -	cpumask_and(data->cpumask, mask, cpu_online_mask);
> -	cpumask_clear_cpu(this_cpu, data->cpumask);
> +	/* This BUG_ON verifies our reuse assertions and can be removed */
> +	BUG_ON(atomic_read(&data->refs) || !cpumask_empty(data->cpumask));
> 
>  	/*
> +	 * The global call function queue list add and delete are protected
> +	 * by a lock, but the list is traversed without any lock, relying
> +	 * on the rcu list add and delete to allow safe concurrent traversal.
>  	 * We reuse the call function data without waiting for any grace
>  	 * period after some other cpu removes it from the global queue.
> -	 * This means a cpu might find our data block as it is writen.
> -	 * The interrupt handler waits until it sees refs filled out
> -	 * while its cpu mask bit is set; here we may only clear our
> -	 * own cpu mask bit, and must wait to set refs until we are sure
> -	 * previous writes are complete and we have obtained the lock to
> -	 * add the element to the queue.
> +	 * This means a cpu might find our data block as it is being
> +	 * filled out.
> +	 *
> +	 * We hold off the interrupt handler on the other cpu by
> +	 * ordering our writes to the cpu mask vs our setting of the
> +	 * refs counter.  We assert only the cpu owning the data block
> +	 * will set a bit in cpumask, and each bit will only be cleared
> +	 * by the subject cpu.  Each cpu must first find its bit is
> +	 * set and then check that refs is set indicating the element is
> +	 * ready to be processed, otherwise it must skip the entry.
> +	 *
> +	 * On the previous iteration refs was set to 0 by another cpu.
> +	 * To avoid the use of transitivity, set the counter to 0 here
> +	 * so the wmb will pair with the rmb in the interrupt handler.
>  	 */
> +	atomic_set(&data->refs, 0);	/* convert 3rd to 1st party write */
> +
> +	data->csd.func = func;
> +	data->csd.info = info;
> +
> +	/* Ensure 0 refs is visible before mask.  Also orders func and info */
> +	smp_wmb();
> +
> +	/* We rely on the "and" being processed before the store */
> +	cpumask_and(data->cpumask, mask, cpu_online_mask);
> +	cpumask_clear_cpu(this_cpu, data->cpumask);
> 
>  	raw_spin_lock_irqsave(&call_function.lock, flags);
>  	/*
> @@ -509,8 +528,9 @@ void smp_call_function_many(const struct
>  	 */
>  	list_add_rcu(&data->csd.list, &call_function.queue);
>  	/*
> -	 * We rely on the wmb() in list_add_rcu to order the writes
> -	 * to func, data, and cpumask before this write to refs.
> +	 * We rely on the wmb() in list_add_rcu to complete our writes
> +	 * to the cpumask before this write to refs, which indicates
> +	 * data is on the list and is ready to be processed.
>  	 */
>  	atomic_set(&data->refs, cpumask_weight(data->cpumask));
>  	raw_spin_unlock_irqrestore(&call_function.lock, flags);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

  reply	other threads:[~2011-03-16 12:06 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-12  4:07 [PATCH] smp_call_function_many SMP race Anton Blanchard
2011-01-17 18:17 ` Peter Zijlstra
2011-01-18 21:05   ` Milton Miller
2011-01-18 21:06     ` [PATCH 2/2] consolidate writes in smp_call_funtion_interrupt Milton Miller
2011-01-27 16:22       ` Peter Zijlstra
2011-01-27 21:59         ` Milton Miller
2011-01-29  0:20           ` call_function_many: fix list delete vs add race Milton Miller
2011-01-31  7:21             ` Mike Galbraith
2011-01-31 20:26               ` [PATCH] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-02-01  3:15                 ` Mike Galbraith
2011-01-31 10:27             ` call_function_many: fix list delete vs add race Peter Zijlstra
2011-01-31 20:26               ` Milton Miller
2011-01-31 20:39                 ` Peter Zijlstra
2011-01-31 21:17             ` Peter Zijlstra
2011-01-31 21:36               ` Milton Miller
2011-02-01  0:22               ` Benjamin Herrenschmidt
2011-02-01  1:39                 ` Linus Torvalds
2011-02-01  2:18                   ` Paul E. McKenney
2011-02-01  2:43                     ` Linus Torvalds
2011-02-01  4:45                       ` Paul E. McKenney
2011-02-01  5:46                         ` Linus Torvalds
2011-02-01  6:18                           ` Benjamin Herrenschmidt
2011-02-01 14:13                           ` Paul E. McKenney
2011-02-01  6:16                       ` Benjamin Herrenschmidt
     [not found]             ` <ipi-list-reply@mdm.bga.com>
2011-02-01  7:12               ` [PATCH 1/3 v2] " Milton Miller
2011-02-01 22:00                 ` Paul E. McKenney
2011-02-01 22:00                   ` Milton Miller
2011-02-02  4:17                     ` Paul E. McKenney
2011-02-06 23:51                       ` Paul E. McKenney
2011-03-15 19:27                         ` [PATCH 0/4 v3] smp_call_function_many issues from review Milton Miller
2011-03-15 20:22                           ` Luck, Tony
2011-03-15 20:32                             ` Dimitri Sivanich
2011-03-15 20:39                           ` Peter Zijlstra
2011-03-16 17:55                           ` Linus Torvalds
2011-03-16 18:13                             ` Peter Zijlstra
2011-03-17  3:15                           ` Mike Galbraith
2011-02-07  8:12                       ` [PATCH 1/3 v2] call_function_many: fix list delete vs add race Mike Galbraith
2011-02-08 19:36                         ` Paul E. McKenney
2011-08-21  6:17                           ` Mike Galbraith
2011-02-02  6:22                     ` Mike Galbraith
2011-02-01  7:12               ` [PATCH 2/3 v2] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-03-15 19:27               ` [PATCH 1/4 v3] call_function_many: fix list delete vs add race Milton Miller
2011-03-15 19:27               ` [PATCH 2/4 v3] call_function_many: add missing ordering Milton Miller
2011-03-16 12:06                 ` Paul E. McKenney [this message]
2011-03-15 19:27               ` [PATCH 3/4 v3] smp_call_function_many: handle concurrent clearing of mask Milton Miller
2011-03-15 22:32                 ` Catalin Marinas
2011-03-16  7:52                 ` Jan Beulich
2011-03-15 19:27               ` [PATCH 4/4 v3] smp_call_function_interrupt: use typedef and %pf Milton Miller
2011-01-18 21:07     ` [PATCH 1/2] smp_call_function_many SMP race Milton Miller
2011-01-20  0:41       ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110316120606.GN2273@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=JBeulich@novell.com \
    --cc=akpm@linux-foundation.org \
    --cc=anton@samba.org \
    --cc=benh@kernel.crashing.org \
    --cc=efault@gmx.de \
    --cc=jaxboe@fusionio.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miltonm@bga.com \
    --cc=mingo@elte.hu \
    --cc=npiggin@gmail.com \
    --cc=peterz@infradead.org \
    --cc=rusty@rustcorp.com.au \
    --cc=sivanich@sgi.com \
    --cc=tony.luck@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=xiaoguangrong@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.