linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Nick Piggin <npiggin@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Oleg Nesterov <oleg@redhat.com>,
	Jens Axboe <jens.axboe@oracle.com>,
	Suresh Siddha <suresh.b.siddha@intel.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Ingo Molnar <mingo@elte.hu>,
	Rusty Russell <rusty@rustcorp.com.au>,
	Steven Rostedt <rostedt@goodmis.org>,
	linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org
Subject: Re: Q: smp.c && barriers (Was: [PATCH 1/4] generic-smp: remove single ipi fallback for smp_call_function_many())
Date: Tue, 17 Feb 2009 07:51:47 -0800	[thread overview]
Message-ID: <20090217155147.GE6761@linux.vnet.ibm.com> (raw)
In-Reply-To: <20090217112657.GE26402@wotan.suse.de>

On Tue, Feb 17, 2009 at 12:26:57PM +0100, Nick Piggin wrote:
> cc linux-arch
> 
> On Tue, Feb 17, 2009 at 11:27:33AM +0100, Peter Zijlstra wrote:
> > > But hmm, why even bother with all this complexity? Why not just
> > > remove the outer loop completely? Do the lock and the list_replace_init
> > > unconditionally. It would turn tricky lockless code into simple locked
> > > code... we've already taken an interrupt anyway, so chances are pretty
> > > high that we have work here to do, right?
> > 
> > Well, that's a practical suggestion, and I agree.
> 
> How's this?
> 
> --
> Simplify the barriers in generic remote function call interrupt code.
> 
> Firstly, just unconditionally take the lock and check the list in the
> generic_call_function_single_interrupt IPI handler. As we've just taken
> an IPI here, the chances are fairly high that there will be work on the
> list for us, so do the locking unconditionally. This removes the tricky
> lockless list_empty check and dubious barriers. The change looks bigger
> than it is because it is just removing an outer loop.
> 
> Secondly, clarify architecture specific IPI locking rules. Generic code
> has no tools to impose any sane ordering on IPIs if they go outside
> normal cache coherency, ergo the arch code must make them appear to
> obey cache coherency as a "memory operation" to initiate an IPI, and
> a "memory operation" to receive one. This way at least they can be
> reasoned about in generic code, and smp_mb used to provide ordering. 
> 
> The combination of these two changes means that explict barriers can
> be taken out of queue handling for the single case -- shared data is
> explicitly locked, and ipi ordering must conform to that, so no
> barriers needed. An extra barrier is needed in the many handler, so
> as to ensure we load the list element after the IPI is received.
> 
> Does any architecture actually needs barriers? For the initiator I
> could see it, but for the handler I would be surprised. The other
> thing we could do for simplicity is just to require that a full
> barrier is required before generating an IPI, and after receiving an
> IPI. We can't just do that in generic code without auditing
> architectures. There have been subtle hangs here on some archs in
> the past.
> 
> Signed-off-by: Nick Piggin <npiggin@suse.de>
> 
> ---
>  kernel/smp.c |   83 +++++++++++++++++++++++++++++++----------------------------
>  1 file changed, 44 insertions(+), 39 deletions(-)
> 
> Index: linux-2.6/kernel/smp.c
> ===================================================================
> --- linux-2.6.orig/kernel/smp.c
> +++ linux-2.6/kernel/smp.c
> @@ -74,9 +74,16 @@ static void generic_exec_single(int cpu,
>  	spin_unlock_irqrestore(&dst->lock, flags);
> 
>  	/*
> -	 * Make the list addition visible before sending the ipi.
> +	 * The list addition should be visible before sending the IPI
> +	 * handler locks the list to pull the entry off it because of
> +	 * normal cache coherency rules implied by spinlocks.
> +	 *
> +	 * If IPIs can go out of order to the cache coherency protocol
> +	 * in an architecture, sufficient synchronisation should be added
> +	 * to arch code to make it appear to obey cache coherency WRT
> +	 * locking and barrier primitives. Generic code isn't really equipped
> +	 * to do the right thing...
>  	 */
> -	smp_mb();
> 
>  	if (ipi)
>  		arch_send_call_function_single_ipi(cpu);
> @@ -104,6 +111,14 @@ void generic_smp_call_function_interrupt
>  	int cpu = get_cpu();
> 
>  	/*
> +	 * Ensure entry is visible on call_function_queue after we have
> +	 * entered the IPI. See comment in smp_call_function_many.
> +	 * If we don't have this, then we may miss an entry on the list
> +	 * and never get another IPI to process it.
> +	 */
> +	smp_mb();
> +
> +	/*
>  	 * It's ok to use list_for_each_rcu() here even though we may delete
>  	 * 'pos', since list_del_rcu() doesn't clear ->next
>  	 */
> @@ -154,49 +169,37 @@ void generic_smp_call_function_single_in
>  {
>  	struct call_single_queue *q = &__get_cpu_var(call_single_queue);
>  	LIST_HEAD(list);
> +	unsigned int data_flags;
> 
> -	/*
> -	 * Need to see other stores to list head for checking whether
> -	 * list is empty without holding q->lock
> -	 */
> -	smp_read_barrier_depends();
> -	while (!list_empty(&q->list)) {
> -		unsigned int data_flags;
> -
> -		spin_lock(&q->lock);
> -		list_replace_init(&q->list, &list);
> -		spin_unlock(&q->lock);
> -
> -		while (!list_empty(&list)) {
> -			struct call_single_data *data;
> -
> -			data = list_entry(list.next, struct call_single_data,
> -						list);
> -			list_del(&data->list);
> +	spin_lock(&q->lock);
> +	list_replace_init(&q->list, &list);
> +	spin_unlock(&q->lock);

OK, I'll bite...

How do we avoid deadlock in the case where a pair of CPUs send to each
other concurrently?

							Thanx, Paul

> 
> -			/*
> -			 * 'data' can be invalid after this call if
> -			 * flags == 0 (when called through
> -			 * generic_exec_single(), so save them away before
> -			 * making the call.
> -			 */
> -			data_flags = data->flags;
> +	while (!list_empty(&list)) {
> +		struct call_single_data *data;
> 
> -			data->func(data->info);
> +		data = list_entry(list.next, struct call_single_data,
> +					list);
> +		list_del(&data->list);
> 
> -			if (data_flags & CSD_FLAG_WAIT) {
> -				smp_wmb();
> -				data->flags &= ~CSD_FLAG_WAIT;
> -			} else if (data_flags & CSD_FLAG_LOCK) {
> -				smp_wmb();
> -				data->flags &= ~CSD_FLAG_LOCK;
> -			} else if (data_flags & CSD_FLAG_ALLOC)
> -				kfree(data);
> -		}
>  		/*
> -		 * See comment on outer loop
> +		 * 'data' can be invalid after this call if
> +		 * flags == 0 (when called through
> +		 * generic_exec_single(), so save them away before
> +		 * making the call.
>  		 */
> -		smp_read_barrier_depends();
> +		data_flags = data->flags;
> +
> +		data->func(data->info);
> +
> +		if (data_flags & CSD_FLAG_WAIT) {
> +			smp_wmb();
> +			data->flags &= ~CSD_FLAG_WAIT;
> +		} else if (data_flags & CSD_FLAG_LOCK) {
> +			smp_wmb();
> +			data->flags &= ~CSD_FLAG_LOCK;
> +		} else if (data_flags & CSD_FLAG_ALLOC)
> +			kfree(data);
>  	}
>  }
> 
> @@ -375,6 +378,8 @@ void smp_call_function_many(const struct
> 
>  	/*
>  	 * Make the list addition visible before sending the ipi.
> +	 * (IPIs must obey or appear to obey normal Linux cache coherency
> +	 * rules -- see comment in generic_exec_single).
>  	 */
>  	smp_mb();
> 

  parent reply	other threads:[~2009-02-17 15:52 UTC|newest]

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-16 16:38 [PATCH 0/4] generic smp helpers vs kmalloc Peter Zijlstra
2009-02-16 16:38 ` [PATCH 1/4] generic-smp: remove single ipi fallback for smp_call_function_many() Peter Zijlstra
2009-02-16 19:10   ` Oleg Nesterov
2009-02-16 19:41     ` Peter Zijlstra
2009-02-16 20:30       ` Oleg Nesterov
2009-02-16 20:55         ` Peter Zijlstra
2009-02-16 21:22           ` Oleg Nesterov
2009-02-17 12:25     ` Oleg Nesterov
2009-02-16 20:49   ` Q: smp.c && barriers (Was: [PATCH 1/4] generic-smp: remove single ipi fallback for smp_call_function_many()) Oleg Nesterov
2009-02-16 21:03     ` Peter Zijlstra
2009-02-16 21:32       ` Oleg Nesterov
2009-02-16 21:45         ` Peter Zijlstra
2009-02-16 22:02           ` Oleg Nesterov
2009-02-16 22:24             ` Peter Zijlstra
2009-02-16 23:19               ` Oleg Nesterov
2009-02-17  9:29                 ` Peter Zijlstra
2009-02-17 10:11                   ` Nick Piggin
2009-02-17 10:27                     ` Peter Zijlstra
2009-02-17 10:39                       ` Nick Piggin
2009-02-17 11:26                       ` Nick Piggin
2009-02-17 11:48                         ` Peter Zijlstra
2009-02-17 15:51                         ` Paul E. McKenney [this message]
2009-02-18  2:15                           ` Suresh Siddha
2009-02-18  2:40                             ` Paul E. McKenney
2009-02-17 19:28                         ` Q: " Oleg Nesterov
2009-02-17 21:32                           ` Paul E. McKenney
2009-02-17 21:45                             ` Oleg Nesterov
2009-02-17 22:39                               ` Paul E. McKenney
2009-02-18 13:52                                 ` Nick Piggin
2009-02-18 16:09                                   ` Linus Torvalds
2009-02-18 16:21                                     ` Ingo Molnar
2009-02-18 16:33                                       ` Linus Torvalds
2009-02-18 16:58                                         ` Ingo Molnar
2009-02-18 17:05                                           ` Ingo Molnar
2009-02-18 17:10                                             ` Ingo Molnar
2009-02-18 17:17                                               ` Linus Torvalds
2009-02-18 17:23                                                 ` Ingo Molnar
2009-02-18 17:14                                             ` Linus Torvalds
2009-02-18 17:47                                               ` Ingo Molnar
2009-02-18 18:33                                               ` Suresh Siddha
2009-02-18 16:37                                       ` Gleb Natapov
2009-02-19  0:12                                     ` Nick Piggin
2009-02-19  6:47                                     ` Benjamin Herrenschmidt
2009-02-19 13:11                                       ` Nick Piggin
2009-02-19 15:06                                         ` Ingo Molnar
2009-02-19 21:49                                           ` Benjamin Herrenschmidt
2009-02-18  2:21                         ` Suresh Siddha
2009-02-18 13:59                           ` Nick Piggin
2009-02-18 16:19                             ` Linus Torvalds
2009-02-18 16:23                               ` Ingo Molnar
2009-02-18 18:43                             ` Suresh Siddha
2009-02-18 19:17                               ` Ingo Molnar
2009-02-18 23:55                                 ` Suresh Siddha
2009-02-19 12:20                                   ` Ingo Molnar
2009-02-19 12:29                                     ` Nick Piggin
2009-02-19 12:45                                       ` Ingo Molnar
2009-02-19 22:00                                     ` Suresh Siddha
2009-02-20 10:56                                       ` Ingo Molnar
2009-02-20 18:56                                         ` Suresh Siddha
2009-02-20 19:40                                           ` Ingo Molnar
2009-02-20 23:28                                           ` Jack Steiner
2009-02-25  3:32                                           ` Nick Piggin
2009-02-25 12:47                                             ` Ingo Molnar
2009-02-25 18:25                                             ` Luck, Tony
2009-03-17 18:16                                             ` Suresh Siddha
2009-03-18  8:51                                               ` [tip:x86/x2apic] x86: add x2apic_wrmsr_fence() to x2apic flush tlb paths Suresh Siddha
2009-02-17 12:40                   ` Q: smp.c && barriers (Was: [PATCH 1/4] generic-smp: remove single ipi fallback for smp_call_function_many()) Peter Zijlstra
2009-02-17 15:43                   ` Paul E. McKenney
2009-02-17 15:40   ` [PATCH] generic-smp: remove kmalloc() Peter Zijlstra
2009-02-17 17:21     ` Oleg Nesterov
2009-02-17 17:40       ` Peter Zijlstra
2009-02-17 17:46         ` Peter Zijlstra
2009-02-17 18:30           ` Oleg Nesterov
2009-02-17 19:29         ` [PATCH -v4] generic-ipi: " Peter Zijlstra
2009-02-17 20:02           ` Oleg Nesterov
2009-02-17 20:11             ` Peter Zijlstra
2009-02-17 20:16               ` Peter Zijlstra
2009-02-17 20:44                 ` Oleg Nesterov
2009-02-17 20:49                 ` Peter Zijlstra
2009-02-17 22:09                   ` Oleg Nesterov
2009-02-17 22:15                     ` Peter Zijlstra
2009-02-17 21:30           ` Paul E. McKenney
2009-02-17 21:38             ` Peter Zijlstra
2009-02-16 16:38 ` [PATCH 2/4] generic-smp: remove kmalloc usage Peter Zijlstra
2009-02-17  0:40   ` Linus Torvalds
2009-02-17  8:24     ` Peter Zijlstra
2009-02-17  9:43       ` Ingo Molnar
2009-02-17  9:49         ` Peter Zijlstra
2009-02-17 10:56           ` Ingo Molnar
2009-02-18  4:50         ` Rusty Russell
2009-02-18 16:05           ` Ingo Molnar
2009-02-19  0:00             ` Jeremy Fitzhardinge
2009-02-19 12:21               ` Ingo Molnar
2009-02-19  4:31             ` Rusty Russell
2009-02-19  9:10               ` Peter Zijlstra
2009-02-19 11:04                 ` Jens Axboe
2009-02-19 16:52               ` Linus Torvalds
2009-02-17 15:44       ` Linus Torvalds
2009-02-16 16:38 ` [PATCH 3/4] generic-smp: properly allocate the cpumasks Peter Zijlstra
2009-02-16 23:17   ` Rusty Russell
2009-02-16 16:38 ` [PATCH 4/4] generic-smp: clean up some of the csd->flags fiddling Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090217155147.GE6761@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=jens.axboe@oracle.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=npiggin@suse.de \
    --cc=oleg@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=rusty@rustcorp.com.au \
    --cc=suresh.b.siddha@intel.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).