From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Nick Piggin <npiggin@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
Oleg Nesterov <oleg@redhat.com>,
Jens Axboe <jens.axboe@oracle.com>,
Suresh Siddha <suresh.b.siddha@intel.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Ingo Molnar <mingo@elte.hu>,
Rusty Russell <rusty@rustcorp.com.au>,
Steven Rostedt <rostedt@goodmis.org>,
linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org
Subject: Re: Q: smp.c && barriers (Was: [PATCH 1/4] generic-smp: remove single ipi fallback for smp_call_function_many())
Date: Tue, 17 Feb 2009 07:51:47 -0800 [thread overview]
Message-ID: <20090217155147.GE6761@linux.vnet.ibm.com> (raw)
In-Reply-To: <20090217112657.GE26402@wotan.suse.de>
On Tue, Feb 17, 2009 at 12:26:57PM +0100, Nick Piggin wrote:
> cc linux-arch
>
> On Tue, Feb 17, 2009 at 11:27:33AM +0100, Peter Zijlstra wrote:
> > > But hmm, why even bother with all this complexity? Why not just
> > > remove the outer loop completely? Do the lock and the list_replace_init
> > > unconditionally. It would turn tricky lockless code into simple locked
> > > code... we've already taken an interrupt anyway, so chances are pretty
> > > high that we have work here to do, right?
> >
> > Well, that's a practical suggestion, and I agree.
>
> How's this?
>
> --
> Simplify the barriers in generic remote function call interrupt code.
>
> Firstly, just unconditionally take the lock and check the list in the
> generic_call_function_single_interrupt IPI handler. As we've just taken
> an IPI here, the chances are fairly high that there will be work on the
> list for us, so do the locking unconditionally. This removes the tricky
> lockless list_empty check and dubious barriers. The change looks bigger
> than it is because it is just removing an outer loop.
>
> Secondly, clarify architecture specific IPI locking rules. Generic code
> has no tools to impose any sane ordering on IPIs if they go outside
> normal cache coherency, ergo the arch code must make them appear to
> obey cache coherency as a "memory operation" to initiate an IPI, and
> a "memory operation" to receive one. This way at least they can be
> reasoned about in generic code, and smp_mb used to provide ordering.
>
> The combination of these two changes means that explict barriers can
> be taken out of queue handling for the single case -- shared data is
> explicitly locked, and ipi ordering must conform to that, so no
> barriers needed. An extra barrier is needed in the many handler, so
> as to ensure we load the list element after the IPI is received.
>
> Does any architecture actually needs barriers? For the initiator I
> could see it, but for the handler I would be surprised. The other
> thing we could do for simplicity is just to require that a full
> barrier is required before generating an IPI, and after receiving an
> IPI. We can't just do that in generic code without auditing
> architectures. There have been subtle hangs here on some archs in
> the past.
>
> Signed-off-by: Nick Piggin <npiggin@suse.de>
>
> ---
> kernel/smp.c | 83 +++++++++++++++++++++++++++++++----------------------------
> 1 file changed, 44 insertions(+), 39 deletions(-)
>
> Index: linux-2.6/kernel/smp.c
> ===================================================================
> --- linux-2.6.orig/kernel/smp.c
> +++ linux-2.6/kernel/smp.c
> @@ -74,9 +74,16 @@ static void generic_exec_single(int cpu,
> spin_unlock_irqrestore(&dst->lock, flags);
>
> /*
> - * Make the list addition visible before sending the ipi.
> + * The list addition should be visible before sending the IPI
> + * handler locks the list to pull the entry off it because of
> + * normal cache coherency rules implied by spinlocks.
> + *
> + * If IPIs can go out of order to the cache coherency protocol
> + * in an architecture, sufficient synchronisation should be added
> + * to arch code to make it appear to obey cache coherency WRT
> + * locking and barrier primitives. Generic code isn't really equipped
> + * to do the right thing...
> */
> - smp_mb();
>
> if (ipi)
> arch_send_call_function_single_ipi(cpu);
> @@ -104,6 +111,14 @@ void generic_smp_call_function_interrupt
> int cpu = get_cpu();
>
> /*
> + * Ensure entry is visible on call_function_queue after we have
> + * entered the IPI. See comment in smp_call_function_many.
> + * If we don't have this, then we may miss an entry on the list
> + * and never get another IPI to process it.
> + */
> + smp_mb();
> +
> + /*
> * It's ok to use list_for_each_rcu() here even though we may delete
> * 'pos', since list_del_rcu() doesn't clear ->next
> */
> @@ -154,49 +169,37 @@ void generic_smp_call_function_single_in
> {
> struct call_single_queue *q = &__get_cpu_var(call_single_queue);
> LIST_HEAD(list);
> + unsigned int data_flags;
>
> - /*
> - * Need to see other stores to list head for checking whether
> - * list is empty without holding q->lock
> - */
> - smp_read_barrier_depends();
> - while (!list_empty(&q->list)) {
> - unsigned int data_flags;
> -
> - spin_lock(&q->lock);
> - list_replace_init(&q->list, &list);
> - spin_unlock(&q->lock);
> -
> - while (!list_empty(&list)) {
> - struct call_single_data *data;
> -
> - data = list_entry(list.next, struct call_single_data,
> - list);
> - list_del(&data->list);
> + spin_lock(&q->lock);
> + list_replace_init(&q->list, &list);
> + spin_unlock(&q->lock);
OK, I'll bite...
How do we avoid deadlock in the case where a pair of CPUs send to each
other concurrently?
Thanx, Paul
>
> - /*
> - * 'data' can be invalid after this call if
> - * flags == 0 (when called through
> - * generic_exec_single(), so save them away before
> - * making the call.
> - */
> - data_flags = data->flags;
> + while (!list_empty(&list)) {
> + struct call_single_data *data;
>
> - data->func(data->info);
> + data = list_entry(list.next, struct call_single_data,
> + list);
> + list_del(&data->list);
>
> - if (data_flags & CSD_FLAG_WAIT) {
> - smp_wmb();
> - data->flags &= ~CSD_FLAG_WAIT;
> - } else if (data_flags & CSD_FLAG_LOCK) {
> - smp_wmb();
> - data->flags &= ~CSD_FLAG_LOCK;
> - } else if (data_flags & CSD_FLAG_ALLOC)
> - kfree(data);
> - }
> /*
> - * See comment on outer loop
> + * 'data' can be invalid after this call if
> + * flags == 0 (when called through
> + * generic_exec_single(), so save them away before
> + * making the call.
> */
> - smp_read_barrier_depends();
> + data_flags = data->flags;
> +
> + data->func(data->info);
> +
> + if (data_flags & CSD_FLAG_WAIT) {
> + smp_wmb();
> + data->flags &= ~CSD_FLAG_WAIT;
> + } else if (data_flags & CSD_FLAG_LOCK) {
> + smp_wmb();
> + data->flags &= ~CSD_FLAG_LOCK;
> + } else if (data_flags & CSD_FLAG_ALLOC)
> + kfree(data);
> }
> }
>
> @@ -375,6 +378,8 @@ void smp_call_function_many(const struct
>
> /*
> * Make the list addition visible before sending the ipi.
> + * (IPIs must obey or appear to obey normal Linux cache coherency
> + * rules -- see comment in generic_exec_single).
> */
> smp_mb();
>
next prev parent reply other threads:[~2009-02-17 15:52 UTC|newest]
Thread overview: 101+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-16 16:38 [PATCH 0/4] generic smp helpers vs kmalloc Peter Zijlstra
2009-02-16 16:38 ` [PATCH 1/4] generic-smp: remove single ipi fallback for smp_call_function_many() Peter Zijlstra
2009-02-16 19:10 ` Oleg Nesterov
2009-02-16 19:41 ` Peter Zijlstra
2009-02-16 20:30 ` Oleg Nesterov
2009-02-16 20:55 ` Peter Zijlstra
2009-02-16 21:22 ` Oleg Nesterov
2009-02-17 12:25 ` Oleg Nesterov
2009-02-16 20:49 ` Q: smp.c && barriers (Was: [PATCH 1/4] generic-smp: remove single ipi fallback for smp_call_function_many()) Oleg Nesterov
2009-02-16 21:03 ` Peter Zijlstra
2009-02-16 21:32 ` Oleg Nesterov
2009-02-16 21:45 ` Peter Zijlstra
2009-02-16 22:02 ` Oleg Nesterov
2009-02-16 22:24 ` Peter Zijlstra
2009-02-16 23:19 ` Oleg Nesterov
2009-02-17 9:29 ` Peter Zijlstra
2009-02-17 10:11 ` Nick Piggin
2009-02-17 10:27 ` Peter Zijlstra
2009-02-17 10:39 ` Nick Piggin
2009-02-17 11:26 ` Nick Piggin
2009-02-17 11:48 ` Peter Zijlstra
2009-02-17 15:51 ` Paul E. McKenney [this message]
2009-02-18 2:15 ` Suresh Siddha
2009-02-18 2:40 ` Paul E. McKenney
2009-02-17 19:28 ` Q: " Oleg Nesterov
2009-02-17 21:32 ` Paul E. McKenney
2009-02-17 21:45 ` Oleg Nesterov
2009-02-17 22:39 ` Paul E. McKenney
2009-02-18 13:52 ` Nick Piggin
2009-02-18 16:09 ` Linus Torvalds
2009-02-18 16:21 ` Ingo Molnar
2009-02-18 16:33 ` Linus Torvalds
2009-02-18 16:58 ` Ingo Molnar
2009-02-18 17:05 ` Ingo Molnar
2009-02-18 17:10 ` Ingo Molnar
2009-02-18 17:17 ` Linus Torvalds
2009-02-18 17:23 ` Ingo Molnar
2009-02-18 17:14 ` Linus Torvalds
2009-02-18 17:47 ` Ingo Molnar
2009-02-18 18:33 ` Suresh Siddha
2009-02-18 16:37 ` Gleb Natapov
2009-02-19 0:12 ` Nick Piggin
2009-02-19 6:47 ` Benjamin Herrenschmidt
2009-02-19 13:11 ` Nick Piggin
2009-02-19 15:06 ` Ingo Molnar
2009-02-19 21:49 ` Benjamin Herrenschmidt
2009-02-18 2:21 ` Suresh Siddha
2009-02-18 13:59 ` Nick Piggin
2009-02-18 16:19 ` Linus Torvalds
2009-02-18 16:23 ` Ingo Molnar
2009-02-18 18:43 ` Suresh Siddha
2009-02-18 19:17 ` Ingo Molnar
2009-02-18 23:55 ` Suresh Siddha
2009-02-19 12:20 ` Ingo Molnar
2009-02-19 12:29 ` Nick Piggin
2009-02-19 12:45 ` Ingo Molnar
2009-02-19 22:00 ` Suresh Siddha
2009-02-20 10:56 ` Ingo Molnar
2009-02-20 18:56 ` Suresh Siddha
2009-02-20 19:40 ` Ingo Molnar
2009-02-20 23:28 ` Jack Steiner
2009-02-25 3:32 ` Nick Piggin
2009-02-25 12:47 ` Ingo Molnar
2009-02-25 18:25 ` Luck, Tony
2009-03-17 18:16 ` Suresh Siddha
2009-03-18 8:51 ` [tip:x86/x2apic] x86: add x2apic_wrmsr_fence() to x2apic flush tlb paths Suresh Siddha
2009-02-17 12:40 ` Q: smp.c && barriers (Was: [PATCH 1/4] generic-smp: remove single ipi fallback for smp_call_function_many()) Peter Zijlstra
2009-02-17 15:43 ` Paul E. McKenney
2009-02-17 15:40 ` [PATCH] generic-smp: remove kmalloc() Peter Zijlstra
2009-02-17 17:21 ` Oleg Nesterov
2009-02-17 17:40 ` Peter Zijlstra
2009-02-17 17:46 ` Peter Zijlstra
2009-02-17 18:30 ` Oleg Nesterov
2009-02-17 19:29 ` [PATCH -v4] generic-ipi: " Peter Zijlstra
2009-02-17 20:02 ` Oleg Nesterov
2009-02-17 20:11 ` Peter Zijlstra
2009-02-17 20:16 ` Peter Zijlstra
2009-02-17 20:44 ` Oleg Nesterov
2009-02-17 20:49 ` Peter Zijlstra
2009-02-17 22:09 ` Oleg Nesterov
2009-02-17 22:15 ` Peter Zijlstra
2009-02-17 21:30 ` Paul E. McKenney
2009-02-17 21:38 ` Peter Zijlstra
2009-02-16 16:38 ` [PATCH 2/4] generic-smp: remove kmalloc usage Peter Zijlstra
2009-02-17 0:40 ` Linus Torvalds
2009-02-17 8:24 ` Peter Zijlstra
2009-02-17 9:43 ` Ingo Molnar
2009-02-17 9:49 ` Peter Zijlstra
2009-02-17 10:56 ` Ingo Molnar
2009-02-18 4:50 ` Rusty Russell
2009-02-18 16:05 ` Ingo Molnar
2009-02-19 0:00 ` Jeremy Fitzhardinge
2009-02-19 12:21 ` Ingo Molnar
2009-02-19 4:31 ` Rusty Russell
2009-02-19 9:10 ` Peter Zijlstra
2009-02-19 11:04 ` Jens Axboe
2009-02-19 16:52 ` Linus Torvalds
2009-02-17 15:44 ` Linus Torvalds
2009-02-16 16:38 ` [PATCH 3/4] generic-smp: properly allocate the cpumasks Peter Zijlstra
2009-02-16 23:17 ` Rusty Russell
2009-02-16 16:38 ` [PATCH 4/4] generic-smp: clean up some of the csd->flags fiddling Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090217155147.GE6761@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=a.p.zijlstra@chello.nl \
--cc=jens.axboe@oracle.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=npiggin@suse.de \
--cc=oleg@redhat.com \
--cc=rostedt@goodmis.org \
--cc=rusty@rustcorp.com.au \
--cc=suresh.b.siddha@intel.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).