From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Nick Piggin <npiggin@suse.de>, Jens Axboe <jens.axboe@oracle.com>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Ingo Molnar <mingo@elte.hu>,
Rusty Russell <rusty@rustcorp.com.au>,
Steven Rostedt <rostedt@goodmis.org>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/4] generic-smp: remove single ipi fallback for smp_call_function_many()
Date: Mon, 16 Feb 2009 20:41:32 +0100 [thread overview]
Message-ID: <1234813292.30178.327.camel@laptop> (raw)
In-Reply-To: <20090216191008.GA1521@redhat.com>
On Mon, 2009-02-16 at 20:10 +0100, Oleg Nesterov wrote:
> On 02/16, Peter Zijlstra wrote:
> >
> > @@ -347,31 +384,32 @@ void smp_call_function_many(const struct
> > return;
> > }
> >
> > - data = kmalloc(sizeof(*data) + cpumask_size(), GFP_ATOMIC);
> > - if (unlikely(!data)) {
> > - /* Slow path. */
> > - for_each_online_cpu(cpu) {
> > - if (cpu == smp_processor_id())
> > - continue;
> > - if (cpumask_test_cpu(cpu, mask))
> > - smp_call_function_single(cpu, func, info, wait);
> > - }
> > - return;
> > + data = kmalloc(sizeof(*data), GFP_ATOMIC);
> > + if (data)
> > + data->csd.flags = CSD_FLAG_ALLOC;
> > + else {
> > + data = &per_cpu(cfd_data, me);
> > + /*
> > + * We need to wait for all previous users to go away.
> > + */
> > + while (data->csd.flags & CSD_FLAG_LOCK)
> > + cpu_relax();
> > + data->csd.flags = CSD_FLAG_LOCK;
> > }
> >
> > spin_lock_init(&data->lock);
> > - data->csd.flags = CSD_FLAG_ALLOC;
> > if (wait)
> > data->csd.flags |= CSD_FLAG_WAIT;
> > data->csd.func = func;
> > data->csd.info = info;
> > - cpumask_and(to_cpumask(data->cpumask_bits), mask, cpu_online_mask);
> > - cpumask_clear_cpu(smp_processor_id(), to_cpumask(data->cpumask_bits));
> > - data->refs = cpumask_weight(to_cpumask(data->cpumask_bits));
> > -
> > - spin_lock_irqsave(&call_function_lock, flags);
> > - list_add_tail_rcu(&data->csd.list, &call_function_queue);
> > - spin_unlock_irqrestore(&call_function_lock, flags);
> > + cpumask_and(&data->cpumask, mask, cpu_online_mask);
> > + cpumask_clear_cpu(smp_processor_id(), &data->cpumask);
>
> (perhaps it makes sense to use "me" instead of smp_processor_id())
Ah, missed one it seems, thanks ;-)
> > + data->refs = cpumask_weight(&data->cpumask);
> > +
> > + spin_lock_irqsave(&call_function.lock, flags);
> > + call_function.counter++;
> > + list_add_tail_rcu(&data->csd.list, &call_function.queue);
> > + spin_unlock_irqrestore(&call_function.lock, flags);
>
> What if the initialization above leaks into the critical section?
>
> I mean, generic_smp_call_function_interrupt() running on another CPU
> can see the result of list_add_tail_rcu() and cpumask_and(data->cpumask)
> but not (say) "data->refs = ...".
Ho humm, nice :-)
So best would be to put that data initialization under data->lock. This
would be a bug in the original code too.
> Actually I don't understand the counter/free_list logic.
>
> generic_smp_call_function_interrupt:
>
> /*
> * When the global queue is empty, its guaranteed that no cpu
> * is still observing any entry on the free_list, therefore
> * we can go ahead and unlock them.
> */
> if (!--call_function.counter)
> list_splice_init(&call_function.free_list, &free_list);
>
> I can't see why "its guaranteed that no cpu ...". Let's suppose CPU 0
> "hangs" for some reason in generic_smp_call_function_interrupt() right
> before "if (!cpumask_test_cpu(cpu, data->cpumask))" test. Then it is
> possible that another CPU removes the single entry (which doesn't have
> CPU 0 in data->cpumask) from call_function.queue.
Then call_function.counter wouldn't be 0, and we would not release the
list entries.
> Now, if that entry is re-added, we can have a problem if CPU 0 sees
> the partly initialized entry.
Right, so the scenario is that a cpu observes the list entry after we do
list_del_rcu() -- most likely a cpu not in the original mask, taversing
the list for a next entry -- we have to wait for some quiescent state to
ensure nobody is still referencing it.
We cannot use regular RCU, because its quiesent state takes forever to
happen, therefore this implementes a simple counting rcu for the queue
domain only.
When the list is empty, there's nobody seeing any elements, ergo we can
release the entries and re-use them.
> But why do we need counter/free_list at all?
> Can't we do the following:
>
> - initialize call_function_data.lock at boot time
>
> - change smp_call_function_many() to initialize cfd_data
> under ->lock
>
> - change generic_smp_call_function_interrupt() to do
>
> list_for_each_entry_rcu(data) {
>
> if (!cpumask_test_cpu(cpu, data->cpumask))
> continue;
>
> spin_lock(&data->lock);
> if (!cpumask_test_cpu(cpu, data->cpumask)) {
> spin_unlock(data->lock);
> continue;
> }
>
> cpumask_clear_cpu(cpu, data->cpumask);
> refs = --data->refs;
>
> func = data->func;
> info = data->info;
> spin_unlock(&data->lock);
>
> func(info);
>
> if (refs)
> continue;
>
> ...
> }
>
> Afaics, it is OK if smp_call_function_many() sees the "unneeded" entry on
> list, the only thing we must ensure is that we can't "misunderstand" this
> entry.
>
> No?
Hmm, could that not leed to loops, and or missed entries in the
list-iteration?
The saves approach seemed to wait until sure nobody observed the entry
before re-using it.
>
> Off-topic question, looks like smp_call_function_single() must not be
> called from interrupt/bh handler, right? But the comment says nothing.
>
> And,
> smp_call_function_single:
>
> /* Can deadlock when called with interrupts disabled */
> WARN_ON(irqs_disabled());
>
> Just curious, we can only deadlock if wait = T, right?
Right.
next prev parent reply other threads:[~2009-02-16 19:42 UTC|newest]
Thread overview: 101+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-16 16:38 [PATCH 0/4] generic smp helpers vs kmalloc Peter Zijlstra
2009-02-16 16:38 ` [PATCH 1/4] generic-smp: remove single ipi fallback for smp_call_function_many() Peter Zijlstra
2009-02-16 19:10 ` Oleg Nesterov
2009-02-16 19:41 ` Peter Zijlstra [this message]
2009-02-16 20:30 ` Oleg Nesterov
2009-02-16 20:55 ` Peter Zijlstra
2009-02-16 21:22 ` Oleg Nesterov
2009-02-17 12:25 ` Oleg Nesterov
2009-02-16 20:49 ` Q: smp.c && barriers (Was: [PATCH 1/4] generic-smp: remove single ipi fallback for smp_call_function_many()) Oleg Nesterov
2009-02-16 21:03 ` Peter Zijlstra
2009-02-16 21:32 ` Oleg Nesterov
2009-02-16 21:45 ` Peter Zijlstra
2009-02-16 22:02 ` Oleg Nesterov
2009-02-16 22:24 ` Peter Zijlstra
2009-02-16 23:19 ` Oleg Nesterov
2009-02-17 9:29 ` Peter Zijlstra
2009-02-17 10:11 ` Nick Piggin
2009-02-17 10:27 ` Peter Zijlstra
2009-02-17 10:39 ` Nick Piggin
2009-02-17 11:26 ` Nick Piggin
2009-02-17 11:48 ` Peter Zijlstra
2009-02-17 15:51 ` Paul E. McKenney
2009-02-18 2:15 ` Suresh Siddha
2009-02-18 2:40 ` Paul E. McKenney
2009-02-17 19:28 ` Q: " Oleg Nesterov
2009-02-17 21:32 ` Paul E. McKenney
2009-02-17 21:45 ` Oleg Nesterov
2009-02-17 22:39 ` Paul E. McKenney
2009-02-18 13:52 ` Nick Piggin
2009-02-18 16:09 ` Linus Torvalds
2009-02-18 16:21 ` Ingo Molnar
2009-02-18 16:33 ` Linus Torvalds
2009-02-18 16:58 ` Ingo Molnar
2009-02-18 17:05 ` Ingo Molnar
2009-02-18 17:10 ` Ingo Molnar
2009-02-18 17:17 ` Linus Torvalds
2009-02-18 17:23 ` Ingo Molnar
2009-02-18 17:14 ` Linus Torvalds
2009-02-18 17:47 ` Ingo Molnar
2009-02-18 18:33 ` Suresh Siddha
2009-02-18 16:37 ` Gleb Natapov
2009-02-19 0:12 ` Nick Piggin
2009-02-19 6:47 ` Benjamin Herrenschmidt
2009-02-19 13:11 ` Nick Piggin
2009-02-19 15:06 ` Ingo Molnar
2009-02-19 21:49 ` Benjamin Herrenschmidt
2009-02-18 2:21 ` Suresh Siddha
2009-02-18 13:59 ` Nick Piggin
2009-02-18 16:19 ` Linus Torvalds
2009-02-18 16:23 ` Ingo Molnar
2009-02-18 18:43 ` Suresh Siddha
2009-02-18 19:17 ` Ingo Molnar
2009-02-18 23:55 ` Suresh Siddha
2009-02-19 12:20 ` Ingo Molnar
2009-02-19 12:29 ` Nick Piggin
2009-02-19 12:45 ` Ingo Molnar
2009-02-19 22:00 ` Suresh Siddha
2009-02-20 10:56 ` Ingo Molnar
2009-02-20 18:56 ` Suresh Siddha
2009-02-20 19:40 ` Ingo Molnar
2009-02-20 23:28 ` Jack Steiner
2009-02-25 3:32 ` Nick Piggin
2009-02-25 12:47 ` Ingo Molnar
2009-02-25 18:25 ` Luck, Tony
2009-03-17 18:16 ` Suresh Siddha
2009-03-18 8:51 ` [tip:x86/x2apic] x86: add x2apic_wrmsr_fence() to x2apic flush tlb paths Suresh Siddha
2009-02-17 12:40 ` Q: smp.c && barriers (Was: [PATCH 1/4] generic-smp: remove single ipi fallback for smp_call_function_many()) Peter Zijlstra
2009-02-17 15:43 ` Paul E. McKenney
2009-02-17 15:40 ` [PATCH] generic-smp: remove kmalloc() Peter Zijlstra
2009-02-17 17:21 ` Oleg Nesterov
2009-02-17 17:40 ` Peter Zijlstra
2009-02-17 17:46 ` Peter Zijlstra
2009-02-17 18:30 ` Oleg Nesterov
2009-02-17 19:29 ` [PATCH -v4] generic-ipi: " Peter Zijlstra
2009-02-17 20:02 ` Oleg Nesterov
2009-02-17 20:11 ` Peter Zijlstra
2009-02-17 20:16 ` Peter Zijlstra
2009-02-17 20:44 ` Oleg Nesterov
2009-02-17 20:49 ` Peter Zijlstra
2009-02-17 22:09 ` Oleg Nesterov
2009-02-17 22:15 ` Peter Zijlstra
2009-02-17 21:30 ` Paul E. McKenney
2009-02-17 21:38 ` Peter Zijlstra
2009-02-16 16:38 ` [PATCH 2/4] generic-smp: remove kmalloc usage Peter Zijlstra
2009-02-17 0:40 ` Linus Torvalds
2009-02-17 8:24 ` Peter Zijlstra
2009-02-17 9:43 ` Ingo Molnar
2009-02-17 9:49 ` Peter Zijlstra
2009-02-17 10:56 ` Ingo Molnar
2009-02-18 4:50 ` Rusty Russell
2009-02-18 16:05 ` Ingo Molnar
2009-02-19 0:00 ` Jeremy Fitzhardinge
2009-02-19 12:21 ` Ingo Molnar
2009-02-19 4:31 ` Rusty Russell
2009-02-19 9:10 ` Peter Zijlstra
2009-02-19 11:04 ` Jens Axboe
2009-02-19 16:52 ` Linus Torvalds
2009-02-17 15:44 ` Linus Torvalds
2009-02-16 16:38 ` [PATCH 3/4] generic-smp: properly allocate the cpumasks Peter Zijlstra
2009-02-16 23:17 ` Rusty Russell
2009-02-16 16:38 ` [PATCH 4/4] generic-smp: clean up some of the csd->flags fiddling Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1234813292.30178.327.camel@laptop \
--to=a.p.zijlstra@chello.nl \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=npiggin@suse.de \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=rostedt@goodmis.org \
--cc=rusty@rustcorp.com.au \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).