linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Nick Piggin <npiggin@suse.de>, Jens Axboe <jens.axboe@oracle.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@elte.hu>,
	Rusty Russell <rusty@rustcorp.com.au>,
	Steven Rostedt <rostedt@goodmis.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/4] generic-smp: remove single ipi fallback for smp_call_function_many()
Date: Mon, 16 Feb 2009 20:41:32 +0100	[thread overview]
Message-ID: <1234813292.30178.327.camel@laptop> (raw)
In-Reply-To: <20090216191008.GA1521@redhat.com>

On Mon, 2009-02-16 at 20:10 +0100, Oleg Nesterov wrote:
> On 02/16, Peter Zijlstra wrote:
> >
> > @@ -347,31 +384,32 @@ void smp_call_function_many(const struct
> >  		return;
> >  	}
> >
> > -	data = kmalloc(sizeof(*data) + cpumask_size(), GFP_ATOMIC);
> > -	if (unlikely(!data)) {
> > -		/* Slow path. */
> > -		for_each_online_cpu(cpu) {
> > -			if (cpu == smp_processor_id())
> > -				continue;
> > -			if (cpumask_test_cpu(cpu, mask))
> > -				smp_call_function_single(cpu, func, info, wait);
> > -		}
> > -		return;
> > +	data = kmalloc(sizeof(*data), GFP_ATOMIC);
> > +	if (data)
> > +		data->csd.flags = CSD_FLAG_ALLOC;
> > +	else {
> > +		data = &per_cpu(cfd_data, me);
> > +		/*
> > +		 * We need to wait for all previous users to go away.
> > +		 */
> > +		while (data->csd.flags & CSD_FLAG_LOCK)
> > +			cpu_relax();
> > +		data->csd.flags = CSD_FLAG_LOCK;
> >  	}
> >
> >  	spin_lock_init(&data->lock);
> > -	data->csd.flags = CSD_FLAG_ALLOC;
> >  	if (wait)
> >  		data->csd.flags |= CSD_FLAG_WAIT;
> >  	data->csd.func = func;
> >  	data->csd.info = info;
> > -	cpumask_and(to_cpumask(data->cpumask_bits), mask, cpu_online_mask);
> > -	cpumask_clear_cpu(smp_processor_id(), to_cpumask(data->cpumask_bits));
> > -	data->refs = cpumask_weight(to_cpumask(data->cpumask_bits));
> > -
> > -	spin_lock_irqsave(&call_function_lock, flags);
> > -	list_add_tail_rcu(&data->csd.list, &call_function_queue);
> > -	spin_unlock_irqrestore(&call_function_lock, flags);
> > +	cpumask_and(&data->cpumask, mask, cpu_online_mask);
> > +	cpumask_clear_cpu(smp_processor_id(), &data->cpumask);
> 
> (perhaps it makes sense to use "me" instead of smp_processor_id())

Ah, missed one it seems, thanks ;-)

> > +	data->refs = cpumask_weight(&data->cpumask);
> > +
> > +	spin_lock_irqsave(&call_function.lock, flags);
> > +	call_function.counter++;
> > +	list_add_tail_rcu(&data->csd.list, &call_function.queue);
> > +	spin_unlock_irqrestore(&call_function.lock, flags);
> 
> What if the initialization above leaks into the critical section?
> 
> I mean, generic_smp_call_function_interrupt() running on another CPU
> can see the result of list_add_tail_rcu() and cpumask_and(data->cpumask)
> but not (say) "data->refs = ...".

Ho humm, nice :-)

So best would be to put that data initialization under data->lock. This
would be a bug in the original code too.

> Actually I don't understand the counter/free_list logic.
> 
> 	generic_smp_call_function_interrupt:
> 
> 			/*
> 			 * When the global queue is empty, its guaranteed that no cpu
> 			 * is still observing any entry on the free_list, therefore
> 			 * we can go ahead and unlock them.
> 			 */
> 			if (!--call_function.counter)
> 				list_splice_init(&call_function.free_list, &free_list);
> 
> I can't see why "its guaranteed that no cpu ...". Let's suppose CPU 0
> "hangs" for some reason in generic_smp_call_function_interrupt() right
> before "if (!cpumask_test_cpu(cpu, data->cpumask))" test. Then it is
> possible that another CPU removes the single entry (which doesn't have
> CPU 0 in data->cpumask) from call_function.queue.

Then call_function.counter wouldn't be 0, and we would not release the
list entries.

> Now, if that entry is re-added, we can have a problem if CPU 0 sees
> the partly initialized entry.

Right, so the scenario is that a cpu observes the list entry after we do
list_del_rcu() -- most likely a cpu not in the original mask, taversing
the list for a next entry -- we have to wait for some quiescent state to
ensure nobody is still referencing it.

We cannot use regular RCU, because its quiesent state takes forever to
happen, therefore this implementes a simple counting rcu for the queue
domain only.

When the list is empty, there's nobody seeing any elements, ergo we can
release the entries and re-use them.

> But why do we need counter/free_list at all?
> Can't we do the following:
> 
> 	- initialize call_function_data.lock at boot time
> 
> 	- change smp_call_function_many() to initialize cfd_data
> 	  under ->lock
> 
> 	- change generic_smp_call_function_interrupt() to do
> 
> 		list_for_each_entry_rcu(data) {
> 
> 			if (!cpumask_test_cpu(cpu, data->cpumask))
> 				continue;
> 
> 			spin_lock(&data->lock);
> 			if (!cpumask_test_cpu(cpu, data->cpumask)) {
> 				spin_unlock(data->lock);
> 				continue;
> 			}
> 
> 			cpumask_clear_cpu(cpu, data->cpumask);
> 			refs = --data->refs;
> 
> 			func = data->func;
> 			info = data->info;
> 			spin_unlock(&data->lock);
> 
> 			func(info);
> 
> 			if (refs)
> 				continue;
> 
> 			...
> 		}
> 
> Afaics, it is OK if smp_call_function_many() sees the "unneeded" entry on
> list, the only thing we must ensure is that we can't "misunderstand" this
> entry.
> 
> No?

Hmm, could that not leed to loops, and or missed entries in the
list-iteration?

The saves approach seemed to wait until sure nobody observed the entry
before re-using it.

> 
> Off-topic question, looks like smp_call_function_single() must not be
> called from interrupt/bh handler, right? But the comment says nothing.
> 
> And,
> 	smp_call_function_single:
> 
> 		/* Can deadlock when called with interrupts disabled */
> 		WARN_ON(irqs_disabled());
> 	
> Just curious, we can only deadlock if wait = T, right?

Right.


  reply	other threads:[~2009-02-16 19:42 UTC|newest]

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-16 16:38 [PATCH 0/4] generic smp helpers vs kmalloc Peter Zijlstra
2009-02-16 16:38 ` [PATCH 1/4] generic-smp: remove single ipi fallback for smp_call_function_many() Peter Zijlstra
2009-02-16 19:10   ` Oleg Nesterov
2009-02-16 19:41     ` Peter Zijlstra [this message]
2009-02-16 20:30       ` Oleg Nesterov
2009-02-16 20:55         ` Peter Zijlstra
2009-02-16 21:22           ` Oleg Nesterov
2009-02-17 12:25     ` Oleg Nesterov
2009-02-16 20:49   ` Q: smp.c && barriers (Was: [PATCH 1/4] generic-smp: remove single ipi fallback for smp_call_function_many()) Oleg Nesterov
2009-02-16 21:03     ` Peter Zijlstra
2009-02-16 21:32       ` Oleg Nesterov
2009-02-16 21:45         ` Peter Zijlstra
2009-02-16 22:02           ` Oleg Nesterov
2009-02-16 22:24             ` Peter Zijlstra
2009-02-16 23:19               ` Oleg Nesterov
2009-02-17  9:29                 ` Peter Zijlstra
2009-02-17 10:11                   ` Nick Piggin
2009-02-17 10:27                     ` Peter Zijlstra
2009-02-17 10:39                       ` Nick Piggin
2009-02-17 11:26                       ` Nick Piggin
2009-02-17 11:48                         ` Peter Zijlstra
2009-02-17 15:51                         ` Paul E. McKenney
2009-02-18  2:15                           ` Suresh Siddha
2009-02-18  2:40                             ` Paul E. McKenney
2009-02-17 19:28                         ` Q: " Oleg Nesterov
2009-02-17 21:32                           ` Paul E. McKenney
2009-02-17 21:45                             ` Oleg Nesterov
2009-02-17 22:39                               ` Paul E. McKenney
2009-02-18 13:52                                 ` Nick Piggin
2009-02-18 16:09                                   ` Linus Torvalds
2009-02-18 16:21                                     ` Ingo Molnar
2009-02-18 16:33                                       ` Linus Torvalds
2009-02-18 16:58                                         ` Ingo Molnar
2009-02-18 17:05                                           ` Ingo Molnar
2009-02-18 17:10                                             ` Ingo Molnar
2009-02-18 17:17                                               ` Linus Torvalds
2009-02-18 17:23                                                 ` Ingo Molnar
2009-02-18 17:14                                             ` Linus Torvalds
2009-02-18 17:47                                               ` Ingo Molnar
2009-02-18 18:33                                               ` Suresh Siddha
2009-02-18 16:37                                       ` Gleb Natapov
2009-02-19  0:12                                     ` Nick Piggin
2009-02-19  6:47                                     ` Benjamin Herrenschmidt
2009-02-19 13:11                                       ` Nick Piggin
2009-02-19 15:06                                         ` Ingo Molnar
2009-02-19 21:49                                           ` Benjamin Herrenschmidt
2009-02-18  2:21                         ` Suresh Siddha
2009-02-18 13:59                           ` Nick Piggin
2009-02-18 16:19                             ` Linus Torvalds
2009-02-18 16:23                               ` Ingo Molnar
2009-02-18 18:43                             ` Suresh Siddha
2009-02-18 19:17                               ` Ingo Molnar
2009-02-18 23:55                                 ` Suresh Siddha
2009-02-19 12:20                                   ` Ingo Molnar
2009-02-19 12:29                                     ` Nick Piggin
2009-02-19 12:45                                       ` Ingo Molnar
2009-02-19 22:00                                     ` Suresh Siddha
2009-02-20 10:56                                       ` Ingo Molnar
2009-02-20 18:56                                         ` Suresh Siddha
2009-02-20 19:40                                           ` Ingo Molnar
2009-02-20 23:28                                           ` Jack Steiner
2009-02-25  3:32                                           ` Nick Piggin
2009-02-25 12:47                                             ` Ingo Molnar
2009-02-25 18:25                                             ` Luck, Tony
2009-03-17 18:16                                             ` Suresh Siddha
2009-03-18  8:51                                               ` [tip:x86/x2apic] x86: add x2apic_wrmsr_fence() to x2apic flush tlb paths Suresh Siddha
2009-02-17 12:40                   ` Q: smp.c && barriers (Was: [PATCH 1/4] generic-smp: remove single ipi fallback for smp_call_function_many()) Peter Zijlstra
2009-02-17 15:43                   ` Paul E. McKenney
2009-02-17 15:40   ` [PATCH] generic-smp: remove kmalloc() Peter Zijlstra
2009-02-17 17:21     ` Oleg Nesterov
2009-02-17 17:40       ` Peter Zijlstra
2009-02-17 17:46         ` Peter Zijlstra
2009-02-17 18:30           ` Oleg Nesterov
2009-02-17 19:29         ` [PATCH -v4] generic-ipi: " Peter Zijlstra
2009-02-17 20:02           ` Oleg Nesterov
2009-02-17 20:11             ` Peter Zijlstra
2009-02-17 20:16               ` Peter Zijlstra
2009-02-17 20:44                 ` Oleg Nesterov
2009-02-17 20:49                 ` Peter Zijlstra
2009-02-17 22:09                   ` Oleg Nesterov
2009-02-17 22:15                     ` Peter Zijlstra
2009-02-17 21:30           ` Paul E. McKenney
2009-02-17 21:38             ` Peter Zijlstra
2009-02-16 16:38 ` [PATCH 2/4] generic-smp: remove kmalloc usage Peter Zijlstra
2009-02-17  0:40   ` Linus Torvalds
2009-02-17  8:24     ` Peter Zijlstra
2009-02-17  9:43       ` Ingo Molnar
2009-02-17  9:49         ` Peter Zijlstra
2009-02-17 10:56           ` Ingo Molnar
2009-02-18  4:50         ` Rusty Russell
2009-02-18 16:05           ` Ingo Molnar
2009-02-19  0:00             ` Jeremy Fitzhardinge
2009-02-19 12:21               ` Ingo Molnar
2009-02-19  4:31             ` Rusty Russell
2009-02-19  9:10               ` Peter Zijlstra
2009-02-19 11:04                 ` Jens Axboe
2009-02-19 16:52               ` Linus Torvalds
2009-02-17 15:44       ` Linus Torvalds
2009-02-16 16:38 ` [PATCH 3/4] generic-smp: properly allocate the cpumasks Peter Zijlstra
2009-02-16 23:17   ` Rusty Russell
2009-02-16 16:38 ` [PATCH 4/4] generic-smp: clean up some of the csd->flags fiddling Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1234813292.30178.327.camel@laptop \
    --to=a.p.zijlstra@chello.nl \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=npiggin@suse.de \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=rostedt@goodmis.org \
    --cc=rusty@rustcorp.com.au \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).