public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Nick Piggin <npiggin@suse.de>, Jens Axboe <jens.axboe@oracle.com>,
	Ingo Molnar <mingo@elte.hu>,
	Rusty Russell <rusty@rustcorp.com.au>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/3] generic-ipi: remove kmalloc()
Date: Wed, 18 Feb 2009 17:15:15 +0100	[thread overview]
Message-ID: <20090218161515.GA24895@redhat.com> (raw)
In-Reply-To: <20090218002823.GA25408@linux.vnet.ibm.com>

On 02/17, Paul E. McKenney wrote:
>
> On Tue, Feb 17, 2009 at 10:59:06PM +0100, Peter Zijlstra wrote:
> > +static void csd_lock(struct call_single_data *data)
> >  {
> > -	/* Wait for response */
> > -	do {
> > -		if (!(data->flags & CSD_FLAG_WAIT))
> > -			break;
> > +	while (data->flags & CSD_FLAG_LOCK)
> >  		cpu_relax();
> > -	} while (1);
> > +	data->flags = CSD_FLAG_LOCK;
>
> We do need an smp_mb() here, otherwise, the call from
> smp_call_function_single() could be reordered by either CPU or compiler
> as follows:
>
> 	data->func = func;
> 	data->info = info;
> 	csd_lock(data);
>
> This might come as a bit of a surprise to the other CPU still trying to
> use the old values for data->func and data->info.

Could you explain a bit more here?

The compiler can't re-order this code due to cpu_relax(). Cpu can
re-order, but this doesn't matter because both the sender and ipi
handler take call_single_queue->lock.

And, giwen that csd_unlock() does mb() before csd_unlock(), how
it is possible that other CPU (ipi handler) still uses the old
values in *data after we see !CSD_FLAG_LOCK ?

> Note that this smb_mb() is required even if cpu_relax() contains a
> memory barrier, as it is possible to execute csd_lock_wait() without
> executing the cpu_relax(), if you get there at just the right time.

Can't understand... Nobody can do csd_wait() on this per-cpu entry,
but I guess you meant something else.

> OK...  What prevents the following sequence of events?
>
> o	CPU 0 calls smp_call_function_many(), targeting numerous CPUs,
> 	including CPU 2.  CPU 0 therefore enqueues this on the global
> 	call_function.queue.  "wait" is not specified, so CPU 0 returns
> 	immediately after sending the IPIs.
>
> 	It decrements the ->refs field, but, finding the result
> 	non-zero, refrains from removing the element that CPU 0
> 	enqueued, and also refrains from invoking csd_unlock().
>
> o	CPU 3 also receives the IPI, and also calls the needed function.
> 	Now, only CPU 1 need receive the IPI for the element to be
> 	removed.

so we have a single entry E0 on list,

> o	CPU 3 calls smp_call_function_many(), targeting numerous CPUs,
> 	but -not- including CPU 2.  CPU 3 therefore also this on the
> 	global call_function.queue and sends the IPIs, but no IPI for
> 	CPU 2.	Your choice as to whether CPU 3 waits or not.

now we have E3 -> E0

> o	CPU 2 receives CPU 3's IPI, but CPU 0's element is first on the
> 	list.  CPU 2 fetches the pointer (via list_for_each_entry_rcu()),
> 	and then...

it actually sees E3, not E0

> o	CPU 1 finally re-enables irqs and receives the IPIs!!!  It
> 	finds CPU 0's element on the queue, calls the function,
> 	decrements the ->refs field, and finds that it is zero.
> 	So, CPU 1 invokes list_del_rcu() to remove the element
> 	(OK so far, as list_del_rcu() doesn't overwrite the next
> 	pointer), then invokes csd_unlock() to release the element.
>
> o	CPU 0 then invokes another smp_call_function_many(), also
> 	multiple CPUs, but -not- to CPU 2.  It requeues the element
> 	that was just csd_unlock()ed above, carrying CPU 2 with it.
> 	It IPIs CPUs 1 and 3, but not CPU 2.

and inserts the element E0 at the head of the list again,

>
> o	CPU 2 continues, and falls off the bottom of the list.

afaics, it doesn't.

Every time smp_call_function_many() reuses the element, it sets its
->next pointer to the head of the list. If we race with another CPU
which fetches this pointer, this CPU has to re-scan the whole list,
but since we always modify/read data under data->lock this should
be safe, that CPU must notice (!cpumask_test_cpu(cpu, data->cpumask).

No?

Oleg.


  parent reply	other threads:[~2009-02-18 16:18 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-17 21:59 [PATCH 0/3] generic-ipi: patches -v5 Peter Zijlstra
2009-02-17 21:59 ` [PATCH 1/3] generic-ipi: simplify the barriers Peter Zijlstra
2009-02-18  0:27   ` Paul E. McKenney
2009-02-18  9:45     ` Peter Zijlstra
2009-02-17 21:59 ` [PATCH 2/3] generic-ipi: remove kmalloc() Peter Zijlstra
2009-02-18  0:28   ` Paul E. McKenney
2009-02-18 10:42     ` Peter Zijlstra
2009-02-18 16:06       ` Paul E. McKenney
2009-02-18 16:15     ` Oleg Nesterov [this message]
2009-02-18 19:47       ` Paul E. McKenney
2009-02-18 20:12         ` Oleg Nesterov
2009-02-19  2:40           ` Paul E. McKenney
2009-02-19  8:33             ` Peter Zijlstra
2009-02-18  5:31   ` Rusty Russell
2009-02-18 10:52     ` Peter Zijlstra
2009-02-17 21:59 ` [PATCH 3/3] generic-ipi: remove CSD_FLAG_WAIT Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090218161515.GA24895@redhat.com \
    --to=oleg@redhat.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=npiggin@suse.de \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=rusty@rustcorp.com.au \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox