All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Tejun Heo <tj@kernel.org>
Cc: mingo@elte.hu, peterz@infradead.org,
	linux-kernel@vger.kernel.org,
	Dipankar Sarma <dipankar@in.ibm.com>,
	Josh Triplett <josh@freedesktop.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Dimitri Sivanich <sivanich@sgi.com>
Subject: Re: [PATCH 3/4] scheduler: replace migration_thread with cpu_stop
Date: Wed, 5 May 2010 10:47:25 -0700	[thread overview]
Message-ID: <20100505174725.GA6783@linux.vnet.ibm.com> (raw)
In-Reply-To: <4BE11E29.6040106@kernel.org>

On Wed, May 05, 2010 at 09:28:41AM +0200, Tejun Heo wrote:
> Hello,
> 
> On 05/05/2010 03:33 AM, Paul E. McKenney wrote:
> > o	Therefore, when CPU 0 queues the work for CPU 1, CPU 1
> > 	loops right around and processes it.  There will be no
> > 	context switch on CPU 1.
> 
> Yes, that can happen.
> 
> > 	At first glance, this looks safe because:
> > 
> > 	1.	Although there is no context switch, there (presumably)
> > 		can be no RCU read-side critical sections on CPU 1 that
> > 		span this sequence of events.  (As far as I can see,
> > 		any such RCU read-side critical section would be due
> > 		to abuse of rcu_read_lock() and friends.)
> 
> AFAICS, this must hold; otherwise, synchronize_sched_expedited()
> wouldn't have worked in the first place.  On entry to any cpu_stop
> function, there can be no RCU read-side critical section in progress.

Makes sense to me!

The actual requirement is that, on each CPU, there must have been a
context switch between the end of the last RCU read-side critical
section and the end of a successful return from try_stop_cpus().

For CONFIG_TREE_PREEMPT_RCU, the guarantee required is a bit different:
on each CPU, either that CPU must not have been in an RCU read-side
critical section, or, if it was, there must have been a context switch
between the time that CPU entered its RCU read-side critical section
and the memory barrier executed within a successful try_stop_cpus().

As near as I can tell, the current implementation does meet these
requirements (but I do like your suggested change below).

> > 	2.	CPU 1 will acquire and release stopper->lock, and
> > 		further more will do an atomic_dec_and_test() in
> > 		cpu_stop_signal_done().  The former is a weak
> > 		guarantee, but the latter guarantees a memory
> > 		barrier, so that any subsequent code on CPU 1 will
> > 		be guaranteed to see changes on CPU 0 prior to the
> > 		call to synchronize_sched_expedited().
> > 
> > 		The guarantee required is that there will be a
> > 		full memory barrier on each affected CPU between
> > 		the time that try_stop_cpus() is called and the
> > 		time that it returns.
> 
> Ah, right.  I think it would be dangerous to depend on the implicit
> barriers there.  It might work today but it can easily break with
> later implementation detail changes which seem completely unrelated.
> Adding smp_mb() in the cpu_stop function should suffice, right?  It's
> not like the cost of smp_mb() there would mean anything anyway.

If I understand the code correctly, this would be very good!

							Thanx, Paul

  reply	other threads:[~2010-05-05 17:47 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-04 13:47 [PATCHSET sched/core] cpu_stop: implement and use cpu_stop, take#2 Tejun Heo
2010-05-04 13:47 ` [PATCH 1/4] cpu_stop: implement stop_cpu[s]() Tejun Heo
2010-05-04 13:47 ` [PATCH 2/4] stop_machine: reimplement using cpu_stop Tejun Heo
2010-05-04 13:47 ` [PATCH 3/4] scheduler: replace migration_thread with cpu_stop Tejun Heo
2010-05-05  1:33   ` Paul E. McKenney
2010-05-05  7:28     ` Tejun Heo
2010-05-05 17:47       ` Paul E. McKenney [this message]
2010-05-05 18:10         ` [PATCH 3/4 UPDATED] " Tejun Heo
2010-05-05 20:31           ` Paul E. McKenney
2010-05-06 16:30             ` Tejun Heo
2010-05-06 18:42               ` Paul E. McKenney
2010-05-07  5:24                 ` Tejun Heo
2010-05-04 13:47 ` [PATCH 4/4] scheduler: kill paranoia check in synchronize_sched_expedited() Tejun Heo
2010-05-04 18:52 ` [PATCHSET sched/core] cpu_stop: implement and use cpu_stop, take#2 Peter Zijlstra
2010-05-05  7:30   ` Tejun Heo
  -- strict thread matches above, loose matches on Subject: below --
2010-04-22 16:09 [PATCHSET sched/core] cpu_stop: implement and use cpu_stop Tejun Heo
2010-04-22 16:09 ` [PATCH 3/4] scheduler: replace migration_thread with cpu_stop Tejun Heo
2010-05-03 13:26   ` Peter Zijlstra
2010-05-04  7:17     ` Tejun Heo
2010-05-04 12:45       ` Peter Zijlstra
2010-05-04 12:49         ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100505174725.GA6783@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=dipankar@in.ibm.com \
    --cc=josh@freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=sivanich@sgi.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.