From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Tejun Heo <tj@kernel.org>
Cc: mingo@elte.hu, peterz@infradead.org,
linux-kernel@vger.kernel.org,
Dipankar Sarma <dipankar@in.ibm.com>,
Josh Triplett <josh@freedesktop.org>,
Oleg Nesterov <oleg@redhat.com>,
Dimitri Sivanich <sivanich@sgi.com>
Subject: Re: [PATCH 3/4] scheduler: replace migration_thread with cpu_stop
Date: Wed, 5 May 2010 10:47:25 -0700 [thread overview]
Message-ID: <20100505174725.GA6783@linux.vnet.ibm.com> (raw)
In-Reply-To: <4BE11E29.6040106@kernel.org>
On Wed, May 05, 2010 at 09:28:41AM +0200, Tejun Heo wrote:
> Hello,
>
> On 05/05/2010 03:33 AM, Paul E. McKenney wrote:
> > o Therefore, when CPU 0 queues the work for CPU 1, CPU 1
> > loops right around and processes it. There will be no
> > context switch on CPU 1.
>
> Yes, that can happen.
>
> > At first glance, this looks safe because:
> >
> > 1. Although there is no context switch, there (presumably)
> > can be no RCU read-side critical sections on CPU 1 that
> > span this sequence of events. (As far as I can see,
> > any such RCU read-side critical section would be due
> > to abuse of rcu_read_lock() and friends.)
>
> AFAICS, this must hold; otherwise, synchronize_sched_expedited()
> wouldn't have worked in the first place. On entry to any cpu_stop
> function, there can be no RCU read-side critical section in progress.
Makes sense to me!
The actual requirement is that, on each CPU, there must have been a
context switch between the end of the last RCU read-side critical
section and the end of a successful return from try_stop_cpus().
For CONFIG_TREE_PREEMPT_RCU, the guarantee required is a bit different:
on each CPU, either that CPU must not have been in an RCU read-side
critical section, or, if it was, there must have been a context switch
between the time that CPU entered its RCU read-side critical section
and the memory barrier executed within a successful try_stop_cpus().
As near as I can tell, the current implementation does meet these
requirements (but I do like your suggested change below).
> > 2. CPU 1 will acquire and release stopper->lock, and
> > further more will do an atomic_dec_and_test() in
> > cpu_stop_signal_done(). The former is a weak
> > guarantee, but the latter guarantees a memory
> > barrier, so that any subsequent code on CPU 1 will
> > be guaranteed to see changes on CPU 0 prior to the
> > call to synchronize_sched_expedited().
> >
> > The guarantee required is that there will be a
> > full memory barrier on each affected CPU between
> > the time that try_stop_cpus() is called and the
> > time that it returns.
>
> Ah, right. I think it would be dangerous to depend on the implicit
> barriers there. It might work today but it can easily break with
> later implementation detail changes which seem completely unrelated.
> Adding smp_mb() in the cpu_stop function should suffice, right? It's
> not like the cost of smp_mb() there would mean anything anyway.
If I understand the code correctly, this would be very good!
Thanx, Paul
next prev parent reply other threads:[~2010-05-05 17:47 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-04 13:47 [PATCHSET sched/core] cpu_stop: implement and use cpu_stop, take#2 Tejun Heo
2010-05-04 13:47 ` [PATCH 1/4] cpu_stop: implement stop_cpu[s]() Tejun Heo
2010-05-04 13:47 ` [PATCH 2/4] stop_machine: reimplement using cpu_stop Tejun Heo
2010-05-04 13:47 ` [PATCH 3/4] scheduler: replace migration_thread with cpu_stop Tejun Heo
2010-05-05 1:33 ` Paul E. McKenney
2010-05-05 7:28 ` Tejun Heo
2010-05-05 17:47 ` Paul E. McKenney [this message]
2010-05-05 18:10 ` [PATCH 3/4 UPDATED] " Tejun Heo
2010-05-05 20:31 ` Paul E. McKenney
2010-05-06 16:30 ` Tejun Heo
2010-05-06 18:42 ` Paul E. McKenney
2010-05-07 5:24 ` Tejun Heo
2010-05-04 13:47 ` [PATCH 4/4] scheduler: kill paranoia check in synchronize_sched_expedited() Tejun Heo
2010-05-04 18:52 ` [PATCHSET sched/core] cpu_stop: implement and use cpu_stop, take#2 Peter Zijlstra
2010-05-05 7:30 ` Tejun Heo
-- strict thread matches above, loose matches on Subject: below --
2010-04-22 16:09 [PATCHSET sched/core] cpu_stop: implement and use cpu_stop Tejun Heo
2010-04-22 16:09 ` [PATCH 3/4] scheduler: replace migration_thread with cpu_stop Tejun Heo
2010-05-03 13:26 ` Peter Zijlstra
2010-05-04 7:17 ` Tejun Heo
2010-05-04 12:45 ` Peter Zijlstra
2010-05-04 12:49 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100505174725.GA6783@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=dipankar@in.ibm.com \
--cc=josh@freedesktop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=sivanich@sgi.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).