From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: "Udo A. Steinberg" <udo@hypervisor.org>,
Joe Korty <joe.korty@ccur.com>,
mathieu.desnoyers@efficios.com, dhowells@redhat.com,
loic.minier@linaro.org, dhaval.giani@gmail.com,
tglx@linutronix.de, peterz@infradead.org,
linux-kernel@vger.kernel.org, josh@joshtriplett.org
Subject: Re: [PATCH] a local-timer-free version of RCU
Date: Mon, 8 Nov 2010 11:38:32 -0800 [thread overview]
Message-ID: <20101108193832.GB4032@linux.vnet.ibm.com> (raw)
In-Reply-To: <20101108153214.GC5466@nowhere>
On Mon, Nov 08, 2010 at 04:32:17PM +0100, Frederic Weisbecker wrote:
> On Sun, Nov 07, 2010 at 06:54:00PM -0800, Paul E. McKenney wrote:
> > On Mon, Nov 08, 2010 at 03:19:36AM +0100, Udo A. Steinberg wrote:
> > > On Mon, 8 Nov 2010 03:11:36 +0100 Udo A. Steinberg (UAS) wrote:
> > >
> > > UAS> On Sat, 6 Nov 2010 12:28:12 -0700 Paul E. McKenney (PEM) wrote:
> > > UAS>
> > > UAS> PEM> > + * rcu_quiescent() is called from rcu_read_unlock() when a
> > > UAS> PEM> > + * RCU batch was started while the rcu_read_lock/rcu_read_unlock
> > > UAS> PEM> > + * critical section was executing.
> > > UAS> PEM> > + */
> > > UAS> PEM> > +
> > > UAS> PEM> > +void rcu_quiescent(int cpu)
> > > UAS> PEM> > +{
> > > UAS> PEM>
> > > UAS> PEM> What prevents two different CPUs from calling this concurrently?
> > > UAS> PEM> Ah, apparently nothing -- the idea being that
> > > UAS> PEM> rcu_grace_period_complete() sorts it out. Though if the second
> > > UAS> PEM> CPU was delayed, it seems like it might incorrectly end a
> > > UAS> PEM> subsequent grace period as follows:
> > > UAS> PEM>
> > > UAS> PEM> o CPU 0 clears the second-to-last bit.
> > > UAS> PEM>
> > > UAS> PEM> o CPU 1 clears the last bit.
> > > UAS> PEM>
> > > UAS> PEM> o CPU 1 sees that the mask is empty, so invokes
> > > UAS> PEM> rcu_grace_period_complete(), but is delayed in the function
> > > UAS> PEM> preamble.
> > > UAS> PEM>
> > > UAS> PEM> o CPU 0 sees that the mask is empty, so invokes
> > > UAS> PEM> rcu_grace_period_complete(), ending the grace period.
> > > UAS> PEM> Because the RCU_NEXT_PENDING is set, it also starts
> > > UAS> PEM> a new grace period.
> > > UAS> PEM>
> > > UAS> PEM> o CPU 1 continues in rcu_grace_period_complete(),
> > > UAS> PEM> incorrectly ending the new grace period.
> > > UAS> PEM>
> > > UAS> PEM> Or am I missing something here?
> > > UAS>
> > > UAS> The scenario you describe seems possible. However, it should be easily
> > > UAS> fixed by passing the perceived batch number as another parameter to
> > > UAS> rcu_set_state() and making it part of the cmpxchg. So if the caller
> > > UAS> tries to set state bits on a stale batch number (e.g., batch !=
> > > UAS> rcu_batch), it can be detected.
> > > UAS>
> > > UAS> There is a similar, although harmless, issue in call_rcu(): Two CPUs can
> > > UAS> concurrently add callbacks to their respective nxt list and compute the
> > > UAS> same value for nxtbatch. One CPU succeeds in setting the PENDING bit
> > > UAS> while observing COMPLETE to be clear, so it starts a new batch.
> > >
> > > Correction: while observing COMPLETE to be set!
> > >
> > > UAS> Afterwards, the other CPU also sets the PENDING bit, but this time for
> > > UAS> the next batch. So it ends up requesting nxtbatch+1, although there is
> > > UAS> no need to. This also would be fixed by making the batch number part of
> > > UAS> the cmpxchg.
> >
> > Another approach is to map the underlying algorithm onto the TREE_RCU
> > data structures. And make preempt_disable(), local_irq_save(), and
> > friends invoke rcu_read_lock() -- irq and nmi handlers already have
> > the dyntick calls into RCU, so should be easy to handle as well.
> > Famous last words. ;-)
>
>
> So, this looks very scary for performances to add rcu_read_lock() in
> preempt_disable() and local_irq_save(), that notwithstanding it won't
> handle the "raw" rcu sched implicit path.
Ah -- I would arrange for the rcu_read_lock() to be added only in the
dyntick-hpc case. So no effect on normal builds, overhead is added only
in the dyntick-hpc case.
> We should check all rcu_dereference_sched
> users to ensure there are not in such raw path.
Indeed! ;-)
> There is also my idea from the other discussion: change rcu_read_lock_sched()
> semantics and map it to rcu_read_lock() in this rcu config (would be a nop
> in other configs). So every users of rcu_dereference_sched() would now need
> to protect their critical section with this.
> Would it be too late to change this semantic?
I was expecting that we would fold RCU, RCU bh, and RCU sched into
the same set of primitives (as Jim Houston did), but again only in the
dyntick-hpc case. However, rcu_read_lock_bh() would still disable BH,
and similarly, rcu_read_lock_sched() would still disable preemption.
> What is scary with this is that it also changes rcu sched semantics, and users
> of call_rcu_sched() and synchronize_sched(), who rely on that to do more
> tricky things than just waiting for rcu_derefence_sched() pointer grace periods,
> like really wanting for preempt_disable and local_irq_save/disable, those
> users will be screwed... :-( ...unless we also add relevant rcu_read_lock_sched()
> for them...
So rcu_read_lock() would be the underlying primitive. The implementation
of rcu_read_lock_sched() would disable preemption and then invoke
rcu_read_lock(). The implementation of rcu_read_lock_bh() would
disable BH and then invoke rcu_read_lock(). This would allow
synchronize_rcu_sched() and synchronize_rcu_bh() to simply invoke
synchronize_rcu().
Seem reasonable?
Thanx, Paul
next prev parent reply other threads:[~2010-11-08 19:38 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-04 23:21 dyntick-hpc and RCU Paul E. McKenney
2010-11-05 5:27 ` Frederic Weisbecker
2010-11-05 5:38 ` Frederic Weisbecker
2010-11-05 15:06 ` Paul E. McKenney
2010-11-05 20:06 ` Dhaval Giani
2010-11-05 15:04 ` Paul E. McKenney
2010-11-08 14:10 ` Frederic Weisbecker
2010-11-05 21:00 ` [PATCH] a local-timer-free version of RCU Joe Korty
2010-11-06 19:28 ` Paul E. McKenney
2010-11-06 19:34 ` Mathieu Desnoyers
2010-11-06 19:42 ` Mathieu Desnoyers
2010-11-06 19:44 ` Paul E. McKenney
2010-11-08 2:11 ` Udo A. Steinberg
2010-11-08 2:19 ` Udo A. Steinberg
2010-11-08 2:54 ` Paul E. McKenney
2010-11-08 15:32 ` Frederic Weisbecker
2010-11-08 19:38 ` Paul E. McKenney [this message]
2010-11-08 20:40 ` Frederic Weisbecker
2010-11-10 18:08 ` Paul E. McKenney
2010-11-08 15:06 ` Frederic Weisbecker
2010-11-08 15:18 ` Joe Korty
2010-11-08 19:50 ` Paul E. McKenney
2010-11-08 19:49 ` Paul E. McKenney
2010-11-08 20:51 ` Frederic Weisbecker
2010-11-06 20:03 ` Mathieu Desnoyers
2010-11-09 9:22 ` Lai Jiangshan
2010-11-10 15:54 ` Frederic Weisbecker
2010-11-10 17:31 ` Peter Zijlstra
2010-11-10 17:45 ` Frederic Weisbecker
2010-11-11 4:19 ` Paul E. McKenney
2010-11-13 22:30 ` Frederic Weisbecker
2010-11-16 1:28 ` Paul E. McKenney
2010-11-16 13:52 ` Frederic Weisbecker
2010-11-16 15:51 ` Paul E. McKenney
2010-11-17 0:52 ` Frederic Weisbecker
2010-11-17 1:25 ` Paul E. McKenney
2011-03-07 20:31 ` [PATCH] An RCU for SMP with a single CPU garbage collector Joe Korty
[not found] ` <20110307210157.GG3104@linux.vnet.ibm.com>
2011-03-07 21:16 ` Joe Korty
2011-03-07 21:33 ` Joe Korty
2011-03-07 22:51 ` Joe Korty
2011-03-08 9:07 ` Paul E. McKenney
2011-03-08 15:57 ` Joe Korty
2011-03-08 22:53 ` Joe Korty
2011-03-10 0:30 ` Paul E. McKenney
2011-03-10 0:28 ` Paul E. McKenney
2011-03-09 22:29 ` Frederic Weisbecker
2011-03-09 22:15 ` [PATCH 2/4] jrcu: tap rcu_read_unlock Joe Korty
2011-03-10 0:34 ` Paul E. McKenney
2011-03-10 19:50 ` JRCU Theory of Operation Joe Korty
2011-03-12 14:36 ` Paul E. McKenney
2011-03-13 0:43 ` Joe Korty
2011-03-13 5:56 ` Paul E. McKenney
2011-03-13 23:53 ` Joe Korty
2011-03-14 0:50 ` Paul E. McKenney
2011-03-14 0:55 ` Josh Triplett
2011-03-09 22:16 ` [PATCH 3/4] jrcu: tap might_resched() Joe Korty
2011-03-09 22:17 ` [PATCH 4/4] jrcu: add new stat to /sys/kernel/debug/rcu/rcudata Joe Korty
2011-03-09 22:19 ` [PATCH 1/4] jrcu: remove preempt_enable() tap [resend] Joe Korty
2011-03-12 14:36 ` [PATCH] An RCU for SMP with a single CPU garbage collector Paul E. McKenney
2011-03-13 1:25 ` Joe Korty
2011-03-13 6:09 ` Paul E. McKenney
[not found] <757455806.950179.1289232791283.JavaMail.root@sz0076a.westchester.pa.mail.comcast.net>
2010-11-08 16:15 ` [PATCH] a local-timer-free version of RCU houston.jim
2010-11-08 19:52 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101108193832.GB4032@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=dhaval.giani@gmail.com \
--cc=dhowells@redhat.com \
--cc=fweisbec@gmail.com \
--cc=joe.korty@ccur.com \
--cc=josh@joshtriplett.org \
--cc=linux-kernel@vger.kernel.org \
--cc=loic.minier@linaro.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=udo@hypervisor.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).