From: Frederic Weisbecker <fweisbec@gmail.com>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
linux-kernel@vger.kernel.org,
Dipankar Sarma <dipankar@in.ibm.com>,
Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Lai Jiangshan <laijs@cn.fujitsu.com>
Subject: Re: linux-next-20110923: warning kernel/rcutree.c:1833
Date: Thu, 29 Sep 2011 14:30:44 +0200 [thread overview]
Message-ID: <20110929123040.GB3537@somewhere> (raw)
In-Reply-To: <20110929005545.GT2383@linux.vnet.ibm.com>
On Wed, Sep 28, 2011 at 05:55:45PM -0700, Paul E. McKenney wrote:
> On Thu, Sep 29, 2011 at 01:46:36AM +0200, Frederic Weisbecker wrote:
> > Not sure what you mean. You want to split that specific patch or
> > others?
>
> It looks to me that having my pair of patches on top of yours is
> really ugly. If we are going to introduce the per-CPU idle variable,
> we should make a patch stack that uses that from the start. This allows
> me to bisect to track down the failures I am seeing on Power.
Yeah right. My patches fix the use on extended qs in idle. But if
idle itself is considered as a quiescent state all along, that's about
useless. So it sounds indeed better in that order.
> If you are too busy, I can take this on, but we might get better results
> if you did it. (And I certainly cannot complain about the large amount
> of time and energy that you have put into this -- plus the reduction in
> OS jitter will be really cool to have!)
No problem, I can take it.
> > > > Although idle and rcu/nohz are still close notions, it sounds
> > > > more logical the other way around in the ordering:
> > > >
> > > > tick_nohz_idle_enter() {
> > > > rcu_idle_enter() {
> > > > rcu_enter_nohz();
> > > > }
> > > > }
> > > >
> > > > tick_nohz_irq_exit() {
> > > > rcu_idle_enter() {
> > > > rcu_enter_nohz();
> > > > }
> > > > }
> > > >
> > > > Because rcu ext qs is something used by idle, not the opposite.
>
> Re-reading this makes me realize that I would instead say that idle
> is an example of an RCU extended quiescent state, or that the rcu_ext_qs
> argument to the various functions is used to indicate whether or not
> we are immediately entering/leaving idle from RCU's viewpoint.
>
> So what were you really trying to say here? ;-)
I was thinking about the fact that idle is a caller of rcu_enter_nohz().
And there may be more callers of it in the future. So I thought it may
be better to keep rcu_enter_nohz() idle-agnostic.
But it's fine, there are other ways to call rcu_idle_enter()/rcu_idle_exit()
from the right places other than from rcu_enter/exit_nohz().
We have tick_check_idle() on irq entry and tick_nohz_irq_exit(), both are called
on the first interrupt level in idle.
So I can change that easily for the nohz cpusets.
> > > The problem I have with this is that it is rcu_enter_nohz() that tracks
> > > the irq nesting required to correctly decide whether or not we are going
> > > to really go to idle state. Furthermore, there are cases where we
> > > do enter idle but do not enter nohz, and that has to be handled correctly
> > > as well.
> > >
> > > Now, it is quite possible that I am suffering a senior moment and just
> > > failing to see how to structure this in the design where rcu_idle_enter()
> > > invokes rcu_enter_nohz(), but regardless, I am failing to see how to
> > > structure this so that it works correctly.
> > >
> > > Please feel free to enlighten me!
> >
> > Ah I realize that you want to call rcu_idle_exit() when we enter
> > the first level interrupt and rcu_idle_enter() when we exit it
> > to return to idle loop.
> >
> > But we use that check:
> >
> > if (user ||
> > (rcu_is_cpu_idle() &&
> > !in_softirq() &&
> > hardirq_count() <= (1 << HARDIRQ_SHIFT)))
> > rcu_sched_qs(cpu);
> >
> > So we ensure that by the time we call rcu_check_callbacks(), we are not nesting
> > in another interrupt.
>
> But I would like to enable checks for entering/exiting idle while
> within an RCU read-side critical section. The idea is to move
> the checks from their currently somewhat problematic location in
> rcu_needs_cpu_quick_check() to somewhere more sensible. My current
> thought is to move them rcu_enter_nohz() and rcu_exit_nohz() near the
> calls to rcu_idle_enter() and rcu_idle_exit(), respectively.
So, checking if we are calling rcu_idle_enter() while in an RCU
read side critical section?
But we already have checks that RCU read side API are not called in
extended quiescent state.
> This would mean that they operated only in NO_HZ kernels with lockdep
> enabled, but I am good with that because to do otherwise would require
> adding nesting-level counters to the non-NO_HZ case, which I would like
> to avoid, expecially for TINY_RCU.
There can be a secondary check in rcu_read_lock_held() and friends to
ensures that rcu_is_idle_cpu(). In the non-NO_HZ case it's useful to
find similar issues.
In fact we could remove the check for rcu_extended_qs() in read side
APIs and check instead rcu_is_idle_cpu(). That would work in any
config and not only NO_HZ.
But I hope we can actually keep the check for RCU extended quiescent
state so that when rcu_enter_nohz() is called from other places than
idle, we are ready for it.
I believe it's fine to have both checks in PROVE_RCU.
>
> > That said we found RCU uses after we decrement the hardirq offset and until
> > we reach rcu_irq_exit(). So rcu_check_callbacks() may miss these places
> > and account spurious quiescent states.
> >
> > But between sub_preempt_count() and rcu_irq_exit(), irqs are disabled
> > AFAIK so we can't be interrupted by rcu_check_callbacks(), except during the
> > softirqs processing. But we have that ordering:
> >
> > add_preempt_count(SOTFIRQ_OFFSET)
> > local_irq_enable()
> >
> > do softirqs
> >
> > local_irq_disable()
> > sub_preempt_count(SOTFIRQ_OFFSET)
> >
> > So the !in_softirq() check covers us during the time we process softirqs.
> >
> > The only assumption we need is that there is no place between
> > sub_preempt_count(IRQ_EXIT_OFFSET) and rcu_irq_ext() that has
> > irqs enabled and that is an rcu read side critical section.
> >
> > I'm not aware of any automatic check to ensure that though.
>
> Nor am I, which is why I am looking to the checks in
> rcu_enter_nohz() and rcu_exit_nohz() called out above.
Yep.
> > Anyway, the delta patch looks good.
>
> OK, my current plans are to start forward-porting to -rc8, and I would
> like to have this pair of delta patches or something like them pulled
> into your stack.
Sure I can take your patches (I'm going to merge the delta into the first).
But if you want a rebase against -rc8, it's going to be easier if you
do that rebase on the branch you want me to work on. Then I work on top
of it.
For example we can take your rcu/dynticks, rewind to
"rcu: Make synchronize_sched_expedited() better at work sharing"
771c326f20029a9f30b9a58237c9a5d5ddc1763d, rebase on top of -rc8
and I rebase my patches (yours included) on top of it and I repost.
Right?
> > Just a little thing:
> >
> > > -void tick_nohz_idle_exit(void)
> > > +void tick_nohz_idle_exit(bool rcu_ext_qs)
> >
> > It becomes weird to have both idle_enter/idle_exit having
> > that parameter.
> >
> > Would it make sense to have tick_nohz_idle_[exit|enter]_norcu()
> > and a version without norcu?
>
> Given that we need to make this work in CONFIG_NO_HZ=n kernels, I believe
> that the current API is OK. But if you would like to change the API
> during the forward-port to -rc8, I am also OK with the alternative API
> you suggest.
Fine. I'll do that rename.
Thanks.
next prev parent reply other threads:[~2011-09-29 12:30 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-25 0:24 linux-next-20110923: warning kernel/rcutree.c:1833 Kirill A. Shutemov
2011-09-25 5:08 ` Paul E. McKenney
2011-09-25 11:26 ` Kirill A. Shutemov
2011-09-25 13:06 ` Frederic Weisbecker
2011-09-25 14:19 ` Kirill A. Shutemov
2011-09-25 16:48 ` Paul E. McKenney
2011-09-26 1:04 ` Frederic Weisbecker
2011-09-26 1:10 ` Frederic Weisbecker
2011-09-26 1:26 ` Paul E. McKenney
2011-09-26 1:41 ` Paul E. McKenney
2011-09-26 9:39 ` Frederic Weisbecker
2011-09-26 22:34 ` Paul E. McKenney
2011-09-27 12:07 ` Frederic Weisbecker
2011-09-26 9:42 ` Frederic Weisbecker
2011-09-26 22:35 ` Paul E. McKenney
2011-09-26 9:20 ` Frederic Weisbecker
2011-09-26 22:50 ` Paul E. McKenney
2011-09-27 12:16 ` Frederic Weisbecker
2011-09-27 18:01 ` Paul E. McKenney
2011-09-28 12:31 ` Frederic Weisbecker
2011-09-28 18:40 ` Paul E. McKenney
2011-09-28 23:46 ` Frederic Weisbecker
2011-09-29 0:55 ` Paul E. McKenney
2011-09-29 4:49 ` Paul E. McKenney
2011-09-29 12:30 ` Frederic Weisbecker [this message]
2011-09-29 17:12 ` Paul E. McKenney
2011-09-29 17:19 ` Paul E. McKenney
2011-09-29 23:18 ` Paul E. McKenney
2011-09-30 13:11 ` Frederic Weisbecker
2011-09-30 15:29 ` Paul E. McKenney
2011-09-30 19:24 ` Paul E. McKenney
2011-10-01 4:34 ` Paul E. McKenney
2011-10-01 12:24 ` Frederic Weisbecker
2011-10-01 12:28 ` Frederic Weisbecker
2011-10-01 16:35 ` Paul E. McKenney
2011-10-01 17:07 ` Paul E. McKenney
2011-10-02 3:23 ` Paul E. McKenney
2011-10-02 11:45 ` Frederic Weisbecker
2011-10-02 22:50 ` Frederic Weisbecker
2011-10-03 0:28 ` Paul E. McKenney
2011-10-03 12:59 ` Frederic Weisbecker
2011-10-03 16:22 ` Paul E. McKenney
2011-10-03 17:11 ` Frederic Weisbecker
2011-10-02 23:07 ` Frederic Weisbecker
2011-10-03 0:32 ` Paul E. McKenney
2011-10-03 13:03 ` Frederic Weisbecker
2011-10-03 16:30 ` Paul E. McKenney
2011-10-06 0:58 ` Paul E. McKenney
2011-10-06 1:59 ` Paul E. McKenney
2011-10-06 12:11 ` Frederic Weisbecker
2011-10-06 18:44 ` Paul E. McKenney
2011-10-06 23:44 ` Paul E. McKenney
2011-09-26 1:25 ` Paul E. McKenney
2011-09-26 8:48 ` Frederic Weisbecker
2011-09-26 8:49 ` Frederic Weisbecker
2011-09-26 22:30 ` Paul E. McKenney
2011-09-27 11:55 ` Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110929123040.GB3537@somewhere \
--to=fweisbec@gmail.com \
--cc=a.p.zijlstra@chello.nl \
--cc=dipankar@in.ibm.com \
--cc=kirill@shutemov.name \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=paulmck@linux.vnet.ibm.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.