public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Subject: Re: [PATCH] rcu: Only pin GP kthread when full dynticks is actually used
Date: Fri, 13 Jun 2014 22:06:06 -0700	[thread overview]
Message-ID: <20140614050606.GD4581@linux.vnet.ibm.com> (raw)
In-Reply-To: <20140613233933.GT6635@localhost.localdomain>

On Sat, Jun 14, 2014 at 01:39:36AM +0200, Frederic Weisbecker wrote:
> On Fri, Jun 13, 2014 at 04:27:15PM -0700, Paul E. McKenney wrote:
> > On Sat, Jun 14, 2014 at 01:10:35AM +0200, Frederic Weisbecker wrote:
> > > On Fri, Jun 13, 2014 at 03:49:26PM -0700, Paul E. McKenney wrote:
> > > > On Fri, Jun 13, 2014 at 02:10:35PM -0700, Josh Triplett wrote:
> > > > > On Fri, Jun 13, 2014 at 01:48:22PM -0700, Paul E. McKenney wrote:
> > > > > > On Fri, Jun 13, 2014 at 09:44:41AM -0700, Josh Triplett wrote:
> > > > > > > On Fri, Jun 13, 2014 at 06:21:32PM +0200, Frederic Weisbecker wrote:
> > > > > > > > On Fri, Jun 13, 2014 at 09:16:30AM -0700, Paul E. McKenney wrote:
> > > > > > > > > > Is it because we have dynticks CPUs staying too long in the kernel without
> > > > > > > > > > taking any quiescent states? Are we perhaps missing some rcu_user_enter() or
> > > > > > > > > > things?
> > > > > > > > > 
> > > > > > > > > Sort of the former, but combined with the fact that in-kernel CPUs still
> > > > > > > > > need scheduling-clock interrupts for RCU to make progress.  I could
> > > > > > > > > move this to RCU's context-switch hook, but that could be very bad for
> > > > > > > > > workloads that do lots of context switching.
> > > > > > > > 
> > > > > > > > Or I can restart the tick if the CPU stays in the kernel for too long without
> > > > > > > > a tick. I think that's what we were doing before but we removed that because
> > > > > > > > we never implemented it correctly (we sent scheduler IPI that did nothing...)
> > > > > > > 
> > > > > > > I wonder if timer slack would make sense here: when you have at least
> > > > > > > one RCU callback pending, set a timer with a huge amount of timer slack,
> > > > > > > and cancel it if you end up handling the callback via a trip through the
> > > > > > > scheduler.
> > > > > > 
> > > > > > But in this case, we need the tick even if the current CPU has no callbacks
> > > > > > because it might be in an RCU read-side critical section.
> > > > > 
> > > > > Don't we handle that case via the slowpath of rcu_read_unlock, and a
> > > > > flag set via IPI?  ("Oh, that CPU has taken too long to note a quiescent
> > > > > state; send it an IPI to set the special flag that makes unlock do the
> > > > > work.")
> > > > 
> > > > There was once such logic on the force-quiescent-state path, and making
> > > > that handle this new case was my first proposal.  As Frederic pointed
> > > > out, that change requires rcu_needs_cpu()'s cooperation, because otherwise
> > > > the CPU will take the IPI, see that it still has but one runnable task,
> > > > and then keep its scheduling-clock interrupt off.
> > > 
> > > Exactly. So that's what happens currently, we call rcu_kick_nohz_cpu()
> > > on extended grace periods but the IPI doesn't reconsider the tick.
> > > 
> > > In fact it doesn't do anything at all because the scheduler IPI,
> > > when invoked without a reason, doesn't even call irq_enter()/irq_exit(),
> > > so rcu_needs_cpu() isn't quite called from there.
> > > 
> > > Now that's going to change with https://lwn.net/Articles/601836/ if
> > > we convert rcu_kick_nohz_cpu() to tick_nohz_full_kick_cpu().
> > > 
> > > Then we have the choice between two options:
> > > 
> > > * We can add a check in tick_nohz_full_check() and restart the tick if
> > > necessary.
> > > 
> > > * Extend rcu_needs_cpu() to restore a similar periodic mode until the
> > > grace periods get some progress.
> > 
> > If I was to extend rcu_needs_cpu(), I would add a flag and another counter
> > to the rcu_data structure.  If rcu_needs_cpu() saw the flag set and the
> > counter equal to the current ->completed value, it would return true.
> > 
> > I already have the rcu_kick_nohz_cpu() in rcu_implicit_dynticks_qs(),
> > so it is just a matter of also setting the flag and copying ->completed
> > to the new counter at that point.  I currently get to this point if the
> > CPU has managed to run for more than one jiffy without hitting either
> > idle or userspace execution.  Fair enough?
> 
> Perfect for me!

One complication...  So if the grace period has gone on for a long time,
and you are returning to kernel mode, RCU will need the scheduling-clock
tick.  However, in that very same situation, if you are returning to
idle or to NO_HZ_FULL userspace execution, RCU does -not- need the
scheduling-clock tick set.

One way I could do this is to have rcu_needs_cpu() return three values:
Zero for RCU doesn't need a scheduling-clock tick for any reason,
one if RCU needs a scheduling-clock tick only if returning to kernel
mode, and two if RCU unconditionally needs the scheduling-clock tick.
Would that work, or is there a better approach?

							Thanx, Paul


  reply	other threads:[~2014-06-14  5:06 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-13  0:16 [PATCH] rcu: Only pin GP kthread when full dynticks is actually used Frederic Weisbecker
2014-06-13  1:24 ` Paul E. McKenney
2014-06-13  1:35   ` Paul E. McKenney
2014-06-13 12:47     ` Frederic Weisbecker
2014-06-13 15:52       ` Paul E. McKenney
2014-06-13 16:00         ` Frederic Weisbecker
2014-06-13 16:16           ` Paul E. McKenney
2014-06-13 16:21             ` Frederic Weisbecker
2014-06-13 16:44               ` Josh Triplett
2014-06-13 20:48                 ` Paul E. McKenney
2014-06-13 21:10                   ` Josh Triplett
2014-06-13 22:49                     ` Paul E. McKenney
2014-06-13 23:10                       ` Frederic Weisbecker
2014-06-13 23:27                         ` Paul E. McKenney
2014-06-13 23:39                           ` Frederic Weisbecker
2014-06-14  5:06                             ` Paul E. McKenney [this message]
2014-06-14 11:26                               ` Paul E. McKenney
2014-06-14 13:10                                 ` Frederic Weisbecker
2014-06-14 14:29                                   ` Paul E. McKenney
2014-06-14 13:05                               ` Frederic Weisbecker
2014-06-13 20:49               ` Paul E. McKenney
2014-06-13 23:13                 ` Frederic Weisbecker
2014-06-13 23:22                   ` Paul E. McKenney
2014-06-13  2:05   ` Paul E. McKenney
2014-06-13 12:55     ` Frederic Weisbecker
2014-06-13 15:55       ` Paul E. McKenney
2014-06-13 16:03         ` Frederic Weisbecker
2014-06-13 16:20           ` Paul E. McKenney
2014-06-13 16:10         ` Paul E. McKenney
2014-06-13 12:42   ` Frederic Weisbecker
2014-06-13 15:58     ` Paul E. McKenney
2014-06-13 16:09       ` Frederic Weisbecker
2014-06-13 16:23         ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140614050606.GD4581@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=fweisbec@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox