public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	John Stultz <john.stultz@linaro.org>,
	Alex Shi <alex.shi@linaro.org>, Kevin Hilman <khilman@linaro.org>
Subject: Re: [PATCH 09/13] nohz: Allow timekeeper's tick to stop when all full dynticks CPUs are idle
Date: Wed, 18 Dec 2013 07:29:01 -0800	[thread overview]
Message-ID: <20131218152901.GQ19211@linux.vnet.ibm.com> (raw)
In-Reply-To: <20131218143609.GC18464@localhost.localdomain>

On Wed, Dec 18, 2013 at 03:36:11PM +0100, Frederic Weisbecker wrote:
> On Tue, Dec 17, 2013 at 03:51:17PM -0800, Paul E. McKenney wrote:
> > On Tue, Dec 17, 2013 at 11:51:28PM +0100, Frederic Weisbecker wrote:
> > > +/*
> > > + * Fetch max deferment for the current clockevent source until it overflows.
> > > + * Also in full dynticks environment, make sure the current timekeeper
> > > + * stays periodic until some other CPU can take its timekeeping duty
> > > + * or until all full dynticks go to sleep.
> > > + */
> > > +static u64 tick_timekeeping_max_deferment(struct tick_sched *ts)
> > > +{
> > > +	int cpu;
> > > +	u64 ret = KTIME_MAX;
> > > +
> > > +	/*
> > > +	 * Fast path for full dynticks off-case: skip to
> > > +	 * clockevent max deferment
> > > +	 */
> > > +	if (!tick_nohz_full_enabled())
> > > +		return timekeeping_max_deferment();
> > > +
> > > +	cpu = smp_processor_id();
> > > +
> > > +	/* Full dynticks CPU don't take timekeeping duty */
> > > +	if (!tick_timekeeping_cpu(cpu))
> > > +		return timekeeping_max_deferment();
> > > +
> > > +	/*
> > > +	 * If we are the timekeeper and all full dynticks CPUs are idle,
> > > +	 * then we can finally sleep.
> > > +	 */
> > > +	if (tick_do_timer_cpu == cpu ||
> > > +	    (tick_do_timer_cpu == TICK_DO_TIMER_NONE &&	ts->do_timer_last == 1)) {
> > > +		if (!rcu_sys_is_idle()) {
> > 
> > So multiple CPUs could call rcu_sys_is_idle()?  Seems like this could
> > happen if tick_do_timer_cpu == TICK_DO_TIMER_NONE.  This would be OK only
> > if tick_timekeeping_cpu() returns true for one and only one of the CPUs
> > at any given range of time -- and also that no one calls rcu_sys_is_idle()
> > during a timekeeping CPU handoff.
> 
> Hmm yeah I fear we can have concurrent callers of this at a same time range.

OK, we do need to fix that.

> > If two different CPUs call rcu_sys_is_idle() anywhere nearly concurrently
> > on a small system (CONFIG_NO_HZ_FULL_SYSIDLE_SMALL), rcu_sys_is_idle()
> > will break and you will have voided your warranty.  ;-)
> 
> So it breaks because of concurrent state machine stepping on each other toes, right?
> Like one CPU has reached RCU_SYSIDLE_SHORT and another comes and see only
> RCU_SYSIDLE_NONE, so it can for example overwite to SHORT while the other CPU
> can be already far further the cmpxchg() sequence?

The ugliest possibility is in the CONFIG_NO_HZ_FULL_SYSIDLE_SMALL case.
Three CPUs are independently scanning the CPUs, and by luck, each of these
three CPUs see all the CPUs idle, even though CPUs are going idle and
non-idle continuously throughout.  After all three complete, they each
advance the state.  The state thus becomes RCU_SYSIDLE_FULL even though
there are non-idle CPUs.  The timekeeping tick therefore gets shut off
even though there are non-idle CPUs.  Very bad.

> Aye, I need to think further on how to cope with that...

One way would be to cmpxchg the ownership, and only the winner does
the call.  After that, the winner is tick_do_timer_cpu and no one else
does the call.  I guess you would want to skip the cmpxchg during early
boot, but can't claim to understand your constraints.

> > Also, if tick_timekeeping_cpu() doesn't think that there is a timekeeping
> > CPU, rcu_sys_is_idle() will always return false.  I think that this is
> > what you want to happen, just checking.
> 
> Ah right but that should be fine. tick_timekeeping_cpu() works for all potential
> timekeepers. Basically it's !tick_nohz_full_cpu(cpu).

OK, so we do have to worry about multiple callers.  Like you said above. ;-)

> > > +			/*
> > > +			 * Stop tick for 1 jiffy. In practice we stay periodic
> > > +			 * but that let us possibly delegate our timekeeping duty
> > > +			 * to stop the tick for real in the future.
> > > +			 */
> > > +			ret = tick_period.tv64;
> > > +		}
> > 
> > Do we need to set tick_do_timer_cpu to cpu?  Or is that handled elsewhere?
> > (If this is the boot-safety feature deleted below, could we please have
> > the comment back here?)
> 
> This is done in the patch that calls ..kick_timekeeping()  from sysidle_exit().
> 
> Do you have another case in mind?

Mostly just that if we did that determination here, we could avoid having
multiple callers into the state machine.

Alternatively, I might be able to use a lock to serialize callers, but
that would result in CPUs spinning needlessly in rcu_sys_is_idle().
Hence my suggestion of a cmpxchg() to elect a timekeeping CPU in this
code.

						Thanx, Paul


  reply	other threads:[~2013-12-18 15:29 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-17 22:51 [RFC PATCH 00/13] nohz: Use sysidle detection to let the timekeeper sleep Frederic Weisbecker
2013-12-17 22:51 ` [PATCH 01/13] tick: Rename tick_check_idle() to tick_irq_enter() Frederic Weisbecker
2014-01-25 14:22   ` [tip:timers/urgent] " tip-bot for Frederic Weisbecker
2013-12-17 22:51 ` [PATCH 02/13] time: New helper to check CPU eligibility to handle timekeeping Frederic Weisbecker
2013-12-17 22:51 ` [PATCH 03/13] rcu: Exclude all potential timekeepers from sysidle detection Frederic Weisbecker
2013-12-17 23:27   ` Paul E. McKenney
2013-12-17 23:49     ` Frederic Weisbecker
2013-12-18 11:43       ` Peter Zijlstra
2013-12-18 11:46         ` Peter Zijlstra
2013-12-18 14:15         ` Paul E. McKenney
2013-12-18 16:24         ` Frederic Weisbecker
2013-12-17 22:51 ` [PATCH 04/13] tick: Use timekeeping_cpu() to elect the CPU handling timekeeping duty Frederic Weisbecker
2013-12-17 23:55   ` Paul E. McKenney
2013-12-17 22:51 ` [PATCH 05/13] rcu: Fix unraised IPI to timekeeping CPU Frederic Weisbecker
2013-12-17 23:21   ` Paul E. McKenney
2013-12-18 14:13     ` Frederic Weisbecker
2013-12-18 14:22       ` Paul E. McKenney
2013-12-18 14:56         ` Frederic Weisbecker
2013-12-18 15:11           ` Peter Zijlstra
2013-12-18 15:58             ` Frederic Weisbecker
2013-12-18 12:12   ` Peter Zijlstra
2013-12-18 15:38     ` Frederic Weisbecker
2013-12-18 15:45       ` Christoph Hellwig
2013-12-18 17:10       ` Peter Zijlstra
2013-12-17 22:51 ` [PATCH 06/13] nohz: Introduce full dynticks' default timekeeping target Frederic Weisbecker
2013-12-17 23:54   ` Paul E. McKenney
2013-12-17 22:51 ` [PATCH 07/13] sched: Enable IPI reception on timekeeper under nohz full system Frederic Weisbecker
2013-12-17 23:52   ` Paul E. McKenney
2013-12-18 14:49     ` Frederic Weisbecker
2013-12-18 15:50       ` Paul E. McKenney
2013-12-18 10:06   ` Peter Zijlstra
2013-12-17 22:51 ` [PATCH 08/13] nohz: Get timekeeping max deferment outside jiffies_lock Frederic Weisbecker
2014-01-25 14:22   ` [tip:timers/urgent] " tip-bot for Frederic Weisbecker
2013-12-17 22:51 ` [PATCH 09/13] nohz: Allow timekeeper's tick to stop when all full dynticks CPUs are idle Frederic Weisbecker
2013-12-17 23:51   ` Paul E. McKenney
2013-12-18 14:36     ` Frederic Weisbecker
2013-12-18 15:29       ` Paul E. McKenney [this message]
2013-12-17 22:51 ` [PATCH 10/13] nohz: Hand over timekeeping duty on cpu offlining Frederic Weisbecker
2013-12-17 23:40   ` Paul E. McKenney
2013-12-18 14:19     ` Frederic Weisbecker
2013-12-18 12:30   ` Peter Zijlstra
2013-12-18 16:43     ` Frederic Weisbecker
2013-12-17 22:51 ` [PATCH 11/13] nohz: Wake up timekeeper on exit from sysidle state Frederic Weisbecker
2013-12-17 23:34   ` Paul E. McKenney
2013-12-17 23:52     ` Frederic Weisbecker
2013-12-17 22:51 ` [PATCH 12/13] nohz: Allow all CPUs outside nohz_full range to do timekeeping Frederic Weisbecker
2013-12-17 23:32   ` Paul E. McKenney
2013-12-17 22:51 ` [PATCH 13/13] nohz_full: fix code style issue of tick_nohz_full_stop_tick Frederic Weisbecker
2013-12-18  2:04 ` [RFC PATCH 00/13] nohz: Use sysidle detection to let the timekeeper sleep Alex Shi
2013-12-18 10:19   ` Peter Zijlstra
2013-12-18 14:18     ` Paul E. McKenney
2013-12-18 17:43   ` Frederic Weisbecker
2013-12-18 21:29     ` Andy Lutomirski
2013-12-18 21:49       ` Paul E. McKenney
2013-12-18 21:53         ` Andy Lutomirski
2013-12-18 21:57           ` Paul E. McKenney
2013-12-18 22:55             ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131218152901.GQ19211@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=alex.shi@linaro.org \
    --cc=fweisbec@gmail.com \
    --cc=john.stultz@linaro.org \
    --cc=khilman@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox