From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org, mingo@elte.hu,
laijs@cn.fujitsu.com, dipankar@in.ibm.com,
akpm@linux-foundation.org, mathieu.desnoyers@efficios.com,
josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de,
rostedt@goodmis.org, dhowells@redhat.com, edumazet@google.com,
darren@dvhart.com, fweisbec@gmail.com, sbw@mit.edu
Subject: Re: [PATCH RFC nohz_full 0/8] Provide infrastructure for full-system idle
Date: Wed, 26 Jun 2013 15:24:42 -0700 [thread overview]
Message-ID: <20130626222442.GU3828@linux.vnet.ibm.com> (raw)
In-Reply-To: <20130626122022.GI28407@twins.programming.kicks-ass.net>
On Wed, Jun 26, 2013 at 02:20:22PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 25, 2013 at 02:37:21PM -0700, Paul E. McKenney wrote:
> > Whenever there is at least one non-idle CPU, it is necessary to
> > periodically update timekeeping information. Before NO_HZ_FULL, this
> > updating was carried out by the scheduling-clock tick, which ran on
> > every non-idle CPU. With the advent of NO_HZ_FULL, it is possible
> > to have non-idle CPUs that are not receiving scheduling-clock ticks.
> > This possibility is handled by assigning a timekeeping CPU that continues
> > taking scheduling-clock ticks.
> >
> > Unfortunately, timekeeping CPU continues taking scheduling-clock
> > interrupts even when all other CPUs are completely idle, which is
> > not so good for energy efficiency and battery lifetime. Clearly, it
> > would be good to turn off the timekeeping CPU's scheduling-clock tick
> > when all CPUs are completely idle. This is conceptually simple, but
> > we also need good performance and scalability on large systems, which
> > rules out implementations based on frequently updated global counts of
> > non-idle CPUs as well as implementations that frequently scan all CPUs.
> > Nevertheless, we need a single global indicator in order to keep the
> > overhead of checking acceptably low.
> >
> > The chosen approach is to enforce hysteresis on the non-idle to
> > full-system-idle transition, with the amount of hysteresis increasing
> > linearly with the number of CPUs, thus keeping contention acceptably low.
> > This approach piggybacks on RCU's existing force-quiescent-state scanning
> > of idle CPUs, which has the advantage of avoiding the scan entirely on
> > busy systems that have high levels of multiprogramming. This scan
> > take per-CPU idleness information and feeds it into a state machine
> > that applies the level of hysteresis required to arrive at a single
> > full-system-idle indicator.
> >
> > Note that this version pays attention to CPUs that have taken an NMI
> > from idle. It is not clear to me that NMI handlers can safely access
> > the time on a system that is long-term idle. Unless someone tells me
> > that it is somehow safe to access time from an NMI from idle, I will
> > remove NMI support in the next version.
>
> Using perf it is 'possible' to come near; we use local_clock() from NMI
> context. It will do a TSC read.
>
> On systems where the TSC is usable we'll end up with a sane timestamp;
> on systems where we need the whole kernel/sched/clock.c song and dance
> routine we'll return a stable time-stamp when called from long idle.
>
> I don't think there's anything we can do better there.
Just to make sure I understand... You are saying that it is OK for
NO_HZ_FULL to shut down timekeeping if all CPUs are idle, even if some
of them are taking NMIs from time to time, right?
Thanx, Paul
next prev parent reply other threads:[~2013-06-26 22:24 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-25 21:37 [PATCH RFC nohz_full 0/8] Provide infrastructure for full-system idle Paul E. McKenney
2013-06-25 21:37 ` [PATCH RFC nohz_full 1/8] nohz_full: Add Kconfig parameter for scalable detection of all-idle state Paul E. McKenney
2013-06-25 21:37 ` [PATCH RFC nohz_full 2/8] nohz_full: Add rcu_dyntick data " Paul E. McKenney
2013-06-25 21:37 ` [PATCH RFC nohz_full 3/8] nohz_full: Add per-CPU idle-state tracking Paul E. McKenney
2013-06-25 21:37 ` [PATCH RFC nohz_full 4/8] nohz_full: Add per-CPU idle-state tracking for NMIs Paul E. McKenney
2013-06-25 21:37 ` [PATCH RFC nohz_full 5/8] nohz_full: Add full-system idle states and variables Paul E. McKenney
2013-06-25 21:37 ` [PATCH RFC nohz_full 6/8] nohz_full: Add full-system-idle arguments to API Paul E. McKenney
2013-06-25 21:37 ` [PATCH RFC nohz_full 7/8] nohz_full: Add full-system-idle state machine Paul E. McKenney
2013-06-25 21:37 ` [PATCH RFC nohz_full 8/8] nohz_full: Force RCU's grace-period kthreads onto timekeeping CPU Paul E. McKenney
2013-06-25 21:49 ` [PATCH RFC nohz_full 0/8] Provide infrastructure for full-system idle Thomas Gleixner
2013-06-25 22:01 ` Paul E. McKenney
2013-06-26 1:11 ` Andy Lutomirski
2013-06-26 14:31 ` Paul E. McKenney
2013-06-26 12:20 ` Peter Zijlstra
2013-06-26 22:24 ` Paul E. McKenney [this message]
2013-06-27 9:42 ` Peter Zijlstra
2013-06-27 12:44 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130626222442.GU3828@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=darren@dvhart.com \
--cc=dhowells@redhat.com \
--cc=dipankar@in.ibm.com \
--cc=edumazet@google.com \
--cc=fweisbec@gmail.com \
--cc=josh@joshtriplett.org \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mingo@elte.hu \
--cc=niv@us.ibm.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=sbw@mit.edu \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox