From: Frederic Weisbecker <fweisbec@gmail.com>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Anton Blanchard <anton@au1.ibm.com>, Avi Kivity <avi@redhat.com>,
Ingo Molnar <mingo@elte.hu>, Lai Jiangshan <laijs@cn.fujitsu.com>,
Paul Menage <menage@google.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Stephen Hemminger <shemminger@vyatta.com>,
Thomas Gleixner <tglx@linutronix.de>,
Tim Pepper <lnxninja@linux.vnet.ibm.com>
Subject: Re: [PATCH 24/32] nohz/cpuset: Handle kernel entry/exit to account cputime
Date: Wed, 17 Aug 2011 04:30:02 +0200 [thread overview]
Message-ID: <20110817023000.GD32132@somewhere.redhat.com> (raw)
In-Reply-To: <20110816203820.GI2404@linux.vnet.ibm.com>
On Tue, Aug 16, 2011 at 01:38:20PM -0700, Paul E. McKenney wrote:
> On Mon, Aug 15, 2011 at 05:52:21PM +0200, Frederic Weisbecker wrote:
> > Provide a few APIs that archs can call to tell they are entering
> > or exiting the kernel so that when we are in nohz adaptive mode
> > we know precisely where we need to account the cputime.
> >
> > The new APIs are:
> >
> > - tick_nohz_enter_kernel() (called when we enter a syscall)
> > - tick_nohz_exit_kernel() (called when we exit a syscall)
> > - tick_nohz_enter_exception() (called when we enter any
> > exception, trap, faults...but not irqs)
> > - tick_nohz_exit_exception() (called when we exit any exception)
> >
> > Hooks into syscalls are typically driven by the TIF_NOHZ thread
> > flag.
> >
> > In addition, we use the value returned by user_mode(regs) from
> > the timer interrupt to know where we are.
> > Nonetheless, we can rely on user_mode(regs) != 0 to know
> > we are in userspace, but we can't rely on user_mode(regs) == 0
> > to know we are in the system.
> >
> > Consider the following scenario: we stop the tick after syscall
> > return, so we set TIF_NOHZ but the syscall exit hook is behind us.
> > If we haven't yet returned to userspace, then we have
> > user_mode(regs) == 0. If on top of that we consider we are in
> > system mode, and later we issue a syscall but restart the tick
> > right before reaching the syscall entry hook, then we have no clue
> > that the whole elapsed cputime was not in the system but in the
> > userspace.
> >
> > The only way to fix this is to only start entering nohz mode once
> > we know we are in userspace a first time, like when we reach the
> > kernel exit hook or when a timer tick with user_mode(regs) == 1
> > fires. Kernel threads don't have this worry.
> >
> > This sucks but for now I have no better solution. Let's hope we
> > can find better.
> >
> > TODO: wrap operation on jiffies?
>
> Hmmm... Does the RCU dyntick-idle code need to know about exception
> entry and exit?
>
> Thanx, Paul
At that time it doesn't because we don't yet call rcu_enter_nohz()
when switching to userspace. Instead we shutdown the tick and
restart it when needed when a remote CPU sends us an IPI to complete
a grace period.
The patch that switches to extended qs is the 31/32 and it handles
syscalls and exceptions as well.
I wanted to have support on rcu extended quiescent states late
in the patchset so that it's considered as an incremental feature
and not a core piece of the adaptive nohz (ie: it's no mandatory thing,
just an optimization). This way we can use cpuset nohz without that
rcu extended quiescent state feature and hence make that small part
bisectable.
Patch 30 activates support for cpuset nohz (support from x86).
Patch 31 activates the rcu extended quiescent state support in
userspace as a bonus.
next prev parent reply other threads:[~2011-08-17 2:30 UTC|newest]
Thread overview: 139+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-15 15:51 [RFC PATCH 00/32] Nohz cpusets (was: Nohz Tasks) Frederic Weisbecker
2011-08-15 15:51 ` [PATCH 01/32 RESEND] nohz: Drop useless call in tick_nohz_start_idle() Frederic Weisbecker
2011-08-29 14:23 ` Peter Zijlstra
2011-08-29 17:10 ` Frederic Weisbecker
2011-08-15 15:51 ` [PATCH 02/32 RESEND] nohz: Drop ts->idle_active Frederic Weisbecker
2011-08-29 14:23 ` Peter Zijlstra
2011-08-29 16:15 ` Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 03/32 RESEND] nohz: Drop useless ts->inidle check before rearming the tick Frederic Weisbecker
2011-08-29 14:23 ` Peter Zijlstra
2011-08-29 16:58 ` Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 04/32] nohz: Separate idle sleeping time accounting from nohz switching Frederic Weisbecker
2011-08-29 14:23 ` Peter Zijlstra
2011-08-29 16:32 ` Frederic Weisbecker
2011-08-29 17:44 ` Peter Zijlstra
2011-08-29 22:53 ` Frederic Weisbecker
2011-08-29 14:23 ` Peter Zijlstra
2011-08-29 17:01 ` Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 05/32] nohz: Move rcu dynticks idle mode handling to idle enter/exit APIs Frederic Weisbecker
2011-08-29 14:25 ` Peter Zijlstra
2011-08-29 17:11 ` Frederic Weisbecker
2011-08-29 17:49 ` Peter Zijlstra
2011-08-29 17:59 ` Frederic Weisbecker
2011-08-29 18:06 ` Peter Zijlstra
2011-08-29 23:35 ` Frederic Weisbecker
2011-08-30 11:17 ` Peter Zijlstra
2011-08-30 14:11 ` Frederic Weisbecker
2011-08-30 14:13 ` Peter Zijlstra
2011-08-30 14:27 ` Frederic Weisbecker
2011-08-30 11:19 ` Peter Zijlstra
2011-08-30 14:26 ` Frederic Weisbecker
2011-08-30 15:22 ` Peter Zijlstra
2011-08-30 18:45 ` Frederic Weisbecker
2011-08-30 11:21 ` Peter Zijlstra
2011-08-30 14:32 ` Frederic Weisbecker
2011-08-30 15:26 ` Peter Zijlstra
2011-08-30 15:33 ` Frederic Weisbecker
2011-08-30 15:42 ` Peter Zijlstra
2011-08-30 18:53 ` Frederic Weisbecker
2011-08-30 20:58 ` Peter Zijlstra
2011-08-30 22:24 ` Frederic Weisbecker
2011-08-31 9:17 ` Peter Zijlstra
2011-08-31 13:37 ` Frederic Weisbecker
2011-08-31 14:41 ` Peter Zijlstra
2011-09-01 16:40 ` Paul E. McKenney
2011-09-01 17:13 ` Peter Zijlstra
2011-09-02 1:41 ` Paul E. McKenney
2011-09-02 8:24 ` Peter Zijlstra
2011-09-04 19:37 ` Paul E. McKenney
2011-09-05 14:28 ` Peter Zijlstra
2011-08-15 15:52 ` [PATCH 06/32] nohz: Move idle ticks stats tracking out of nohz handlers Frederic Weisbecker
2011-08-29 14:28 ` Peter Zijlstra
2011-09-06 0:35 ` Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 07/32] nohz: Rename ts->idle_tick to ts->last_tick Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 08/32] nohz: Move nohz load balancer selection into idle logic Frederic Weisbecker
2011-08-29 14:45 ` Peter Zijlstra
2011-09-08 14:08 ` Frederic Weisbecker
2011-09-08 17:16 ` Paul E. McKenney
2011-08-15 15:52 ` [PATCH 09/32] nohz: Move ts->idle_calls into strict " Frederic Weisbecker
2011-08-29 14:47 ` Peter Zijlstra
2011-08-29 17:34 ` Frederic Weisbecker
2011-08-29 17:59 ` Peter Zijlstra
2011-08-29 18:23 ` Frederic Weisbecker
2011-08-29 18:33 ` Peter Zijlstra
2011-08-30 14:45 ` Frederic Weisbecker
2011-08-30 15:33 ` Peter Zijlstra
2011-09-06 16:35 ` Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 10/32] nohz: Move next idle expiring time record into idle logic area Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 11/32] cpuset: Set up interface for nohz flag Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 12/32] nohz: Try not to give the timekeeping duty to a cpuset nohz cpu Frederic Weisbecker
2011-08-29 14:55 ` Peter Zijlstra
2011-08-30 15:17 ` Frederic Weisbecker
2011-08-30 15:30 ` Dimitri Sivanich
2011-08-30 15:37 ` Peter Zijlstra
2011-08-30 22:44 ` Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 13/32] nohz: Adaptive tick stop and restart on nohz cpuset Frederic Weisbecker
2011-08-29 15:25 ` Peter Zijlstra
2011-09-06 13:03 ` Frederic Weisbecker
2011-08-29 15:28 ` Peter Zijlstra
2011-08-29 18:02 ` Frederic Weisbecker
2011-08-29 18:07 ` Peter Zijlstra
2011-08-29 18:28 ` Frederic Weisbecker
2011-08-30 12:44 ` Peter Zijlstra
2011-08-30 14:38 ` Frederic Weisbecker
2011-08-30 15:28 ` Peter Zijlstra
2011-08-29 15:32 ` Peter Zijlstra
2011-08-15 15:52 ` [PATCH 14/32] nohz/cpuset: Don't turn off the tick if rcu needs it Frederic Weisbecker
2011-08-16 20:13 ` Paul E. McKenney
2011-08-17 2:10 ` Frederic Weisbecker
2011-08-17 2:49 ` Paul E. McKenney
2011-08-29 15:36 ` Peter Zijlstra
2011-08-15 15:52 ` [PATCH 15/32] nohz/cpuset: Restart tick when switching to idle task Frederic Weisbecker
2011-08-29 15:43 ` Peter Zijlstra
2011-08-30 15:04 ` Frederic Weisbecker
2011-08-30 15:35 ` Peter Zijlstra
2011-08-15 15:52 ` [PATCH 16/32] nohz/cpuset: Wake up adaptive nohz CPU when a timer gets enqueued Frederic Weisbecker
2011-08-29 15:51 ` Peter Zijlstra
2011-08-29 15:55 ` Peter Zijlstra
2011-08-30 15:06 ` Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 17/32] x86: New cpuset nohz irq vector Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 18/32] nohz/cpuset: Don't stop the tick if posix cpu timers are running Frederic Weisbecker
2011-08-29 15:59 ` Peter Zijlstra
2011-08-15 15:52 ` [PATCH 19/32] nohz/cpuset: Restart tick when nohz flag is cleared on cpuset Frederic Weisbecker
2011-08-29 16:02 ` Peter Zijlstra
2011-08-30 15:10 ` Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 20/32] nohz/cpuset: Restart the tick if printk needs it Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 21/32] rcu: Restart the tick on non-responding adaptive nohz CPUs Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 22/32] rcu: Restart tick if we enqueue a callback in a nohz/cpuset CPU Frederic Weisbecker
2011-08-16 20:20 ` Paul E. McKenney
2011-08-17 2:18 ` Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 23/32] nohz/cpuset: Account user and system times in adaptive nohz mode Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 24/32] nohz/cpuset: Handle kernel entry/exit to account cputime Frederic Weisbecker
2011-08-16 20:38 ` Paul E. McKenney
2011-08-17 2:30 ` Frederic Weisbecker [this message]
2011-08-15 15:52 ` [PATCH 25/32] nohz/cpuset: New API to flush cputimes on nohz cpusets Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 26/32] nohz/cpuset: Flush cputime on threads in nohz cpusets when waiting leader Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 27/32] nohz/cpuset: Flush cputimes on procfs stat file read Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 28/32] nohz/cpuset: Flush cputimes for getrusage() and times() syscalls Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 29/32] x86: Syscall hooks for nohz cpusets Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 30/32] x86: Exception " Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 31/32] rcu: Switch to extended quiescent state in userspace from nohz cpuset Frederic Weisbecker
2011-08-16 20:44 ` Paul E. McKenney
2011-08-17 2:43 ` Frederic Weisbecker
2011-08-15 15:52 ` [PATCH 32/32] nohz/cpuset: Disable under some configs Frederic Weisbecker
2011-08-17 16:36 ` [RFC PATCH 00/32] Nohz cpusets (was: Nohz Tasks) Avi Kivity
2011-08-18 13:25 ` Frederic Weisbecker
2011-08-20 7:45 ` Paul Menage
2011-08-23 16:36 ` Frederic Weisbecker
2011-08-24 14:41 ` Gilad Ben-Yossef
2011-08-30 14:06 ` Frederic Weisbecker
2011-08-31 3:47 ` Mike Galbraith
2011-08-31 9:28 ` Peter Zijlstra
2011-08-31 10:26 ` Mike Galbraith
2011-08-31 10:33 ` Peter Zijlstra
2011-08-31 14:00 ` Gilad Ben-Yossef
2011-08-31 14:26 ` Peter Zijlstra
2011-08-31 14:05 ` Gilad Ben-Yossef
2011-08-31 16:12 ` Mike Galbraith
2011-08-31 13:57 ` Gilad Ben-Yossef
2011-08-31 14:30 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110817023000.GD32132@somewhere.redhat.com \
--to=fweisbec@gmail.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=anton@au1.ibm.com \
--cc=avi@redhat.com \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lnxninja@linux.vnet.ibm.com \
--cc=menage@google.com \
--cc=mingo@elte.hu \
--cc=paulmck@linux.vnet.ibm.com \
--cc=shemminger@vyatta.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).