From: Frederic Weisbecker <fweisbec@gmail.com>
To: LKML <linux-kernel@vger.kernel.org>, linaro-sched-sig@lists.linaro.org
Cc: Frederic Weisbecker <fweisbec@gmail.com>,
Alessio Igor Bogani <abogani@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Avi Kivity <avi@redhat.com>, Chris Metcalf <cmetcalf@tilera.com>,
Christoph Lameter <cl@linux.com>,
Daniel Lezcano <daniel.lezcano@linaro.org>,
Geoff Levand <geoff@infradead.org>,
Gilad Ben Yossef <gilad@benyossef.com>,
Ingo Molnar <mingo@kernel.org>,
Max Krasnyansky <maxk@qualcomm.com>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Peter Zijlstra <peterz@infradead.org>,
Stephen Hemminger <shemminger@vyatta.com>,
Steven Rostedt <rostedt@goodmis.org>,
Sven-Thorsten Dietrich <thebigcorporation@gmail.com>,
Thomas Gleixner <tglx@linutronix.de>,
Zen Lin <zen@openhuawei.org>
Subject: [PATCH 19/32] nohz/cpuset: Account user and system times in adaptive nohz mode
Date: Wed, 21 Mar 2012 14:58:25 +0100 [thread overview]
Message-ID: <1332338318-5958-21-git-send-email-fweisbec@gmail.com> (raw)
In-Reply-To: <1332338318-5958-1-git-send-email-fweisbec@gmail.com>
If we are not running the tick, we are not anymore regularly counting
the user/system cputime at every jiffies.
To solve this, save a snapshot of the jiffies when we stop the tick
and keep track of where we saved it: user or system. On top of this,
we account the cputime elapsed when we cross the kernel entry/exit
boundaries and when we restart the tick.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alessio Igor Bogani <abogani@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Max Krasnyansky <maxk@qualcomm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Zen Lin <zen@openhuawei.org>
---
include/linux/tick.h | 12 ++++
kernel/sched/core.c | 1 +
kernel/time/tick-sched.c | 131 +++++++++++++++++++++++++++++++++++++++++++++-
3 files changed, 142 insertions(+), 2 deletions(-)
diff --git a/include/linux/tick.h b/include/linux/tick.h
index 03b6edd..598b492 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -153,11 +153,23 @@ static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; }
# endif /* !NO_HZ */
#ifdef CONFIG_CPUSETS_NO_HZ
+extern void tick_nohz_enter_kernel(void);
+extern void tick_nohz_exit_kernel(void);
+extern void tick_nohz_enter_exception(struct pt_regs *regs);
+extern void tick_nohz_exit_exception(struct pt_regs *regs);
extern void tick_nohz_check_adaptive(void);
+extern void tick_nohz_pre_schedule(void);
extern void tick_nohz_post_schedule(void);
+extern bool tick_nohz_account_tick(void);
#else /* !CPUSETS_NO_HZ */
+static inline void tick_nohz_enter_kernel(void) { }
+static inline void tick_nohz_exit_kernel(void) { }
+static inline void tick_nohz_enter_exception(struct pt_regs *regs) { }
+static inline void tick_nohz_exit_exception(struct pt_regs *regs) { }
static inline void tick_nohz_check_adaptive(void) { }
+static inline void tick_nohz_pre_schedule(void) { }
static inline void tick_nohz_post_schedule(void) { }
+static inline bool tick_nohz_account_tick(void) { return false; }
#endif /* CPUSETS_NO_HZ */
#endif
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index eca842e..5debfd7 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1923,6 +1923,7 @@ static inline void
prepare_task_switch(struct rq *rq, struct task_struct *prev,
struct task_struct *next)
{
+ tick_nohz_pre_schedule();
sched_info_switch(prev, next);
perf_event_task_sched_out(prev, next);
fire_sched_out_preempt_notifiers(prev, next);
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 9359e6c..ff78126 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -526,7 +526,13 @@ static bool can_stop_adaptive_tick(void)
static void tick_nohz_cpuset_stop_tick(struct tick_sched *ts)
{
+ struct pt_regs *regs = get_irq_regs();
int cpu = smp_processor_id();
+ int was_stopped;
+ int user = 0;
+
+ if (regs)
+ user = user_mode(regs);
if (!cpuset_adaptive_nohz() || is_idle_task(current))
return;
@@ -537,7 +543,36 @@ static void tick_nohz_cpuset_stop_tick(struct tick_sched *ts)
if (!can_stop_adaptive_tick())
return;
+ /*
+ * If we stop the tick between the syscall exit hook and the actual
+ * return to userspace, we'll think we are in system space (due to
+ * user_mode() thinking so). And since we passed the syscall exit hook
+ * already we won't realize we are in userspace. So the time spent
+ * tickless would be spuriously accounted as belonging to system.
+ *
+ * To avoid this kind of problem, we only stop the tick from userspace
+ * (until we find a better solution).
+ * We can later enter the kernel and keep the tick stopped. But the place
+ * where we stop the tick must be userspace.
+ * We make an exception for kernel threads since they always execute in
+ * kernel space.
+ */
+ if (!user && current->mm)
+ return;
+
+ was_stopped = ts->tick_stopped;
tick_nohz_stop_sched_tick(ts, ktime_get(), cpu);
+
+ if (!was_stopped && ts->tick_stopped) {
+ WARN_ON_ONCE(ts->saved_jiffies_whence != JIFFIES_SAVED_NONE);
+ if (user)
+ ts->saved_jiffies_whence = JIFFIES_SAVED_USER;
+ else if (!current->mm)
+ ts->saved_jiffies_whence = JIFFIES_SAVED_SYS;
+
+ ts->saved_jiffies = jiffies;
+ set_thread_flag(TIF_NOHZ);
+ }
}
#else
static void tick_nohz_cpuset_stop_tick(struct tick_sched *ts) { }
@@ -862,6 +897,70 @@ void tick_check_idle(int cpu)
}
#ifdef CONFIG_CPUSETS_NO_HZ
+void tick_nohz_exit_kernel(void)
+{
+ unsigned long flags;
+ struct tick_sched *ts;
+ unsigned long delta_jiffies;
+
+ local_irq_save(flags);
+
+ ts = &__get_cpu_var(tick_cpu_sched);
+
+ if (!ts->tick_stopped) {
+ local_irq_restore(flags);
+ return;
+ }
+
+ WARN_ON_ONCE(ts->saved_jiffies_whence != JIFFIES_SAVED_SYS);
+
+ delta_jiffies = jiffies - ts->saved_jiffies;
+ account_system_ticks(current, delta_jiffies);
+
+ ts->saved_jiffies = jiffies;
+ ts->saved_jiffies_whence = JIFFIES_SAVED_USER;
+
+ local_irq_restore(flags);
+}
+
+void tick_nohz_enter_kernel(void)
+{
+ unsigned long flags;
+ struct tick_sched *ts;
+ unsigned long delta_jiffies;
+
+ local_irq_save(flags);
+
+ ts = &__get_cpu_var(tick_cpu_sched);
+
+ if (!ts->tick_stopped) {
+ local_irq_restore(flags);
+ return;
+ }
+
+ WARN_ON_ONCE(ts->saved_jiffies_whence != JIFFIES_SAVED_USER);
+
+ delta_jiffies = jiffies - ts->saved_jiffies;
+ account_user_ticks(current, delta_jiffies);
+
+ ts->saved_jiffies = jiffies;
+ ts->saved_jiffies_whence = JIFFIES_SAVED_SYS;
+
+ local_irq_restore(flags);
+}
+
+void tick_nohz_enter_exception(struct pt_regs *regs)
+{
+ if (user_mode(regs))
+ tick_nohz_enter_kernel();
+}
+
+void tick_nohz_exit_exception(struct pt_regs *regs)
+{
+ if (user_mode(regs))
+ tick_nohz_exit_kernel();
+}
+
/*
* Take the timer duty if nobody is taking care of it.
* If a CPU already does and and it's in a nohz cpuset,
@@ -880,13 +979,22 @@ static void tick_do_timer_check_handler(int cpu)
}
}
+static void tick_nohz_restart_adaptive(void)
+{
+ struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+
+ tick_nohz_account_ticks(ts);
+ tick_nohz_restart_sched_tick();
+ clear_thread_flag(TIF_NOHZ);
+}
+
void tick_nohz_check_adaptive(void)
{
struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
if (ts->tick_stopped && !is_idle_task(current)) {
if (!can_stop_adaptive_tick())
- tick_nohz_restart_sched_tick();
+ tick_nohz_restart_adaptive();
}
}
@@ -898,6 +1006,26 @@ void cpuset_exit_nohz_interrupt(void *unused)
tick_nohz_restart_adaptive();
}
+/*
+ * Flush cputime and clear hooks before context switch in case we
+ * haven't yet received the IPI that should take care of that.
+ */
+void tick_nohz_pre_schedule(void)
+{
+ struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+
+ /*
+ * We are holding the rq lock and if we restart the tick now
+ * we could deadlock by acquiring the lock twice. Instead
+ * we do that on post schedule time. For now do the cleanups
+ * on the prev task.
+ */
+ if (ts->tick_stopped) {
+ tick_nohz_account_ticks(ts);
+ clear_thread_flag(TIF_NOHZ);
+ }
+}
+
void tick_nohz_post_schedule(void)
{
struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
@@ -910,7 +1038,6 @@ void tick_nohz_post_schedule(void)
if (ts->tick_stopped)
tick_nohz_restart_sched_tick();
}
-
#else
static void tick_do_timer_check_handler(int cpu)
--
1.7.5.4
next prev parent reply other threads:[~2012-03-21 14:03 UTC|newest]
Thread overview: 96+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-21 13:58 [RFC][PATCH 00/32] Nohz cpusets v2 (adaptive tickless kernel) Frederic Weisbecker
2012-03-21 13:58 ` Frederic Weisbecker
2012-04-04 15:33 ` warning in tick_nohz_irq_exit Stephen Hemminger
2012-04-04 20:45 ` Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 01/32] nohz: Separate idle sleeping time accounting from nohz logic Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 02/32] nohz: Make nohz API agnostic against idle ticks cputime accounting Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 03/32] nohz: Rename ts->idle_tick to ts->last_tick Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 04/32] nohz: Move nohz load balancer selection into idle logic Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 05/32] nohz: Move ts->idle_calls incrementation into strict " Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 06/32] nohz: Move next idle expiry time record into idle logic area Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 07/32] cpuset: Set up interface for nohz flag Frederic Weisbecker
2012-03-21 14:50 ` Christoph Lameter
2012-03-22 4:03 ` Mike Galbraith
2012-03-22 16:26 ` Christoph Lameter
2012-03-22 19:20 ` Mike Galbraith
2012-03-27 11:22 ` Frederic Weisbecker
2012-03-27 11:53 ` Mike Galbraith
2012-03-27 11:56 ` Frederic Weisbecker
2012-03-27 12:31 ` Mike Galbraith
2012-03-27 11:19 ` Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 08/32] nohz: Try not to give the timekeeping duty to an adaptive tickless cpu Frederic Weisbecker
2012-03-21 14:52 ` Christoph Lameter
2012-03-27 10:50 ` Frederic Weisbecker
2012-03-27 16:08 ` Christoph Lameter
2012-03-27 16:47 ` Peter Zijlstra
2012-03-28 1:12 ` Christoph Lameter
2012-03-28 8:39 ` Peter Zijlstra
2012-03-28 13:11 ` Dimitri Sivanich
2012-03-28 15:51 ` Chris Metcalf
2012-03-30 1:34 ` Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 09/32] x86: New cpuset nohz irq vector Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 10/32] nohz: Adaptive tick stop and restart on nohz cpuset Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 11/32] nohz/cpuset: Don't turn off the tick if rcu needs it Frederic Weisbecker
2012-03-21 14:54 ` Christoph Lameter
2012-03-22 7:38 ` Gilad Ben-Yossef
2012-03-22 16:18 ` Christoph Lameter
2012-03-27 15:21 ` Gilad Ben-Yossef
2012-03-28 12:39 ` Frederic Weisbecker
2012-03-28 12:57 ` Gilad Ben-Yossef
2012-03-28 13:38 ` Frederic Weisbecker
2012-03-22 17:18 ` Chris Metcalf
2012-03-27 15:31 ` Gilad Ben-Yossef
2012-03-27 15:43 ` Chris Metcalf
2012-03-28 8:36 ` Gilad Ben-Yossef
2012-03-27 12:13 ` Frederic Weisbecker
2012-03-27 16:13 ` Christoph Lameter
2012-03-27 16:24 ` Steven Rostedt
2012-03-28 0:42 ` Christoph Lameter
2012-03-28 1:06 ` Steven Rostedt
2012-03-28 1:19 ` Christoph Lameter
2012-03-28 1:35 ` Steven Rostedt
2012-03-28 3:17 ` Steven Rostedt
2012-03-28 7:55 ` Gilad Ben-Yossef
2012-03-28 12:21 ` Frederic Weisbecker
2012-03-28 12:41 ` Gilad Ben-Yossef
2012-03-28 14:02 ` Steven Rostedt
2012-03-28 11:53 ` Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 12/32] nohz/cpuset: Wake up adaptive nohz CPU when a timer gets enqueued Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 13/32] nohz/cpuset: Don't stop the tick if posix cpu timers are running Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 14/32] nohz/cpuset: Restart tick when nohz flag is cleared on cpuset Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 15/32] nohz/cpuset: Restart the tick if printk needs it Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 16/32] rcu: Restart the tick on non-responding adaptive nohz CPUs Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 17/32] rcu: Restart tick if we enqueue a callback in a nohz/cpuset CPU Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 18/32] nohz: Generalize tickless cpu time accounting Frederic Weisbecker
2012-03-21 13:58 ` Frederic Weisbecker [this message]
2012-03-21 13:58 ` [PATCH 20/32] nohz/cpuset: New API to flush cputimes on nohz cpusets Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 21/32] nohz/cpuset: Flush cputime on threads in nohz cpusets when waiting leader Frederic Weisbecker
2012-03-27 14:10 ` Gilad Ben-Yossef
2012-03-27 14:23 ` Gilad Ben-Yossef
2012-03-28 11:20 ` Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 22/32] nohz/cpuset: Flush cputimes on procfs stat file read Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 23/32] nohz/cpuset: Flush cputimes for getrusage() and times() syscalls Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 24/32] x86: Syscall hooks for nohz cpusets Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 25/32] x86: Exception " Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 26/32] x86: Add adaptive tickless hooks on do_notify_resume() Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 27/32] nohz: Don't restart the tick before scheduling to idle Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 28/32] rcu: New rcu_user_enter() and rcu_user_exit() APIs Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 29/32] rcu: New rcu_user_enter_irq() and rcu_user_exit_irq() APIs Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 30/32] rcu: Switch to extended quiescent state in userspace from nohz cpuset Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 31/32] nohz: Exit RCU idle mode when we schedule before resuming userspace Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 32/32] nohz/cpuset: Disable under some configs Frederic Weisbecker
2012-03-27 15:02 ` [RFC][PATCH 00/32] Nohz cpusets v2 (adaptive tickless kernel) Gilad Ben-Yossef
2012-03-27 15:04 ` Gilad Ben-Yossef
2012-03-27 15:05 ` Gilad Ben-Yossef
2012-03-27 16:22 ` Christoph Lameter
2012-03-28 6:47 ` Gilad Ben-Yossef
2012-03-27 15:10 ` Peter Zijlstra
2012-03-27 15:18 ` Gilad Ben-Yossef
2012-05-22 21:31 ` Thomas Gleixner
2012-05-22 21:50 ` Steven Rostedt
2012-05-22 22:22 ` Thomas Gleixner
2012-03-28 11:43 ` Frederic Weisbecker
2012-03-30 0:33 ` Kevin Hilman
2012-03-30 0:45 ` Frederic Weisbecker
2012-03-30 2:07 ` Geoff Levand
2012-03-30 14:10 ` Kevin Hilman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1332338318-5958-21-git-send-email-fweisbec@gmail.com \
--to=fweisbec@gmail.com \
--cc=abogani@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=avi@redhat.com \
--cc=cl@linux.com \
--cc=cmetcalf@tilera.com \
--cc=daniel.lezcano@linaro.org \
--cc=geoff@infradead.org \
--cc=gilad@benyossef.com \
--cc=linaro-sched-sig@lists.linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maxk@qualcomm.com \
--cc=mingo@kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=shemminger@vyatta.com \
--cc=tglx@linutronix.de \
--cc=thebigcorporation@gmail.com \
--cc=zen@openhuawei.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).