linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/4] nohz: Only update sleeptime stats locally
@ 2014-04-24 18:45 Denys Vlasenko
  2014-04-24 18:45 ` [PATCH 2/4] nohz: Synchronize sleep time stats with seqlock Denys Vlasenko
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Denys Vlasenko @ 2014-04-24 18:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Frederic Weisbecker, Denys Vlasenko, Hidetoshi Seto,
	Fernando Luis Vazquez Cao, Tetsuo Handa, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Andrew Morton, Arjan van de Ven,
	Oleg Nesterov

From: Frederic Weisbecker <fweisbec@gmail.com>

The idle and io sleeptime stats can be updated concurrently from callers
of get_cpu_idle_time_us(), get_cpu_iowait_time_us() and
tick_nohz_stop_idle().

Updaters can easily race and mess up with internal datas coherency,
for example when a governor calls a get_cpu_*_time_us() API and the
target CPU exits idle at the same time, because no locking or whatsoever
is there to protect against this concurrency.

To fix this, lets only update the sleeptime stats locally when the CPU
exits from idle. This is the only place where a real update is truly
needed. The callers of get_cpu_*_time_us() can simply add up the pending
sleep time delta to the last sleeptime snapshot in order to get a coherent
result. There is no need for them to also update the stats.

Rebased to Linus' tree by Denys Vlasenko.

Reported-by: Fernando Luis Vazquez Cao <fernando_b1@lab.ntt.co.jp>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Fernando Luis Vazquez Cao <fernando_b1@lab.ntt.co.jp>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
---
 kernel/time/tick-sched.c | 63 ++++++++++++++++--------------------------------
 1 file changed, 21 insertions(+), 42 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 6558b7a..7d86a54 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -406,31 +406,16 @@ static void tick_nohz_update_jiffies(ktime_t now)
 	touch_softlockup_watchdog();
 }
 
-/*
- * Updates the per cpu time idle statistics counters
- */
-static void
-update_ts_time_stats(int cpu, struct tick_sched *ts, ktime_t now, u64 *last_update_time)
+static void tick_nohz_stop_idle(struct tick_sched *ts, ktime_t now)
 {
 	ktime_t delta;
 
-	if (ts->idle_active) {
-		delta = ktime_sub(now, ts->idle_entrytime);
-		if (nr_iowait_cpu(cpu) > 0)
-			ts->iowait_sleeptime = ktime_add(ts->iowait_sleeptime, delta);
-		else
-			ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
-		ts->idle_entrytime = now;
-	}
-
-	if (last_update_time)
-		*last_update_time = ktime_to_us(now);
-
-}
-
-static void tick_nohz_stop_idle(struct tick_sched *ts, ktime_t now)
-{
-	update_ts_time_stats(smp_processor_id(), ts, now, NULL);
+	/* Updates the per cpu time idle statistics counters */
+	delta = ktime_sub(now, ts->idle_entrytime);
+	if (nr_iowait_cpu(smp_processor_id()) > 0)
+		ts->iowait_sleeptime = ktime_add(ts->iowait_sleeptime, delta);
+	else
+		ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
 	ts->idle_active = 0;
 
 	sched_clock_idle_wakeup_event(0);
@@ -469,17 +454,14 @@ u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time)
 		return -1;
 
 	now = ktime_get();
-	if (last_update_time) {
-		update_ts_time_stats(cpu, ts, now, last_update_time);
-		idle = ts->idle_sleeptime;
-	} else {
-		if (ts->idle_active && !nr_iowait_cpu(cpu)) {
-			ktime_t delta = ktime_sub(now, ts->idle_entrytime);
+	if (last_update_time)
+		*last_update_time = ktime_to_us(now);
 
-			idle = ktime_add(ts->idle_sleeptime, delta);
-		} else {
-			idle = ts->idle_sleeptime;
-		}
+	if (ts->idle_active && !nr_iowait_cpu(cpu)) {
+		ktime_t delta = ktime_sub(now, ts->idle_entrytime);
+		idle = ktime_add(ts->idle_sleeptime, delta);
+	} else {
+		idle = ts->idle_sleeptime;
 	}
 
 	return ktime_to_us(idle);
@@ -510,17 +492,14 @@ u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time)
 		return -1;
 
 	now = ktime_get();
-	if (last_update_time) {
-		update_ts_time_stats(cpu, ts, now, last_update_time);
-		iowait = ts->iowait_sleeptime;
-	} else {
-		if (ts->idle_active && nr_iowait_cpu(cpu) > 0) {
-			ktime_t delta = ktime_sub(now, ts->idle_entrytime);
+	if (last_update_time)
+		*last_update_time = ktime_to_us(now);
 
-			iowait = ktime_add(ts->iowait_sleeptime, delta);
-		} else {
-			iowait = ts->iowait_sleeptime;
-		}
+	if (ts->idle_active && nr_iowait_cpu(cpu) > 0) {
+		ktime_t delta = ktime_sub(now, ts->idle_entrytime);
+		iowait = ktime_add(ts->iowait_sleeptime, delta);
+	} else {
+		iowait = ts->iowait_sleeptime;
 	}
 
 	return ktime_to_us(iowait);
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 18+ messages in thread
* [PATCH 1/4] nohz: Only update sleeptime stats locally
@ 2014-05-07 13:41 Denys Vlasenko
  0 siblings, 0 replies; 18+ messages in thread
From: Denys Vlasenko @ 2014-05-07 13:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Frederic Weisbecker, Denys Vlasenko, Hidetoshi Seto,
	Fernando Luis Vazquez Cao, Tetsuo Handa, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Andrew Morton, Arjan van de Ven,
	Oleg Nesterov

From: Frederic Weisbecker <fweisbec@gmail.com>

The idle and io sleeptime stats can be updated concurrently from callers
of get_cpu_idle_time_us(), get_cpu_iowait_time_us() and
tick_nohz_stop_idle().

Updaters can easily race and mess up with internal datas coherency,
for example when a governor calls a get_cpu_*_time_us() API and the
target CPU exits idle at the same time, because no locking or whatsoever
is there to protect against this concurrency.

To fix this, lets only update the sleeptime stats locally when the CPU
exits from idle. This is the only place where a real update is truly
needed. The callers of get_cpu_*_time_us() can simply add up the pending
sleep time delta to the last sleeptime snapshot in order to get a coherent
result. There is no need for them to also update the stats.

Rebased to Linus' tree by Denys Vlasenko.

Reported-by: Fernando Luis Vazquez Cao <fernando_b1@lab.ntt.co.jp>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Fernando Luis Vazquez Cao <fernando_b1@lab.ntt.co.jp>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
---
 kernel/time/tick-sched.c | 63 ++++++++++++++++--------------------------------
 1 file changed, 21 insertions(+), 42 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 6558b7a..7d86a54 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -406,31 +406,16 @@ static void tick_nohz_update_jiffies(ktime_t now)
 	touch_softlockup_watchdog();
 }
 
-/*
- * Updates the per cpu time idle statistics counters
- */
-static void
-update_ts_time_stats(int cpu, struct tick_sched *ts, ktime_t now, u64 *last_update_time)
+static void tick_nohz_stop_idle(struct tick_sched *ts, ktime_t now)
 {
 	ktime_t delta;
 
-	if (ts->idle_active) {
-		delta = ktime_sub(now, ts->idle_entrytime);
-		if (nr_iowait_cpu(cpu) > 0)
-			ts->iowait_sleeptime = ktime_add(ts->iowait_sleeptime, delta);
-		else
-			ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
-		ts->idle_entrytime = now;
-	}
-
-	if (last_update_time)
-		*last_update_time = ktime_to_us(now);
-
-}
-
-static void tick_nohz_stop_idle(struct tick_sched *ts, ktime_t now)
-{
-	update_ts_time_stats(smp_processor_id(), ts, now, NULL);
+	/* Updates the per cpu time idle statistics counters */
+	delta = ktime_sub(now, ts->idle_entrytime);
+	if (nr_iowait_cpu(smp_processor_id()) > 0)
+		ts->iowait_sleeptime = ktime_add(ts->iowait_sleeptime, delta);
+	else
+		ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
 	ts->idle_active = 0;
 
 	sched_clock_idle_wakeup_event(0);
@@ -469,17 +454,14 @@ u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time)
 		return -1;
 
 	now = ktime_get();
-	if (last_update_time) {
-		update_ts_time_stats(cpu, ts, now, last_update_time);
-		idle = ts->idle_sleeptime;
-	} else {
-		if (ts->idle_active && !nr_iowait_cpu(cpu)) {
-			ktime_t delta = ktime_sub(now, ts->idle_entrytime);
+	if (last_update_time)
+		*last_update_time = ktime_to_us(now);
 
-			idle = ktime_add(ts->idle_sleeptime, delta);
-		} else {
-			idle = ts->idle_sleeptime;
-		}
+	if (ts->idle_active && !nr_iowait_cpu(cpu)) {
+		ktime_t delta = ktime_sub(now, ts->idle_entrytime);
+		idle = ktime_add(ts->idle_sleeptime, delta);
+	} else {
+		idle = ts->idle_sleeptime;
 	}
 
 	return ktime_to_us(idle);
@@ -510,17 +492,14 @@ u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time)
 		return -1;
 
 	now = ktime_get();
-	if (last_update_time) {
-		update_ts_time_stats(cpu, ts, now, last_update_time);
-		iowait = ts->iowait_sleeptime;
-	} else {
-		if (ts->idle_active && nr_iowait_cpu(cpu) > 0) {
-			ktime_t delta = ktime_sub(now, ts->idle_entrytime);
+	if (last_update_time)
+		*last_update_time = ktime_to_us(now);
 
-			iowait = ktime_add(ts->iowait_sleeptime, delta);
-		} else {
-			iowait = ts->iowait_sleeptime;
-		}
+	if (ts->idle_active && nr_iowait_cpu(cpu) > 0) {
+		ktime_t delta = ktime_sub(now, ts->idle_entrytime);
+		iowait = ktime_add(ts->iowait_sleeptime, delta);
+	} else {
+		iowait = ts->iowait_sleeptime;
 	}
 
 	return ktime_to_us(iowait);
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 18+ messages in thread
* [PATCH RESEND 0/4] nohz: Fix racy sleeptime stats
@ 2013-08-16 15:42 Frederic Weisbecker
  2013-08-16 15:42 ` [PATCH 1/4] nohz: Only update sleeptime stats locally Frederic Weisbecker
  0 siblings, 1 reply; 18+ messages in thread
From: Frederic Weisbecker @ 2013-08-16 15:42 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: LKML, Frederic Weisbecker, Fernando Luis Vazquez Cao,
	Tetsuo Handa, Thomas Gleixner, Ingo Molnar, Peter Zijlstra,
	Andrew Morton, Arjan van de Ven

Hi,

This is a resend to include Oleg in Cc, no modification since last version
https://lkml.org/lkml/2013/8/8/638.

Tree is still:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
	timers/nohz-sleeptime-fix

Thanks.

Frederic Weisbecker (4):
  nohz: Only update sleeptime stats locally
  nohz: Synchronize sleep time stats with seqlock
  nohz: Consolidate sleep time stats read code
  nohz: Convert a few places to use local per cpu accesses

 include/linux/tick.h     |    2 +
 kernel/time/tick-sched.c |  124 +++++++++++++++++++--------------------------
 2 files changed, 54 insertions(+), 72 deletions(-)

-- 
1.7.5.4


^ permalink raw reply	[flat|nested] 18+ messages in thread
* [PATCH 0/4] nohz: Fix racy sleeptime stats
@ 2013-08-09  0:54 Frederic Weisbecker
  2013-08-09  0:54 ` [PATCH 1/4] nohz: Only update sleeptime stats locally Frederic Weisbecker
  0 siblings, 1 reply; 18+ messages in thread
From: Frederic Weisbecker @ 2013-08-09  0:54 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Fernando Luis Vazquez Cao, Tetsuo Handa,
	Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Andrew Morton,
	Arjan van de Ven

Hi,

Here is a propsition to fix the bug described here:

http://lkml.kernel.org/r/1363660703.4993.3.camel@nexus
http://lkml.kernel.org/r/201304012205.DFC60784.HVOMJSFFLFtOOQ@I-love.SAKURA.ne.jp

You can try the following branch:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
	timers/nohz-sleeptime-fix

Thanks,
	Frederic
---

Frederic Weisbecker (4):
      nohz: Only update sleeptime stats locally
      nohz: Synchronize sleep time stats with seqlock
      nohz: Consolidate sleep time stats read code
      nohz: Convert a few places to use local per cpu accesses


 include/linux/tick.h     |    2 +
 kernel/time/tick-sched.c |  124 +++++++++++++++++++--------------------------
 2 files changed, 54 insertions(+), 72 deletions(-)

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2014-05-07 13:42 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-24 18:45 [PATCH 1/4] nohz: Only update sleeptime stats locally Denys Vlasenko
2014-04-24 18:45 ` [PATCH 2/4] nohz: Synchronize sleep time stats with seqlock Denys Vlasenko
2014-04-24 18:45 ` [PATCH 3/4] nohz: Fix idle/iowait counts going backwards Denys Vlasenko
2014-04-24 18:45 ` [PATCH 4/4] nohz: Fix iowait overcounting if iowait task migrates Denys Vlasenko
2014-04-24 19:18   ` Peter Zijlstra
2014-04-25 18:56     ` Denys Vlasenko
2014-04-29 14:47   ` Frederic Weisbecker
2014-05-05 18:06     ` Denys Vlasenko
2014-05-07 13:38       ` Denys Vlasenko
2014-04-29 16:20   ` Frederic Weisbecker
2014-05-05 18:14     ` Denys Vlasenko
  -- strict thread matches above, loose matches on Subject: below --
2014-05-07 13:41 [PATCH 1/4] nohz: Only update sleeptime stats locally Denys Vlasenko
2013-08-16 15:42 [PATCH RESEND 0/4] nohz: Fix racy sleeptime stats Frederic Weisbecker
2013-08-16 15:42 ` [PATCH 1/4] nohz: Only update sleeptime stats locally Frederic Weisbecker
2013-08-18 16:49   ` Oleg Nesterov
2013-08-18 21:38     ` Frederic Weisbecker
2013-08-18 17:04   ` Oleg Nesterov
2013-08-19 18:05     ` Stratos Karafotis
2013-08-09  0:54 [PATCH 0/4] nohz: Fix racy sleeptime stats Frederic Weisbecker
2013-08-09  0:54 ` [PATCH 1/4] nohz: Only update sleeptime stats locally Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).