* Patch "sched/cputime: Fix steal time accounting vs. CPU hotplug" has been added to the 4.4-stable tree
@ 2016-04-10 18:18 gregkh
0 siblings, 0 replies; only message in thread
From: gregkh @ 2016-04-10 18:18 UTC (permalink / raw)
To: tglx, fweisbec, glommer, gregkh, mingo, peterz, riel, torvalds
Cc: stable, stable-commits
This is a note to let you know that I've just added the patch titled
sched/cputime: Fix steal time accounting vs. CPU hotplug
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
The filename of the patch is:
sched-cputime-fix-steal-time-accounting-vs.-cpu-hotplug.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.
>From e9532e69b8d1d1284e8ecf8d2586de34aec61244 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx@linutronix.de>
Date: Fri, 4 Mar 2016 15:59:42 +0100
Subject: sched/cputime: Fix steal time accounting vs. CPU hotplug
From: Thomas Gleixner <tglx@linutronix.de>
commit e9532e69b8d1d1284e8ecf8d2586de34aec61244 upstream.
On CPU hotplug the steal time accounting can keep a stale rq->prev_steal_time
value over CPU down and up. So after the CPU comes up again the delta
calculation in steal_account_process_tick() wreckages itself due to the
unsigned math:
u64 steal = paravirt_steal_clock(smp_processor_id());
steal -= this_rq()->prev_steal_time;
So if steal is smaller than rq->prev_steal_time we end up with an insane large
value which then gets added to rq->prev_steal_time, resulting in a permanent
wreckage of the accounting. As a consequence the per CPU stats in /proc/stat
become stale.
Nice trick to tell the world how idle the system is (100%) while the CPU is
100% busy running tasks. Though we prefer realistic numbers.
None of the accounting values which use a previous value to account for
fractions is reset at CPU hotplug time. update_rq_clock_task() has a sanity
check for prev_irq_time and prev_steal_time_rq, but that sanity check solely
deals with clock warps and limits the /proc/stat visible wreckage. The
prev_time values are still wrong.
Solution is simple: Reset rq->prev_*_time when the CPU is plugged in again.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: commit 095c0aa83e52 "sched: adjust scheduler cpu power for stolen time"
Fixes: commit aa483808516c "sched: Remove irq time from available CPU power"
Fixes: commit e6e6685accfa "KVM guest: Steal time accounting"
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1603041539490.3686@nanos
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
kernel/sched/core.c | 1 +
kernel/sched/sched.h | 13 +++++++++++++
2 files changed, 14 insertions(+)
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5525,6 +5525,7 @@ migration_call(struct notifier_block *nf
case CPU_UP_PREPARE:
rq->calc_load_update = calc_load_update;
+ account_reset_rq(rq);
break;
case CPU_ONLINE:
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1770,3 +1770,16 @@ static inline u64 irq_time_read(int cpu)
}
#endif /* CONFIG_64BIT */
#endif /* CONFIG_IRQ_TIME_ACCOUNTING */
+
+static inline void account_reset_rq(struct rq *rq)
+{
+#ifdef CONFIG_IRQ_TIME_ACCOUNTING
+ rq->prev_irq_time = 0;
+#endif
+#ifdef CONFIG_PARAVIRT
+ rq->prev_steal_time = 0;
+#endif
+#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
+ rq->prev_steal_time_rq = 0;
+#endif
+}
Patches currently in stable-queue which might be from tglx@linutronix.de are
queue-4.4/edac-sb_edac-fix-computation-of-channel-address.patch
queue-4.4/x86-iopl-64-properly-context-switch-iopl-on-xen-pv.patch
queue-4.4/sched-cputime-fix-steal_account_process_tick-to-always-return-jiffies.patch
queue-4.4/bitops-do-not-default-to-__clear_bit-for-__clear_bit_unlock.patch
queue-4.4/sched-cputime-fix-steal-time-accounting-vs.-cpu-hotplug.patch
queue-4.4/x86-iopl-fix-iopl-capability-check-on-xen-pv.patch
queue-4.4/x86-apic-fix-suspicious-rcu-usage-in-smp_trace_call_function_interrupt.patch
queue-4.4/perf-x86-intel-add-definition-for-pt-pmi-bit.patch
queue-4.4/x86-microcode-intel-make-early-loader-look-for-builtin-microcode-too.patch
queue-4.4/perf-core-fix-perf_sched_count-derailment.patch
queue-4.4/x86-microcode-untangle-from-blk_dev_initrd.patch
queue-4.4/x86-irq-cure-live-lock-in-fixup_irqs.patch
queue-4.4/x86-entry-compat-keep-ts_compat-set-during-signal-delivery.patch
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2016-04-10 18:18 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-10 18:18 Patch "sched/cputime: Fix steal time accounting vs. CPU hotplug" has been added to the 4.4-stable tree gregkh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).