From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.linuxfoundation.org ([140.211.169.12]:50899 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751153AbcDJSSP (ORCPT ); Sun, 10 Apr 2016 14:18:15 -0400 Subject: Patch "sched/cputime: Fix steal time accounting vs. CPU hotplug" has been added to the 4.4-stable tree To: tglx@linutronix.de, fweisbec@gmail.com, glommer@parallels.com, gregkh@linuxfoundation.org, mingo@kernel.org, peterz@infradead.org, riel@redhat.com, torvalds@linux-foundation.org Cc: , From: Date: Sun, 10 Apr 2016 11:18:14 -0700 Message-ID: <1460312294224125@kroah.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ANSI_X3.4-1968 Content-Transfer-Encoding: 8bit Sender: stable-owner@vger.kernel.org List-ID: This is a note to let you know that I've just added the patch titled sched/cputime: Fix steal time accounting vs. CPU hotplug to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: sched-cputime-fix-steal-time-accounting-vs.-cpu-hotplug.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let know about it. >>From e9532e69b8d1d1284e8ecf8d2586de34aec61244 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Fri, 4 Mar 2016 15:59:42 +0100 Subject: sched/cputime: Fix steal time accounting vs. CPU hotplug From: Thomas Gleixner commit e9532e69b8d1d1284e8ecf8d2586de34aec61244 upstream. On CPU hotplug the steal time accounting can keep a stale rq->prev_steal_time value over CPU down and up. So after the CPU comes up again the delta calculation in steal_account_process_tick() wreckages itself due to the unsigned math: u64 steal = paravirt_steal_clock(smp_processor_id()); steal -= this_rq()->prev_steal_time; So if steal is smaller than rq->prev_steal_time we end up with an insane large value which then gets added to rq->prev_steal_time, resulting in a permanent wreckage of the accounting. As a consequence the per CPU stats in /proc/stat become stale. Nice trick to tell the world how idle the system is (100%) while the CPU is 100% busy running tasks. Though we prefer realistic numbers. None of the accounting values which use a previous value to account for fractions is reset at CPU hotplug time. update_rq_clock_task() has a sanity check for prev_irq_time and prev_steal_time_rq, but that sanity check solely deals with clock warps and limits the /proc/stat visible wreckage. The prev_time values are still wrong. Solution is simple: Reset rq->prev_*_time when the CPU is plugged in again. Signed-off-by: Thomas Gleixner Acked-by: Rik van Riel Cc: Frederic Weisbecker Cc: Glauber Costa Cc: Linus Torvalds Cc: Peter Zijlstra Fixes: commit 095c0aa83e52 "sched: adjust scheduler cpu power for stolen time" Fixes: commit aa483808516c "sched: Remove irq time from available CPU power" Fixes: commit e6e6685accfa "KVM guest: Steal time accounting" Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1603041539490.3686@nanos Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman --- kernel/sched/core.c | 1 + kernel/sched/sched.h | 13 +++++++++++++ 2 files changed, 14 insertions(+) --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5525,6 +5525,7 @@ migration_call(struct notifier_block *nf case CPU_UP_PREPARE: rq->calc_load_update = calc_load_update; + account_reset_rq(rq); break; case CPU_ONLINE: --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1770,3 +1770,16 @@ static inline u64 irq_time_read(int cpu) } #endif /* CONFIG_64BIT */ #endif /* CONFIG_IRQ_TIME_ACCOUNTING */ + +static inline void account_reset_rq(struct rq *rq) +{ +#ifdef CONFIG_IRQ_TIME_ACCOUNTING + rq->prev_irq_time = 0; +#endif +#ifdef CONFIG_PARAVIRT + rq->prev_steal_time = 0; +#endif +#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING + rq->prev_steal_time_rq = 0; +#endif +} Patches currently in stable-queue which might be from tglx@linutronix.de are queue-4.4/edac-sb_edac-fix-computation-of-channel-address.patch queue-4.4/x86-iopl-64-properly-context-switch-iopl-on-xen-pv.patch queue-4.4/sched-cputime-fix-steal_account_process_tick-to-always-return-jiffies.patch queue-4.4/bitops-do-not-default-to-__clear_bit-for-__clear_bit_unlock.patch queue-4.4/sched-cputime-fix-steal-time-accounting-vs.-cpu-hotplug.patch queue-4.4/x86-iopl-fix-iopl-capability-check-on-xen-pv.patch queue-4.4/x86-apic-fix-suspicious-rcu-usage-in-smp_trace_call_function_interrupt.patch queue-4.4/perf-x86-intel-add-definition-for-pt-pmi-bit.patch queue-4.4/x86-microcode-intel-make-early-loader-look-for-builtin-microcode-too.patch queue-4.4/perf-core-fix-perf_sched_count-derailment.patch queue-4.4/x86-microcode-untangle-from-blk_dev_initrd.patch queue-4.4/x86-irq-cure-live-lock-in-fixup_irqs.patch queue-4.4/x86-entry-compat-keep-ts_compat-set-during-signal-delivery.patch