From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760101AbcAUSzk (ORCPT ); Thu, 21 Jan 2016 13:55:40 -0500 Received: from terminus.zytor.com ([198.137.202.10]:57369 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758455AbcAUSzg (ORCPT ); Thu, 21 Jan 2016 13:55:36 -0500 Date: Thu, 21 Jan 2016 10:54:45 -0800 From: tip-bot for Vik Heyndrickx Message-ID: Cc: efault@gmx.de, tglx@linutronix.de, linux-kernel@vger.kernel.org, hpa@zytor.com, vik.heyndrickx@veribox.net, torvalds@linux-foundation.org, dsmythies@telus.net, mingo@kernel.org, peterz@infradead.org Reply-To: hpa@zytor.com, linux-kernel@vger.kernel.org, efault@gmx.de, tglx@linutronix.de, peterz@infradead.org, torvalds@linux-foundation.org, vik.heyndrickx@veribox.net, mingo@kernel.org, dsmythies@telus.net In-Reply-To: <56A0A38D.4040900@veribox.net> References: <56A0A38D.4040900@veribox.net> To: linux-tip-commits@vger.kernel.org Subject: [tip:sched/urgent] sched: Fix non-zero idle loadavg Git-Commit-ID: 1f9649ef6aa1bac53fb478d9e641b22d67f8423c X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit-ID: 1f9649ef6aa1bac53fb478d9e641b22d67f8423c Gitweb: http://git.kernel.org/tip/1f9649ef6aa1bac53fb478d9e641b22d67f8423c Author: Vik Heyndrickx AuthorDate: Thu, 21 Jan 2016 10:23:25 +0100 Committer: Ingo Molnar CommitDate: Thu, 21 Jan 2016 18:55:23 +0100 sched: Fix non-zero idle loadavg Systems show a minimal load average of 0.00, 0.01, 0.05 even when they have no load at all. Uptime and /proc/loadavg on all systems with kernels released during the last five years up until kernel version 4.4, show a 5- and 15-minute minimum loadavg of 0.01 and 0.05 respectively. This should be 0.00 on idle systems, but the way the kernel calculates this value prevents it from getting lower than the mentioned values. Likewise but not as obviously noticeable, a fully loaded system with no processes waiting, shows a maximum 1/5/15 loadavg of 1.00, 0.99, 0.95 (multiplied by number of cores). By removing the single code line that performed a rounding on the internally kept load value, effectively returning this function calc_load to its state it had before, the visualization problem is completely fixed. Once the (old) load becomes 93 or higher, it mathematically can never get lower than 93, even when the active (load) remains 0 forever. This results in the strange 0.00, 0.01, 0.05 uptime values on idle systems. Note: 93/2048 = 0.0454..., which rounds up to 0.05. It is not correct to add a 0.5 rounding (=1024/2048) here, since the result from this function is fed back into the next iteration again, so the result of that +0.5 rounding value then gets multiplied by (2048-2037), and then rounded again, so there is a virtual "ghost" load created, next to the old and active load terms. The modified code was tested on nohz=off and nohz kernels. It was tested on vanilla kernel 4.4 and on centos 7.1 kernel 3.10.0-327. It was tested on single, dual, and octal cores system. It was tested on virtual hosts and bare hardware. No unwanted effects have been observed, and the problems that the patch intended to fix were indeed gone. Signed-off-by: Vik Heyndrickx [ Changelog edits ] Signed-off-by: Peter Zijlstra (Intel) Cc: Doug Smythies Cc: Linus Torvalds Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: stable@vger.kernel.org Fixes: 0f004f5a696a ("sched: Cure more NO_HZ load average woes") Link: http://lkml.kernel.org/r/56A0A38D.4040900@veribox.net Signed-off-by: Ingo Molnar --- kernel/sched/loadavg.c | 1 - 1 file changed, 1 deletion(-) diff --git a/kernel/sched/loadavg.c b/kernel/sched/loadavg.c index ef71590..eb83b93 100644 --- a/kernel/sched/loadavg.c +++ b/kernel/sched/loadavg.c @@ -101,7 +101,6 @@ calc_load(unsigned long load, unsigned long exp, unsigned long active) { load *= exp; load += active * (FIXED_1 - exp); - load += 1UL << (FSHIFT - 1); return load >> FSHIFT; }