All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Frederic Weisbecker <fweisbec@gmail.com>,
	Rik van Riel <riel@redhat.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Stanislaw Gruszka <sgruszka@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Wanpeng Li <wanpeng.li@hotmail.com>,
	Ingo Molnar <mingo@kernel.org>,
	Ivan Delalande <colona@arista.com>
Subject: [PATCH 4.9 29/35] sched/cputime: Fix ksoftirqd cputime accounting regression
Date: Thu, 11 Oct 2018 17:35:31 +0200	[thread overview]
Message-ID: <20181011152521.346719028@linuxfoundation.org> (raw)
In-Reply-To: <20181011152520.174949126@linuxfoundation.org>

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Frederic Weisbecker <fweisbec@gmail.com>

commit 25e2d8c1b9e327ed260edd13169cc22bc7a78bc6 upstream.

irq_time_read() returns the irqtime minus the ksoftirqd time. This
is necessary because irq_time_read() is used to substract the IRQ time
from the sum_exec_runtime of a task. If we were to include the softirq
time of ksoftirqd, this task would substract its own CPU time everytime
it updates ksoftirqd->sum_exec_runtime which would therefore never
progress.

But this behaviour got broken by:

  a499a5a14db ("sched/cputime: Increment kcpustat directly on irqtime account")

... which now includes ksoftirqd softirq time in the time returned by
irq_time_read().

This has resulted in wrong ksoftirqd cputime reported to userspace
through /proc/stat and thus "top" not showing ksoftirqd when it should
after intense networking load.

ksoftirqd->stime happens to be correct but it gets scaled down by
sum_exec_runtime through task_cputime_adjusted().

To fix this, just account the strict IRQ time in a separate counter and
use it to report the IRQ time.

Reported-and-tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1493129448-5356-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ivan Delalande <colona@arista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/sched/cputime.c |   27 ++++++++++++++++-----------
 kernel/sched/sched.h   |    9 +++++++--
 2 files changed, 23 insertions(+), 13 deletions(-)

--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -37,6 +37,18 @@ void disable_sched_clock_irqtime(void)
 	sched_clock_irqtime = 0;
 }
 
+static void irqtime_account_delta(struct irqtime *irqtime, u64 delta,
+				  enum cpu_usage_stat idx)
+{
+	u64 *cpustat = kcpustat_this_cpu->cpustat;
+
+	u64_stats_update_begin(&irqtime->sync);
+	cpustat[idx] += delta;
+	irqtime->total += delta;
+	irqtime->tick_delta += delta;
+	u64_stats_update_end(&irqtime->sync);
+}
+
 /*
  * Called before incrementing preempt_count on {soft,}irq_enter
  * and before decrementing preempt_count on {soft,}irq_exit.
@@ -44,7 +56,6 @@ void disable_sched_clock_irqtime(void)
 void irqtime_account_irq(struct task_struct *curr)
 {
 	struct irqtime *irqtime = this_cpu_ptr(&cpu_irqtime);
-	u64 *cpustat = kcpustat_this_cpu->cpustat;
 	s64 delta;
 	int cpu;
 
@@ -55,22 +66,16 @@ void irqtime_account_irq(struct task_str
 	delta = sched_clock_cpu(cpu) - irqtime->irq_start_time;
 	irqtime->irq_start_time += delta;
 
-	u64_stats_update_begin(&irqtime->sync);
 	/*
 	 * We do not account for softirq time from ksoftirqd here.
 	 * We want to continue accounting softirq time to ksoftirqd thread
 	 * in that case, so as not to confuse scheduler with a special task
 	 * that do not consume any time, but still wants to run.
 	 */
-	if (hardirq_count()) {
-		cpustat[CPUTIME_IRQ] += delta;
-		irqtime->tick_delta += delta;
-	} else if (in_serving_softirq() && curr != this_cpu_ksoftirqd()) {
-		cpustat[CPUTIME_SOFTIRQ] += delta;
-		irqtime->tick_delta += delta;
-	}
-
-	u64_stats_update_end(&irqtime->sync);
+	if (hardirq_count())
+		irqtime_account_delta(irqtime, delta, CPUTIME_IRQ);
+	else if (in_serving_softirq() && curr != this_cpu_ksoftirqd())
+		irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
 }
 EXPORT_SYMBOL_GPL(irqtime_account_irq);
 
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1743,6 +1743,7 @@ static inline void nohz_balance_exit_idl
 
 #ifdef CONFIG_IRQ_TIME_ACCOUNTING
 struct irqtime {
+	u64			total;
 	u64			tick_delta;
 	u64			irq_start_time;
 	struct u64_stats_sync	sync;
@@ -1750,16 +1751,20 @@ struct irqtime {
 
 DECLARE_PER_CPU(struct irqtime, cpu_irqtime);
 
+/*
+ * Returns the irqtime minus the softirq time computed by ksoftirqd.
+ * Otherwise ksoftirqd's sum_exec_runtime is substracted its own runtime
+ * and never move forward.
+ */
 static inline u64 irq_time_read(int cpu)
 {
 	struct irqtime *irqtime = &per_cpu(cpu_irqtime, cpu);
-	u64 *cpustat = kcpustat_cpu(cpu).cpustat;
 	unsigned int seq;
 	u64 total;
 
 	do {
 		seq = __u64_stats_fetch_begin(&irqtime->sync);
-		total = cpustat[CPUTIME_SOFTIRQ] + cpustat[CPUTIME_IRQ];
+		total = irqtime->total;
 	} while (__u64_stats_fetch_retry(&irqtime->sync, seq));
 
 	return total;



  parent reply	other threads:[~2018-10-11 15:43 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-11 15:35 [PATCH 4.9 00/35] 4.9.133-stable review Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 01/35] mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 02/35] fbdev/omapfb: fix omapfb_memory_read infoleak Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 03/35] xen-netback: fix input validation in xenvif_set_hash_mapping() Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 04/35] x86/vdso: Fix asm constraints on vDSO syscall fallbacks Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 05/35] x86/vdso: Fix vDSO syscall fallback asm constraint regression Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 06/35] PCI: Reprogram bridge prefetch registers on resume Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 07/35] mac80211: fix setting IEEE80211_KEY_FLAG_RX_MGMT for AP mode keys Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 08/35] PM / core: Clear the direct_complete flag on errors Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 09/35] dm cache metadata: ignore hints array being too small during resize Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 10/35] dm cache: fix resize crash if user doesnt reload cache table Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 11/35] xhci: Add missing CAS workaround for Intel Sunrise Point xHCI Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 12/35] usb: xhci-mtk: resume USB3 roothub first Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 13/35] USB: serial: simple: add Motorola Tetra MTP6550 id Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 14/35] tty: Drop tty->count on tty_reopen() failure Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 15/35] of: unittest: Disable interrupt node tests for old world MAC systems Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 16/35] ext4: add corruption check in ext4_xattr_set_entry() Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 17/35] ext4: always verify the magic number in xattr blocks Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 18/35] cgroup: Fix deadlock in cpu hotplug path Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 19/35] ath10k: fix use-after-free in ath10k_wmi_cmd_send_nowait Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 20/35] ath10k: fix kernel panic issue during pci probe Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 21/35] powerpc/fadump: Return error when fadump registration fails Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 22/35] ARC: clone syscall to setp r25 as thread pointer Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 23/35] x86/mm: Expand static page table for fixmap space Greg Kroah-Hartman
2018-11-01 22:25   ` Ben Hutchings
2018-11-02  3:38     ` Feng Tang
2018-11-02 13:56       ` Sasha Levin
2018-10-11 15:35 ` [PATCH 4.9 24/35] f2fs: fix invalid memory access Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 25/35] ucma: fix a use-after-free in ucma_resolve_ip() Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 26/35] ubifs: Check for name being NULL while mounting Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 27/35] sched/cputime: Convert kcpustat to nsecs Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 28/35] sched/cputime: Increment kcpustat directly on irqtime account Greg Kroah-Hartman
2018-10-11 15:35 ` Greg Kroah-Hartman [this message]
2018-10-11 15:35 ` [PATCH 4.9 30/35] ath10k: fix scan crash due to incorrect length calculation Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 31/35] ebtables: arpreply: Add the standard target sanity check Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 32/35] x86/fpu: Remove use_eager_fpu() Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 33/35] x86/fpu: Remove struct fpu::counter Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 34/35] Revert "perf: sync up x86/.../cpufeatures.h" Greg Kroah-Hartman
2018-10-11 15:35 ` [PATCH 4.9 35/35] x86/fpu: Finish excising eagerfpu Greg Kroah-Hartman
2018-10-11 22:40 ` [PATCH 4.9 00/35] 4.9.133-stable review Shuah Khan
2018-10-12  4:33 ` Naresh Kamboju
2018-10-12 12:20 ` Guenter Roeck
2018-10-12 13:52   ` Greg Kroah-Hartman
2018-10-12 14:44     ` Greg Kroah-Hartman
2018-10-12 17:07 ` Nathan Chancellor
2018-10-12 17:50 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181011152521.346719028@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=brouer@redhat.com \
    --cc=colona@arista.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=sgruszka@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=wanpeng.li@hotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.