All of lore.kernel.org
 help / color / mirror / Atom feed
From: Venkatesh Pallipadi <venki@google.com>
To: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@elte.hu>, "H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: linux-kernel@vger.kernel.org, Paul Turner <pjt@google.com>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Venkatesh Pallipadi <venki@google.com>
Subject: [PATCH 7/7] Export per cpu hardirq and softirq time in proc -v3
Date: Wed, 29 Sep 2010 12:21:36 -0700	[thread overview]
Message-ID: <1285788096-29471-8-git-send-email-venki@google.com> (raw)
In-Reply-To: <1285788096-29471-1-git-send-email-venki@google.com>

I can predict this change being debated.

There is already per CPU and system level irq time in /proc/stat, which
on arch like x86 is based on sampled data. Earlier patches in this series
adds a fine grained irq time option for such archs. And exporting this
fine grained irq time to userspace seems helpful.

How should it be exported though? I considered:
(1) Changing the currently exported info in /proc/stat to directly use this
    new fine grained irq times. Doing that though will likely break the sum
    view to the user as user/system/ and other times there are still sample
    based and only irq time will be fine grained. So, user will almost always
    see sum time != 100% in top etc.
(2) Changing the currently exported info in /proc/stat to indirectly use this
    new fine grained irq times. By still doing the cpustat updating on ticks,
    but looking at the fine grained stats to figure out whether the time should
    be hardirq/softirq/user/system/idle. Doing that will be a lot of code
    churn in kernel/sched.c:account_*_tick code which is already sort of
    complicated.
(3) Add a new interface in /proc. Implied an additional file read and buffer
    allocation, etc which I want to avoid if possible.
(4) Don't export this info at all. I am ok with this as a alternative. But,
    I needed this to be exported somewhere for my testing atleast.
(5) piggyback on /proc/interrupts and /proc/softirqs. Assuming users interested
    in this kind of info are already looking into those files, we wont have
    overhead of additional file read. There is still a likely hood of breaking
    some apps which only expect interrupt count in those files.

So, here is the patch that does (5)

Signed-off-by: Venkatesh Pallipadi <venki@google.com>
---
 Documentation/filesystems/proc.txt |    9 +++++++++
 fs/proc/interrupts.c               |   11 ++++++++++-
 fs/proc/softirqs.c                 |    8 ++++++++
 include/linux/sched.h              |    3 +++
 kernel/sched.c                     |   27 +++++++++++++++++++++++++++
 5 files changed, 57 insertions(+), 1 deletions(-)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index a6aca87..4456011 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -536,6 +536,11 @@ the threshold vector does not exist on x86_64 platforms.  Others are
 suppressed when the system is a uniprocessor.  As of this writing, only
 i386 and x86_64 platforms support the new IRQ vector displays.
 
+Another addition to /proc/interrupt is "Time:" line at the end which
+displays time spent by corresponding CPU processing interrupts in USER_HZ units.
+This time is based on fine grained accouting when CONFIG_VIRT_CPU_ACCOUNTING
+or CONFIG_IRQ_TIME_ACCOUNTING is active, otherwise it is tick sample based.
+
 Of some interest is the introduction of the /proc/irq directory to 2.4.
 It could be used to set IRQ to CPU affinity, this means that you can "hook" an
 IRQ to only one CPU, or to exclude a CPU of handling IRQs. The contents of the
@@ -824,6 +829,10 @@ Provides counts of softirq handlers serviced since boot time, for each cpu.
  HRTIMER:          0          0          0          0
      RCU:       1678       1769       2178       2250
 
+Addition to /proc/softirqs is "Time:" line at the end which
+displays time spent by corresponding CPU processing softirqs in USER_HZ units.
+This time is based on fine grained accouting when CONFIG_VIRT_CPU_ACCOUNTING
+or CONFIG_IRQ_TIME_ACCOUNTING is active, otherwise it is tick sample based.
 
 1.3 IDE devices in /proc/ide
 ----------------------------
diff --git a/fs/proc/interrupts.c b/fs/proc/interrupts.c
index 05029c0..66d913a 100644
--- a/fs/proc/interrupts.c
+++ b/fs/proc/interrupts.c
@@ -3,6 +3,7 @@
 #include <linux/interrupt.h>
 #include <linux/irqnr.h>
 #include <linux/proc_fs.h>
+#include <linux/sched.h>
 #include <linux/seq_file.h>
 
 /*
@@ -23,7 +24,15 @@ static void *int_seq_next(struct seq_file *f, void *v, loff_t *pos)
 
 static void int_seq_stop(struct seq_file *f, void *v)
 {
-	/* Nothing to do */
+	int j;
+
+	seq_printf(f, "\n");
+	seq_printf(f, "Time:");
+	for_each_possible_cpu(j)
+		seq_printf(f, " %10lu", (unsigned long)get_cpu_hardirq_time(j));
+	seq_printf(f, "  Interrupt Processing Time\n");
+	seq_printf(f, "\n");
+
 }
 
 static const struct seq_operations int_seq_ops = {
diff --git a/fs/proc/softirqs.c b/fs/proc/softirqs.c
index 1807c24..f028329 100644
--- a/fs/proc/softirqs.c
+++ b/fs/proc/softirqs.c
@@ -1,6 +1,7 @@
 #include <linux/init.h>
 #include <linux/kernel_stat.h>
 #include <linux/proc_fs.h>
+#include <linux/sched.h>
 #include <linux/seq_file.h>
 
 /*
@@ -21,6 +22,13 @@ static int show_softirqs(struct seq_file *p, void *v)
 			seq_printf(p, " %10u", kstat_softirqs_cpu(i, j));
 		seq_printf(p, "\n");
 	}
+
+	seq_printf(p, "\n");
+	seq_printf(p, "    Time:");
+	for_each_possible_cpu(j)
+		seq_printf(p, " %10lu", (unsigned long)get_cpu_softirq_time(j));
+	seq_printf(p, "\n");
+
 	return 0;
 }
 
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 8adf166..6562daf 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1826,6 +1826,9 @@ extern void sched_clock_idle_sleep_event(void);
 extern void sched_clock_idle_wakeup_event(u64 delta_ns);
 #endif
 
+extern clock_t get_cpu_hardirq_time(int cpu);
+extern clock_t get_cpu_softirq_time(int cpu);
+
 #ifdef CONFIG_IRQ_TIME_ACCOUNTING
 /*
  * An i/f to runtime opt-in for irq time accounting based off of sched_clock.
diff --git a/kernel/sched.c b/kernel/sched.c
index bfbe064..a171647 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -73,6 +73,7 @@
 #include <linux/ftrace.h>
 #include <linux/slab.h>
 
+#include <asm/cputime.h>
 #include <asm/tlb.h>
 #include <asm/irq_regs.h>
 
@@ -1999,6 +2000,22 @@ static void sched_irq_time_avg_update(struct rq *rq, u64 curr_irq_time)
 	}
 }
 
+clock_t get_cpu_hardirq_time(int cpu)
+{
+	if (!sched_clock_irqtime)
+		return cputime64_to_clock_t(kstat_cpu(cpu).cpustat.irq);
+
+	return nsec_to_clock_t(per_cpu(cpu_hardirq_time, cpu));
+}
+
+clock_t get_cpu_softirq_time(int cpu)
+{
+	if (!sched_clock_irqtime)
+		return cputime64_to_clock_t(kstat_cpu(cpu).cpustat.softirq);
+
+	return nsec_to_clock_t(per_cpu(cpu_softirq_time, cpu));
+}
+
 #else
 
 static u64 irq_time_cpu(int cpu)
@@ -2008,6 +2025,16 @@ static u64 irq_time_cpu(int cpu)
 
 static void sched_irq_time_avg_update(struct rq *rq, u64 curr_irq_time) { }
 
+clock_t get_cpu_hardirq_time(int cpu)
+{
+	return cputime64_to_clock_t(kstat_cpu(cpu).cpustat.irq);
+}
+
+clock_t get_cpu_softirq_time(int cpu)
+{
+	return cputime64_to_clock_t(kstat_cpu(cpu).cpustat.softirq);
+}
+
 #endif
 
 #include "sched_idletask.c"
-- 
1.7.1


  parent reply	other threads:[~2010-09-29 19:22 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-29 19:21 Proper kernel irq time accounting -v3 Venkatesh Pallipadi
2010-09-29 19:21 ` [PATCH 1/7] si time accounting accounts bh_disable'd time to si -v3 Venkatesh Pallipadi
2010-09-30 11:04   ` Peter Zijlstra
2010-09-30 16:26     ` Venkatesh Pallipadi
2010-10-01 23:16       ` Peter Zijlstra
2010-10-02 15:42         ` Venkatesh Pallipadi
2010-10-03  0:34           ` Peter Zijlstra
2010-10-04 16:54             ` Venkatesh Pallipadi
2010-09-29 19:21 ` [PATCH 2/7] Consolidate account_system_vtime extern declaration -v3 Venkatesh Pallipadi
2010-09-29 19:21 ` [PATCH 3/7] Add IRQ_TIME_ACCOUNTING, finer accounting of irq time -v3 Venkatesh Pallipadi
2010-09-30 11:06   ` Peter Zijlstra
2010-09-30 16:29     ` Venkatesh Pallipadi
2010-09-30 20:38       ` Venkatesh Pallipadi
2010-10-01 11:46         ` Peter Zijlstra
2010-10-01 16:51           ` Venkatesh Pallipadi
2010-10-01 17:29             ` Venkatesh Pallipadi
2010-10-01 23:14               ` Peter Zijlstra
2010-10-01 23:32                 ` Venkatesh Pallipadi
2010-10-02 10:53                   ` Peter Zijlstra
2010-10-02 15:26                     ` Venkatesh Pallipadi
2010-10-03  0:26                       ` Peter Zijlstra
2010-10-01 11:45       ` Peter Zijlstra
2010-09-29 19:21 ` [PATCH 4/7] x86: Add IRQ_TIME_ACCOUNTING in x86 -v3 Venkatesh Pallipadi
2010-09-29 19:21 ` [PATCH 5/7] sched: Do not account irq time to current task -v3 Venkatesh Pallipadi
2010-09-29 19:21 ` [PATCH 6/7] sched: Remove irq time from available CPU power -v3 Venkatesh Pallipadi
2010-09-29 19:21 ` Venkatesh Pallipadi [this message]
2010-09-30  7:59 ` Proper kernel irq time accounting -v3 Andi Kleen
2010-09-30 16:37   ` Venkatesh Pallipadi
2010-09-30 17:36     ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1285788096-29471-8-git-send-email-venki@google.com \
    --to=venki@google.com \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=eric.dumazet@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.