* [patch] Avoid too many remote cpu references due to /proc/stat
2007-07-13 0:06 [patch] x86_64: Avoid too many remote cpu references due to /proc/stat Ravikiran G Thirumalai
@ 2007-07-13 0:09 ` Ravikiran G Thirumalai
2007-07-13 2:13 ` [patch] x86_64: " Andrew Morton
2007-07-13 10:03 ` Christoph Hellwig
2 siblings, 0 replies; 5+ messages in thread
From: Ravikiran G Thirumalai @ 2007-07-13 0:09 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, linux-kernel, Shai Fultheim (Shai@scalex86.org)
On Thu, Jul 12, 2007 at 05:06:15PM -0700, Ravikiran G Thirumalai wrote:
> Too many remote cpu references due to /proc/stat.
>
> On x86_64, with newer kernel versions, kstat_irqs is a bit of a problem.
> On every call to kstat_irqs, the process brings in per-cpu data from all
> online cpus. Doing this for NR_IRQS, which is now 256 + 32 * NR_CPUS
> results in (256+32*63) * 63 remote cpu references on a 64 cpu config.
> /proc/stat is parsed by common commands like top, who etc, causing
> lots of cacheline transfers
>
> This statistic seems useless. Other 'big iron' arches disable this.
> Can we disable computing/reporting this statistic? This piece of
> statistic is not human readable on x86_64 anymore,
>
> If not, can we optimize computing this statistic so as to avoid
> too many remote references (patch to follow)
Optimize show_stat to collect per-irq information just once.
On x86_64, with newer kernel versions, kstat_irqs is a bit of a problem.
On every call to kstat_irqs, the process brings in per-cpu data from all
online cpus. Doing this for NR_IRQS, which is now 256 + 32 * NR_CPUS
results in (256+32*63) * 63 remote cpu references on a 64 cpu config.
Considering the fact that we already compute this value per-cpu, we can
save on the remote references as below.
Signed-off-by: Alok N Kataria <alok.kataria@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Index: linux-2.6.22/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.22.orig/fs/proc/proc_misc.c 2007-07-11 14:32:33.013197741 -0700
+++ linux-2.6.22/fs/proc/proc_misc.c 2007-07-12 16:28:24.871389279 -0700
@@ -443,6 +443,11 @@ static int show_stat(struct seq_file *p,
unsigned long jif;
cputime64_t user, nice, system, idle, iowait, irq, softirq, steal;
u64 sum = 0;
+ unsigned int *per_irq_sum;
+
+ per_irq_sum = kzalloc(sizeof(unsigned int)*NR_IRQS, GFP_KERNEL);
+ if (!per_irq_sum)
+ return -ENOMEM;
user = nice = system = idle = iowait =
irq = softirq = steal = cputime64_zero;
@@ -461,8 +466,11 @@ static int show_stat(struct seq_file *p,
irq = cputime64_add(irq, kstat_cpu(i).cpustat.irq);
softirq = cputime64_add(softirq, kstat_cpu(i).cpustat.softirq);
steal = cputime64_add(steal, kstat_cpu(i).cpustat.steal);
- for (j = 0 ; j < NR_IRQS ; j++)
- sum += kstat_cpu(i).irqs[j];
+ for (j = 0 ; j < NR_IRQS ; j++) {
+ unsigned int temp = kstat_cpu(i).irqs[j];
+ sum += temp;
+ per_irq_sum[j] += temp;
+ }
}
seq_printf(p, "cpu %llu %llu %llu %llu %llu %llu %llu %llu\n",
@@ -500,7 +508,7 @@ static int show_stat(struct seq_file *p,
#if !defined(CONFIG_PPC64) && !defined(CONFIG_ALPHA) && !defined(CONFIG_IA64)
for (i = 0; i < NR_IRQS; i++)
- seq_printf(p, " %u", kstat_irqs(i));
+ seq_printf(p, " %u", per_irq_sum[i]);
#endif
seq_printf(p,
@@ -515,6 +523,7 @@ static int show_stat(struct seq_file *p,
nr_running(),
nr_iowait());
+ kfree(per_irq_sum);
return 0;
}
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [patch] x86_64: Avoid too many remote cpu references due to /proc/stat
2007-07-13 0:06 [patch] x86_64: Avoid too many remote cpu references due to /proc/stat Ravikiran G Thirumalai
2007-07-13 0:09 ` [patch] " Ravikiran G Thirumalai
@ 2007-07-13 2:13 ` Andrew Morton
2007-07-13 6:50 ` Ravikiran G Thirumalai
2007-07-13 10:03 ` Christoph Hellwig
2 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2007-07-13 2:13 UTC (permalink / raw)
To: Ravikiran G Thirumalai
Cc: Andi Kleen, linux-kernel, Shai Fultheim (Shai@scalex86.org)
On Thu, 12 Jul 2007 17:06:16 -0700 Ravikiran G Thirumalai <kiran@scalex86.org> wrote:
> Too many remote cpu references due to /proc/stat.
>
> On x86_64, with newer kernel versions, kstat_irqs is a bit of a problem.
> On every call to kstat_irqs, the process brings in per-cpu data from all
> online cpus. Doing this for NR_IRQS, which is now 256 + 32 * NR_CPUS
> results in (256+32*63) * 63 remote cpu references on a 64 cpu config.
> /proc/stat is parsed by common commands like top, who etc, causing
> lots of cacheline transfers
(256+32*63) * 63 = 143136
Do you have any actual numbers for how much this hurts?
> This statistic seems useless. Other 'big iron' arches disable this.
> Can we disable computing/reporting this statistic? This piece of
> statistic is not human readable on x86_64 anymore,
Did you consider using percpu_counters (or such) at interrupt-time?
(warning: percpu_counters aren't presently interrupt safe).
> If not, can we optimize computing this statistic so as to avoid
> too many remote references (patch to follow)
You other patch is a straightforward optimisation and should just be merged.
But afaict it will only provide a 2x speedup which I doubt is sufficient?
Another thought is: how many of the NR_IRQS counters are actually non-zero?
Because a pretty obvious optimisation would be to have a global
bitmap[NR_IRQS] and do
if (!bitmap[irq])
bitmap[irq] = 1;
at interrupt-time, then just print a "0" for the interrupts which have
never occurred within show_stats().
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [patch] x86_64: Avoid too many remote cpu references due to /proc/stat
2007-07-13 0:06 [patch] x86_64: Avoid too many remote cpu references due to /proc/stat Ravikiran G Thirumalai
2007-07-13 0:09 ` [patch] " Ravikiran G Thirumalai
2007-07-13 2:13 ` [patch] x86_64: " Andrew Morton
@ 2007-07-13 10:03 ` Christoph Hellwig
2 siblings, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2007-07-13 10:03 UTC (permalink / raw)
To: Ravikiran G Thirumalai
Cc: Andi Kleen, Andrew Morton, linux-kernel,
Shai Fultheim (Shai@scalex86.org)
On Thu, Jul 12, 2007 at 05:06:16PM -0700, Ravikiran G Thirumalai wrote:
> Too many remote cpu references due to /proc/stat.
>
> On x86_64, with newer kernel versions, kstat_irqs is a bit of a problem.
> On every call to kstat_irqs, the process brings in per-cpu data from all
> online cpus. Doing this for NR_IRQS, which is now 256 + 32 * NR_CPUS
> results in (256+32*63) * 63 remote cpu references on a 64 cpu config.
> /proc/stat is parsed by common commands like top, who etc, causing
> lots of cacheline transfers
>
> This statistic seems useless. Other 'big iron' arches disable this.
> Can we disable computing/reporting this statistic? This piece of
> statistic is not human readable on x86_64 anymore,
At which point we might just remove it completely..
^ permalink raw reply [flat|nested] 5+ messages in thread