linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* CONFIG_NO_HZ added too much idle time in /proc/stat during throughput test.
@ 2011-12-13 20:42 Fushen Chen
  2011-12-13 23:24 ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 7+ messages in thread
From: Fushen Chen @ 2011-12-13 20:42 UTC (permalink / raw)
  To: Linuxppc-dev Development

[-- Attachment #1: Type: text/plain, Size: 853 bytes --]

On APM82181,  "vmstat" (/proc/stat)  doesn't show correct idle percent, if
kernel enables "CONFIG_NO_HZ" (Tickless System / Dynamic Tick).

When I run wireless throughput test with heavy traffic, "vmstat" shows very
high idle percent while "oprofile" shows very low idle percent. During the
test, the system is idle, but network traffic uses a lot of hard IRQ and
soft-irq time. "vmstat" would have the correct stats if
account_idle_ticks(ticks) in kernel/time/tick-sched.c doesn't add more idle
time in "vmstat". In the same test, if I disable "CONFIG_NO_HZ" in kernel,
idle percent in "vmstat" and "oprofile" would match.

My APM82181 kernel configuration is "CONFIG_NO_HZ", "CONFIG_HZ_250=y",
"CONFIG_HZ=250", and "CONFIG_HIGH_RES_TIMERS".

My question is that if kernel enables "CONFIG_NO_HZ", how would kernel
report correct stats.

Thanks,
Fushen

[-- Attachment #2: Type: text/html, Size: 1037 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CONFIG_NO_HZ added too much idle time in /proc/stat during throughput test.
  2011-12-13 20:42 Fushen Chen
@ 2011-12-13 23:24 ` Benjamin Herrenschmidt
  2011-12-13 23:28   ` Thomas Gleixner
  0 siblings, 1 reply; 7+ messages in thread
From: Benjamin Herrenschmidt @ 2011-12-13 23:24 UTC (permalink / raw)
  To: Fushen Chen; +Cc: Linuxppc-dev Development, Thomas Gleixner

On Tue, 2011-12-13 at 12:42 -0800, Fushen Chen wrote:
> On APM82181,  "vmstat" (/proc/stat)  doesn't show correct idle
> percent, if kernel enables "CONFIG_NO_HZ" (Tickless System / Dynamic
> Tick).
> 
> When I run wireless throughput test with heavy traffic, "vmstat" shows
> very high idle percent while "oprofile" shows very low idle percent.
> During the test, the system is idle, but network traffic uses a lot of
> hard IRQ and soft-irq time. "vmstat" would have the correct stats if
> account_idle_ticks(ticks) in kernel/time/tick-sched.c doesn't add more
> idle time in "vmstat". In the same test, if I disable "CONFIG_NO_HZ"
> in kernel, idle percent in "vmstat" and "oprofile" would match.
> 
> My APM82181 kernel configuration is "CONFIG_NO_HZ", "CONFIG_HZ_250=y",
> "CONFIG_HZ=250", and "CONFIG_HIGH_RES_TIMERS".
> 
> My question is that if kernel enables "CONFIG_NO_HZ", how would kernel
> report correct stats.

Hi Thomas ! Any idea what we're doing wrong ? :-)

Cheers,
Ben.

> Thanks,
> Fushen
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CONFIG_NO_HZ added too much idle time in /proc/stat during throughput test.
  2011-12-13 23:24 ` Benjamin Herrenschmidt
@ 2011-12-13 23:28   ` Thomas Gleixner
  2011-12-13 23:34     ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 7+ messages in thread
From: Thomas Gleixner @ 2011-12-13 23:28 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Peter Zijlstra, Linuxppc-dev Development, Fushen Chen

On Wed, 14 Dec 2011, Benjamin Herrenschmidt wrote:

> On Tue, 2011-12-13 at 12:42 -0800, Fushen Chen wrote:
> > On APM82181,  "vmstat" (/proc/stat)  doesn't show correct idle
> > percent, if kernel enables "CONFIG_NO_HZ" (Tickless System / Dynamic
> > Tick).
> > 
> > When I run wireless throughput test with heavy traffic, "vmstat" shows
> > very high idle percent while "oprofile" shows very low idle percent.
> > During the test, the system is idle, but network traffic uses a lot of
> > hard IRQ and soft-irq time. "vmstat" would have the correct stats if
> > account_idle_ticks(ticks) in kernel/time/tick-sched.c doesn't add more
> > idle time in "vmstat". In the same test, if I disable "CONFIG_NO_HZ"
> > in kernel, idle percent in "vmstat" and "oprofile" would match.
> > 
> > My APM82181 kernel configuration is "CONFIG_NO_HZ", "CONFIG_HZ_250=y",
> > "CONFIG_HZ=250", and "CONFIG_HIGH_RES_TIMERS".
> > 
> > My question is that if kernel enables "CONFIG_NO_HZ", how would kernel
> > report correct stats.
> 
> Hi Thomas ! Any idea what we're doing wrong ? :-)

Not really, that had been an issue before and had been fixed. Peter ????

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CONFIG_NO_HZ added too much idle time in /proc/stat during throughput test.
  2011-12-13 23:28   ` Thomas Gleixner
@ 2011-12-13 23:34     ` Benjamin Herrenschmidt
  2011-12-14  1:14       ` Fushen Chen
  0 siblings, 1 reply; 7+ messages in thread
From: Benjamin Herrenschmidt @ 2011-12-13 23:34 UTC (permalink / raw)
  To: Fushen Chen; +Cc: Peter Zijlstra, Linuxppc-dev Development, Thomas Gleixner

On Wed, 2011-12-14 at 00:28 +0100, Thomas Gleixner wrote:
> On Wed, 14 Dec 2011, Benjamin Herrenschmidt wrote:
> 
> > On Tue, 2011-12-13 at 12:42 -0800, Fushen Chen wrote:
> > > On APM82181,  "vmstat" (/proc/stat)  doesn't show correct idle
> > > percent, if kernel enables "CONFIG_NO_HZ" (Tickless System / Dynamic
> > > Tick).
> > > 
> > > When I run wireless throughput test with heavy traffic, "vmstat" shows
> > > very high idle percent while "oprofile" shows very low idle percent.
> > > During the test, the system is idle, but network traffic uses a lot of
> > > hard IRQ and soft-irq time. "vmstat" would have the correct stats if
> > > account_idle_ticks(ticks) in kernel/time/tick-sched.c doesn't add more
> > > idle time in "vmstat". In the same test, if I disable "CONFIG_NO_HZ"
> > > in kernel, idle percent in "vmstat" and "oprofile" would match.
> > > 
> > > My APM82181 kernel configuration is "CONFIG_NO_HZ", "CONFIG_HZ_250=y",
> > > "CONFIG_HZ=250", and "CONFIG_HIGH_RES_TIMERS".
> > > 
> > > My question is that if kernel enables "CONFIG_NO_HZ", how would kernel
> > > report correct stats.
> > 
> > Hi Thomas ! Any idea what we're doing wrong ? :-)
> 
> Not really, that had been an issue before and had been fixed. Peter ????

Fusen, what kernel version is this ?

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CONFIG_NO_HZ added too much idle time in /proc/stat during throughput test.
       [not found] <CAEu=RPirE=H1N=KjHNjNRBM6H1fRvrugCw6ojqWaTNm2=WTfng__4707.66240400753$1323813396$gmane$org@mail.gmail.com>
@ 2011-12-13 23:57 ` Andreas Schwab
  0 siblings, 0 replies; 7+ messages in thread
From: Andreas Schwab @ 2011-12-13 23:57 UTC (permalink / raw)
  To: Fushen Chen; +Cc: Linuxppc-dev Development

Does this help?

<http://permalink.gmane.org/gmane.linux.ports.ppc.embedded/47530>

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CONFIG_NO_HZ added too much idle time in /proc/stat during throughput test.
  2011-12-13 23:34     ` Benjamin Herrenschmidt
@ 2011-12-14  1:14       ` Fushen Chen
  2011-12-14  3:17         ` Anton Blanchard
  0 siblings, 1 reply; 7+ messages in thread
From: Fushen Chen @ 2011-12-14  1:14 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Peter Zijlstra, Linuxppc-dev Development, Thomas Gleixner

[-- Attachment #1: Type: text/plain, Size: 1524 bytes --]

This is 2.6.32, but I think 2.6.36 is the same.
Thanks,
Fushen

On Tue, Dec 13, 2011 at 3:34 PM, Benjamin Herrenschmidt <
benh@kernel.crashing.org> wrote:

> On Wed, 2011-12-14 at 00:28 +0100, Thomas Gleixner wrote:
> > On Wed, 14 Dec 2011, Benjamin Herrenschmidt wrote:
> >
> > > On Tue, 2011-12-13 at 12:42 -0800, Fushen Chen wrote:
> > > > On APM82181,  "vmstat" (/proc/stat)  doesn't show correct idle
> > > > percent, if kernel enables "CONFIG_NO_HZ" (Tickless System / Dynamic
> > > > Tick).
> > > >
> > > > When I run wireless throughput test with heavy traffic, "vmstat"
> shows
> > > > very high idle percent while "oprofile" shows very low idle percent.
> > > > During the test, the system is idle, but network traffic uses a lot
> of
> > > > hard IRQ and soft-irq time. "vmstat" would have the correct stats if
> > > > account_idle_ticks(ticks) in kernel/time/tick-sched.c doesn't add
> more
> > > > idle time in "vmstat". In the same test, if I disable "CONFIG_NO_HZ"
> > > > in kernel, idle percent in "vmstat" and "oprofile" would match.
> > > >
> > > > My APM82181 kernel configuration is "CONFIG_NO_HZ",
> "CONFIG_HZ_250=y",
> > > > "CONFIG_HZ=250", and "CONFIG_HIGH_RES_TIMERS".
> > > >
> > > > My question is that if kernel enables "CONFIG_NO_HZ", how would
> kernel
> > > > report correct stats.
> > >
> > > Hi Thomas ! Any idea what we're doing wrong ? :-)
> >
> > Not really, that had been an issue before and had been fixed. Peter ????
>
> Fusen, what kernel version is this ?
>
> Cheers,
> Ben.
>
>
>

[-- Attachment #2: Type: text/html, Size: 2237 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CONFIG_NO_HZ added too much idle time in /proc/stat during throughput test.
  2011-12-14  1:14       ` Fushen Chen
@ 2011-12-14  3:17         ` Anton Blanchard
  0 siblings, 0 replies; 7+ messages in thread
From: Anton Blanchard @ 2011-12-14  3:17 UTC (permalink / raw)
  To: Fushen Chen; +Cc: Peter Zijlstra, Thomas Gleixner, Linuxppc-dev Development


Hi,

> This is 2.6.32, but I think 2.6.36 is the same.

Sounds a bit like this, merged in 2.6.39.

Anton
--

commit ad5d1c888e556bc00c4e86f452cad4a3a87d22c1
Author: Anton Blanchard <anton@samba.org>
Date:   Sun Mar 20 15:28:03 2011 +0000

    powerpc: Fix accounting of softirq time when idle
    
    commit cf9efce0ce31 (powerpc: Account time using timebase rather
    than PURR) used in_irq() to detect if the time was spent in
    interrupt processing. This only catches hardirq context so if we
    are in softirq context and in the idle loop we end up accounting it
    as idle time. If we instead use in_interrupt() we catch both softirq
    and hardirq time.
    
    The issue was found when running a network intensive workload. top
    showed the following:
    
    0.0%us,  1.1%sy,  0.0%ni, 85.7%id,  0.0%wa,  9.9%hi,  3.3%si,  0.0%st
    
    85.7% idle. But this was wildly different to the perf events data.
    To confirm the suspicion I ran something to keep the core busy:
    
    # yes > /dev/null &
    
    8.2%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa, 10.3%hi, 81.4%si,  0.0%st
    
    We only got 8.2% of the CPU for the userspace task and softirq has
    shot up to 81.4%.
    
    With the patch below top shows the correct stats:
    
    0.0%us,  0.0%sy,  0.0%ni,  5.3%id,  0.0%wa, 13.3%hi, 81.3%si,  0.0%st
    
    Signed-off-by: Anton Blanchard <anton@samba.org>
    Cc: stable@kernel.org
    Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-12-14  3:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CAEu=RPirE=H1N=KjHNjNRBM6H1fRvrugCw6ojqWaTNm2=WTfng__4707.66240400753$1323813396$gmane$org@mail.gmail.com>
2011-12-13 23:57 ` CONFIG_NO_HZ added too much idle time in /proc/stat during throughput test Andreas Schwab
2011-12-13 20:42 Fushen Chen
2011-12-13 23:24 ` Benjamin Herrenschmidt
2011-12-13 23:28   ` Thomas Gleixner
2011-12-13 23:34     ` Benjamin Herrenschmidt
2011-12-14  1:14       ` Fushen Chen
2011-12-14  3:17         ` Anton Blanchard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).