Intel-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
* less load less performance
@ 2010-10-31 16:44 Alexey Fisher
  2010-10-31 17:01 ` Peter Clifton
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Alexey Fisher @ 2010-10-31 16:44 UTC (permalink / raw)
  To: intel-gfx@lists.freedesktop.org; +Cc: power@bughost.org

Hallo all,

after testing latest intel_drm_next v2.6.36-07547-g100519e on my netbook
i have seen interesting issue. Standardly started glxgears performs not
so well, it drops some times to 25fps.
if start glxgears in fullscrean mode it performs well - about 60fps.

here is powertop --dump, glxgears in standart window:
====================================================
C0 (Prozessor läuft) ( 6,7%)
zyklisches AbfraC1 mwait 0,0ms ( 0,0%)
C1 mwait 0,4ms ( 0,2%)
C2 mwait 2,0ms ( 6,1%)
C4 mwait 5,6ms (87,0%)
P-States (Frequenzen)
1,67 GHz 13,4%
1333 MHz 0,1%
1000 MHz 86,4%

dump with glxgears in fullscrean mode.
====================================================
C0 (Prozessor läuft) (18,5%)
zyklisches AbfraC1 mwait 0,0ms ( 0,0%)
C1 mwait 0,4ms ( 0,4%)
C2 mwait 2,2ms (29,0%)
C4 mwait 1,8ms (52,2%)
P-States (Frequenzen)
1,67 GHz 11,7%
1333 MHz 0,1%
1000 MHz 88,2%

As i can understand if cpu do not get enough load it will work mostly in
C4 mode and graphic perfome slow too. I think there is some thing wrong
in this logic :)
-- 
Regards,
        Alexey

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: less load less performance
  2010-10-31 16:44 less load less performance Alexey Fisher
@ 2010-10-31 17:01 ` Peter Clifton
  2010-10-31 17:40   ` Alexey Fisher
  2010-10-31 17:33 ` Vasily Khoruzhick
  2010-10-31 22:07 ` Arjan van de Ven
  2 siblings, 1 reply; 9+ messages in thread
From: Peter Clifton @ 2010-10-31 17:01 UTC (permalink / raw)
  To: Alexey Fisher; +Cc: intel-gfx@lists.freedesktop.org, power@bughost.org

On Sun, 2010-10-31 at 17:44 +0100, Alexey Fisher wrote:
> Hallo all,
> 
> As i can understand if cpu do not get enough load it will work mostly in
> C4 mode and graphic perfome slow too. I think there is some thing wrong
> in this logic :)

Yes, a little messed up.. try running your test at low screen-res with
this app running (once per core):

int main( int argc, char **argv )
{
  while (1);
}

(gcc loop.c -o loop)

Do you get the high frames per second (non-full-screen) then?

-- 
Peter Clifton

Electrical Engineering Division,
Engineering Department,
University of Cambridge,
9, JJ Thomson Avenue,
Cambridge
CB3 0FA

Tel: +44 (0)7729 980173 - (No signal in the lab!)
Tel: +44 (0)1223 748328 - (Shared lab phone, ask for me)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: less load less performance
  2010-10-31 16:44 less load less performance Alexey Fisher
  2010-10-31 17:01 ` Peter Clifton
@ 2010-10-31 17:33 ` Vasily Khoruzhick
  2010-10-31 22:07 ` Arjan van de Ven
  2 siblings, 0 replies; 9+ messages in thread
From: Vasily Khoruzhick @ 2010-10-31 17:33 UTC (permalink / raw)
  To: intel-gfx; +Cc: power@bughost.org

On Sunday 31 October 2010 18:44:47 Alexey Fisher wrote:
> Hallo all,
> 
> after testing latest intel_drm_next v2.6.36-07547-g100519e on my netbook
> i have seen interesting issue. Standardly started glxgears performs not
> so well, it drops some times to 25fps.
> if start glxgears in fullscrean mode it performs well - about 60fps.

It's a know bug for 945GM(E) chipset:

https://bugs.freedesktop.org/show_bug.cgi?id=30364

Regards
Vasily

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: less load less performance
  2010-10-31 17:01 ` Peter Clifton
@ 2010-10-31 17:40   ` Alexey Fisher
  2010-10-31 19:18     ` [Intel-gfx] " Andreas Mohr
  0 siblings, 1 reply; 9+ messages in thread
From: Alexey Fisher @ 2010-10-31 17:40 UTC (permalink / raw)
  To: Peter Clifton; +Cc: intel-gfx@lists.freedesktop.org, power@bughost.org

Am Sonntag, den 31.10.2010, 17:01 +0000 schrieb Peter Clifton:
> On Sun, 2010-10-31 at 17:44 +0100, Alexey Fisher wrote:
> > Hallo all,
> > 
> > As i can understand if cpu do not get enough load it will work mostly in
> > C4 mode and graphic perfome slow too. I think there is some thing wrong
> > in this logic :)
> 
> Yes, a little messed up.. try running your test at low screen-res with
> this app running (once per core):
> 
> int main( int argc, char **argv )
> {
>   while (1);
> }
> 
> (gcc loop.c -o loop)
> 
> Do you get the high frames per second (non-full-screen) then?

Yes! it working smooth, with 60fps (i have only single core atom with HT
enabled)

-- 
Regards,
        Alexey

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-gfx] less load less performance
  2010-10-31 17:40   ` Alexey Fisher
@ 2010-10-31 19:18     ` Andreas Mohr
  2010-10-31 19:44       ` Alexey Fisher
  0 siblings, 1 reply; 9+ messages in thread
From: Andreas Mohr @ 2010-10-31 19:18 UTC (permalink / raw)
  To: Alexey Fisher
  Cc: Peter Clifton, power-072X8lT/F9NAfugRpC6u6w@public.gmane.org,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org

On Sun, Oct 31, 2010 at 06:40:26PM +0100, Alexey Fisher wrote:
> Am Sonntag, den 31.10.2010, 17:01 +0000 schrieb Peter Clifton:
> > On Sun, 2010-10-31 at 17:44 +0100, Alexey Fisher wrote:
> > > Hallo all,
> > > 
> > > As i can understand if cpu do not get enough load it will work mostly in
> > > C4 mode and graphic perfome slow too. I think there is some thing wrong
> > > in this logic :)
> > 
> > Yes, a little messed up.. try running your test at low screen-res with
> > this app running (once per core):
> > 
> > int main( int argc, char **argv )
> > {
> >   while (1);
> > }
> > 
> > (gcc loop.c -o loop)
> > 
> > Do you get the high frames per second (non-full-screen) then?
> 
> Yes! it working smooth, with 60fps (i have only single core atom with HT
> enabled)

Why painfully compile a custom c app to keep the CPU busy?

Boot with processor.max_cstate=1
Much better performance? --> "BUG"!
("BUG" == "something should probably be done about these power management side
effects")

Andreas Mohr

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: less load less performance
  2010-10-31 19:18     ` [Intel-gfx] " Andreas Mohr
@ 2010-10-31 19:44       ` Alexey Fisher
  2010-11-02 15:53         ` [Intel-gfx] " Thomas Renninger
  0 siblings, 1 reply; 9+ messages in thread
From: Alexey Fisher @ 2010-10-31 19:44 UTC (permalink / raw)
  To: Andreas Mohr; +Cc: power@bughost.org, intel-gfx@lists.freedesktop.org

Am Sonntag, den 31.10.2010, 20:18 +0100 schrieb Andreas Mohr:
> On Sun, Oct 31, 2010 at 06:40:26PM +0100, Alexey Fisher wrote:
> > Am Sonntag, den 31.10.2010, 17:01 +0000 schrieb Peter Clifton:
> > > On Sun, 2010-10-31 at 17:44 +0100, Alexey Fisher wrote:
> > > > Hallo all,
> > > > 
> > > > As i can understand if cpu do not get enough load it will work mostly in
> > > > C4 mode and graphic perfome slow too. I think there is some thing wrong
> > > > in this logic :)
> > > 
> > > Yes, a little messed up.. try running your test at low screen-res with
> > > this app running (once per core):
> > > 
> > > int main( int argc, char **argv )
> > > {
> > >   while (1);
> > > }
> > > 
> > > (gcc loop.c -o loop)
> > > 
> > > Do you get the high frames per second (non-full-screen) then?
> > 
> > Yes! it working smooth, with 60fps (i have only single core atom with HT
> > enabled)
> 
> Why painfully compile a custom c app to keep the CPU busy?
> 
> Boot with processor.max_cstate=1
> Much better performance? --> "BUG"!
> ("BUG" == "something should probably be done about these power management side
> effects")

for some reasons "processor.max_cstate=1" do not make any difference,
cpu still use C4. Interesting is maxcpus=1 do difference, C4 is used and
it perform good too. So what can it be? Some SMP scheduler problem, IRQ
balancing?
I know intel CPUs had some PM problem, if 1 core is disabled it consume
more power (may be no C4?). What talking against this theory:
1. if i start SMP and put one core off, this will make no difference
so maxcpus=1 and "echo 0 > /sys/devices/system/cpu/cpu1/online" is not
the same
2. i use Atom N280, there is only one core but HT is enabled.
-- 
Regards,
        Alexey

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: less load less performance
  2010-10-31 16:44 less load less performance Alexey Fisher
  2010-10-31 17:01 ` Peter Clifton
  2010-10-31 17:33 ` Vasily Khoruzhick
@ 2010-10-31 22:07 ` Arjan van de Ven
  2010-11-01  8:50   ` Alexey Fisher
  2 siblings, 1 reply; 9+ messages in thread
From: Arjan van de Ven @ 2010-10-31 22:07 UTC (permalink / raw)
  To: Alexey Fisher
  Cc: intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
	power-072X8lT/F9NAfugRpC6u6w@public.gmane.org

On 10/31/2010 9:44 AM, Alexey Fisher wrote:
> Hallo all,
>
> after testing latest intel_drm_next v2.6.36-07547-g100519e on my netbook
> i have seen interesting issue. Standardly started glxgears performs not
> so well, it drops some times to 25fps.
> if start glxgears in fullscrean mode it performs well - about 60fps.


funny that you mention this; I was just talking to Eric earlier about 
this topic in a cab to the conference...
... we have some ideas on how to fix this.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: less load less performance
  2010-10-31 22:07 ` Arjan van de Ven
@ 2010-11-01  8:50   ` Alexey Fisher
  0 siblings, 0 replies; 9+ messages in thread
From: Alexey Fisher @ 2010-11-01  8:50 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: intel-gfx@lists.freedesktop.org, power@bughost.org

Am Sonntag, den 31.10.2010, 15:07 -0700 schrieb Arjan van de Ven:
> On 10/31/2010 9:44 AM, Alexey Fisher wrote:
> > Hallo all,
> >
> > after testing latest intel_drm_next v2.6.36-07547-g100519e on my netbook
> > i have seen interesting issue. Standardly started glxgears performs not
> > so well, it drops some times to 25fps.
> > if start glxgears in fullscrean mode it performs well - about 60fps.
> 
> 
> funny that you mention this; I was just talking to Eric earlier about 
> this topic in a cab to the conference...
> ... we have some ideas on how to fix this.

It will be great to see some patches :D
i did some more powertop debuging to see what is the difference between
smp and non smp system.
Here are two dumps, i tried to do it more or less clean: "sleep 10;
powertop --dump" 

dump with maxcpus=1:
==========================================================
Cn	           Verweildauer
C0 (Prozessor läuft)    (33,0%)
zyklisches AbfraC1 mwait	  0,0ms ( 0,0%)
C1 mwait	  0,2ms ( 0,1%)
C2 mwait	  1,4ms ( 1,8%)
C4 mwait	  5,6ms (65,2%)
P-States (Frequenzen)
  1,67 GHz    23,8%
  1333 MHz     0,1%
  1000 MHz    76,1%
Aufwachen pro Sekunde : 132,2	Intervall: 15,0s
Stromverbrauch (ACPI-Schätzung): 10,2W (6,1 Std.) 
Häufigste Ursachen für das Aufwachen:
  34,1% ( 88,7)   [kernel scheduler] Load balancing tick
  23,1% ( 60,1)   [i915, uhci_hcd:usb5] <interrupt>
  12,1% ( 31,4)   firefox-bin
   9,2% ( 23,9)   [ath9k] <interrupt>
   4,7% ( 12,1)   evince


dump with smp/ht enabled:
===============================================
Ihre CPU unterstützt folgende C-Status: C1 C2 C4 
Ihr BIOS meldet folgende C-Status: C1 C2 C4 
Cn	           Verweildauer
C0 (Prozessor läuft)    (13,4%)
zyklisches AbfraC1 mwait	  0,0ms ( 0,0%)
C1 mwait	  2,2ms ( 2,6%)
C2 mwait	  3,6ms (41,8%)
C4 mwait	  1,5ms (42,3%)
P-States (Frequenzen)
  1,67 GHz    11,3%
  1333 MHz     0,1%
  1000 MHz    88,6%
Aufwachen pro Sekunde : 411,0	Intervall: 15,0s
Stromverbrauch (ACPI-Schätzung): 9,2W (6,5 Std.) 
Häufigste Ursachen für das Aufwachen:
  19,3% ( 50,0)   kworker/0:0
  17,9% ( 46,3)   [i915, uhci_hcd:usb5] <interrupt>
  15,6% ( 40,3)   [kernel scheduler] Load balancing tick
  13,3% ( 34,4)   PS/2 keyboard/mouse/touchpad interrupt
   8,3% ( 21,4)   [ath9k] <interrupt>
   4,7% ( 12,1)   evince
   4,0% ( 10,5)   [kernel core] hrtimer_start (tick_sched_timer)
========================================================================

With SMP it use less C0 and more C2,C4; with NONSMP it use mostly C0 and
C4 but is use _more_ power ...


If i start normal kernel also SMP but set one core offline after boot i
get this:
==========================================================================
Ihre CPU unterstützt folgende C-Status: C1 C2 C4 
Ihr BIOS meldet folgende C-Status: C1 C2 C4 
Cn	           Verweildauer
C0 (Prozessor läuft)    (39,8%)
zyklisches AbfraC1 mwait	  0,0ms ( 0,0%)
C1 mwait	  4,9ms ( 8,0%)
C2 mwait	 11,4ms (51,0%)
C4 mwait	  0,1ms ( 1,3%)
P-States (Frequenzen)
  1,67 GHz    34,0%
  1333 MHz     0,4%
  1000 MHz    65,6%
Aufwachen pro Sekunde : 301,1	Intervall: 15,0s
Stromverbrauch (ACPI-Schätzung): 9,7W (5,8 Std.) 
Häufigste Ursachen für das Aufwachen:
  39,1% ( 92,6)   [kernel scheduler] Load balancing tick
  22,6% ( 53,5)   [i915, uhci_hcd:usb5] <interrupt>
   5,7% ( 13,6)   [ahci] <interrupt>
   5,1% ( 12,1)   evince
   4,8% ( 11,3)   glxgears
   4,2% (  9,9)   desktopcouch-se


C4 is almost not used, but i still get bad performance. 

Do PM behave differently if i start with one Core and if i desable one
core after start?

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-gfx] less load less performance
  2010-10-31 19:44       ` Alexey Fisher
@ 2010-11-02 15:53         ` Thomas Renninger
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Renninger @ 2010-11-02 15:53 UTC (permalink / raw)
  To: power-072X8lT/F9NAfugRpC6u6w
  Cc: Peter Clifton,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org

On Sunday 31 October 2010 20:44:27 Alexey Fisher wrote:
> Am Sonntag, den 31.10.2010, 20:18 +0100 schrieb Andreas Mohr:
...
> > 
> > Why painfully compile a custom c app to keep the CPU busy?
> > 
> > Boot with processor.max_cstate=1
> > Much better performance? --> "BUG"!
> > ("BUG" == "something should probably be done about these power management side
> > effects")
> 
> for some reasons "processor.max_cstate=1" do not make any difference,
> cpu still use C4.
This is because the new intel_idle driver is used:
cat /sys/devices/system/cpu/cpuidle/current_driver
either you pass both:
intel_idle.max_cstate=0 processor.max_cstate=1
or with the patch I posted today to the linux-acpi list,
idle=halt (C1) idle=poll (busy idling, no power saving at all) can
be used:
[PATCH] intel_idle: Do not load if user overrides idle function via idle= boot param

Hmm, a more generic cpuidle param:
cpuidle.max_state=
may make sense as well.

> Interesting is maxcpus=1 do difference, C4 is used and
> it perform good too.
I am not familiar with the very details of Atoms very deep
C-state implementation, but it could be that all cores/siblings
of a CPU socket need to request sleep states so that C4 or
whatever HW triggered internal power savings take place.

> So what can it be? Some SMP scheduler problem, IRQ
> balancing?
> I know intel CPUs had some PM problem, if 1 core is disabled it consume
> more power (may be no C4?).
Sounds like this is the case...

    Thomas

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-11-02 15:53 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-31 16:44 less load less performance Alexey Fisher
2010-10-31 17:01 ` Peter Clifton
2010-10-31 17:40   ` Alexey Fisher
2010-10-31 19:18     ` [Intel-gfx] " Andreas Mohr
2010-10-31 19:44       ` Alexey Fisher
2010-11-02 15:53         ` [Intel-gfx] " Thomas Renninger
2010-10-31 17:33 ` Vasily Khoruzhick
2010-10-31 22:07 ` Arjan van de Ven
2010-11-01  8:50   ` Alexey Fisher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox