* less load less performance
@ 2010-10-31 16:44 Alexey Fisher
2010-10-31 17:01 ` Peter Clifton
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Alexey Fisher @ 2010-10-31 16:44 UTC (permalink / raw)
To: intel-gfx@lists.freedesktop.org; +Cc: power@bughost.org
Hallo all,
after testing latest intel_drm_next v2.6.36-07547-g100519e on my netbook
i have seen interesting issue. Standardly started glxgears performs not
so well, it drops some times to 25fps.
if start glxgears in fullscrean mode it performs well - about 60fps.
here is powertop --dump, glxgears in standart window:
====================================================
C0 (Prozessor läuft) ( 6,7%)
zyklisches AbfraC1 mwait 0,0ms ( 0,0%)
C1 mwait 0,4ms ( 0,2%)
C2 mwait 2,0ms ( 6,1%)
C4 mwait 5,6ms (87,0%)
P-States (Frequenzen)
1,67 GHz 13,4%
1333 MHz 0,1%
1000 MHz 86,4%
dump with glxgears in fullscrean mode.
====================================================
C0 (Prozessor läuft) (18,5%)
zyklisches AbfraC1 mwait 0,0ms ( 0,0%)
C1 mwait 0,4ms ( 0,4%)
C2 mwait 2,2ms (29,0%)
C4 mwait 1,8ms (52,2%)
P-States (Frequenzen)
1,67 GHz 11,7%
1333 MHz 0,1%
1000 MHz 88,2%
As i can understand if cpu do not get enough load it will work mostly in
C4 mode and graphic perfome slow too. I think there is some thing wrong
in this logic :)
--
Regards,
Alexey
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: less load less performance 2010-10-31 16:44 less load less performance Alexey Fisher @ 2010-10-31 17:01 ` Peter Clifton 2010-10-31 17:40 ` Alexey Fisher 2010-10-31 17:33 ` Vasily Khoruzhick 2010-10-31 22:07 ` Arjan van de Ven 2 siblings, 1 reply; 9+ messages in thread From: Peter Clifton @ 2010-10-31 17:01 UTC (permalink / raw) To: Alexey Fisher; +Cc: intel-gfx@lists.freedesktop.org, power@bughost.org On Sun, 2010-10-31 at 17:44 +0100, Alexey Fisher wrote: > Hallo all, > > As i can understand if cpu do not get enough load it will work mostly in > C4 mode and graphic perfome slow too. I think there is some thing wrong > in this logic :) Yes, a little messed up.. try running your test at low screen-res with this app running (once per core): int main( int argc, char **argv ) { while (1); } (gcc loop.c -o loop) Do you get the high frames per second (non-full-screen) then? -- Peter Clifton Electrical Engineering Division, Engineering Department, University of Cambridge, 9, JJ Thomson Avenue, Cambridge CB3 0FA Tel: +44 (0)7729 980173 - (No signal in the lab!) Tel: +44 (0)1223 748328 - (Shared lab phone, ask for me) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: less load less performance 2010-10-31 17:01 ` Peter Clifton @ 2010-10-31 17:40 ` Alexey Fisher 2010-10-31 19:18 ` [Intel-gfx] " Andreas Mohr 0 siblings, 1 reply; 9+ messages in thread From: Alexey Fisher @ 2010-10-31 17:40 UTC (permalink / raw) To: Peter Clifton; +Cc: intel-gfx@lists.freedesktop.org, power@bughost.org Am Sonntag, den 31.10.2010, 17:01 +0000 schrieb Peter Clifton: > On Sun, 2010-10-31 at 17:44 +0100, Alexey Fisher wrote: > > Hallo all, > > > > As i can understand if cpu do not get enough load it will work mostly in > > C4 mode and graphic perfome slow too. I think there is some thing wrong > > in this logic :) > > Yes, a little messed up.. try running your test at low screen-res with > this app running (once per core): > > int main( int argc, char **argv ) > { > while (1); > } > > (gcc loop.c -o loop) > > Do you get the high frames per second (non-full-screen) then? Yes! it working smooth, with 60fps (i have only single core atom with HT enabled) -- Regards, Alexey ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Intel-gfx] less load less performance 2010-10-31 17:40 ` Alexey Fisher @ 2010-10-31 19:18 ` Andreas Mohr 2010-10-31 19:44 ` Alexey Fisher 0 siblings, 1 reply; 9+ messages in thread From: Andreas Mohr @ 2010-10-31 19:18 UTC (permalink / raw) To: Alexey Fisher Cc: Peter Clifton, power-072X8lT/F9NAfugRpC6u6w@public.gmane.org, intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org On Sun, Oct 31, 2010 at 06:40:26PM +0100, Alexey Fisher wrote: > Am Sonntag, den 31.10.2010, 17:01 +0000 schrieb Peter Clifton: > > On Sun, 2010-10-31 at 17:44 +0100, Alexey Fisher wrote: > > > Hallo all, > > > > > > As i can understand if cpu do not get enough load it will work mostly in > > > C4 mode and graphic perfome slow too. I think there is some thing wrong > > > in this logic :) > > > > Yes, a little messed up.. try running your test at low screen-res with > > this app running (once per core): > > > > int main( int argc, char **argv ) > > { > > while (1); > > } > > > > (gcc loop.c -o loop) > > > > Do you get the high frames per second (non-full-screen) then? > > Yes! it working smooth, with 60fps (i have only single core atom with HT > enabled) Why painfully compile a custom c app to keep the CPU busy? Boot with processor.max_cstate=1 Much better performance? --> "BUG"! ("BUG" == "something should probably be done about these power management side effects") Andreas Mohr ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: less load less performance 2010-10-31 19:18 ` [Intel-gfx] " Andreas Mohr @ 2010-10-31 19:44 ` Alexey Fisher 2010-11-02 15:53 ` [Intel-gfx] " Thomas Renninger 0 siblings, 1 reply; 9+ messages in thread From: Alexey Fisher @ 2010-10-31 19:44 UTC (permalink / raw) To: Andreas Mohr; +Cc: power@bughost.org, intel-gfx@lists.freedesktop.org Am Sonntag, den 31.10.2010, 20:18 +0100 schrieb Andreas Mohr: > On Sun, Oct 31, 2010 at 06:40:26PM +0100, Alexey Fisher wrote: > > Am Sonntag, den 31.10.2010, 17:01 +0000 schrieb Peter Clifton: > > > On Sun, 2010-10-31 at 17:44 +0100, Alexey Fisher wrote: > > > > Hallo all, > > > > > > > > As i can understand if cpu do not get enough load it will work mostly in > > > > C4 mode and graphic perfome slow too. I think there is some thing wrong > > > > in this logic :) > > > > > > Yes, a little messed up.. try running your test at low screen-res with > > > this app running (once per core): > > > > > > int main( int argc, char **argv ) > > > { > > > while (1); > > > } > > > > > > (gcc loop.c -o loop) > > > > > > Do you get the high frames per second (non-full-screen) then? > > > > Yes! it working smooth, with 60fps (i have only single core atom with HT > > enabled) > > Why painfully compile a custom c app to keep the CPU busy? > > Boot with processor.max_cstate=1 > Much better performance? --> "BUG"! > ("BUG" == "something should probably be done about these power management side > effects") for some reasons "processor.max_cstate=1" do not make any difference, cpu still use C4. Interesting is maxcpus=1 do difference, C4 is used and it perform good too. So what can it be? Some SMP scheduler problem, IRQ balancing? I know intel CPUs had some PM problem, if 1 core is disabled it consume more power (may be no C4?). What talking against this theory: 1. if i start SMP and put one core off, this will make no difference so maxcpus=1 and "echo 0 > /sys/devices/system/cpu/cpu1/online" is not the same 2. i use Atom N280, there is only one core but HT is enabled. -- Regards, Alexey ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Intel-gfx] less load less performance 2010-10-31 19:44 ` Alexey Fisher @ 2010-11-02 15:53 ` Thomas Renninger 0 siblings, 0 replies; 9+ messages in thread From: Thomas Renninger @ 2010-11-02 15:53 UTC (permalink / raw) To: power-072X8lT/F9NAfugRpC6u6w Cc: Peter Clifton, intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org On Sunday 31 October 2010 20:44:27 Alexey Fisher wrote: > Am Sonntag, den 31.10.2010, 20:18 +0100 schrieb Andreas Mohr: ... > > > > Why painfully compile a custom c app to keep the CPU busy? > > > > Boot with processor.max_cstate=1 > > Much better performance? --> "BUG"! > > ("BUG" == "something should probably be done about these power management side > > effects") > > for some reasons "processor.max_cstate=1" do not make any difference, > cpu still use C4. This is because the new intel_idle driver is used: cat /sys/devices/system/cpu/cpuidle/current_driver either you pass both: intel_idle.max_cstate=0 processor.max_cstate=1 or with the patch I posted today to the linux-acpi list, idle=halt (C1) idle=poll (busy idling, no power saving at all) can be used: [PATCH] intel_idle: Do not load if user overrides idle function via idle= boot param Hmm, a more generic cpuidle param: cpuidle.max_state= may make sense as well. > Interesting is maxcpus=1 do difference, C4 is used and > it perform good too. I am not familiar with the very details of Atoms very deep C-state implementation, but it could be that all cores/siblings of a CPU socket need to request sleep states so that C4 or whatever HW triggered internal power savings take place. > So what can it be? Some SMP scheduler problem, IRQ > balancing? > I know intel CPUs had some PM problem, if 1 core is disabled it consume > more power (may be no C4?). Sounds like this is the case... Thomas ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: less load less performance 2010-10-31 16:44 less load less performance Alexey Fisher 2010-10-31 17:01 ` Peter Clifton @ 2010-10-31 17:33 ` Vasily Khoruzhick 2010-10-31 22:07 ` Arjan van de Ven 2 siblings, 0 replies; 9+ messages in thread From: Vasily Khoruzhick @ 2010-10-31 17:33 UTC (permalink / raw) To: intel-gfx; +Cc: power@bughost.org On Sunday 31 October 2010 18:44:47 Alexey Fisher wrote: > Hallo all, > > after testing latest intel_drm_next v2.6.36-07547-g100519e on my netbook > i have seen interesting issue. Standardly started glxgears performs not > so well, it drops some times to 25fps. > if start glxgears in fullscrean mode it performs well - about 60fps. It's a know bug for 945GM(E) chipset: https://bugs.freedesktop.org/show_bug.cgi?id=30364 Regards Vasily ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: less load less performance 2010-10-31 16:44 less load less performance Alexey Fisher 2010-10-31 17:01 ` Peter Clifton 2010-10-31 17:33 ` Vasily Khoruzhick @ 2010-10-31 22:07 ` Arjan van de Ven 2010-11-01 8:50 ` Alexey Fisher 2 siblings, 1 reply; 9+ messages in thread From: Arjan van de Ven @ 2010-10-31 22:07 UTC (permalink / raw) To: Alexey Fisher Cc: intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org, power-072X8lT/F9NAfugRpC6u6w@public.gmane.org On 10/31/2010 9:44 AM, Alexey Fisher wrote: > Hallo all, > > after testing latest intel_drm_next v2.6.36-07547-g100519e on my netbook > i have seen interesting issue. Standardly started glxgears performs not > so well, it drops some times to 25fps. > if start glxgears in fullscrean mode it performs well - about 60fps. funny that you mention this; I was just talking to Eric earlier about this topic in a cab to the conference... ... we have some ideas on how to fix this. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: less load less performance 2010-10-31 22:07 ` Arjan van de Ven @ 2010-11-01 8:50 ` Alexey Fisher 0 siblings, 0 replies; 9+ messages in thread From: Alexey Fisher @ 2010-11-01 8:50 UTC (permalink / raw) To: Arjan van de Ven; +Cc: intel-gfx@lists.freedesktop.org, power@bughost.org Am Sonntag, den 31.10.2010, 15:07 -0700 schrieb Arjan van de Ven: > On 10/31/2010 9:44 AM, Alexey Fisher wrote: > > Hallo all, > > > > after testing latest intel_drm_next v2.6.36-07547-g100519e on my netbook > > i have seen interesting issue. Standardly started glxgears performs not > > so well, it drops some times to 25fps. > > if start glxgears in fullscrean mode it performs well - about 60fps. > > > funny that you mention this; I was just talking to Eric earlier about > this topic in a cab to the conference... > ... we have some ideas on how to fix this. It will be great to see some patches :D i did some more powertop debuging to see what is the difference between smp and non smp system. Here are two dumps, i tried to do it more or less clean: "sleep 10; powertop --dump" dump with maxcpus=1: ========================================================== Cn Verweildauer C0 (Prozessor läuft) (33,0%) zyklisches AbfraC1 mwait 0,0ms ( 0,0%) C1 mwait 0,2ms ( 0,1%) C2 mwait 1,4ms ( 1,8%) C4 mwait 5,6ms (65,2%) P-States (Frequenzen) 1,67 GHz 23,8% 1333 MHz 0,1% 1000 MHz 76,1% Aufwachen pro Sekunde : 132,2 Intervall: 15,0s Stromverbrauch (ACPI-Schätzung): 10,2W (6,1 Std.) Häufigste Ursachen für das Aufwachen: 34,1% ( 88,7) [kernel scheduler] Load balancing tick 23,1% ( 60,1) [i915, uhci_hcd:usb5] <interrupt> 12,1% ( 31,4) firefox-bin 9,2% ( 23,9) [ath9k] <interrupt> 4,7% ( 12,1) evince dump with smp/ht enabled: =============================================== Ihre CPU unterstützt folgende C-Status: C1 C2 C4 Ihr BIOS meldet folgende C-Status: C1 C2 C4 Cn Verweildauer C0 (Prozessor läuft) (13,4%) zyklisches AbfraC1 mwait 0,0ms ( 0,0%) C1 mwait 2,2ms ( 2,6%) C2 mwait 3,6ms (41,8%) C4 mwait 1,5ms (42,3%) P-States (Frequenzen) 1,67 GHz 11,3% 1333 MHz 0,1% 1000 MHz 88,6% Aufwachen pro Sekunde : 411,0 Intervall: 15,0s Stromverbrauch (ACPI-Schätzung): 9,2W (6,5 Std.) Häufigste Ursachen für das Aufwachen: 19,3% ( 50,0) kworker/0:0 17,9% ( 46,3) [i915, uhci_hcd:usb5] <interrupt> 15,6% ( 40,3) [kernel scheduler] Load balancing tick 13,3% ( 34,4) PS/2 keyboard/mouse/touchpad interrupt 8,3% ( 21,4) [ath9k] <interrupt> 4,7% ( 12,1) evince 4,0% ( 10,5) [kernel core] hrtimer_start (tick_sched_timer) ======================================================================== With SMP it use less C0 and more C2,C4; with NONSMP it use mostly C0 and C4 but is use _more_ power ... If i start normal kernel also SMP but set one core offline after boot i get this: ========================================================================== Ihre CPU unterstützt folgende C-Status: C1 C2 C4 Ihr BIOS meldet folgende C-Status: C1 C2 C4 Cn Verweildauer C0 (Prozessor läuft) (39,8%) zyklisches AbfraC1 mwait 0,0ms ( 0,0%) C1 mwait 4,9ms ( 8,0%) C2 mwait 11,4ms (51,0%) C4 mwait 0,1ms ( 1,3%) P-States (Frequenzen) 1,67 GHz 34,0% 1333 MHz 0,4% 1000 MHz 65,6% Aufwachen pro Sekunde : 301,1 Intervall: 15,0s Stromverbrauch (ACPI-Schätzung): 9,7W (5,8 Std.) Häufigste Ursachen für das Aufwachen: 39,1% ( 92,6) [kernel scheduler] Load balancing tick 22,6% ( 53,5) [i915, uhci_hcd:usb5] <interrupt> 5,7% ( 13,6) [ahci] <interrupt> 5,1% ( 12,1) evince 4,8% ( 11,3) glxgears 4,2% ( 9,9) desktopcouch-se C4 is almost not used, but i still get bad performance. Do PM behave differently if i start with one Core and if i desable one core after start? _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2010-11-02 15:53 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-10-31 16:44 less load less performance Alexey Fisher 2010-10-31 17:01 ` Peter Clifton 2010-10-31 17:40 ` Alexey Fisher 2010-10-31 19:18 ` [Intel-gfx] " Andreas Mohr 2010-10-31 19:44 ` Alexey Fisher 2010-11-02 15:53 ` [Intel-gfx] " Thomas Renninger 2010-10-31 17:33 ` Vasily Khoruzhick 2010-10-31 22:07 ` Arjan van de Ven 2010-11-01 8:50 ` Alexey Fisher
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox