* Issues with ondemand governor
@ 2010-11-22 13:18 Vishwanath Sripathy
2010-11-22 16:09 ` David C Niemi
0 siblings, 1 reply; 14+ messages in thread
From: Vishwanath Sripathy @ 2010-11-22 13:18 UTC (permalink / raw)
To: cpufreq-u79uwXL29TY76Z2rM5mHXA; +Cc: linaro-dev-cunTk1MwBs8s++Sfvej+rw
[-- Attachment #1: Type: text/plain, Size: 1038 bytes --]
Hi,
I was trying to investigate performance issues that we were seeing
with some usecases like Video playback on OMAP Platforms with ondemand
governor.
As part of this, I found a tool called cpufreq-bench
(http://lwn.net/Articles/339862) which can be used determine the
performance impact of ondemand governor compared to performacne
governor.
When I ran this tool on OMAP3 (ZOOM3) platform using 2.6.36 kernel
with below command, the worstcase ondemand performance is 35% compared
to performance governor.
cpufreq-bench -l 50000 -s 100000 -x 50000 -y 100000 -g ondemand -r 5 -n 5 -v
I tried the same on x86 platforms and there the worstcase performance
is around 88%.
Attached are the cpufreq-bench logs for x86 and omap3.
Questions:
1. Is this is known limitaiton of ondemand governor?
2. How do we support system usecases (like video playback etc) with
ondemand governor if governor is not able to scale the frequencies in
realtime? Are applications expected to play with scaling_min_freq to
increase mpu frequency?
Regards
Vishwa
[-- Attachment #2: cpufreq_x86.log --]
[-- Type: application/octet-stream, Size: 21019 bytes --]
/proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 13
model name : Intel(R) Pentium(R) M processor 1.60GHz
stepping : 8
cpu MHz : 800.000
cache size : 2048 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss tm pbe nx up bts est tm2
bogomips : 1595.90
clflush size : 64
cache_alignment : 64
address sizes : 32 bits physical, 32 bits virtual
power management:
DMI info (x86-only)
# dmidecode 2.9
SMBIOS 2.3 present.
Handle 0x0000, DMI type 0, 20 bytes
BIOS Information
Vendor: Dell Inc.
Version: A06
Release Date: 10/02/2005
Address: 0xF0000
Runtime Size: 64 kB
ROM Size: 576 kB
Characteristics:
ISA is supported
PCI is supported
PC Card (PCMCIA) is supported
PNP is supported
BIOS is upgradeable
BIOS shadowing is allowed
Boot from CD is supported
Selectable boot is supported
3.5"/720 KB floppy services are supported (int 13h)
Print screen service is supported (int 5h)
8042 keyboard services are supported (int 9h)
Serial services are supported (int 14h)
Printer services are supported (int 17h)
CGA/mono video services are supported (int 10h)
ACPI is supported
USB legacy is supported
AGP is supported
Smart battery is supported
BIOS boot specification is supported
Function key-initiated network boot is supported
Handle 0x0400, DMI type 4, 32 bytes
Processor Information
Socket Designation: Microprocessor
Type: Central Processor
Family: Pentium M
Manufacturer: Intel
ID: D8 06 00 00 FF FB E9 AF
Signature: Type 0, Family 6, Model 13, Stepping 8
Flags:
FPU (Floating-point unit on-chip)
VME (Virtual mode extension)
DE (Debugging extension)
PSE (Page size extension)
TSC (Time stamp counter)
MSR (Model specific registers)
PAE (Physical address extension)
MCE (Machine check exception)
CX8 (CMPXCHG8 instruction supported)
APIC (On-chip APIC hardware supported)
SEP (Fast system call)
MTRR (Memory type range registers)
PGE (Page global enable)
MCA (Machine check architecture)
CMOV (Conditional move instruction supported)
PAT (Page attribute table)
CLFSH (CLFLUSH instruction supported)
DS (Debug store)
ACPI (ACPI supported)
MMX (MMX technology supported)
FXSR (Fast floating-point save and restore)
SSE (Streaming SIMD extensions)
SSE2 (Streaming SIMD extensions 2)
SS (Self-snoop)
TM (Thermal monitor supported)
PBE (Pending break enabled)
Version: Not Specified
Voltage: 3.3 V
External Clock: 133 MHz
Max Speed: 1800 MHz
Current Speed: 1600 MHz
Status: Populated, Enabled
Upgrade: None
L1 Cache Handle: 0x0700
L2 Cache Handle: 0x0701
L3 Cache Handle: Not Provided
Available frequencies
1600000 1333000 1067000 800000
Starting cpufreq-bench
starting benchmark with parameters:
config:
sleep=100000
load=50000
sleep_step=100000
load_step=50000
cpu=0
cycles=10
rounds=10
governor=ondemand
approx. test duration: 2m
your terminal may hardly be responsible while the benchmark is running
\rbenchmark starts in 5s\rbenchmark starts in 4s\rbenchmark starts in 3s\rbenchmark starts in 2s\rbenchmark starts in 1s\rbenchmark starts in 0s
set cpu affinity to cpu #0
high priority condition requested
round 1: doing 10 cycles with 12220664 calculations for 50000us
calibrating load of 50000us, please wait...
calibration done
avarage: 1219us, rps:820
performance cycle took 149347us, sleep: 100000us, load: 50000us, rounds: 41
performance cycle took 149298us, sleep: 100000us, load: 50000us, rounds: 41
performance cycle took 149813us, sleep: 100000us, load: 50000us, rounds: 41
performance cycle took 149452us, sleep: 100000us, load: 50000us, rounds: 41
performance cycle took 149409us, sleep: 100000us, load: 50000us, rounds: 41
performance cycle took 149237us, sleep: 100000us, load: 50000us, rounds: 41
performance cycle took 149970us, sleep: 100000us, load: 50000us, rounds: 41
performance cycle took 149530us, sleep: 100000us, load: 50000us, rounds: 41
performance cycle took 149452us, sleep: 100000us, load: 50000us, rounds: 41
performance cycle took 150954us, sleep: 100000us, load: 50000us, rounds: 41
powersave cycle took 157502us, sleep: 100000us, load: 50000us, rounds: 41
powersave cycle took 155333us, sleep: 100000us, load: 50000us, rounds: 41
powersave cycle took 156021us, sleep: 100000us, load: 50000us, rounds: 41
powersave cycle took 156209us, sleep: 100000us, load: 50000us, rounds: 41
powersave cycle took 155909us, sleep: 100000us, load: 50000us, rounds: 41
powersave cycle took 155484us, sleep: 100000us, load: 50000us, rounds: 41
powersave cycle took 156111us, sleep: 100000us, load: 50000us, rounds: 41
powersave cycle took 155741us, sleep: 100000us, load: 50000us, rounds: 41
powersave cycle took 155753us, sleep: 100000us, load: 50000us, rounds: 41
powersave cycle took 155845us, sleep: 100000us, load: 50000us, rounds: 41
performance time = 496462, powersave time = 559908
performance is at 88.67%
round 2: doing 10 cycles with 41 calculations for 100000us
calibrating load of 100000us, please wait...
calibration done
avarage: 1351us, rps:740
performance cycle took 298570us, sleep: 200000us, load: 100000us, rounds: 74
performance cycle took 298744us, sleep: 200000us, load: 100000us, rounds: 74
performance cycle took 299508us, sleep: 200000us, load: 100000us, rounds: 74
performance cycle took 304618us, sleep: 200000us, load: 100000us, rounds: 74
performance cycle took 299670us, sleep: 200000us, load: 100000us, rounds: 74
performance cycle took 298520us, sleep: 200000us, load: 100000us, rounds: 74
performance cycle took 298760us, sleep: 200000us, load: 100000us, rounds: 74
performance cycle took 298680us, sleep: 200000us, load: 100000us, rounds: 74
performance cycle took 298538us, sleep: 200000us, load: 100000us, rounds: 74
performance cycle took 298605us, sleep: 200000us, load: 100000us, rounds: 74
powersave cycle took 301024us, sleep: 200000us, load: 100000us, rounds: 74
powersave cycle took 306723us, sleep: 200000us, load: 100000us, rounds: 74
powersave cycle took 312544us, sleep: 200000us, load: 100000us, rounds: 74
powersave cycle took 303118us, sleep: 200000us, load: 100000us, rounds: 74
powersave cycle took 301949us, sleep: 200000us, load: 100000us, rounds: 74
powersave cycle took 306729us, sleep: 200000us, load: 100000us, rounds: 74
powersave cycle took 303289us, sleep: 200000us, load: 100000us, rounds: 74
powersave cycle took 301563us, sleep: 200000us, load: 100000us, rounds: 74
powersave cycle took 306871us, sleep: 200000us, load: 100000us, rounds: 74
powersave cycle took 305971us, sleep: 200000us, load: 100000us, rounds: 74
performance time = 994213, powersave time = 1049781
performance is at 94.71%
round 3: doing 10 cycles with 74 calculations for 150000us
calibrating load of 150000us, please wait...
calibration done
avarage: 1428us, rps:700
performance cycle took 450110us, sleep: 300000us, load: 150000us, rounds: 105
performance cycle took 450259us, sleep: 300000us, load: 150000us, rounds: 105
performance cycle took 451149us, sleep: 300000us, load: 150000us, rounds: 105
performance cycle took 449609us, sleep: 300000us, load: 150000us, rounds: 105
performance cycle took 452592us, sleep: 300000us, load: 150000us, rounds: 105
performance cycle took 450578us, sleep: 300000us, load: 150000us, rounds: 105
performance cycle took 450928us, sleep: 300000us, load: 150000us, rounds: 105
performance cycle took 450315us, sleep: 300000us, load: 150000us, rounds: 105
performance cycle took 450961us, sleep: 300000us, load: 150000us, rounds: 105
performance cycle took 450916us, sleep: 300000us, load: 150000us, rounds: 105
powersave cycle took 457547us, sleep: 300000us, load: 150000us, rounds: 105
powersave cycle took 456943us, sleep: 300000us, load: 150000us, rounds: 105
powersave cycle took 456405us, sleep: 300000us, load: 150000us, rounds: 105
powersave cycle took 456942us, sleep: 300000us, load: 150000us, rounds: 105
powersave cycle took 456906us, sleep: 300000us, load: 150000us, rounds: 105
powersave cycle took 461478us, sleep: 300000us, load: 150000us, rounds: 105
powersave cycle took 452238us, sleep: 300000us, load: 150000us, rounds: 105
powersave cycle took 471539us, sleep: 300000us, load: 150000us, rounds: 105
powersave cycle took 453829us, sleep: 300000us, load: 150000us, rounds: 105
powersave cycle took 454076us, sleep: 300000us, load: 150000us, rounds: 105
performance time = 1507417, powersave time = 1577903
performance is at 95.53%
round 4: doing 10 cycles with 105 calculations for 200000us
calibrating load of 200000us, please wait...
calibration done
avarage: 1503us, rps:665
performance cycle took 598692us, sleep: 400000us, load: 200000us, rounds: 133
performance cycle took 601277us, sleep: 400000us, load: 200000us, rounds: 133
performance cycle took 598993us, sleep: 400000us, load: 200000us, rounds: 133
performance cycle took 598526us, sleep: 400000us, load: 200000us, rounds: 133
performance cycle took 599131us, sleep: 400000us, load: 200000us, rounds: 133
performance cycle took 598801us, sleep: 400000us, load: 200000us, rounds: 133
performance cycle took 598797us, sleep: 400000us, load: 200000us, rounds: 133
performance cycle took 598302us, sleep: 400000us, load: 200000us, rounds: 133
performance cycle took 598378us, sleep: 400000us, load: 200000us, rounds: 133
performance cycle took 598446us, sleep: 400000us, load: 200000us, rounds: 133
powersave cycle took 609434us, sleep: 400000us, load: 200000us, rounds: 133
powersave cycle took 613549us, sleep: 400000us, load: 200000us, rounds: 133
powersave cycle took 603543us, sleep: 400000us, load: 200000us, rounds: 133
powersave cycle took 608245us, sleep: 400000us, load: 200000us, rounds: 133
powersave cycle took 619328us, sleep: 400000us, load: 200000us, rounds: 133
powersave cycle took 606461us, sleep: 400000us, load: 200000us, rounds: 133
powersave cycle took 600797us, sleep: 400000us, load: 200000us, rounds: 133
powersave cycle took 602219us, sleep: 400000us, load: 200000us, rounds: 133
powersave cycle took 601188us, sleep: 400000us, load: 200000us, rounds: 133
powersave cycle took 621789us, sleep: 400000us, load: 200000us, rounds: 133
performance time = 1989343, powersave time = 2086553
performance is at 95.34%
round 5: doing 10 cycles with 133 calculations for 250000us
calibrating load of 250000us, please wait...
calibration done
avarage: 1533us, rps:652
performance cycle took 752552us, sleep: 500000us, load: 250000us, rounds: 163
performance cycle took 751894us, sleep: 500000us, load: 250000us, rounds: 163
performance cycle took 751851us, sleep: 500000us, load: 250000us, rounds: 163
performance cycle took 753625us, sleep: 500000us, load: 250000us, rounds: 163
performance cycle took 751755us, sleep: 500000us, load: 250000us, rounds: 163
performance cycle took 751443us, sleep: 500000us, load: 250000us, rounds: 163
performance cycle took 751985us, sleep: 500000us, load: 250000us, rounds: 163
performance cycle took 750828us, sleep: 500000us, load: 250000us, rounds: 163
performance cycle took 753281us, sleep: 500000us, load: 250000us, rounds: 163
performance cycle took 753092us, sleep: 500000us, load: 250000us, rounds: 163
powersave cycle took 756531us, sleep: 500000us, load: 250000us, rounds: 163
powersave cycle took 757968us, sleep: 500000us, load: 250000us, rounds: 163
powersave cycle took 762608us, sleep: 500000us, load: 250000us, rounds: 163
powersave cycle took 758028us, sleep: 500000us, load: 250000us, rounds: 163
powersave cycle took 756024us, sleep: 500000us, load: 250000us, rounds: 163
powersave cycle took 755373us, sleep: 500000us, load: 250000us, rounds: 163
powersave cycle took 758089us, sleep: 500000us, load: 250000us, rounds: 163
powersave cycle took 754402us, sleep: 500000us, load: 250000us, rounds: 163
powersave cycle took 755947us, sleep: 500000us, load: 250000us, rounds: 163
powersave cycle took 758036us, sleep: 500000us, load: 250000us, rounds: 163
performance time = 2522306, powersave time = 2573006
performance is at 98.03%
round 6: doing 10 cycles with 163 calculations for 300000us
calibrating load of 300000us, please wait...
calibration done
avarage: 1604us, rps:623
performance cycle took 894810us, sleep: 600000us, load: 300000us, rounds: 187
performance cycle took 893902us, sleep: 600000us, load: 300000us, rounds: 187
performance cycle took 899683us, sleep: 600000us, load: 300000us, rounds: 187
performance cycle took 897401us, sleep: 600000us, load: 300000us, rounds: 187
performance cycle took 896494us, sleep: 600000us, load: 300000us, rounds: 187
performance cycle took 895780us, sleep: 600000us, load: 300000us, rounds: 187
performance cycle took 895338us, sleep: 600000us, load: 300000us, rounds: 187
performance cycle took 893959us, sleep: 600000us, load: 300000us, rounds: 187
performance cycle took 895157us, sleep: 600000us, load: 300000us, rounds: 187
performance cycle took 895058us, sleep: 600000us, load: 300000us, rounds: 187
powersave cycle took 901089us, sleep: 600000us, load: 300000us, rounds: 187
powersave cycle took 899842us, sleep: 600000us, load: 300000us, rounds: 187
powersave cycle took 903532us, sleep: 600000us, load: 300000us, rounds: 187
powersave cycle took 907361us, sleep: 600000us, load: 300000us, rounds: 187
powersave cycle took 904548us, sleep: 600000us, load: 300000us, rounds: 187
powersave cycle took 898717us, sleep: 600000us, load: 300000us, rounds: 187
powersave cycle took 899267us, sleep: 600000us, load: 300000us, rounds: 187
powersave cycle took 905573us, sleep: 600000us, load: 300000us, rounds: 187
powersave cycle took 903137us, sleep: 600000us, load: 300000us, rounds: 187
powersave cycle took 902353us, sleep: 600000us, load: 300000us, rounds: 187
performance time = 2957582, powersave time = 3025419
performance is at 97.76%
round 7: doing 10 cycles with 187 calculations for 350000us
calibrating load of 350000us, please wait...
calibration done
avarage: 1605us, rps:622
performance cycle took 1055098us, sleep: 700000us, load: 350000us, rounds: 218
performance cycle took 1051366us, sleep: 700000us, load: 350000us, rounds: 218
performance cycle took 1054538us, sleep: 700000us, load: 350000us, rounds: 218
performance cycle took 1052109us, sleep: 700000us, load: 350000us, rounds: 218
performance cycle took 1054706us, sleep: 700000us, load: 350000us, rounds: 218
performance cycle took 1051716us, sleep: 700000us, load: 350000us, rounds: 218
performance cycle took 1052074us, sleep: 700000us, load: 350000us, rounds: 218
performance cycle took 1052864us, sleep: 700000us, load: 350000us, rounds: 218
performance cycle took 1051216us, sleep: 700000us, load: 350000us, rounds: 218
performance cycle took 1071208us, sleep: 700000us, load: 350000us, rounds: 218
powersave cycle took 1149631us, sleep: 700000us, load: 350000us, rounds: 218
powersave cycle took 1059418us, sleep: 700000us, load: 350000us, rounds: 218
powersave cycle took 1060026us, sleep: 700000us, load: 350000us, rounds: 218
powersave cycle took 1070900us, sleep: 700000us, load: 350000us, rounds: 218
powersave cycle took 1055027us, sleep: 700000us, load: 350000us, rounds: 218
powersave cycle took 1058177us, sleep: 700000us, load: 350000us, rounds: 218
powersave cycle took 1060803us, sleep: 700000us, load: 350000us, rounds: 218
powersave cycle took 1061344us, sleep: 700000us, load: 350000us, rounds: 218
powersave cycle took 1061984us, sleep: 700000us, load: 350000us, rounds: 218
powersave cycle took 1060078us, sleep: 700000us, load: 350000us, rounds: 218
performance time = 3546895, powersave time = 3697388
performance is at 95.93%
round 8: doing 10 cycles with 218 calculations for 400000us
calibrating load of 400000us, please wait...
calibration done
avarage: 1652us, rps:605
performance cycle took 1206501us, sleep: 800000us, load: 400000us, rounds: 242
performance cycle took 1196900us, sleep: 800000us, load: 400000us, rounds: 242
performance cycle took 1195977us, sleep: 800000us, load: 400000us, rounds: 242
performance cycle took 1196262us, sleep: 800000us, load: 400000us, rounds: 242
performance cycle took 1197569us, sleep: 800000us, load: 400000us, rounds: 242
performance cycle took 1201550us, sleep: 800000us, load: 400000us, rounds: 242
performance cycle took 1204615us, sleep: 800000us, load: 400000us, rounds: 242
performance cycle took 1196243us, sleep: 800000us, load: 400000us, rounds: 242
performance cycle took 1196142us, sleep: 800000us, load: 400000us, rounds: 242
performance cycle took 1201843us, sleep: 800000us, load: 400000us, rounds: 242
powersave cycle took 1207840us, sleep: 800000us, load: 400000us, rounds: 242
powersave cycle took 1261032us, sleep: 800000us, load: 400000us, rounds: 242
powersave cycle took 1204220us, sleep: 800000us, load: 400000us, rounds: 242
powersave cycle took 1207517us, sleep: 800000us, load: 400000us, rounds: 242
powersave cycle took 1204487us, sleep: 800000us, load: 400000us, rounds: 242
powersave cycle took 1205677us, sleep: 800000us, load: 400000us, rounds: 242
powersave cycle took 1198789us, sleep: 800000us, load: 400000us, rounds: 242
powersave cycle took 1199029us, sleep: 800000us, load: 400000us, rounds: 242
powersave cycle took 1203750us, sleep: 800000us, load: 400000us, rounds: 242
powersave cycle took 1205140us, sleep: 800000us, load: 400000us, rounds: 242
performance time = 3993602, powersave time = 4097481
performance is at 97.46%
round 9: doing 10 cycles with 242 calculations for 450000us
calibrating load of 450000us, please wait...
calibration done
avarage: 1666us, rps:600
performance cycle took 1350461us, sleep: 900000us, load: 450000us, rounds: 270
performance cycle took 1349106us, sleep: 900000us, load: 450000us, rounds: 270
performance cycle took 1352866us, sleep: 900000us, load: 450000us, rounds: 270
performance cycle took 1348477us, sleep: 900000us, load: 450000us, rounds: 270
performance cycle took 1350252us, sleep: 900000us, load: 450000us, rounds: 270
performance cycle took 1352472us, sleep: 900000us, load: 450000us, rounds: 270
performance cycle took 1350970us, sleep: 900000us, load: 450000us, rounds: 270
performance cycle took 1348821us, sleep: 900000us, load: 450000us, rounds: 270
performance cycle took 1352365us, sleep: 900000us, load: 450000us, rounds: 270
performance cycle took 1349422us, sleep: 900000us, load: 450000us, rounds: 270
powersave cycle took 1366041us, sleep: 900000us, load: 450000us, rounds: 270
powersave cycle took 1356204us, sleep: 900000us, load: 450000us, rounds: 270
powersave cycle took 1401714us, sleep: 900000us, load: 450000us, rounds: 270
powersave cycle took 1356358us, sleep: 900000us, load: 450000us, rounds: 270
powersave cycle took 1354570us, sleep: 900000us, load: 450000us, rounds: 270
powersave cycle took 1354405us, sleep: 900000us, load: 450000us, rounds: 270
powersave cycle took 1355460us, sleep: 900000us, load: 450000us, rounds: 270
powersave cycle took 1355028us, sleep: 900000us, load: 450000us, rounds: 270
powersave cycle took 1357778us, sleep: 900000us, load: 450000us, rounds: 270
powersave cycle took 1355245us, sleep: 900000us, load: 450000us, rounds: 270
performance time = 4505212, powersave time = 4612803
performance is at 97.67%
round 10: doing 10 cycles with 270 calculations for 500000us
calibrating load of 500000us, please wait...
calibration done
avarage: 1872us, rps:534
performance cycle took 1452886us, sleep: 1000000us, load: 500000us, rounds: 267
performance cycle took 1444442us, sleep: 1000000us, load: 500000us, rounds: 267
performance cycle took 1443612us, sleep: 1000000us, load: 500000us, rounds: 267
performance cycle took 1449110us, sleep: 1000000us, load: 500000us, rounds: 267
performance cycle took 1443888us, sleep: 1000000us, load: 500000us, rounds: 267
performance cycle took 1444513us, sleep: 1000000us, load: 500000us, rounds: 267
performance cycle took 1450777us, sleep: 1000000us, load: 500000us, rounds: 267
performance cycle took 1444843us, sleep: 1000000us, load: 500000us, rounds: 267
performance cycle took 1449112us, sleep: 1000000us, load: 500000us, rounds: 267
performance cycle took 1443121us, sleep: 1000000us, load: 500000us, rounds: 267
powersave cycle took 1455590us, sleep: 1000000us, load: 500000us, rounds: 267
powersave cycle took 1453884us, sleep: 1000000us, load: 500000us, rounds: 267
powersave cycle took 1449484us, sleep: 1000000us, load: 500000us, rounds: 267
powersave cycle took 1448273us, sleep: 1000000us, load: 500000us, rounds: 267
powersave cycle took 1454833us, sleep: 1000000us, load: 500000us, rounds: 267
powersave cycle took 1448872us, sleep: 1000000us, load: 500000us, rounds: 267
powersave cycle took 1452681us, sleep: 1000000us, load: 500000us, rounds: 267
powersave cycle took 1462866us, sleep: 1000000us, load: 500000us, rounds: 267
powersave cycle took 1454852us, sleep: 1000000us, load: 500000us, rounds: 267
powersave cycle took 1448686us, sleep: 1000000us, load: 500000us, rounds: 267
performance time = 4466304, powersave time = 4530021
performance is at 98.59%
End
[-- Attachment #3: cpufreq_omap3.log --]
[-- Type: application/octet-stream, Size: 18293 bytes --]
Processor : ARMv7 Processor rev 2 (v7l)
processor : 0
BogoMIPS : 597.64
Features : swp half thumb fastmult vfp edsp thumbee neon vfpv3
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x3
CPU part : 0xc08
CPU revision : 2
Hardware : OMAP Zoom3 board
Revision : 0010
Serial : 0000000000000000
300000 600000 800000 1000000
starting benchmark with parameters:
config:
sleep=100000
load=50000
sleep_step=100000
load_step=50000
cpu=0
cycles=10
rounds=10
governor=ondemand
approx. test duration: 2m
your terminal may hardly be responsible while the benchmark is running
\rbenchmark starts in 5s\rbenchmark starts in 4s\rbenchmark starts in 3s\rbenchmark starts in 2s\rbenchmark starts in 1s\rbenchmark starts in 0s
set cpu affinity to cpu #0
high priority condition requested
round 1: doing 10 cycles with 0 calculations for 50000us
calibrating load of 50000us, please wait...
calibration done
avarage: 10000us, rps:100
performance cycle took 145691us, sleep: 100000us, load: 50000us, rounds: 5
performance cycle took 145661us, sleep: 100000us, load: 50000us, rounds: 5
performance cycle took 145568us, sleep: 100000us, load: 50000us, rounds: 5
performance cycle took 145569us, sleep: 100000us, load: 50000us, rounds: 5
performance cycle took 145661us, sleep: 100000us, load: 50000us, rounds: 5
performance cycle took 145569us, sleep: 100000us, load: 50000us, rounds: 5
performance cycle took 145782us, sleep: 100000us, load: 50000us, rounds: 5
performance cycle took 145508us, sleep: 100000us, load: 50000us, rounds: 5
performance cycle took 145660us, sleep: 100000us, load: 50000us, rounds: 5
performance cycle took 145447us, sleep: 100000us, load: 50000us, rounds: 5
powersave cycle took 146118us, sleep: 100000us, load: 50000us, rounds: 5
powersave cycle took 145905us, sleep: 100000us, load: 50000us, rounds: 5
powersave cycle took 253174us, sleep: 100000us, load: 50000us, rounds: 5
powersave cycle took 253143us, sleep: 100000us, load: 50000us, rounds: 5
powersave cycle took 253143us, sleep: 100000us, load: 50000us, rounds: 5
powersave cycle took 253357us, sleep: 100000us, load: 50000us, rounds: 5
powersave cycle took 253845us, sleep: 100000us, load: 50000us, rounds: 5
powersave cycle took 253418us, sleep: 100000us, load: 50000us, rounds: 5
powersave cycle took 253266us, sleep: 100000us, load: 50000us, rounds: 5
powersave cycle took 253204us, sleep: 100000us, load: 50000us, rounds: 5
performance time = 456116, powersave time = 1318573
performance is at 34.59%
round 2: doing 10 cycles with 5 calculations for 100000us
calibrating load of 100000us, please wait...
calibration done
avarage: 10000us, rps:100
performance cycle took 291657us, sleep: 200000us, load: 100000us, rounds: 10
performance cycle took 291626us, sleep: 200000us, load: 100000us, rounds: 10
performance cycle took 291809us, sleep: 200000us, load: 100000us, rounds: 10
performance cycle took 291687us, sleep: 200000us, load: 100000us, rounds: 10
performance cycle took 291748us, sleep: 200000us, load: 100000us, rounds: 10
performance cycle took 291779us, sleep: 200000us, load: 100000us, rounds: 10
performance cycle took 291809us, sleep: 200000us, load: 100000us, rounds: 10
performance cycle took 291717us, sleep: 200000us, load: 100000us, rounds: 10
performance cycle took 291809us, sleep: 200000us, load: 100000us, rounds: 10
performance cycle took 291687us, sleep: 200000us, load: 100000us, rounds: 10
powersave cycle took 292602us, sleep: 200000us, load: 100000us, rounds: 10
powersave cycle took 500610us, sleep: 200000us, load: 100000us, rounds: 10
powersave cycle took 510192us, sleep: 200000us, load: 100000us, rounds: 10
powersave cycle took 510437us, sleep: 200000us, load: 100000us, rounds: 10
powersave cycle took 510131us, sleep: 200000us, load: 100000us, rounds: 10
powersave cycle took 509613us, sleep: 200000us, load: 100000us, rounds: 10
powersave cycle took 292756us, sleep: 200000us, load: 100000us, rounds: 10
powersave cycle took 495849us, sleep: 200000us, load: 100000us, rounds: 10
powersave cycle took 508667us, sleep: 200000us, load: 100000us, rounds: 10
powersave cycle took 292328us, sleep: 200000us, load: 100000us, rounds: 10
performance time = 917328, powersave time = 2423185
performance is at 37.86%
round 3: doing 10 cycles with 10 calculations for 150000us
calibrating load of 150000us, please wait...
calibration done
avarage: 9375us, rps:106
performance cycle took 447510us, sleep: 300000us, load: 150000us, rounds: 16
performance cycle took 447326us, sleep: 300000us, load: 150000us, rounds: 16
performance cycle took 448914us, sleep: 300000us, load: 150000us, rounds: 16
performance cycle took 447602us, sleep: 300000us, load: 150000us, rounds: 16
performance cycle took 447723us, sleep: 300000us, load: 150000us, rounds: 16
performance cycle took 447357us, sleep: 300000us, load: 150000us, rounds: 16
performance cycle took 447632us, sleep: 300000us, load: 150000us, rounds: 16
performance cycle took 447387us, sleep: 300000us, load: 150000us, rounds: 16
performance cycle took 447480us, sleep: 300000us, load: 150000us, rounds: 16
performance cycle took 447479us, sleep: 300000us, load: 150000us, rounds: 16
powersave cycle took 663421us, sleep: 300000us, load: 150000us, rounds: 16
powersave cycle took 663238us, sleep: 300000us, load: 150000us, rounds: 16
powersave cycle took 663178us, sleep: 300000us, load: 150000us, rounds: 16
powersave cycle took 663330us, sleep: 300000us, load: 150000us, rounds: 16
powersave cycle took 663178us, sleep: 300000us, load: 150000us, rounds: 16
powersave cycle took 663421us, sleep: 300000us, load: 150000us, rounds: 16
powersave cycle took 663208us, sleep: 300000us, load: 150000us, rounds: 16
powersave cycle took 663300us, sleep: 300000us, load: 150000us, rounds: 16
powersave cycle took 663330us, sleep: 300000us, load: 150000us, rounds: 16
powersave cycle took 663269us, sleep: 300000us, load: 150000us, rounds: 16
performance time = 1476410, powersave time = 3632873
performance is at 40.64%
round 4: doing 10 cycles with 16 calculations for 200000us
calibrating load of 200000us, please wait...
calibration done
avarage: 9523us, rps:105
performance cycle took 594055us, sleep: 400000us, load: 200000us, rounds: 21
performance cycle took 594025us, sleep: 400000us, load: 200000us, rounds: 21
performance cycle took 593841us, sleep: 400000us, load: 200000us, rounds: 21
performance cycle took 594421us, sleep: 400000us, load: 200000us, rounds: 21
performance cycle took 593872us, sleep: 400000us, load: 200000us, rounds: 21
performance cycle took 594574us, sleep: 400000us, load: 200000us, rounds: 21
performance cycle took 594390us, sleep: 400000us, load: 200000us, rounds: 21
performance cycle took 594574us, sleep: 400000us, load: 200000us, rounds: 21
performance cycle took 593933us, sleep: 400000us, load: 200000us, rounds: 21
performance cycle took 594360us, sleep: 400000us, load: 200000us, rounds: 21
powersave cycle took 810333us, sleep: 400000us, load: 200000us, rounds: 21
powersave cycle took 810486us, sleep: 400000us, load: 200000us, rounds: 21
powersave cycle took 810120us, sleep: 400000us, load: 200000us, rounds: 21
powersave cycle took 810242us, sleep: 400000us, load: 200000us, rounds: 21
powersave cycle took 940216us, sleep: 400000us, load: 200000us, rounds: 21
powersave cycle took 982238us, sleep: 400000us, load: 200000us, rounds: 21
powersave cycle took 994751us, sleep: 400000us, load: 200000us, rounds: 21
powersave cycle took 997986us, sleep: 400000us, load: 200000us, rounds: 21
powersave cycle took 999847us, sleep: 400000us, load: 200000us, rounds: 21
powersave cycle took 999908us, sleep: 400000us, load: 200000us, rounds: 21
performance time = 1942045, powersave time = 5156127
performance is at 37.66%
round 5: doing 10 cycles with 21 calculations for 250000us
calibrating load of 250000us, please wait...
calibration done
avarage: 9259us, rps:108
performance cycle took 750305us, sleep: 500000us, load: 250000us, rounds: 27
performance cycle took 750702us, sleep: 500000us, load: 250000us, rounds: 27
performance cycle took 750793us, sleep: 500000us, load: 250000us, rounds: 27
performance cycle took 750489us, sleep: 500000us, load: 250000us, rounds: 27
performance cycle took 750611us, sleep: 500000us, load: 250000us, rounds: 27
performance cycle took 750091us, sleep: 500000us, load: 250000us, rounds: 27
performance cycle took 749970us, sleep: 500000us, load: 250000us, rounds: 27
performance cycle took 750275us, sleep: 500000us, load: 250000us, rounds: 27
performance cycle took 750550us, sleep: 500000us, load: 250000us, rounds: 27
performance cycle took 750122us, sleep: 500000us, load: 250000us, rounds: 27
powersave cycle took 967224us, sleep: 500000us, load: 250000us, rounds: 27
powersave cycle took 966828us, sleep: 500000us, load: 250000us, rounds: 27
powersave cycle took 966401us, sleep: 500000us, load: 250000us, rounds: 27
powersave cycle took 966949us, sleep: 500000us, load: 250000us, rounds: 27
powersave cycle took 967896us, sleep: 500000us, load: 250000us, rounds: 27
powersave cycle took 967926us, sleep: 500000us, load: 250000us, rounds: 27
powersave cycle took 971283us, sleep: 500000us, load: 250000us, rounds: 27
powersave cycle took 982849us, sleep: 500000us, load: 250000us, rounds: 27
powersave cycle took 990143us, sleep: 500000us, load: 250000us, rounds: 27
powersave cycle took 993652us, sleep: 500000us, load: 250000us, rounds: 27
performance time = 2503908, powersave time = 4741151
performance is at 52.81%
round 6: doing 10 cycles with 27 calculations for 300000us
calibrating load of 300000us, please wait...
calibration done
avarage: 9375us, rps:106
performance cycle took 896514us, sleep: 600000us, load: 300000us, rounds: 32
performance cycle took 896668us, sleep: 600000us, load: 300000us, rounds: 32
performance cycle took 896973us, sleep: 600000us, load: 300000us, rounds: 32
performance cycle took 896759us, sleep: 600000us, load: 300000us, rounds: 32
performance cycle took 897125us, sleep: 600000us, load: 300000us, rounds: 32
performance cycle took 897095us, sleep: 600000us, load: 300000us, rounds: 32
performance cycle took 896668us, sleep: 600000us, load: 300000us, rounds: 32
performance cycle took 896423us, sleep: 600000us, load: 300000us, rounds: 32
performance cycle took 896515us, sleep: 600000us, load: 300000us, rounds: 32
performance cycle took 896393us, sleep: 600000us, load: 300000us, rounds: 32
powersave cycle took 1113892us, sleep: 600000us, load: 300000us, rounds: 32
powersave cycle took 1113617us, sleep: 600000us, load: 300000us, rounds: 32
powersave cycle took 1113007us, sleep: 600000us, load: 300000us, rounds: 32
powersave cycle took 1114410us, sleep: 600000us, load: 300000us, rounds: 32
powersave cycle took 1113464us, sleep: 600000us, load: 300000us, rounds: 32
powersave cycle took 1113617us, sleep: 600000us, load: 300000us, rounds: 32
powersave cycle took 1312805us, sleep: 600000us, load: 300000us, rounds: 32
powersave cycle took 1114411us, sleep: 600000us, load: 300000us, rounds: 32
powersave cycle took 1114227us, sleep: 600000us, load: 300000us, rounds: 32
powersave cycle took 1113342us, sleep: 600000us, load: 300000us, rounds: 32
performance time = 2967133, powersave time = 5336792
performance is at 55.60%
round 7: doing 10 cycles with 32 calculations for 350000us
calibrating load of 350000us, please wait...
calibration done
avarage: 9459us, rps:105
performance cycle took 1043121us, sleep: 700000us, load: 350000us, rounds: 37
performance cycle took 1043304us, sleep: 700000us, load: 350000us, rounds: 37
performance cycle took 1043274us, sleep: 700000us, load: 350000us, rounds: 37
performance cycle took 1043060us, sleep: 700000us, load: 350000us, rounds: 37
performance cycle took 1042938us, sleep: 700000us, load: 350000us, rounds: 37
performance cycle took 1043182us, sleep: 700000us, load: 350000us, rounds: 37
performance cycle took 1043335us, sleep: 700000us, load: 350000us, rounds: 37
performance cycle took 1043122us, sleep: 700000us, load: 350000us, rounds: 37
performance cycle took 1042907us, sleep: 700000us, load: 350000us, rounds: 37
performance cycle took 1043274us, sleep: 700000us, load: 350000us, rounds: 37
powersave cycle took 1261017us, sleep: 700000us, load: 350000us, rounds: 37
powersave cycle took 1260224us, sleep: 700000us, load: 350000us, rounds: 37
powersave cycle took 1260834us, sleep: 700000us, load: 350000us, rounds: 37
powersave cycle took 1293396us, sleep: 700000us, load: 350000us, rounds: 37
powersave cycle took 1263428us, sleep: 700000us, load: 350000us, rounds: 37
powersave cycle took 1260406us, sleep: 700000us, load: 350000us, rounds: 37
powersave cycle took 1445770us, sleep: 700000us, load: 350000us, rounds: 37
powersave cycle took 1263214us, sleep: 700000us, load: 350000us, rounds: 37
powersave cycle took 1260742us, sleep: 700000us, load: 350000us, rounds: 37
powersave cycle took 1253205us, sleep: 700000us, load: 350000us, rounds: 37
performance time = 3431517, powersave time = 5822236
performance is at 58.94%
round 8: doing 10 cycles with 37 calculations for 400000us
calibrating load of 400000us, please wait...
calibration done
avarage: 9302us, rps:107
performance cycle took 1199615us, sleep: 800000us, load: 400000us, rounds: 43
performance cycle took 1199646us, sleep: 800000us, load: 400000us, rounds: 43
performance cycle took 1199310us, sleep: 800000us, load: 400000us, rounds: 43
performance cycle took 1198669us, sleep: 800000us, load: 400000us, rounds: 43
performance cycle took 1199280us, sleep: 800000us, load: 400000us, rounds: 43
performance cycle took 1199616us, sleep: 800000us, load: 400000us, rounds: 43
performance cycle took 1198913us, sleep: 800000us, load: 400000us, rounds: 43
performance cycle took 1198761us, sleep: 800000us, load: 400000us, rounds: 43
performance cycle took 1199371us, sleep: 800000us, load: 400000us, rounds: 43
performance cycle took 1199371us, sleep: 800000us, load: 400000us, rounds: 43
powersave cycle took 1417541us, sleep: 800000us, load: 400000us, rounds: 43
powersave cycle took 1464599us, sleep: 800000us, load: 400000us, rounds: 43
powersave cycle took 1417603us, sleep: 800000us, load: 400000us, rounds: 43
powersave cycle took 1423096us, sleep: 800000us, load: 400000us, rounds: 43
powersave cycle took 1420685us, sleep: 800000us, load: 400000us, rounds: 43
powersave cycle took 1421325us, sleep: 800000us, load: 400000us, rounds: 43
powersave cycle took 1421539us, sleep: 800000us, load: 400000us, rounds: 43
powersave cycle took 1416351us, sleep: 800000us, load: 400000us, rounds: 43
powersave cycle took 1474609us, sleep: 800000us, load: 400000us, rounds: 43
powersave cycle took 1601807us, sleep: 800000us, load: 400000us, rounds: 43
performance time = 3992552, powersave time = 6479155
performance is at 61.62%
round 9: doing 10 cycles with 43 calculations for 450000us
calibrating load of 450000us, please wait...
calibration done
avarage: 9375us, rps:106
performance cycle took 1345886us, sleep: 900000us, load: 450000us, rounds: 48
performance cycle took 1345825us, sleep: 900000us, load: 450000us, rounds: 48
performance cycle took 1346008us, sleep: 900000us, load: 450000us, rounds: 48
performance cycle took 1346100us, sleep: 900000us, load: 450000us, rounds: 48
performance cycle took 1346313us, sleep: 900000us, load: 450000us, rounds: 48
performance cycle took 1345825us, sleep: 900000us, load: 450000us, rounds: 48
performance cycle took 1345489us, sleep: 900000us, load: 450000us, rounds: 48
performance cycle took 1345704us, sleep: 900000us, load: 450000us, rounds: 48
performance cycle took 1345306us, sleep: 900000us, load: 450000us, rounds: 48
performance cycle took 1345734us, sleep: 900000us, load: 450000us, rounds: 48
powersave cycle took 1567169us, sleep: 900000us, load: 450000us, rounds: 48
powersave cycle took 1738525us, sleep: 900000us, load: 450000us, rounds: 48
powersave cycle took 1564667us, sleep: 900000us, load: 450000us, rounds: 48
powersave cycle took 1564575us, sleep: 900000us, load: 450000us, rounds: 48
powersave cycle took 1563659us, sleep: 900000us, load: 450000us, rounds: 48
powersave cycle took 1567810us, sleep: 900000us, load: 450000us, rounds: 48
powersave cycle took 1739319us, sleep: 900000us, load: 450000us, rounds: 48
powersave cycle took 1669647us, sleep: 900000us, load: 450000us, rounds: 48
powersave cycle took 1567352us, sleep: 900000us, load: 450000us, rounds: 48
powersave cycle took 1755310us, sleep: 900000us, load: 450000us, rounds: 48
performance time = 4458190, powersave time = 7298033
performance is at 61.09%
round 10: doing 10 cycles with 48 calculations for 500000us
calibrating load of 500000us, please wait...
calibration done
avarage: 9433us, rps:106
performance cycle took 1493531us, sleep: 1000000us, load: 500000us, rounds: 53
performance cycle took 1491485us, sleep: 1000000us, load: 500000us, rounds: 53
performance cycle took 1491730us, sleep: 1000000us, load: 500000us, rounds: 53
performance cycle took 1491333us, sleep: 1000000us, load: 500000us, rounds: 53
performance cycle took 1492401us, sleep: 1000000us, load: 500000us, rounds: 53
performance cycle took 1491272us, sleep: 1000000us, load: 500000us, rounds: 53
performance cycle took 1492126us, sleep: 1000000us, load: 500000us, rounds: 53
performance cycle took 1492157us, sleep: 1000000us, load: 500000us, rounds: 53
performance cycle took 1491821us, sleep: 1000000us, load: 500000us, rounds: 53
performance cycle took 1491913us, sleep: 1000000us, load: 500000us, rounds: 53
powersave cycle took 1706055us, sleep: 1000000us, load: 500000us, rounds: 53
powersave cycle took 1714295us, sleep: 1000000us, load: 500000us, rounds: 53
powersave cycle took 1876679us, sleep: 1000000us, load: 500000us, rounds: 53
powersave cycle took 1711670us, sleep: 1000000us, load: 500000us, rounds: 53
powersave cycle took 1710968us, sleep: 1000000us, load: 500000us, rounds: 53
powersave cycle took 1711212us, sleep: 1000000us, load: 500000us, rounds: 53
powersave cycle took 1869720us, sleep: 1000000us, load: 500000us, rounds: 53
powersave cycle took 1714935us, sleep: 1000000us, load: 500000us, rounds: 53
powersave cycle took 1711182us, sleep: 1000000us, load: 500000us, rounds: 53
powersave cycle took 1710907us, sleep: 1000000us, load: 500000us, rounds: 53
performance time = 4919769, powersave time = 7437623
performance is at 66.15%
[-- Attachment #4: Type: text/plain, Size: 175 bytes --]
_______________________________________________
linaro-dev mailing list
linaro-dev-cunTk1MwBs8s++Sfvej+rw@public.gmane.org
http://lists.linaro.org/mailman/listinfo/linaro-dev
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Issues with ondemand governor
2010-11-22 13:18 Issues with ondemand governor Vishwanath Sripathy
@ 2010-11-22 16:09 ` David C Niemi
2010-11-23 12:29 ` Vishwanath Sripathy
[not found] ` <4CEA959F.9000505-0nFLJxsdniVWk0Htik3J/w@public.gmane.org>
0 siblings, 2 replies; 14+ messages in thread
From: David C Niemi @ 2010-11-22 16:09 UTC (permalink / raw)
To: Vishwanath Sripathy; +Cc: cpufreq, linaro-dev
The general problem here is that the ondemand governor is aimed more at
power savings than performance. In cases where the ondemand governor
performs worse than the performance governor, the "sampling_down_factor"
tunable is often useful. I submitted the patch to add this tunable a
few weeks ago and it was acked by Venki, but I don't know what happened
to it after that. It helps in two ways:
1) the governor does not spend as much overhead on the governor when the
CPU is truly busy
2) the governor is a lot less eager to downshift when the CPU is busy --
without this patch, even on a busy system ondemand will blip down in
clock speed surprisingly often, hurting performance.
This patch is all about improving peak load performance. On quite a few
loads I've tried this patch with a sampling_down_factor of 100 matches
the performance governor quite well while the original ondemand
performance was poor. On the other hand, it is not much help if you are
trying to minimize power consumption on light to medium loads. If you
set sampling_down_factor to "1" it preserves default behavior.
David C Niemi
Vishwanath Sripathy wrote:
> Hi,
>
> I was trying to investigate performance issues that we were seeing
> with some usecases like Video playback on OMAP Platforms with ondemand
> governor.
> As part of this, I found a tool called cpufreq-bench
> (http://lwn.net/Articles/339862) which can be used determine the
> performance impact of ondemand governor compared to performacne
> governor.
> When I ran this tool on OMAP3 (ZOOM3) platform using 2.6.36 kernel
> with below command, the worstcase ondemand performance is 35% compared
> to performance governor.
> cpufreq-bench -l 50000 -s 100000 -x 50000 -y 100000 -g ondemand -r 5 -n 5 -v
>
> I tried the same on x86 platforms and there the worstcase performance
> is around 88%.
> Attached are the cpufreq-bench logs for x86 and omap3.
>
> Questions:
> 1. Is this is known limitaiton of ondemand governor?
> 2. How do we support system usecases (like video playback etc) with
> ondemand governor if governor is not able to scale the frequencies in
> realtime? Are applications expected to play with scaling_min_freq to
> increase mpu frequency?
>
> Regards
> Vishwa
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: Issues with ondemand governor
2010-11-22 16:09 ` David C Niemi
@ 2010-11-23 12:29 ` Vishwanath Sripathy
2010-11-23 14:52 ` Amit Kucheria
[not found] ` <4CEA959F.9000505-0nFLJxsdniVWk0Htik3J/w@public.gmane.org>
1 sibling, 1 reply; 14+ messages in thread
From: Vishwanath Sripathy @ 2010-11-23 12:29 UTC (permalink / raw)
To: David C Niemi; +Cc: cpufreq, linaro-dev
Thanks David for the inputs.
I tried your patch. In addition to that I reduced transition_latency.
With these 2 changes, I do see much better results (worst case
performance of ondemand is 88%).
Vishwa
On Mon, Nov 22, 2010 at 9:39 PM, David C Niemi <dniemi@verisign.com> wrote:
> The general problem here is that the ondemand governor is aimed more at
> power savings than performance. In cases where the ondemand governor
> performs worse than the performance governor, the "sampling_down_factor"
> tunable is often useful. I submitted the patch to add this tunable a
> few weeks ago and it was acked by Venki, but I don't know what happened
> to it after that. It helps in two ways:
>
> 1) the governor does not spend as much overhead on the governor when the
> CPU is truly busy
>
> 2) the governor is a lot less eager to downshift when the CPU is busy --
> without this patch, even on a busy system ondemand will blip down in
> clock speed surprisingly often, hurting performance.
>
> This patch is all about improving peak load performance. On quite a few
> loads I've tried this patch with a sampling_down_factor of 100 matches
> the performance governor quite well while the original ondemand
> performance was poor. On the other hand, it is not much help if you are
> trying to minimize power consumption on light to medium loads. If you
> set sampling_down_factor to "1" it preserves default behavior.
>
> David C Niemi
>
> Vishwanath Sripathy wrote:
>> Hi,
>>
>> I was trying to investigate performance issues that we were seeing
>> with some usecases like Video playback on OMAP Platforms with ondemand
>> governor.
>> As part of this, I found a tool called cpufreq-bench
>> (http://lwn.net/Articles/339862) which can be used determine the
>> performance impact of ondemand governor compared to performacne
>> governor.
>> When I ran this tool on OMAP3 (ZOOM3) platform using 2.6.36 kernel
>> with below command, the worstcase ondemand performance is 35% compared
>> to performance governor.
>> cpufreq-bench -l 50000 -s 100000 -x 50000 -y 100000 -g ondemand -r 5 -n 5 -v
>>
>> I tried the same on x86 platforms and there the worstcase performance
>> is around 88%.
>> Attached are the cpufreq-bench logs for x86 and omap3.
>>
>> Questions:
>> 1. Is this is known limitaiton of ondemand governor?
>> 2. How do we support system usecases (like video playback etc) with
>> ondemand governor if governor is not able to scale the frequencies in
>> realtime? Are applications expected to play with scaling_min_freq to
>> increase mpu frequency?
>>
>> Regards
>> Vishwa
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: Issues with ondemand governor
2010-11-23 12:29 ` Vishwanath Sripathy
@ 2010-11-23 14:52 ` Amit Kucheria
2010-11-24 11:57 ` Vishwanath Sripathy
0 siblings, 1 reply; 14+ messages in thread
From: Amit Kucheria @ 2010-11-23 14:52 UTC (permalink / raw)
To: Vishwanath Sripathy; +Cc: David C Niemi, linaro-dev, cpufreq
Vishwa,
Have you had a chance to do some usetime tests with these changes?
It would be interesting to measure the power consumption with and
without these changes.
/Amit
On Tue, Nov 23, 2010 at 5:59 PM, Vishwanath Sripathy
<vishwanath.sripathy@linaro.org> wrote:
> Thanks David for the inputs.
> I tried your patch. In addition to that I reduced transition_latency.
> With these 2 changes, I do see much better results (worst case
> performance of ondemand is 88%).
>
> Vishwa
>
>
> On Mon, Nov 22, 2010 at 9:39 PM, David C Niemi <dniemi@verisign.com> wrote:
>> The general problem here is that the ondemand governor is aimed more at
>> power savings than performance. In cases where the ondemand governor
>> performs worse than the performance governor, the "sampling_down_factor"
>> tunable is often useful. I submitted the patch to add this tunable a
>> few weeks ago and it was acked by Venki, but I don't know what happened
>> to it after that. It helps in two ways:
>>
>> 1) the governor does not spend as much overhead on the governor when the
>> CPU is truly busy
>>
>> 2) the governor is a lot less eager to downshift when the CPU is busy --
>> without this patch, even on a busy system ondemand will blip down in
>> clock speed surprisingly often, hurting performance.
>>
>> This patch is all about improving peak load performance. On quite a few
>> loads I've tried this patch with a sampling_down_factor of 100 matches
>> the performance governor quite well while the original ondemand
>> performance was poor. On the other hand, it is not much help if you are
>> trying to minimize power consumption on light to medium loads. If you
>> set sampling_down_factor to "1" it preserves default behavior.
>>
>> David C Niemi
>>
>> Vishwanath Sripathy wrote:
>>> Hi,
>>>
>>> I was trying to investigate performance issues that we were seeing
>>> with some usecases like Video playback on OMAP Platforms with ondemand
>>> governor.
>>> As part of this, I found a tool called cpufreq-bench
>>> (http://lwn.net/Articles/339862) which can be used determine the
>>> performance impact of ondemand governor compared to performacne
>>> governor.
>>> When I ran this tool on OMAP3 (ZOOM3) platform using 2.6.36 kernel
>>> with below command, the worstcase ondemand performance is 35% compared
>>> to performance governor.
>>> cpufreq-bench -l 50000 -s 100000 -x 50000 -y 100000 -g ondemand -r 5 -n 5 -v
>>>
>>> I tried the same on x86 platforms and there the worstcase performance
>>> is around 88%.
>>> Attached are the cpufreq-bench logs for x86 and omap3.
>>>
>>> Questions:
>>> 1. Is this is known limitaiton of ondemand governor?
>>> 2. How do we support system usecases (like video playback etc) with
>>> ondemand governor if governor is not able to scale the frequencies in
>>> realtime? Are applications expected to play with scaling_min_freq to
>>> increase mpu frequency?
>>>
>>> Regards
>>> Vishwa
>>
>>
>
> _______________________________________________
> linaro-dev mailing list
> linaro-dev@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-dev
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Issues with ondemand governor
2010-11-23 14:52 ` Amit Kucheria
@ 2010-11-24 11:57 ` Vishwanath Sripathy
2010-11-24 14:12 ` David C Niemi
0 siblings, 1 reply; 14+ messages in thread
From: Vishwanath Sripathy @ 2010-11-24 11:57 UTC (permalink / raw)
To: Amit Kucheria; +Cc: David C Niemi, linaro-dev, cpufreq
Amit,
On Tue, Nov 23, 2010 at 8:22 PM, Amit Kucheria <amit.kucheria@linaro.org> wrote:
> Vishwa,
>
> Have you had a chance to do some usetime tests with these changes?
I did test USB performance with this and I see ondmeand is 90% close
to performance.
>
> It would be interesting to measure the power consumption with and
> without these changes.
Power consumption impact can vary from usecase to usecase and extra
performance will have some power impact.
However in idle scenario, I feel this should not have much impact
since ondemand timer is a deferrable timer which means that it does
not prevent cpuidle. I will try to measure it for some usecase and
compare the power impact.
Vishwa
>
> /Amit
>
> On Tue, Nov 23, 2010 at 5:59 PM, Vishwanath Sripathy
> <vishwanath.sripathy@linaro.org> wrote:
>> Thanks David for the inputs.
>> I tried your patch. In addition to that I reduced transition_latency.
>> With these 2 changes, I do see much better results (worst case
>> performance of ondemand is 88%).
>>
>> Vishwa
>>
>>
>> On Mon, Nov 22, 2010 at 9:39 PM, David C Niemi <dniemi@verisign.com> wrote:
>>> The general problem here is that the ondemand governor is aimed more at
>>> power savings than performance. In cases where the ondemand governor
>>> performs worse than the performance governor, the "sampling_down_factor"
>>> tunable is often useful. I submitted the patch to add this tunable a
>>> few weeks ago and it was acked by Venki, but I don't know what happened
>>> to it after that. It helps in two ways:
>>>
>>> 1) the governor does not spend as much overhead on the governor when the
>>> CPU is truly busy
>>>
>>> 2) the governor is a lot less eager to downshift when the CPU is busy --
>>> without this patch, even on a busy system ondemand will blip down in
>>> clock speed surprisingly often, hurting performance.
>>>
>>> This patch is all about improving peak load performance. On quite a few
>>> loads I've tried this patch with a sampling_down_factor of 100 matches
>>> the performance governor quite well while the original ondemand
>>> performance was poor. On the other hand, it is not much help if you are
>>> trying to minimize power consumption on light to medium loads. If you
>>> set sampling_down_factor to "1" it preserves default behavior.
>>>
>>> David C Niemi
>>>
>>> Vishwanath Sripathy wrote:
>>>> Hi,
>>>>
>>>> I was trying to investigate performance issues that we were seeing
>>>> with some usecases like Video playback on OMAP Platforms with ondemand
>>>> governor.
>>>> As part of this, I found a tool called cpufreq-bench
>>>> (http://lwn.net/Articles/339862) which can be used determine the
>>>> performance impact of ondemand governor compared to performacne
>>>> governor.
>>>> When I ran this tool on OMAP3 (ZOOM3) platform using 2.6.36 kernel
>>>> with below command, the worstcase ondemand performance is 35% compared
>>>> to performance governor.
>>>> cpufreq-bench -l 50000 -s 100000 -x 50000 -y 100000 -g ondemand -r 5 -n 5 -v
>>>>
>>>> I tried the same on x86 platforms and there the worstcase performance
>>>> is around 88%.
>>>> Attached are the cpufreq-bench logs for x86 and omap3.
>>>>
>>>> Questions:
>>>> 1. Is this is known limitaiton of ondemand governor?
>>>> 2. How do we support system usecases (like video playback etc) with
>>>> ondemand governor if governor is not able to scale the frequencies in
>>>> realtime? Are applications expected to play with scaling_min_freq to
>>>> increase mpu frequency?
>>>>
>>>> Regards
>>>> Vishwa
>>>
>>>
>>
>> _______________________________________________
>> linaro-dev mailing list
>> linaro-dev@lists.linaro.org
>> http://lists.linaro.org/mailman/listinfo/linaro-dev
>>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Issues with ondemand governor
2010-11-24 11:57 ` Vishwanath Sripathy
@ 2010-11-24 14:12 ` David C Niemi
2010-11-25 12:05 ` Vishwanath Sripathy
0 siblings, 1 reply; 14+ messages in thread
From: David C Niemi @ 2010-11-24 14:12 UTC (permalink / raw)
To: Vishwanath Sripathy; +Cc: Amit Kucheria, linaro-dev, cpufreq
Thanks for running the tests, Vishwa. Your results are what I'd expect
but it's good to see independent confirmation. In my benchmarks I saw
95-100% of the performance governor's performance, but the conditions
were more favorable and the original ondemand governor was "only"
degrading performance 20-30% to begin with.
There should be absolutely no changes in power consumption at all for
the patch itself, as behavior does not change until you raise
sampling_down_factor above 1 (the default). If you set it high, I would
expect higher power consumption (but also higher performance) under load
and no change in power consumption when idle or close to idle. Setting
a high sampling_down_factor causes the governor to reevaluate load less
often when at max cpu speed, both to reduce overhead and to let it
remain at maximum performance more consistently. Without this change,
the ondemand governor jitters a lot in and out of max clock speed when
under high loads, which is why its performance can be much worse than
the performance governor. Reducing the number of transitions and load
evaluations should also improve performance per watt, though the details
of that depend on the relative efficiency of the CPU's respective clock
speeds.
If you want to balance power consumption and performance, a middle
setting of sampling_down_factor like "10" should make a noticeable
improvement in performance while not having as much impact on power.
But if you want to match the performance governor's performance and are
less concerned about transient power consumption, you will want to set
it higher.
Another note: I recommend setting io_is_busy to 1 when using
sampling_down_factor above 1, as it improves responsiveness to quick
load transients involving some I/O. It's also worth considering
lowering up_threshold to 50 or even down to 15-20.
David C Niemi
Vishwanath Sripathy wrote:
> Amit,
>
> On Tue, Nov 23, 2010 at 8:22 PM, Amit Kucheria <amit.kucheria@linaro.org> wrote:
>
>> Vishwa,
>>
>> Have you had a chance to do some usetime tests with these changes?
>>
> I did test USB performance with this and I see ondmeand is 90% close
> to performance.
>
>> It would be interesting to measure the power consumption with and
>> without these changes.
>>
> Power consumption impact can vary from usecase to usecase and extra
> performance will have some power impact.
> However in idle scenario, I feel this should not have much impact
> since ondemand timer is a deferrable timer which means that it does
> not prevent cpuidle. I will try to measure it for some usecase and
> compare the power impact.
>
> Vishwa
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Issues with ondemand governor
2010-11-24 14:12 ` David C Niemi
@ 2010-11-25 12:05 ` Vishwanath Sripathy
2010-11-25 14:31 ` Thomas Renninger
0 siblings, 1 reply; 14+ messages in thread
From: Vishwanath Sripathy @ 2010-11-25 12:05 UTC (permalink / raw)
To: David C Niemi; +Cc: Amit Kucheria, linaro-dev, cpufreq
Thanks David.
If I would like to fine tune up_threshold and sampling_down_factor for
say OMAP platform, is there any way to do it in kernel itself?
I know these are configurable via sysfs entries. But if I want to
optimize them in kernel itself, is there anyway? I see that default
values are set in cpufreq-ondemand.c which is common kernel file. I
would like to know if these can be set in platform specific code?
Vishwa
On Wed, Nov 24, 2010 at 7:42 PM, David C Niemi <dniemi@verisign.com> wrote:
>
> Thanks for running the tests, Vishwa. Your results are what I'd expect
> but it's good to see independent confirmation. In my benchmarks I saw
> 95-100% of the performance governor's performance, but the conditions
> were more favorable and the original ondemand governor was "only"
> degrading performance 20-30% to begin with.
>
> There should be absolutely no changes in power consumption at all for
> the patch itself, as behavior does not change until you raise
> sampling_down_factor above 1 (the default). If you set it high, I would
> expect higher power consumption (but also higher performance) under load
> and no change in power consumption when idle or close to idle. Setting
> a high sampling_down_factor causes the governor to reevaluate load less
> often when at max cpu speed, both to reduce overhead and to let it
> remain at maximum performance more consistently. Without this change,
> the ondemand governor jitters a lot in and out of max clock speed when
> under high loads, which is why its performance can be much worse than
> the performance governor. Reducing the number of transitions and load
> evaluations should also improve performance per watt, though the details
> of that depend on the relative efficiency of the CPU's respective clock
> speeds.
>
> If you want to balance power consumption and performance, a middle
> setting of sampling_down_factor like "10" should make a noticeable
> improvement in performance while not having as much impact on power.
> But if you want to match the performance governor's performance and are
> less concerned about transient power consumption, you will want to set
> it higher.
>
> Another note: I recommend setting io_is_busy to 1 when using
> sampling_down_factor above 1, as it improves responsiveness to quick
> load transients involving some I/O. It's also worth considering
> lowering up_threshold to 50 or even down to 15-20.
>
> David C Niemi
>
> Vishwanath Sripathy wrote:
>> Amit,
>>
>> On Tue, Nov 23, 2010 at 8:22 PM, Amit Kucheria <amit.kucheria@linaro.org> wrote:
>>
>>> Vishwa,
>>>
>>> Have you had a chance to do some usetime tests with these changes?
>>>
>> I did test USB performance with this and I see ondmeand is 90% close
>> to performance.
>>
>>> It would be interesting to measure the power consumption with and
>>> without these changes.
>>>
>> Power consumption impact can vary from usecase to usecase and extra
>> performance will have some power impact.
>> However in idle scenario, I feel this should not have much impact
>> since ondemand timer is a deferrable timer which means that it does
>> not prevent cpuidle. I will try to measure it for some usecase and
>> compare the power impact.
>>
>> Vishwa
>>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Issues with ondemand governor
2010-11-25 12:05 ` Vishwanath Sripathy
@ 2010-11-25 14:31 ` Thomas Renninger
0 siblings, 0 replies; 14+ messages in thread
From: Thomas Renninger @ 2010-11-25 14:31 UTC (permalink / raw)
To: Vishwanath Sripathy; +Cc: David C Niemi, Amit Kucheria, linaro-dev, cpufreq
On Thursday 25 November 2010 13:05:49 Vishwanath Sripathy wrote:
> Thanks David.
> If I would like to fine tune up_threshold and sampling_down_factor for
> say OMAP platform, is there any way to do it in kernel itself?
> I know these are configurable via sysfs entries. But if I want to
> optimize them in kernel itself, is there anyway? I see that default
> values are set in cpufreq-ondemand.c which is common kernel file. I
> would like to know if these can be set in platform specific code?
Ugly to do...
This should be done global and not per cpu?
If per_cpu it could be added to policy, simlar to latency:
set in driver's init func, evaluated in governor later.
Possibly if in cpufreq.h a struct:
{
unsigned int sampling_down_factor;
...
} cpufreq_governor_hints;
is added which gets filled by the platform driver in the .init func
and then gets evaluated in the governor, but only once also at init
time. Doing this when the governor is already active involves
locking which must get avoided.
Thomas
^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <4CEA959F.9000505-0nFLJxsdniVWk0Htik3J/w@public.gmane.org>]
* Re: Issues with ondemand governor
[not found] ` <4CEA959F.9000505-0nFLJxsdniVWk0Htik3J/w@public.gmane.org>
@ 2010-11-26 22:38 ` Christian Robottom Reis
[not found] ` <20101126223815.GU30563-J1k5CargkBPB0jqWMgOSsQh0onu2mTI+@public.gmane.org>
2010-11-29 15:16 ` David C Niemi
0 siblings, 2 replies; 14+ messages in thread
From: Christian Robottom Reis @ 2010-11-26 22:38 UTC (permalink / raw)
To: David C Niemi, Nicolas Pitre
Cc: linaro-dev-cunTk1MwBs8s++Sfvej+rw, cpufreq-u79uwXL29TY76Z2rM5mHXA
On Mon, Nov 22, 2010 at 11:09:03AM -0500, David C Niemi wrote:
> The general problem here is that the ondemand governor is aimed more at
> power savings than performance. In cases where the ondemand governor
> performs worse than the performance governor, the "sampling_down_factor"
> tunable is often useful. I submitted the patch to add this tunable a
> few weeks ago and it was acked by Venki, but I don't know what happened
> to it after that.
Would you like to get it merged into linux-linaro? Given it's been ack'd
I think Nicolas might be willing to consider it:
http://kerneltrap.org/mailarchive/linux-kernel/2010/10/6/4628889/thread
--
Christian Robottom Reis | [+55] 16 9112 6430 | http://launchpad.net/~kiko
Linaro Engineering VP | [ +1] 612 216 4935 | http://async.com.br/~kiko
^ permalink raw reply [flat|nested] 14+ messages in thread[parent not found: <20101126223815.GU30563-J1k5CargkBPB0jqWMgOSsQh0onu2mTI+@public.gmane.org>]
* Re: Issues with ondemand governor
[not found] ` <20101126223815.GU30563-J1k5CargkBPB0jqWMgOSsQh0onu2mTI+@public.gmane.org>
@ 2010-11-29 9:05 ` Vishwanath Sripathy
0 siblings, 0 replies; 14+ messages in thread
From: Vishwanath Sripathy @ 2010-11-29 9:05 UTC (permalink / raw)
To: Christian Robottom Reis, Nicolas Pitre
Cc: David C Niemi, linaro-dev-cunTk1MwBs8s++Sfvej+rw,
cpufreq-u79uwXL29TY76Z2rM5mHXA
Nicolas,
Can you pls merge this patch into Linaro tree?
Vishwa
On Sat, Nov 27, 2010 at 4:08 AM, Christian Robottom Reis
<kiko-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:
> On Mon, Nov 22, 2010 at 11:09:03AM -0500, David C Niemi wrote:
>> The general problem here is that the ondemand governor is aimed more at
>> power savings than performance. In cases where the ondemand governor
>> performs worse than the performance governor, the "sampling_down_factor"
>> tunable is often useful. I submitted the patch to add this tunable a
>> few weeks ago and it was acked by Venki, but I don't know what happened
>> to it after that.
>
> Would you like to get it merged into linux-linaro? Given it's been ack'd
> I think Nicolas might be willing to consider it:
>
> http://kerneltrap.org/mailarchive/linux-kernel/2010/10/6/4628889/thread
> --
> Christian Robottom Reis | [+55] 16 9112 6430 | http://launchpad.net/~kiko
> Linaro Engineering VP | [ +1] 612 216 4935 | http://async.com.br/~kiko
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Issues with ondemand governor
2010-11-26 22:38 ` Christian Robottom Reis
[not found] ` <20101126223815.GU30563-J1k5CargkBPB0jqWMgOSsQh0onu2mTI+@public.gmane.org>
@ 2010-11-29 15:16 ` David C Niemi
[not found] ` <4CF3C3B4.3000209-0nFLJxsdniVWk0Htik3J/w@public.gmane.org>
1 sibling, 1 reply; 14+ messages in thread
From: David C Niemi @ 2010-11-29 15:16 UTC (permalink / raw)
To: Christian Robottom Reis
Cc: Nicolas Pitre, Vishwanath Sripathy, linaro-dev, cpufreq
I certainly have no objections to it going into the Linaro tree, though
I was hoping to get it into the main kernel tree too.
DCN
Christian Robottom Reis wrote:
> On Mon, Nov 22, 2010 at 11:09:03AM -0500, David C Niemi wrote:
>
>> The general problem here is that the ondemand governor is aimed more at
>> power savings than performance. In cases where the ondemand governor
>> performs worse than the performance governor, the "sampling_down_factor"
>> tunable is often useful. I submitted the patch to add this tunable a
>> few weeks ago and it was acked by Venki, but I don't know what happened
>> to it after that.
>>
>
> Would you like to get it merged into linux-linaro? Given it's been ack'd
> I think Nicolas might be willing to consider it:
>
> http://kerneltrap.org/mailarchive/linux-kernel/2010/10/6/4628889/thread
>
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2010-11-29 20:03 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-22 13:18 Issues with ondemand governor Vishwanath Sripathy
2010-11-22 16:09 ` David C Niemi
2010-11-23 12:29 ` Vishwanath Sripathy
2010-11-23 14:52 ` Amit Kucheria
2010-11-24 11:57 ` Vishwanath Sripathy
2010-11-24 14:12 ` David C Niemi
2010-11-25 12:05 ` Vishwanath Sripathy
2010-11-25 14:31 ` Thomas Renninger
[not found] ` <4CEA959F.9000505-0nFLJxsdniVWk0Htik3J/w@public.gmane.org>
2010-11-26 22:38 ` Christian Robottom Reis
[not found] ` <20101126223815.GU30563-J1k5CargkBPB0jqWMgOSsQh0onu2mTI+@public.gmane.org>
2010-11-29 9:05 ` Vishwanath Sripathy
2010-11-29 15:16 ` David C Niemi
[not found] ` <4CF3C3B4.3000209-0nFLJxsdniVWk0Htik3J/w@public.gmane.org>
2010-11-29 15:38 ` Nicolas Pitre
2010-11-29 18:00 ` Dave Jones
2010-11-29 20:03 ` David C Niemi
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.