All of lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick Pannuto <ppannuto@codeaurora.org>
To: Daniel Walker <dwalker@codeaurora.org>
Cc: linux-kernel@vger.kernel.org, sboyd@codeaurora.org,
	tglx@linutronix.de, mingo@elte.hu, heiko.carstens@de.ibm.com,
	eranian@google.com, schwidefsky@de.ibm.com,
	linux-arm-msm@vger.kernel.org
Subject: Re: [RFC] [PATCH] timer: Added usleep[_range][_interruptable] timer
Date: Mon, 28 Jun 2010 11:03:11 -0700	[thread overview]
Message-ID: <4C28E3DF.1000907@codeaurora.org> (raw)
In-Reply-To: <1277326605.15159.70.camel@c-dwalke-linux.qualcomm.com>

[-- Attachment #1: Type: text/plain, Size: 19995 bytes --]

> 
> You could test residency in specific power states .. Since you want to
> test if your reducing power consumption .. However, I'd replace a ton
> more of these udelay()'s , I don't think you'll get a decent results
> with out that.
> 

Ok.

> 
> The busy waits your replacing are small in length on average .. If you
> have timers that trigger in small intervals then your not going to
> increase residency in any _lower_ power states .. It's possible that you
> could increase residency in the top level power state, but it seem like
> it would be really marginal .. You need to show that udelay()'s have an
> outside of noise impact on something ..
> 

Ok.

> 
> I think you need to do some more research on what your actually doing to
> the system. From what your showing us one could make a lot of different
> arguments as to what this change will actually do. You really need some
> sort of test that doesn't leave a lot of room for argument.
> 

Well here we go then.

First, I would note that IMO is both telling and remarkable that the
following (almost) worked and allowed the system to boot successfully:

(a little long since you can cut off half of it and just get a list of
affected / candidate files, the following is a glorified sed that
replaces all instances of udelay with usleep for udelay's >= 50us)

git grep 'udelay([[:digit:]]\+)' | 
	perl -F"[\(\)]" -anl -e 'print if $F[1] >= 50' | 
	cut -d: -f1 | 
	uniq | 
	grep -v staging | 
	xargs perl -i -wlpe 
		's/udelay/usleep/ if m/udelay\((\d+)\)/g && $1 >= 50'

The usb driver and one power routine that was in a critical section had
to be reverted; everything else ran successfully.


*** MEASUREMENT TECHNIQUE ***

/proc/msm_pm_stats will dump information on residency in power states as
the system is running; dumps were taken 15s, 30s, 60s, 120s, 180s, and
240s from boot. 10 trials were performed for each build. A trial was run
replacing udelay >1000, >500, >100, >=100, >=50.

	origin:		0 changes	(0 total)
	over1000:	106 changes	(106 total)
	over500:	193 changes	(299 total)
	over100:	302 changes	(601 total)
	equal100:	423 changes	(1024 total)
	equal50:	220 changes	(1244 total)

Now, a concern is obviously that these changes hit drivers that are not
used in this system, or that many will be hit only once in boot / 
initialization / etc; this shows plainly in the results and will be
addressed.


*** RESULTS ***

Complete results are attached (please forgive the ugly stats.py, it
evolved; it's functional for the task at hand, but certainly not pretty).

--> Long results also at end of email

>From the final results, after 240s:

TIME	STATE		#TIMES CALLED	 TIME IN STATE		| DELTA FROM ORG

////////////////////////////////////////////////////////////////////////////////
=== origin (10 samples) =================================== RUNTIME: 241.535 ===
240	idle-spin    --	2.3              1.45001e-05     
240	not-idle     --	88368.2          67.8311703321   
240	idle-request --	88367.2          23337.6322655   
240	idle-wfi     --	88364.9          172.296383696   
=== over1000 (10 samples) ====== (delta: -0.00999999998604) RUNTIME: 241.525 ===
240	idle-spin    --	3.1              1.91666e-05      	| 4.6665e-06
240	not-idle     --	88723.9          65.6361809172    	| -2.1949894149
240	idle-request --	88722.9          23311.9855603    	| -25.6467051707
240	idle-wfi     --	88719.8          174.493487111    	| 2.1971034149
=== over500 (10 samples) ======== (delta: -0.0339999999851) RUNTIME: 241.501 ===
240	idle-spin    --	2.3              1.88334e-05      	| 4.3333e-06
240	not-idle     --	88636.3          67.0242803241    	| -0.806890008
240	idle-request --	88635.3          23280.1632631    	| -57.469002442
240	idle-wfi     --	88633.0          173.077055869    	| 0.7806721723
=== over100 (10 samples) ======== (delta: -0.0539999999921) RUNTIME: 241.481 ===
240	idle-spin    --	0.9              6.6666e-06       	| -7.8335e-06
240	not-idle     --	88599.0          67.190273164     	| -0.6408971681
240	idle-request --	88597.9          23253.4797828    	| -84.1524827638
240	idle-wfi     --	88597.0          172.884529866    	| 0.5881461694
=== equal100 (10 samples) ================== (delta: -0.025) RUNTIME: 241.51 ===
240	idle-spin    --	1.4              9.5002e-06       	| -4.9999e-06
240	not-idle     --	88685.9          66.5067348407    	| -1.3244354914
240	idle-request --	88684.9          23294.4341497    	| -43.1981158192
240	idle-wfi     --	88683.5          173.60379269     	| 1.3074089936
=== equal50 (10 samples) ======== (delta: -0.0289999999804) RUNTIME: 241.506 ===
240	idle-spin    --	2.0              9.6664e-06       	| -4.8337e-06
240	not-idle     --	88537.4          65.8619214556    	| -1.9692488765
240	idle-request --	88536.4          22979.3665406    	| -358.265724952
240	idle-wfi     --	88534.4          174.247270576    	| 1.9508868794


There appears to be very little change outside of noise from origin through
equal100, however, once equal50 is reached, there is a noticeable change.
Further investigation revealed that before equal100, none of the changes
could be guaranteed to hit the testing target (which, if nothing else,
provides a nice baseline for noise).  The equal100 test included 3 changes
that could positively be sure to hit target and the equal50 test another
4 changes.  The equal50 test was the only test that seems to conclusively
show immediate benefit, but it *does* show benefit.

(here "positively sure" are arch-specific drivers that changed, easy to
search for; it's likely that more did hit the target along the way)


*** CONCLUSIONS ***

I submit that the concept of usleep introduces no harm and does have the
potential for benefit.  I further submit that as processors get faster, the
benefit from a usleep versus a udelay will only improve. Finally, I would note
that the success of the "sed" implies to me that many users of udelay would
prefer the semantics of a usleep call; it is simply unavailable and msleep is
simply too imprecise for callers to use, which is in and of itself a case
for a usleep API.

I invite comments, questions, etc, and am happy to run more tests; however,
I would like to request that if you for any reason believe this data to be
imprecise or insufficient that you please provide me with the *exact* test you
would like me to run and would like to see data from, since it does take a
fair amount of time to run these tests.

Finally, I would like to re-iterate my point that unnecessary busy-waiting,
while it *can* be pre-empted, *is* wasteful and __if it can easily be avoided,
it should be__.

-Pat


*** LONG RESULTS ***

==> Attached is pm_stats.tar.gz, it contains results as TEST_TRIALNO,
flash.sh to run tests (./flash.sh pm_stats TESTNAME) run from an Android
product directory, and stats.py to generate statistics. Forgive stats.py,
it is ugly and an abuse of python.

TIME	STATE		#TIMES CALLED	 TIME IN STATE		| DELTA FROM ORG

////////////////////////////////////////////////////////////////////////////////
=== origin (10 samples) ==================================== RUNTIME: 15.253 ===
15	idle-spin    --	1.1              4.8332e-06      
15	not-idle     --	59265.1          9.3842608833    
15	idle-request --	59262.2          21892.7762788   
15	idle-wfi     --	59261.1          4.4754141812    
=== over1000 (10 samples) ======== (delta: -0.00299999999115) RUNTIME: 15.25 ===
15	idle-spin    --	0.7              3.6666e-06       	| -1.1666e-06
15	not-idle     --	59243.9          9.3891611869     	| 0.0049003036
15	idle-request --	59237.1          21897.8631197    	| 5.0868408908
15	idle-wfi     --	59236.4          4.4665700454     	| -0.0088441358
=== over500 (10 samples) ======== (delta: -0.00899999999674) RUNTIME: 15.244 ===
15	idle-spin    --	0.5              2.8334e-06       	| -1.9998e-06
15	not-idle     --	59267.2          9.3819648583     	| -0.002296025
15	idle-request --	59263.2          21894.5342115    	| 1.75793271321
15	idle-wfi     --	59262.7          4.4662633725     	| -0.0091508087
=== over100 (10 samples) ======== (delta: -0.00200000000186) RUNTIME: 15.251 ===
15	idle-spin    --	0.1              1.1666e-06       	| -3.6666e-06
15	not-idle     --	59222.3          9.4040358354     	| 0.0197749521
15	idle-request --	59213.6          21893.1330305    	| 0.356751724303
15	idle-wfi     --	59213.5          4.4462678975     	| -0.0291462837
=== equal100 (10 samples) ======= (delta: -0.00599999999395) RUNTIME: 15.247 ===
15	idle-spin    --	0.1              5e-07            	| -4.3332e-06
15	not-idle     --	59311.6          9.392722999      	| 0.0084621157
15	idle-request --	59306.0          21900.290705     	| 7.5144261487
15	idle-wfi     --	59305.9          4.4606828991     	| -0.0147312821
=== equal50 (10 samples) ====================== (delta: 0.0) RUNTIME: 15.253 ===
15	idle-spin    --	0.6              1.8333e-06       	| -2.9999e-06
15	not-idle     --	59267.0          9.4182486803     	| 0.033987797
15	idle-request --	59261.4          21613.5706545    	| -279.205624353
15	idle-wfi     --	59260.8          4.4418992194     	| -0.0335149618
////////////////////////////////////////////////////////////////////////////////
=== origin (10 samples) ==================================== RUNTIME: 30.494 ===
30	idle-spin    --	1.2              5.1666e-06      
30	not-idle     --	69462.2          21.80567204     
30	idle-request --	69462.2          22568.1921778   
30	idle-wfi     --	69461.0          6.8173021886    
=== over1000 (10 samples) ======= (delta: -0.00100000001257) RUNTIME: 30.493 ===
30	idle-spin    --	1.1              6.3332e-06       	| 1.1666e-06
30	not-idle     --	69436.0          21.7969753263    	| -0.0086967137
30	idle-request --	69436.0          22564.1587592    	| -4.0334186002
30	idle-wfi     --	69434.9          6.8029330699     	| -0.0143691187
=== over500 (10 samples) ======== (delta: -0.00700000001816) RUNTIME: 30.487 ===
30	idle-spin    --	0.6              3.1667e-06       	| -1.9999e-06
30	not-idle     --	69474.6          21.7808043443    	| -0.0248676957
30	idle-request --	69474.6          22564.0631218    	| -4.1290559666
30	idle-wfi     --	69474.0          6.7958443852     	| -0.0214578034
=== over100 (10 samples) ========= (delta: 0.00099999998929) RUNTIME: 30.495 ===
30	idle-spin    --	0.2              1.6666e-06       	| -3.5e-06
30	not-idle     --	69379.1          21.8042313156    	| -0.0014407244
30	idle-request --	69379.1          22551.153217     	| -17.0389607525
30	idle-wfi     --	69378.9          6.7590274145     	| -0.0582747741
=== equal100 (10 samples) ======= (delta: -0.00300000001443) RUNTIME: 30.491 ===
30	idle-spin    --	0.2              1e-06            	| -4.1666e-06
30	not-idle     --	69455.7          21.8058526573    	| 0.0001806173
30	idle-request --	69455.7          22570.70418      	| 2.51200219339
30	idle-wfi     --	69455.5          6.7603640729     	| -0.0569381157
=== equal50 (10 samples) ======== (delta: -0.00100000002421) RUNTIME: 30.493 ===
30	idle-spin    --	0.9              3.8332e-06       	| -1.3334e-06
30	not-idle     --	69449.7          21.8319953111    	| 0.0263232711
30	idle-request --	69449.7          22271.8053346    	| -296.386843177
30	idle-wfi     --	69448.8          6.7478177546     	| -0.069484434
////////////////////////////////////////////////////////////////////////////////
=== origin (10 samples) ===================================== RUNTIME: 60.78 ===
60	idle-spin    --	1.9              1.16667e-05     
60	not-idle     --	79366.8          49.3036608463   
60	idle-request --	79366.3          22907.1705373   
60	idle-wfi     --	79364.4          9.0110567161    
=== over1000 (10 samples) ========= (delta: 0.0139999999898) RUNTIME: 60.794 ===
60	idle-spin    --	2.0              1.36668e-05      	| 2.0001e-06
60	not-idle     --	79493.9          49.8914426181    	| 0.5877817718
60	idle-request --	79491.4          22894.3020036    	| -12.8685336944
60	idle-wfi     --	79489.4          9.0156314459     	| 0.0045747298
=== over500 (10 samples) ========== (delta: -0.010999999987) RUNTIME: 60.769 ===
60	idle-spin    --	1.5              1.14999e-05      	| -1.668e-07
60	not-idle     --	79513.5          49.7980203079    	| 0.4943594616
60	idle-request --	79512.6          22869.995713     	| -37.1748243636
60	idle-wfi     --	79511.1          9.0027399223     	| -0.0083167938
=== over100 (10 samples) ========= (delta: -0.0229999999865) RUNTIME: 60.757 ===
60	idle-spin    --	0.8              6.3332e-06       	| -5.3335e-06
60	not-idle     --	79399.4          49.8599212971    	| 0.5562604508
60	idle-request --	79396.8          22849.5517755    	| -57.6187617949
60	idle-wfi     --	79396.0          8.9517647664     	| -0.0592919497
=== equal100 (10 samples) ======= (delta: -0.00699999999488) RUNTIME: 60.773 ===
60	idle-spin    --	1.0              6.1667e-06       	| -5.5e-06
60	not-idle     --	79508.7          49.7887594666    	| 0.4850986203
60	idle-request --	79506.3          22878.2886511    	| -28.8818862207
60	idle-wfi     --	79505.3          8.9810057637     	| -0.0300509524
=== equal50 (10 samples) ======== (delta: -0.00799999999581) RUNTIME: 60.772 ===
60	idle-spin    --	1.4              5.9998e-06       	| -5.6669e-06
60	not-idle     --	79401.8          49.7245356251    	| 0.4208747788
60	idle-request --	79401.8          22589.3334706    	| -317.837066708
60	idle-wfi     --	79400.4          8.9224572725     	| -0.0885994436
////////////////////////////////////////////////////////////////////////////////
=== origin (10 samples) =================================== RUNTIME: 121.018 ===
120	idle-spin    --	2.3              1.45001e-05     
120	not-idle     --	84534.3          65.4829273526   
120	idle-request --	84533.2          23073.5171825   
120	idle-wfi     --	84530.9          54.1388846978   
=== over1000 (10 samples) ======== (delta: 0.0149999999907) RUNTIME: 121.033 ===
120	idle-spin    --	2.9              1.76667e-05      	| 3.1666e-06
120	not-idle     --	84899.7          62.9539904388    	| -2.5289369138
120	idle-request --	84898.6          23058.6542567    	| -14.8629258454
120	idle-wfi     --	84895.7          56.6840665995    	| 2.5451819017
=== over500 (10 samples) ======= (delta: -0.00999999999767) RUNTIME: 121.008 ===
120	idle-spin    --	2.1              1.61667e-05      	| 1.6666e-06
120	not-idle     --	84810.4          64.5704918038    	| -0.9124355488
120	idle-request --	84809.3          23020.7240229    	| -52.7931596084
120	idle-wfi     --	84807.2          55.0429809068    	| 0.904096209
=== over100 (10 samples) ======== (delta: -0.0209999999963) RUNTIME: 120.997 ===
120	idle-spin    --	0.9              6.6666e-06       	| -7.8335e-06
120	not-idle     --	84773.7          64.6287104755    	| -0.8542168771
120	idle-request --	84772.7          23002.4330668    	| -71.0841157664
120	idle-wfi     --	84771.8          54.9635557389    	| 0.8246710411
=== equal100 (10 samples) ======= (delta: 0.00600000000559) RUNTIME: 121.024 ===
120	idle-spin    --	1.4              9.5002e-06       	| -4.9999e-06
120	not-idle     --	84843.8          63.8751059888    	| -1.6078213638
120	idle-request --	84842.8          23042.7777274    	| -30.7394551461
120	idle-wfi     --	84841.4          55.7466510555    	| 1.6077663577
=== equal50 (10 samples) ======= (delta: -0.00099999998929) RUNTIME: 121.017 ===
120	idle-spin    --	2.0              9.6664e-06       	| -4.8337e-06
120	not-idle     --	84767.4          63.0424664541    	| -2.4404608985
120	idle-request --	84766.4          22740.2895937    	| -333.227588835
120	idle-wfi     --	84764.4          56.5740535877    	| 2.4351688899
////////////////////////////////////////////////////////////////////////////////
=== origin (10 samples) =================================== RUNTIME: 181.273 ===
180	idle-spin    --	2.3              1.45001e-05     
180	not-idle     --	86458.1          66.7356990125   
180	idle-request --	86457.1          23202.3433604   
180	idle-wfi     --	86454.8          113.126771849   
=== over1000 (10 samples) ========= (delta: 0.010999999987) RUNTIME: 181.284 ===
180	idle-spin    --	3.0              1.83333e-05      	| 3.8332e-06
180	not-idle     --	86831.3          64.5482361081    	| -2.1874629044
180	idle-request --	86830.3          23181.3150713    	| -21.028289176
180	idle-wfi     --	86827.3          115.331773087    	| 2.2050012376
=== over500 (10 samples) ======== (delta: -0.0250000000117) RUNTIME: 181.248 ===
180	idle-spin    --	2.2              1.68334e-05      	| 2.3333e-06
180	not-idle     --	86733.7          65.9390319914    	| -0.7966670211
180	idle-request --	86732.7          23151.5750952    	| -50.7682652808
180	idle-wfi     --	86730.5          113.915960535    	| 0.7891886856
=== over100 (10 samples) ======== (delta: -0.0339999999968) RUNTIME: 181.239 ===
180	idle-spin    --	0.9              6.6666e-06       	| -7.8335e-06
180	not-idle     --	86696.8          66.0923128287    	| -0.6433861838
180	idle-request --	86695.7          23122.6406168    	| -79.7027435974
180	idle-wfi     --	86694.8          113.744699867    	| 0.6179280181
=== equal100 (10 samples) ======= (delta: -0.0060000000056) RUNTIME: 181.267 ===
180	idle-spin    --	1.4              9.5002e-06       	| -4.9999e-06
180	not-idle     --	86765.3          65.4099056755    	| -1.325793337
180	idle-request --	86764.2          23156.7172921    	| -45.6260683097
180	idle-wfi     --	86762.8          114.457775355    	| 1.3310035056
=== equal50 (10 samples) ======= (delta: -0.00799999998418) RUNTIME: 181.265 ===
180	idle-spin    --	2.0              9.6664e-06       	| -4.8337e-06
180	not-idle     --	86665.1          64.770143973     	| -1.9655550395
180	idle-request --	86664.1          22851.7003915    	| -350.642968977
180	idle-wfi     --	86662.1          115.098642558    	| 1.971870709
////////////////////////////////////////////////////////////////////////////////
=== origin (10 samples) =================================== RUNTIME: 241.535 ===
240	idle-spin    --	2.3              1.45001e-05     
240	not-idle     --	88368.2          67.8311703321   
240	idle-request --	88367.2          23337.6322655   
240	idle-wfi     --	88364.9          172.296383696   
=== over1000 (10 samples) ====== (delta: -0.00999999998604) RUNTIME: 241.525 ===
240	idle-spin    --	3.1              1.91666e-05      	| 4.6665e-06
240	not-idle     --	88723.9          65.6361809172    	| -2.1949894149
240	idle-request --	88722.9          23311.9855603    	| -25.6467051707
240	idle-wfi     --	88719.8          174.493487111    	| 2.1971034149
=== over500 (10 samples) ======== (delta: -0.0339999999851) RUNTIME: 241.501 ===
240	idle-spin    --	2.3              1.88334e-05      	| 4.3333e-06
240	not-idle     --	88636.3          67.0242803241    	| -0.806890008
240	idle-request --	88635.3          23280.1632631    	| -57.469002442
240	idle-wfi     --	88633.0          173.077055869    	| 0.7806721723
=== over100 (10 samples) ======== (delta: -0.0539999999921) RUNTIME: 241.481 ===
240	idle-spin    --	0.9              6.6666e-06       	| -7.8335e-06
240	not-idle     --	88599.0          67.190273164     	| -0.6408971681
240	idle-request --	88597.9          23253.4797828    	| -84.1524827638
240	idle-wfi     --	88597.0          172.884529866    	| 0.5881461694
=== equal100 (10 samples) ================== (delta: -0.025) RUNTIME: 241.51 ===
240	idle-spin    --	1.4              9.5002e-06       	| -4.9999e-06
240	not-idle     --	88685.9          66.5067348407    	| -1.3244354914
240	idle-request --	88684.9          23294.4341497    	| -43.1981158192
240	idle-wfi     --	88683.5          173.60379269     	| 1.3074089936
=== equal50 (10 samples) ======== (delta: -0.0289999999804) RUNTIME: 241.506 ===
240	idle-spin    --	2.0              9.6664e-06       	| -4.8337e-06
240	not-idle     --	88537.4          65.8619214556    	| -1.9692488765
240	idle-request --	88536.4          22979.3665406    	| -358.265724952
240	idle-wfi     --	88534.4          174.247270576    	| 1.9508868794

[-- Attachment #2: pm_stats.tar.gz --]
[-- Type: application/x-gzip, Size: 119862 bytes --]

  parent reply	other threads:[~2010-06-28 18:03 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-23 19:22 [RFC] [PATCH] timer: Added usleep[_range][_interruptable] timer Patrick Pannuto
2010-06-23 20:05 ` Daniel Walker
2010-06-23 20:21   ` Patrick Pannuto
2010-06-23 20:56     ` Daniel Walker
2010-06-23 22:04       ` Andreas Mohr
2010-06-26 21:43       ` Pavel Machek
2010-06-28 18:03       ` Patrick Pannuto [this message]
2010-06-28 19:39         ` Daniel Walker
2010-06-24 10:22 ` Andi Kleen
2010-06-24 16:51   ` Patrick Pannuto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C28E3DF.1000907@codeaurora.org \
    --to=ppannuto@codeaurora.org \
    --cc=dwalker@codeaurora.org \
    --cc=eranian@google.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=sboyd@codeaurora.org \
    --cc=schwidefsky@de.ibm.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.