Statistical methods for latency profiling

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Statistical methods for latency profiling
@ 2004-07-31  5:22 Lee Revell
  2004-08-01  2:55 ` Matt Mackall
  0 siblings, 1 reply; 6+ messages in thread
From: Lee Revell @ 2004-07-31  5:22 UTC (permalink / raw)
  To: jackit-devel; +Cc: Ingo Molnar, linux-kernel

Hey,

Recently Ingo Molnar asked in one of the voluntary-preempt threads for
the minimum and average scheduling delay reported by jackd.  JACK does
not currently maintain these statistics.

I realized that the distribution of maximum latencies reported on each
process cycle is fairly normally distributed.  If this data followed a
normal distribution, it becomes much easier to make generalizations
about it - the standard deviation becomes meaningful, we can identify
whether a change is statistically significant, etc.  It would also
indicate the effectiveness of the Linux scheduler by identifying any
skew - the distribution in fact should be normal, as the only factor
that should be influencing this value is the variance in lengths of
whatever non-preemptible sections the kernel was in when we became
runnable.

The first step in being able to meaningfully analyze these results is to
determine the degree to which distribution is in fact normal.  The first
step in this process is to just to look at a histogram of the data.

In any jackd engine, the allowable window for scheduler latency is:

	(0,  period_usecs/2)

A latency in the range:

	[period_usecs/2, period_usecs]

will cause jackd to restart, and of course anything over period_usecs is
an XRUN.

Thus we can build a histogram of the observed latencies by creating an
array with period_usecs/2 elements, and adding 1 to each "bin" whenever
we see that value.  Using 32 frames at 48KHz, we have 333 "bins". 

The overhead is not too bad, because we only look at the highest latency
for each period (max_usecs), so we only have to update the histogram
once per period - if the underlying distribution is normal then the
distribution of the maximimums will be as well.

The ability to use statistical methods to generalize about this data
will be increasingly important moving forward, because this is the
*only* way we can provide hard numbers: confidence intervals, p-values,
standard deviations, etc. rather than "It feels faster" or "Since $FOO
change the system is sluggish".  Statistical methods would allow us to
say "$FOO change increases the mean latency from 150 to 200 usecs, but
the standard deviation is now 20 rather than 75, so it's actually an
improvement".

Any change in the kernel that makes a difference in the "feel" of the
system is now open to analysis.  Many developers apparently do not
consider a user report of "It feels sluggish" to be as much of a concern
as "X took Y amount of time now it takes Z".  Statistical methods can
close this gap.

Anyone care to comment?

Lee

PS  I did not do very well in my statistics classes, and it's possible
that many kernel hackers are completely unfamiliar with statistical
methods.  Fortunately the subset that is immediately useful to us is
pretty easy to understand, and could be covered in one or two lectures,
or a HOWTO.  Depending on the level of interest I can create one.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Statistical methods for latency profiling
  2004-07-31  5:22 Statistical methods for latency profiling Lee Revell
@ 2004-08-01  2:55 ` Matt Mackall
  2004-08-01  3:24   ` [Jackit-devel] " Lee Revell
  0 siblings, 1 reply; 6+ messages in thread
From: Matt Mackall @ 2004-08-01  2:55 UTC (permalink / raw)
  To: Lee Revell; +Cc: jackit-devel, Ingo Molnar, linux-kernel

On Sat, Jul 31, 2004 at 01:22:37AM -0400, Lee Revell wrote:
> Hey,
> 
> Recently Ingo Molnar asked in one of the voluntary-preempt threads for
> the minimum and average scheduling delay reported by jackd.  JACK does
> not currently maintain these statistics.
> 
> I realized that the distribution of maximum latencies reported on each
> process cycle is fairly normally distributed.

This is not at all what I would expect. Instead, I'd expect to see
something like a gamma distribution, where we have everything
clustered down close to zero, but with a very long tail in the
positive direction falling off exponentially and obviously a hard
limit on the other side..

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Jackit-devel] Re: Statistical methods for latency profiling
  2004-08-01  2:55 ` Matt Mackall
@ 2004-08-01  3:24   ` Lee Revell
  2004-08-01  5:59     ` Lee Revell
  0 siblings, 1 reply; 6+ messages in thread
From: Lee Revell @ 2004-08-01  3:24 UTC (permalink / raw)
  To: Matt Mackall; +Cc: jackit-devel, Ingo Molnar, linux-kernel

On Sat, 2004-07-31 at 22:55, Matt Mackall wrote:
> On Sat, Jul 31, 2004 at 01:22:37AM -0400, Lee Revell wrote:
> > Hey,
> > 
> > Recently Ingo Molnar asked in one of the voluntary-preempt threads for
> > the minimum and average scheduling delay reported by jackd.  JACK does
> > not currently maintain these statistics.
> > 
> > I realized that the distribution of maximum latencies reported on each
> > process cycle is fairly normally distributed.
> 
> This is not at all what I would expect. Instead, I'd expect to see
> something like a gamma distribution, where we have everything
> clustered down close to zero, but with a very long tail in the
> positive direction falling off exponentially and obviously a hard
> limit on the other side..

Right, it is a lot closer to a gamma distribution.  It's been years
since I have used any of this, and I took stat for psych majors, vs stat
for engineers.  I was a lot more interested in playing Doom at the
time...

This looks interesting:

http://www.itl.nist.gov/div898/handbook/eda/section3/ppccplot.htm
http://www.itl.nist.gov/div898/handbook/eda/section4/eda4291.htm

I will have some numbers soon.

Lee






^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Jackit-devel] Re: Statistical methods for latency profiling
  2004-08-01  3:24   ` [Jackit-devel] " Lee Revell
@ 2004-08-01  5:59     ` Lee Revell
  2004-08-01 11:21       ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: Lee Revell @ 2004-08-01  5:59 UTC (permalink / raw)
  To: Matt Mackall; +Cc: jackit-devel, Ingo Molnar, linux-kernel, Peter Williams

On Sat, 2004-07-31 at 23:24, Lee Revell wrote:

> I will have some numbers soon.

Here is the histogram from running 500000 cycles (~5 minutes), with
max_sectors_kb -> 64, normal desktop usage.  Kernel is 2.6.8-rc2-M5, set
to 1:3, with all irqs threaded except the sound card.

Delay   Count
-----   -----
6       2257
7       3210
8       797
9       717
10      283
11      219
12      235
13      250
14      263
15      320
16      334
17      287
18      268
19      260
20      223
21      155
22      109
23      90
24      41
25      19
26      3
27      4
28      1
29      3
30      1
31      2
32      5
33      3
34      1
35      7
36      1
37      3
38      4
39      1
40      5
41      8
42      2
43      1
44      1
47      1
48      1
52      2
62      1
79      1
100     1

The next three tests were done with iozone -a running.

Delay   Count
-----   -----
6       21
7       203
8       595
9       1756
10      1643
11      957
12      618
13      507
14      532
15      540
16      523
17      423
18      400
19      424
20      418
21      293
22      202
23      119
24      71
25      38
26      20
27      8
28      7
29      2
30      1
31      1
32      4
33      4
34      7
35      6
36      1
37      2
38      5
39      7
40      3
41      4
42      4
43      1
44      2
47      1
48      1
50      1
53      1
54      1
57      1
59      1
63      1
69      1
70      2
79      1
91      1
92      1
93      1
100     1
101     1
107     1
109     1
113     1
118     1
123     1
131     1
143     1
145     1
146     1
157     1
161     1

So stressing the filesystem moves the center to the right a bit, from
6-7 to 9-10, and *drastically* lengthens the 'tail'.

These numbers suggest to me that a lot of the latencies from 47 usecs
and up are caused by one code path, because they are so uniformly
distributed over the upper part of the histogram.  The prime suspect of
course being the ide io completions.  I tested this theory by lowering
max_sectors_kb from 64 to 32:

Delay   Count
-----   -----
6       3
7       427
8       742
9       1583
10      3429
11      2010
12      614
13      203
14      200
15      275
16      267
17      177
18      104
19      50
20      55
21      36
22      22
23      21
24      22
25      29
26      26
27      9
28      2
29      3
30      1
32      2
33      2
34      2
35      1
36      1
37      2
41      1
42      1
43      1
48      1
65      1
70      1
71      4
72      3
73      6
74      3
75      1
76      1
77      3
78      2
79      4
80      2
81      3
82      3
83      3
84      3
85      2
86      2
87      4
88      4
90      1
91      1
92      3
94      3
96      2
101     2
102     3
103     1
106     1
111     1
112     1
122     1
132     1

Finally, 16:

Delay   Count
-----   -----
6       420
7       1463
8       652
9       1444
10      2365
11      1111
12      537
13      265
14      232
15      241
16      253
17      218
18      173
19      181
20      233
21      188
22      125
23      76
24      55
25      23
26      9
27      15
28      6
29      2
30      2
31      6
32      5
33      9
34      8
35      4
36      2
37      5
38      2
39      1
40      2
41      3
42      3
43      2
45      1
47      1
49      1
51      3
52      1
53      1
54      1
55      1
56      1
57      1
59      2
60      2
61      2
62      2
63      1
64      1
66      2
67      2
68      3
69      1
71      1
72      3
74      2
75      1
76      2
77      1
78      2
79      1
80      1
81      2
82      1
84      1
85      3
86      1
87      1
89      1
90      2
92      1
100     1
101     1

These numbers all point to the ide sg completion code as the only 
thing on the system generating latencies over ~42 usecs.

I have sent the patch separately to jackit-devel so as not to clutter
LKML with unrelated patches.

Lee


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Jackit-devel] Re: Statistical methods for latency profiling
  2004-08-01  5:59     ` Lee Revell
@ 2004-08-01 11:21       ` Ingo Molnar
  2004-08-01 11:53         ` Lee Revell
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2004-08-01 11:21 UTC (permalink / raw)
  To: Lee Revell
  Cc: Matt Mackall, jackit-devel, Ingo Molnar, linux-kernel,
	Peter Williams


On Sun, 1 Aug 2004, Lee Revell wrote:

> So stressing the filesystem moves the center to the right a bit, from
> 6-7 to 9-10, and *drastically* lengthens the 'tail'.

basically each codepath has a typical latency distribution, and when a
workload uses multiple codepaths then the latencies get intermixed almost
linearly.

> These numbers suggest to me that a lot of the latencies from 47 usecs
> and up are caused by one code path, because they are so uniformly
> distributed over the upper part of the histogram.  The prime suspect of
> course being the ide io completions.  I tested this theory by lowering
> max_sectors_kb from 64 to 32:

> These numbers all point to the ide sg completion code as the only thing
> on the system generating latencies over ~42 usecs.

yep, that's a fair assumption. Once the IO-APIC irq-redirection problems
are solved i'll try to further thread the IDE completion IRQ to remove
that ~100 usecs latency.

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Jackit-devel] Re: Statistical methods for latency profiling
  2004-08-01 11:21       ` Ingo Molnar
@ 2004-08-01 11:53         ` Lee Revell
  0 siblings, 0 replies; 6+ messages in thread
From: Lee Revell @ 2004-08-01 11:53 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Matt Mackall, jackit-devel, Ingo Molnar, linux-kernel,
	Peter Williams

On Sun, 2004-08-01 at 07:21, Ingo Molnar wrote:
> On Sun, 1 Aug 2004, Lee Revell wrote:
> 
> > So stressing the filesystem moves the center to the right a bit, from
> > 6-7 to 9-10, and *drastically* lengthens the 'tail'.
> 
> basically each codepath has a typical latency distribution, and when a
> workload uses multiple codepaths then the latencies get intermixed almost
> linearly.
> 

I noticed several distinct spikes with 1M samples, that blend into
smooth Erlang/gamma type distribution at 5M.  I posted some more results
to jackit-dev.  It seems like each of these would represent a common
code path out of a non-preemptible region.  I suspect the spike at 70-80
usecs is a bug in my code, from updating the histogram every 1024
cycles.  I will start posting results on the web soon, it's getting big.

> > These numbers suggest to me that a lot of the latencies from 47 usecs
> > and up are caused by one code path, because they are so uniformly
> > distributed over the upper part of the histogram.  The prime suspect of
> > course being the ide io completions.  I tested this theory by lowering
> > max_sectors_kb from 64 to 32:
> 
> > These numbers all point to the ide sg completion code as the only thing
> > on the system generating latencies over ~42 usecs.
> 
> yep, that's a fair assumption. Once the IO-APIC irq-redirection problems
> are solved i'll try to further thread the IDE completion IRQ to remove
> that ~100 usecs latency.
> 

It would be interesting to identify the code paths corresponding to the
other peaks.  It occurred to me that if you suspect a peak in the
histogram is related to a certain code path, you could stick a udelay in
there, and see if the spike moves up by the same amount.

Lee

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-08-01 11:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-31  5:22 Statistical methods for latency profiling Lee Revell
2004-08-01  2:55 ` Matt Mackall
2004-08-01  3:24   ` [Jackit-devel] " Lee Revell
2004-08-01  5:59     ` Lee Revell
2004-08-01 11:21       ` Ingo Molnar
2004-08-01 11:53         ` Lee Revell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).