Adaptive Readahead V14 - statistics question...

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Adaptive Readahead V14 - statistics question...
@ 2006-05-29 19:44 Valdis.Kletnieks
       [not found] ` <20060530003757.GA5164@mail.ustc.edu.cn>
  0 siblings, 1 reply; 11+ messages in thread
From: Valdis.Kletnieks @ 2006-05-29 19:44 UTC (permalink / raw)
  To: Wu Fengguang; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 5489 bytes --]

Running 2.6.17-rc4-mm3 + V14.  I see this in /debug/readahead/events:

[table summary]      total   initial     state   context  contexta  backward  onthrash    onseek      none
random_rate             8%        0%        4%       46%        9%       44%        0%       38%       18%
ra_hit_rate            89%       97%       90%       40%       83%       76%        0%       49%        0%
la_hit_rate            62%       99%       88%       29%       84%     9500%        0%      200%     3700%
var_ra_size            703         4      8064        39      5780         3         0        59      3010
avg_ra_size              6         2        67         6        33         2         0         4        36
avg_la_size             37         1        96         4        45         0         0         0         0

Are the 9500%, 200%, and 3700% numbers in la_hit_rate related to reality
in any way, or is something b0rken?

And is there any documentation on what these mean, so you can tell if it's
doing anything useful? (One thing I've noticed is that xmms, rather than gobble
up 100K of data off disk every 10 seconds or so, snarfs a big 2M chunk every
3-4 minutes, often sucking in an entire song at (nearly) one shot...)

(Complete contents of readahead/events follows, in case it helps diagnose...)

[table requests]     total   initial     state   context  contexta  backward  onthrash    onseek      none
cache_miss            3934       543        93      2013        39      1199         0        47       417
random_read           1772        59        49      1059        11       575         0        19       327
io_congestion            4         0         4         0         0         0         0         0         0
io_cache_hit          1082         1        63       855        14       144         0         5         0
io_block             26320     18973      3519      2225       265      1288         0        50      1371
readahead            18601     15540      1008      1203       110       710         0        30      1483
lookahead             1972       153       671      1050        98         0         0         0         0
lookahead_hit         1241       152       596       312        84        95         0         2        37
lookahead_ignore         0         0         0         0         0         0         0         0         0
readahead_mmap           0         0         0         0         0         0         0         0         0
readahead_eof        14951     14348       569        19        15         0         0         0         0
readahead_shrink         0         0         0         0         0         0         0         0        70
readahead_thrash         0         0         0         0         0         0         0         0         0
readahead_mutilt         0         0         0         0         0         0         0         0         0
readahead_rescue         0         0         0         0         0         0         0         0       138

[table pages]        total   initial     state   context  contexta  backward  onthrash    onseek      none
cache_miss            6541      2472       754      2026        43      1199         0        47      1194
random_read           1784        62        51      1065        12       575         0        19       337
io_congestion          396         0       396         0         0         0         0         0         0
io_cache_hit         10185         2       571      7930      1383       293         0         6         0
readahead           111015     30757     67949      6864      3642      1681         0       122     53677
readahead_hit        98812     30052     61602      2762      3041      1294         0        61       277
lookahead            72607       185     64222      3734      4466         0         0         0         0
lookahead_hit        68640       184     59207      4475      4774         0         0         0         0
lookahead_ignore         0         0         0         0         0         0         0         0         0
readahead_mmap           0         0         0         0         0         0         0         0         0
readahead_eof        39959     25045     14102        64       748         0         0         0         0
readahead_shrink         0         0         0         0         0         0         0         0      1076
readahead_thrash         0         0         0         0         0         0         0         0         0
readahead_mutilt         0         0         0         0         0         0         0         0         0
readahead_rescue         0         0         0         0         0         0         0         0      9538

[table summary]      total   initial     state   context  contexta  backward  onthrash    onseek      none
random_rate             8%        0%        4%       46%        9%       44%        0%       38%       18%
ra_hit_rate            89%       97%       90%       40%       83%       76%        0%       49%        0%
la_hit_rate            62%       99%       88%       29%       84%     9500%        0%      200%     3700%
var_ra_size            703         4      8064        39      5780         3         0        59      3010
avg_ra_size              6         2        67         6        33         2         0         4        36
avg_la_size             37         1        96         4        45         0         0         0         0


[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Adaptive Readahead V14 - statistics question...
       [not found] ` <20060530003757.GA5164@mail.ustc.edu.cn>
@ 2006-05-30  0:37   ` Wu Fengguang
  0 siblings, 0 replies; 11+ messages in thread
From: Wu Fengguang @ 2006-05-30  0:37 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: linux-kernel

On Mon, May 29, 2006 at 03:44:59PM -0400, Valdis.Kletnieks@vt.edu wrote:
> Running 2.6.17-rc4-mm3 + V14.  I see this in /debug/readahead/events:
> 
> [table summary]      total   initial     state   context  contexta  backward  onthrash    onseek      none
> random_rate             8%        0%        4%       46%        9%       44%        0%       38%       18%
> ra_hit_rate            89%       97%       90%       40%       83%       76%        0%       49%        0%
> la_hit_rate            62%       99%       88%       29%       84%     9500%        0%      200%     3700%
> var_ra_size            703         4      8064        39      5780         3         0        59      3010
> avg_ra_size              6         2        67         6        33         2         0         4        36
> avg_la_size             37         1        96         4        45         0         0         0         0
> 
> Are the 9500%, 200%, and 3700% numbers in la_hit_rate related to reality
> in any way, or is something b0rken?

It's ok. They are computed from the following lines:
> lookahead             1972       153       671      1050        98         0         0         0         0
> lookahead_hit         1241       152       596       312        84        95         0         2        37
Here 'lookahead_hit' can somehow be greater than 'lookahead', which means
'cache hit' happened. i.e. the new readahead request overlapped with
some previous ones, and the 'lookahead_hit' is counted into the wrong
place. The 'cache hit' can also make the 'readahead_hit' larger or smaller.

This kind of mistakes can happen randomly because the accounting
mechanism is simple and supposed to work in normal. However there's no
guarantee of exact accurate - or the overhead will be unacceptable.

> And is there any documentation on what these mean, so you can tell if it's

This code snip helps a bit understanding:

/* Read-ahead events to be accounted. */
enum ra_event {
        RA_EVENT_CACHE_MISS,            /* read cache misses */
        RA_EVENT_RANDOM_READ,           /* random reads */
        RA_EVENT_IO_CONGESTION,         /* i/o congestion */
        RA_EVENT_IO_CACHE_HIT,          /* canceled i/o due to cache hit */
        RA_EVENT_IO_BLOCK,              /* wait for i/o completion */

        RA_EVENT_READAHEAD,             /* read-ahead issued */
        RA_EVENT_READAHEAD_HIT,         /* read-ahead page hit */
        RA_EVENT_LOOKAHEAD,             /* look-ahead issued */
        RA_EVENT_LOOKAHEAD_HIT,         /* look-ahead mark hit */
        RA_EVENT_LOOKAHEAD_NOACTION,    /* look-ahead mark ignored */
        RA_EVENT_READAHEAD_MMAP,        /* read-ahead for mmap access */
        RA_EVENT_READAHEAD_EOF,         /* read-ahead reaches EOF */
        RA_EVENT_READAHEAD_SHRINK,      /* ra_size falls under previous la_size */
        RA_EVENT_READAHEAD_THRASHING,   /* read-ahead thrashing happened */
        RA_EVENT_READAHEAD_MUTILATE,    /* read-ahead mutilated by imbalanced aging */
        RA_EVENT_READAHEAD_RESCUE,      /* read-ahead rescued */

        RA_EVENT_READAHEAD_CUBE,
        RA_EVENT_COUNT
};

> doing anything useful? (One thing I've noticed is that xmms, rather than gobble
> up 100K of data off disk every 10 seconds or so, snarfs a big 2M chunk every
> 3-4 minutes, often sucking in an entire song at (nearly) one shot...)

Hehe, it's resulted from the enlarged default max readahead size(128K => 1M).
Too much aggressive? I'm interesting to know the recommended size for
desktops, thanks. For now you can adjust it through the 'blockdev
--setra /dev/hda' command.

Wu
--

> (Complete contents of readahead/events follows, in case it helps diagnose...)
> 
> [table requests]     total   initial     state   context  contexta  backward  onthrash    onseek      none
> cache_miss            3934       543        93      2013        39      1199         0        47       417
> random_read           1772        59        49      1059        11       575         0        19       327
> io_congestion            4         0         4         0         0         0         0         0         0
> io_cache_hit          1082         1        63       855        14       144         0         5         0
> io_block             26320     18973      3519      2225       265      1288         0        50      1371
> readahead            18601     15540      1008      1203       110       710         0        30      1483
> lookahead             1972       153       671      1050        98         0         0         0         0
> lookahead_hit         1241       152       596       312        84        95         0         2        37
> lookahead_ignore         0         0         0         0         0         0         0         0         0
> readahead_mmap           0         0         0         0         0         0         0         0         0
> readahead_eof        14951     14348       569        19        15         0         0         0         0
> readahead_shrink         0         0         0         0         0         0         0         0        70
> readahead_thrash         0         0         0         0         0         0         0         0         0
> readahead_mutilt         0         0         0         0         0         0         0         0         0
> readahead_rescue         0         0         0         0         0         0         0         0       138
> 
> [table pages]        total   initial     state   context  contexta  backward  onthrash    onseek      none
> cache_miss            6541      2472       754      2026        43      1199         0        47      1194
> random_read           1784        62        51      1065        12       575         0        19       337
> io_congestion          396         0       396         0         0         0         0         0         0
> io_cache_hit         10185         2       571      7930      1383       293         0         6         0
> readahead           111015     30757     67949      6864      3642      1681         0       122     53677
> readahead_hit        98812     30052     61602      2762      3041      1294         0        61       277
> lookahead            72607       185     64222      3734      4466         0         0         0         0
> lookahead_hit        68640       184     59207      4475      4774         0         0         0         0
> lookahead_ignore         0         0         0         0         0         0         0         0         0
> readahead_mmap           0         0         0         0         0         0         0         0         0
> readahead_eof        39959     25045     14102        64       748         0         0         0         0
> readahead_shrink         0         0         0         0         0         0         0         0      1076
> readahead_thrash         0         0         0         0         0         0         0         0         0
> readahead_mutilt         0         0         0         0         0         0         0         0         0
> readahead_rescue         0         0         0         0         0         0         0         0      9538
> 
> [table summary]      total   initial     state   context  contexta  backward  onthrash    onseek      none
> random_rate             8%        0%        4%       46%        9%       44%        0%       38%       18%
> ra_hit_rate            89%       97%       90%       40%       83%       76%        0%       49%        0%
> la_hit_rate            62%       99%       88%       29%       84%     9500%        0%      200%     3700%
> var_ra_size            703         4      8064        39      5780         3         0        59      3010
> avg_ra_size              6         2        67         6        33         2         0         4        36
> avg_la_size             37         1        96         4        45         0         0         0         0
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Adaptive Readahead V14 - statistics question...
@ 2006-05-30  3:36 Voluspa
       [not found] ` <20060530064026.GA4950@mail.ustc.edu.cn>
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Voluspa @ 2006-05-30  3:36 UTC (permalink / raw)
  To: wfg; +Cc: Valdis.Kletnieks, linux-kernel


Sorry about the top-post, I'm not subscribed.

On 2006-05-30 0:37:57 Wu Fengguang wrote:
> On Mon, May 29, 2006 at 03:44:59PM -0400, Valdis Kletnieks wrote:
[...]
>> doing anything useful? (One thing I've noticed is that xmms, rather
>> than gobble up 100K of data off disk every 10 seconds or so, snarfs
>> a big 2M chunk every 3-4 minutes, often sucking in an entire song at
>> (nearly) one shot...)
>
> Hehe, it's resulted from the enlarged default max readahead size(128K
> => 1M). Too much aggressive? I'm interesting to know the recommended
> size for desktops, thanks. For now you can adjust it through the
> 'blockdev --setra /dev/hda' command.

And notebooks? I'm running a 64bit system with 2gig memory and a 7200
RPM disk. Without your patches a movie like Elephants_Dream_HD.avi
causes a continuous silent read. After patching 2.6.17-rc5 (more on that
later) there's a slow 'click-read-click-read-click-etc' during the
same movie as the head travels _somewhere_ to rest(?) between reads.

Distracting in silent sequences, and perhaps increased disk wear/tear.
I'll try adjusting the readahead size towards silence tomorrow.

But as size slides in a mainstream direction, whence will any benefit
come - in this Joe-average case? It's not a faster 'cp' at least:

_Cold boot between tests - Copy between different partitions_

2.6.17-rc5-proper (Elephants_Dream_HD.avi 854537054 bytes)

real    0m44.050s
user    0m0.076s
sys     0m6.344s

2.6.17-rc5-patched

real    0m49.353s
user    0m0.075s
sys     0m6.287s

2.6.17-rc5-proper (compiled kernel tree linux-2.6.17-rc5 ~339M)

real    0m47.952s
user    0m0.198s
sys     0m6.118s

2.6.17-rc5-patched

real    0m46.513s
user    0m0.200s
sys     0m5.827s

Of course, my failure to see speed-ups could well be 'cos of a botched
patch transfer (or some kind of missing groundwork only available in
-mm). There was one reject in particular which made me pause. I'm no
programmer... and 'continue;' is a weird direction. At the end I settled
on:

[mm/readahead.c]
@@ -184,8 +289,10 @@
 					page->index, GFP_KERNEL)) {
 			ret = mapping->a_ops->readpage(filp, page);
 			if (ret != AOP_TRUNCATED_PAGE) {
-				if (!pagevec_add(&lru_pvec, page))
+				if (!pagevec_add(&lru_pvec, page)) {
+					cond_resched();
 					__pagevec_lru_add(&lru_pvec);
+				}
 				continue;
 			} /* else fall through to release */
 		}

The full 82K experiment can temprarily be found at this location:
http://web.comhem.se/~u46139355/storetmp/adaptive-readahead-v14-linux-2.6.17-rc5-part-01to28of32.patch

At least it hasn't eaten my (backed up) disk yet ;-)

Mvh
Mats Johannesson
--

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Adaptive Readahead V14 - statistics question...
       [not found] ` <20060530064026.GA4950@mail.ustc.edu.cn>
@ 2006-05-30  6:40   ` Wu Fengguang
  0 siblings, 0 replies; 11+ messages in thread
From: Wu Fengguang @ 2006-05-30  6:40 UTC (permalink / raw)
  To: Voluspa; +Cc: Valdis.Kletnieks, linux-kernel

On Tue, May 30, 2006 at 05:36:31AM +0200, Voluspa wrote:
> 
> Sorry about the top-post, I'm not subscribed.
> 
> On 2006-05-30 0:37:57 Wu Fengguang wrote:
> > On Mon, May 29, 2006 at 03:44:59PM -0400, Valdis Kletnieks wrote:
> [...]
> >> doing anything useful? (One thing I've noticed is that xmms, rather
> >> than gobble up 100K of data off disk every 10 seconds or so, snarfs
> >> a big 2M chunk every 3-4 minutes, often sucking in an entire song at
> >> (nearly) one shot...)
> >
> > Hehe, it's resulted from the enlarged default max readahead size(128K
> > => 1M). Too much aggressive? I'm interesting to know the recommended
> > size for desktops, thanks. For now you can adjust it through the
> > 'blockdev --setra /dev/hda' command.
> 
> And notebooks? I'm running a 64bit system with 2gig memory and a 7200
> RPM disk. Without your patches a movie like Elephants_Dream_HD.avi
> causes a continuous silent read. After patching 2.6.17-rc5 (more on that
> later) there's a slow 'click-read-click-read-click-etc' during the
> same movie as the head travels _somewhere_ to rest(?) between reads.
> 
> Distracting in silent sequences, and perhaps increased disk wear/tear.
> I'll try adjusting the readahead size towards silence tomorrow.

Hmm... It seems risky to increase the default readahead size.
I would appreciate a feed back when you are settled with some new
size, thanks.

btw, maybe you will be interested in the 'laptop mode'.
It prolongs battery life by making disk activity "bursty":
http://www.xs4all.nl/~bsamwel/laptop_mode/

> But as size slides in a mainstream direction, whence will any benefit
> come - in this Joe-average case? It's not a faster 'cp' at least:
> 
> _Cold boot between tests - Copy between different partitions_

I have never did 'cp' tests, because it involves writes caching
problems. Which makes the result hard to interpret. However I will
try to explain the two tests.

> 2.6.17-rc5-proper (Elephants_Dream_HD.avi 854537054 bytes)
> 
> real    0m44.050s
> user    0m0.076s
> sys     0m6.344s
> 
> 2.6.17-rc5-patched
> 
> real    0m49.353s
> user    0m0.075s
> sys     0m6.287s

- only size matters in this trivial case.
- the increased size generally do not help single reading speed.
- but it helped reducing overhead(i.e. decreased user/sys time)
- not sure why real time increased so much.

> 2.6.17-rc5-proper (compiled kernel tree linux-2.6.17-rc5 ~339M)
> 
> real    0m47.952s
> user    0m0.198s
> sys     0m6.118s
> 
> 2.6.17-rc5-patched
> 
> real    0m46.513s
> user    0m0.200s
> sys     0m5.827s

- the small files optimization in the new logic helped a little

Thanks,
Wu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Adaptive Readahead V14 - statistics question...
  2006-05-30  3:36 Voluspa
       [not found] ` <20060530064026.GA4950@mail.ustc.edu.cn>
@ 2006-05-30 16:49 ` Valdis.Kletnieks
  2006-05-31 21:06   ` Diego Calleja
  2006-05-31 21:50   ` Voluspa
       [not found] ` <448493E9.9030203@samwel.tk>
  2 siblings, 2 replies; 11+ messages in thread
From: Valdis.Kletnieks @ 2006-05-30 16:49 UTC (permalink / raw)
  To: Voluspa; +Cc: wfg, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3325 bytes --]

On Tue, 30 May 2006 05:36:31 +0200, Voluspa said:
> On 2006-05-30 0:37:57 Wu Fengguang wrote:
> > On Mon, May 29, 2006 at 03:44:59PM -0400, Valdis Kletnieks wrote:
> [...]
> >> doing anything useful? (One thing I've noticed is that xmms, rather
> >> than gobble up 100K of data off disk every 10 seconds or so, snarfs
> >> a big 2M chunk every 3-4 minutes, often sucking in an entire song at
> >> (nearly) one shot...)
> >
> > Hehe, it's resulted from the enlarged default max readahead size(128K
> > => 1M). Too much aggressive? I'm interesting to know the recommended
> > size for desktops, thanks. For now you can adjust it through the
> > 'blockdev --setra /dev/hda' command.

Actually, it doesn't seem too aggressive at all - I have 768M of memory,
and the larger max readahead means that it hits the disk 1/8th as often
for a bigger slurp.  Since I'm on a laptop with a slow 5400rpm 60g disk,
a 128K seek-and-read "costs" almost exactly the same as a 1M seek-and-read...

(If I was more memory constrained, I'd probably be hitting that --setra though ;)

The only hard numbers I have so far are a build of a 17-rc4-mm3 kernel tree
under -mm3+readahead and a slightly older -mm2 - the readahead kernel got
through the build about 30 seconds faster (19 mins 45 secs versus 20:17 -but
that's only 1 trial each).

Oh.. another "hard number" - elapsed time for a 4AM 'tripwire' run from cron
with a -mm3+readahead kernel was 36 minutes. A few days earlier, a -mm3
kernel took 46 minutes for the same thing.  I'll have to go and retry this
with equivalent cache-cold scenarios - I *think* the file cache was roughly
equivalent, but can't prove it...

The desktop "feel" is certainly at least as good, but it's a lot harder
to quantify that - yesterday I was doing some heavy-duty cleaning in my
~/Mail directory (MH-style one message per file, about 250K files and 3G,
obviously seriously in need of cleaning).  I'd often have 2 different
'find | xargs grep' type commands running at a time, and that seemed to
work a lot better than it used to (but again, no numbers)..

Damn, this is a lot harder to benchmark than the sort of microbenchmarks
we usually see around here. :)

> And notebooks? I'm running a 64bit system with 2gig memory and a 7200
> RPM disk. Without your patches a movie like Elephants_Dream_HD.avi
> causes a continuous silent read. After patching 2.6.17-rc5 (more on that
> later) there's a slow 'click-read-click-read-click-etc' during the
> same movie as the head travels _somewhere_ to rest(?) between reads.

For my usage patterns, this is a feature, not a bug. As mentioned before,
on this machine anything that reduces the number of seeks is a Good Thing.

> Distracting in silent sequences, and perhaps increased disk wear/tear.

It would be increased wear/tear only if the disk was idle long enough to
spin down. Especially for video, the read-ahead needed to let the disk spin
down (assuming a sane timeout for that) would be enormous. :)

> I'll try adjusting the readahead size towards silence tomorrow.

The onboard sound chip is an ok-quality CS4205, the onboard speakers are crap.
However, running the audio through a nice pair of Kenwood headphones is a good
solution. I don't hear the disk (or sometimes even the phone), and my
co-workers don't have to hear my Malmsteen collection. :)

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Adaptive Readahead V14 - statistics question...
  2006-05-30 16:49 ` Valdis.Kletnieks
@ 2006-05-31 21:06   ` Diego Calleja
  2006-05-31 21:50   ` Voluspa
  1 sibling, 0 replies; 11+ messages in thread
From: Diego Calleja @ 2006-05-31 21:06 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: lista1, wfg, linux-kernel

El Tue, 30 May 2006 12:49:50 -0400,
Valdis.Kletnieks@vt.edu escribió:

> The desktop "feel" is certainly at least as good, but it's a lot harder
> to quantify that - yesterday I was doing some heavy-duty cleaning in my

My desktop seems to boot a bit faster with adaptive readahead. I setup
a environment running kdm with automatic login plus a kde session which runs
a konqueror window and a openoffice writer windows. The time it takes for
the system to show the OO window went from 1:19 to 1:16 (I did a couple of
test of each kernel). Not a very scientific measurement, bootchart probably
could do it better

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Adaptive Readahead V14 - statistics question...
  2006-05-30 16:49 ` Valdis.Kletnieks
  2006-05-31 21:06   ` Diego Calleja
@ 2006-05-31 21:50   ` Voluspa
       [not found]     ` <20060601055143.GA5216@mail.ustc.edu.cn>
  1 sibling, 1 reply; 11+ messages in thread
From: Voluspa @ 2006-05-31 21:50 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: wfg, linux-kernel

On Tue, 30 May 2006 12:49:50 -0400 Valdis.Kletnieks wrote:
> On Tue, 30 May 2006 05:36:31 +0200, Voluspa said:
> > On 2006-05-30 0:37:57 Wu Fengguang wrote:
> > > On Mon, May 29, 2006 at 03:44:59PM -0400, Valdis Kletnieks wrote:
[...]
> Damn, this is a lot harder to benchmark than the sort of microbenchmarks
> we usually see around here. :)

I don't even know what a microbenchmark is, but 'cp' and its higher-level
equivalents is such a frequent operation that I always begin any test
there.

[...] [Correction, should be: 'click-read-pause, click-read-pause etc']
> > later) there's a slow 'click-read-click-read-click-etc' during the
> > same movie as the head travels _somewhere_ to rest(?) between reads.
> 
> For my usage patterns, this is a feature, not a bug. As mentioned before,
> on this machine anything that reduces the number of seeks is a Good Thing.
> 
> > Distracting in silent sequences, and perhaps increased disk wear/tear.
> 
> It would be increased wear/tear only if the disk was idle long enough to
> spin down. Especially for video, the read-ahead needed to let the disk spin
> down (assuming a sane timeout for that) would be enormous. :)

:-) I was thinking more in terms of disk head _arm_ wear. Somehow there's a
picture in my head of the arm swinging back to a rest position at an outer
(or inner?) "safe" disk track if read/write operations are delayed too much.
And therefore I associate a 'click' with the arm swinging back into action.
Normal quick read/write arm movement noise is distinctly different - in my
uninformed user ears.

I haven't adjusted the readahed size yet, but instead performed a series of
real-world usage tests.

Conclusion: On _this_ machine, with _these_ operations, Adaptive Readahead
in its current incarnation and default settings is a _loss_.

Patch version:
http://web.comhem.se/~u46139355/storetmp/adaptive-readahead-v14-linux-2.6.17-rc5-part-01to28of32-and-update-01to04of04-and-saner-CDVD-medium-error-handling.patch

Relevant hardware:
AMD Athlon 64 Processor 3400+ (2200 MHz top speed) L1 I Cache: 64K (64 
bytes/line), D cache 64K (64 bytes/line), L2 Cache: 1024K (64 bytes/line).
VIA K8M800 chipset with VT8235 south. Kingmax 2x1GB DDR-333MHz SO-DIMM memory.
Hitachi Travelstar 7K100 (HTS721010G9AT00) 100GB 7200RPM Parallel-ATA disk,
http://www.hitachigst.com/hdd/support/7k100/7k100.htm acoustic management
value was set to 254 (fast/"noisy") at delivery.

Soft system:
Is extremely lean and simple. Pure 64bit compiled in a lfs-ish way almost
exactly 1 year ago. No desktop, just a wm (which wasn't even launched in
these tests). Toolchain glibc-2.3.5 (nptl), binutils-2.16.1, gcc-3.4.4

Filesystem:
Journaled ext3 with default mount (ordered data mode) and noatime.

Kernels:
loke:sleipner:~$ ls -l /boot/kernel-2.6.17-rc5*
1440 -rw-r--r--  1 root root 1469211 May 30 02:25 /boot/kernel-2.6.17-rc5
1444 -rw-r--r--  1 root root 1470540 May 30 19:07 /boot/kernel-2.6.17-rc5-ar

All tests were performed as the root user from a machine standstill "cold
boot" for each iteration and prepared for a 'console login - immediate run'
ie. eventual previous output deleted/reset.

_Massive READ_

[/usr had some 490000 files]

"cd /usr ; time find . -type f -exec md5sum {} \;"

2.6.17-rc5 ------- 2.6.17-rc5-ar

real 21m21.009s -- 21m37.663s
user 3m20.784s  -- 3m20.701s
sys  6m34.261s  -- 6m41.735s

I had planned to run this at least three times, but didn't realize I had
12 compiled kernel trees and 3 uncompiled there... So, a one-shot had to
do. But it's still significant.

_READ/WRITE_

[255 .tga files, each is 1244178 bytes]
[1 .wav file which is 1587644 bytes]
[movie becomes 573298 bytes ~9s long]

"time mencoder -ovc lavc -lavcopts aspect=16/9 mf://picsave/kreation/03-logo-joined/*.tga -oac lavc -audiofile kreation-files/kreation-logo-final.wav -o logo-final-widescreen-speedtest.avi"

2.6.17-rc5

real 0m10.164s 0m10.224s 0m10.141s
user 0m3.301s  0m3.304s  0m3.297s
sys  0m1.103s  0m1.097s  0m1.082s

2.6.17-rc5-ar

real 0m10.831s 0m10.816s 0m10.747s
user 0m3.319s  0m3.313s  0m3.324s
sys  0m1.081s  0m1.099s  0m1.042s

A 0.6s slowdown might not seem as such a big deal, but this is on a 9s
movie! Furthermore, the test was conducted on the / root partition which
resides on hda2. Subtracting the 8GB hda1 and the occupied 1.2GB of hda2
places us 9.2GB in from the disk edge (assuming 1 platter). I did a
one-shot test of this movie on hda3 - closes to the spindle - which all
in all gives a distance of ~95GB:

2.6.17-rc5 ------ 2.6.17-rc5-ar

real 0m16.134s -- 0m17.456s
user 0m3.311s  -- 0m3.312s
sys  0m1.111s  -- 0m1.135s

Wow. If nothing else, these tests have made me rethink my partitioning
scheme. I've used the same layout since xx-years ago when proximity of
swap-usr-home on those slow disks really made a difference. And since
I don't touch swap in normal operation nowadays... Power to the Edge!

_Geek usage_

[Kernel compile]
[CONFIG_REORDER "Processor type and features -> Function reordering" adds
ca 30s here]
[Note: I made a mistake by booting the -ar kernel first, and also didn't
alternate like I should have. This was the first set of tests and chip
temperature rise seem to slow things down. Physics reason above my head]

"time make"

2.6.17-rc5-ar

real 5m3.654s  5m3.787s  5m4.390s  5m4.991s
user 4m17.595s 4m17.580s 4m17.701s 4m18.043s
sys  0m31.551s 0m31.506s 0m31.368s 0m31.563s

2.6.17-rc5

real 5m4.606s  5m5.798s  5m4.684s  5m4.508s
user 4m18.586s 4m19.183s 4m19.111s 4m17.799s
sys  0m31.241s 0m31.482s 0m31.278s 0m31.610s

Any difference here should really be considered noise. The file read/write
is too infrequent and slow to really measure.

_Caveat and preemptive Mea Culpa_

The patching of 2.6.17-rc5 has neither been approved or verified as to
its correctness. The kernel compiles without errors and the new
/proc/sys/kernel/ sysctl readahead_ratio and readahead_hit_rate turn up
with the default 50 and 1. This is however not a proof of total parity
with the official -mm patch-set.

Mvh
Mats Johannesson
--

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Adaptive Readahead V14 - statistics question...
       [not found]     ` <20060601055143.GA5216@mail.ustc.edu.cn>
@ 2006-06-01  5:51       ` Fengguang Wu
  2006-06-01  6:35         ` Voluspa
  2006-06-08  8:04         ` Voluspa
  0 siblings, 2 replies; 11+ messages in thread
From: Fengguang Wu @ 2006-06-01  5:51 UTC (permalink / raw)
  To: Voluspa; +Cc: Valdis.Kletnieks, linux-kernel

On Wed, May 31, 2006 at 11:50:21PM +0200, Voluspa wrote:
> _Massive READ_
> 
> [/usr had some 490000 files]
> 
> "cd /usr ; time find . -type f -exec md5sum {} \;"
> 
> 2.6.17-rc5 ------- 2.6.17-rc5-ar
> 
> real 21m21.009s -- 21m37.663s
> user 3m20.784s  -- 3m20.701s
> sys  6m34.261s  -- 6m41.735s
> 
> I had planned to run this at least three times, but didn't realize I had
> 12 compiled kernel trees and 3 uncompiled there... So, a one-shot had to
> do. But it's still significant.

Sorry, it is a known regression. I'd like to fix it in the next
release.

Thanks,
Wu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Adaptive Readahead V14 - statistics question...
  2006-06-01  5:51       ` Fengguang Wu
@ 2006-06-01  6:35         ` Voluspa
  2006-06-08  8:04         ` Voluspa
  1 sibling, 0 replies; 11+ messages in thread
From: Voluspa @ 2006-06-01  6:35 UTC (permalink / raw)
  To: Fengguang Wu; +Cc: Valdis.Kletnieks, diegocg, linux-kernel

On Thu, 1 Jun 2006 13:51:43 +0800 Fengguang Wu wrote:
> On Wed, May 31, 2006 at 11:50:21PM +0200, Voluspa wrote:
> > _Massive READ_
> > 
> > [/usr had some 490000 files]
> > 
> > "cd /usr ; time find . -type f -exec md5sum {} \;"
> > 
> > 2.6.17-rc5 ------- 2.6.17-rc5-ar
> > 
> > real 21m21.009s -- 21m37.663s
> > user 3m20.784s  -- 3m20.701s
> > sys  6m34.261s  -- 6m41.735s
> > 
> > I had planned to run this at least three times, but didn't realize I had
> > 12 compiled kernel trees and 3 uncompiled there... So, a one-shot had to
> > do. But it's still significant.
> 
> Sorry, it is a known regression. I'd like to fix it in the next
> release.

That's cool. I had fun testing (I'm weird) and now have a fixed procedure
to monitor your future work. When/if it hits mainline I'll both back it out
and switch it on/off. Then shout WOLF if I see a regression anywhere ;)

There's still the readahead size to adjust. I'll return with my findings.

Mvh
Mats Johannesson
--

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Adaptive Readahead V14 - statistics question...
       [not found]   ` <20060606033436.GB6071@mail.ustc.edu.cn>
@ 2006-06-06  3:34     ` Wu Fengguang
  0 siblings, 0 replies; 11+ messages in thread
From: Wu Fengguang @ 2006-06-06  3:34 UTC (permalink / raw)
  To: Bart Samwel; +Cc: Voluspa, linux-kernel

On Mon, Jun 05, 2006 at 10:28:25PM +0200, Bart Samwel wrote:
> Hi Mats, Wu,
> 
> Hmmm, video at 1 Mb/s = 128 kB/s (just guessing a typical bitrate) 
> equals 8 seconds between reads at 1 MB readahead, right? That's strange, 
> you should be hearing those sounds normally as well then, as a typical 
> Linux laptop setup accesses the disk less frequently than once every 8 
> seconds. Anyway, _increasing_ the maximum readahead to some film-decent 
> value will probably get rid of the clicking as well.
> 
> I guess the problem is getting this to work without disturbing other 
> applications, i.e. without making slow-but-predictably-reading 
> applications read ahead 10 MB as well. I've been struggling with this 
> with laptop mode for quite some time: last time I checked, there didn't 
> seem to be a good way to do video readahead without making all other 
> reads read ahead too much as well... What I'd like to have is a setting 
> that works based on _time_, so that I can say "read all you think will 
> be needed in the next N seconds".
> 
> I could imagine having the maximum readahead being composed of two settings:
> 
> MAX_BYTES = maximum readahead in bytes
> MAX_TIME = maximum readahead in *time* before the data is expected to be 
> needed
> 
> For instance, if MAX_BYTES = 50MB and MAX_TIME=180 seconds, an 
> application reading at 10 kB/s would get a max readahead of 180*10 = 
> 1800kB, while an application reading at 100 kB/s would get a max 
> readahead of 180*100 = 18000kB. As a use case, the first application 
> would be xmms (128kbit MP3), while the second would be mplayer (800kbit 
> video). In both cases, laptop mode would be able to spin down the disk 
> for a full three minutes between spinups. Ideal for when you're trying 
> to watch a video while on the road.
> 
> Wu, do the adaptive readahead patches have something like this, or could 
> it be included? It would solve a _major_ problem for laptop mode.

Yes, it has the capability you need.

And MAX_TIME is not necessary for this case.
It does not try to estimate the time, but the relative speeds of all
concurrent readers. It automatically arranges appropriate readahead
sizes for all the concurrent readers, so that no readahead thrashing
will happen, provided that the reading speeds do not fluctuate too
much.

However there are some cases not thrashing protected:
        - normal mmapped reading(without POSIX_FADV_SEQUENTIAL hint);
        - backward reading;
        - fs stuffs(i.e. readahead for dirs);

With that in mind, you can safely set max readahead size to as large
as 255M when watching videos by doing the following tricks:

blockdev --setra 524280 /dev/hda[N]     # following opened file will use this aggressive size
mplayer <your video file on hda[N]>     # open it
blockdev --setra 2048 /dev/hda[N]       # revert to sane value
# now continue watching video ...

Cheers,
Wu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Adaptive Readahead V14 - statistics question...
  2006-06-01  5:51       ` Fengguang Wu
  2006-06-01  6:35         ` Voluspa
@ 2006-06-08  8:04         ` Voluspa
  1 sibling, 0 replies; 11+ messages in thread
From: Voluspa @ 2006-06-08  8:04 UTC (permalink / raw)
  To: Fengguang Wu; +Cc: Valdis.Kletnieks, diegocg, linux-kernel


My patching was borked as can be seen in:
http://marc.theaimsgroup.com/?l=linux-kernel&m=114956084026066&w=2

I've therefore benchmarked two corrected patches from Wu:
http://web.comhem.se/~u46139355/storetmp/adaptive-readahead-2.6.17-rc5-wu-v1.patch
http://web.comhem.se/~u46139355/storetmp/adaptive-readahead-2.6.17-rc5-wu-v2.patch

Revised Conclusion: On _this_ machine, with _these_ operations, Adaptive
Readahead in its current incarnation and default settings is a slight
_loss_. However, if the readahead size is lowered from 2048 to 256, it
becomes a slight _gain_ or at least stays in parity with normal readahead.

I suggest others to test multi-thread, multi-cpu, more than/less than my
2GB memory, sata-disks, different disk speeds etc etc.

Kernels:
root:sleipner:~# ls -l /boot/kernel-2.6.17-rc5-git10*
1440 -rw-r--r--  1 root root 1469326 Jun  6 22:27 /boot/kernel-2.6.17-rc5-git10
1440 -rw-r--r--  1 root root 1470122 Jun  6 22:36 /boot/kernel-2.6.17-rc5-git10-ar1
1440 -rw-r--r--  1 root root 1470128 Jun  6 22:44 /boot/kernel-2.6.17-rc5-git10-ar2

_Massive READ_

[/usr had some 195000 files]

"cd /usr; time find . -type f -exec md5sum {} \; >/dev/null"

[/sbin/blockdev --setra 256 /dev/hda]  * [/sbin/blockdev --setra 2048 /dev/hda]

6.17-rc5-git10 - git10-ar1 - git10-ar2 * 6.17-rc5-git10 - git10-ar1 - git10-ar2

real 8m18.241s - 8m19.053s - 8m16.639s * real 8m24.042s - 8m22.652s - 8m20.812s
user 1m23.556s - 1m24.526s - 1m23.725s * user 1m23.788s - 1m23.741s - 1m24.023s
sys  2m8.514s  - 2m5.989s  - 2m3.540s  * sys  2m7.369s  - 2m6.914s  - 2m5.317s

real 8m19.171s - 8m17.993s - 8m17.062s * real 8m23.110s - 8m20.409s - 8m19.278s
user 1m23.863s - 1m23.692s - 1m23.980s * user 1m23.770s - 1m23.715s - 1m23.525s
sys  2m9.332s  - 2m4.133s  - 2m3.602s  * sys  2m6.463s  - 2m5.735s  - 2m3.801s

real 8m17.111s - 8m17.102s - 8m16.859s * real 8m21.891s - 8m19.129s - 8m17.321s
user 1m24.071s - 1m24.126s - 1m24.430s * user 1m23.876s - 1m23.592s - 1m23.024s
sys  2m6.292s  - 2m3.543s  - 2m3.142s  * sys  2m4.768s  - 2m4.012s  - 2m3.110s

real 8m20.427s - 8m16.972s - 8m17.847s * real 8m25.359s - 8m21.261s - 8m20.365s
user 1m23.730s - 1m23.260s - 1m23.227s * user 1m24.242s - 1m23.825s - 1m23.895s
sys  2m9.524s  - 2m3.708s  - 2m5.244s  * sys  2m7.894s  - 2m5.366s  - 2m3.971s


_READ/WRITE_

[255 .tga files, each is 1244178 bytes]
[1 .wav file which is 1587644 bytes]
[movie becomes 573298 bytes ~9s long]

"time mencoder -ovc lavc -lavcopts aspect=16/9 mf://picsave/kreation/03-logo-joined/*.tga -oac lavc -audiofile kreation-files/kreation-logo-final.wav -o logo-final-widescreen-speedtest.avi"

[/sbin/blockdev --setra 256 /dev/hda]  * [/sbin/blockdev --setra 2048 /dev/hda]

6.17-rc5-git10 - git10-ar1 - git10-ar2 * 6.17-rc5-git10 - git10-ar1 - git10-ar2

real 0m12.961s - 0m12.864s - 0m12.811s * real 0m16.628s - 0m15.862s - 0m16.754s
user 0m3.315s  - 0m3.319s  - 0m3.316s  * user 0m3.335s  - 0m3.301s  - 0m3.308s
sys  0m1.082s  - 0m1.077s  - 0m1.086s  * sys  0m1.084s  - 0m1.122s  - 0m1.093s

real 0m12.908s - 0m12.793s - 0m12.813s * real 0m16.601s - 0m15.893s - 0m16.736s
user 0m3.323s  - 0m3.305s  - 0m3.312s  * user 0m3.311s  - 0m3.316s  - 0m3.308s
sys  0m1.051s  - 0m1.079s  - 0m1.145s  * sys  0m1.046s  - 0m1.109s  - 0m1.091s


_cp bigfile between different partitions_

[Elephants_Dream_HD.avi 854537054 bytes]

"time cp /home/downloads/Elephants_Dream_HD.avi /root"

[/sbin/blockdev --setra 256 /dev/hda]  * [/sbin/blockdev --setra 2048 /dev/hda]

6.17-rc5-git10 - git10-ar1 - git10-ar2 * 6.17-rc5-git10 - git10-ar1 - git10-ar2

real 0m46.463s - 0m46.909s - 0m45.865s * real 0m50.232s - 0m50.863s - 0m50.549s
user 0m0.081s  - 0m0.073s  - 0m0.068s  * user 0m0.069s  - 0m0.063s  - 0m0.088s
sys  0m6.304s  - 0m7.204s  - 0m5.949s  * sys  0m5.902s  - 0m7.174s  - 0m6.822s

real 0m46.126s - 0m47.305s - 0m47.174s * real 0m50.875s - 0m50.066s - 0m50.862s
user 0m0.091s  - 0m0.095s  - 0m0.070s  * user 0m0.099s  - 0m0.091s  - 0m0.071s
sys  0m5.751s  - 0m7.159s  - 0m6.707s  * sys  0m6.271s  - 0m6.740s  - 0m7.318s


_cp filetree between different partitions_

[compiled kerneltree ~339M]

"time cp -a /usr/src/testing/linux-2.6.17-rc5-git10 /root"

[/sbin/blockdev --setra 256 /dev/hda]  * [/sbin/blockdev --setra 2048 /dev/hda]

6.17-rc5-git10 - git10-ar1 - git10-ar2 * 6.17-rc5-git10 - git10-ar1 - git10-ar2

real 0m51.344s - 0m51.886s - 0m51.077s * real 0m52.502s - 0m52.757s - 0m54.794s
user 0m0.193s  - 0m0.220s  - 0m0.231s  * user 0m0.210s  - 0m0.198s  - 0m0.177s
sys  0m5.508s  - 0m6.003s  - 0m5.205s  * sys  0m5.980s  - 0m5.800s  - 0m6.372s

real 0m51.148s - 0m51.212s - 0m51.768s * real 0m51.488s - 0m52.098s - 0m51.719s
user 0m0.170s  - 0m0.209s  - 0m0.184s  * user 0m0.198s  - 0m0.210s  - 0m0.179s
sys  0m5.697s  - 0m5.604s  - 0m6.438s  * sys  0m5.527s  - 0m5.918s  - 0m5.544s

Mvh
Mats Johannesson
--

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2006-06-08  8:04 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-29 19:44 Adaptive Readahead V14 - statistics question Valdis.Kletnieks
     [not found] ` <20060530003757.GA5164@mail.ustc.edu.cn>
2006-05-30  0:37   ` Wu Fengguang
  -- strict thread matches above, loose matches on Subject: below --
2006-05-30  3:36 Voluspa
     [not found] ` <20060530064026.GA4950@mail.ustc.edu.cn>
2006-05-30  6:40   ` Wu Fengguang
2006-05-30 16:49 ` Valdis.Kletnieks
2006-05-31 21:06   ` Diego Calleja
2006-05-31 21:50   ` Voluspa
     [not found]     ` <20060601055143.GA5216@mail.ustc.edu.cn>
2006-06-01  5:51       ` Fengguang Wu
2006-06-01  6:35         ` Voluspa
2006-06-08  8:04         ` Voluspa
     [not found] ` <448493E9.9030203@samwel.tk>
     [not found]   ` <20060606033436.GB6071@mail.ustc.edu.cn>
2006-06-06  3:34     ` Wu Fengguang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).