All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fengguang Wu <wfg@mail.ustc.edu.cn>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, Nick Piggin <npiggin@suse.de>
Subject: Re: [PATCH 0/9] mmap read-around and readahead
Date: Tue, 18 Dec 2007 19:46:09 +0800	[thread overview]
Message-ID: <397978379.15961@ustc.edu.cn> (raw)
Message-ID: <E1J4atl-0000UI-4a@localhost> (raw)
In-Reply-To: <alpine.LFD.0.9999.0712161535090.21557@woody.linux-foundation.org>

On Sun, Dec 16, 2007 at 03:35:58PM -0800, Linus Torvalds wrote:
> 
> 
> On Sun, 16 Dec 2007, Fengguang Wu wrote:
> > 
> > Here are the mmap read-around related patches initiated by Linus.
> > They are for linux-2.6.24-rc4-mm1.  The one major new feature -
> > auto detection and early readahead for mmap sequential reads - runs
> > as expected on my desktop :-)
> 
> Just out of interest - did you check to see if it makes any difference to 
> any IO patterns (or even timings)?

No timings for now... but I wrote a debug patch(attached) and watched
it running for about a week.  Here are some interesting numbers:

% grep .so, /var/log/kern.log|grep init0|wc                   
   4085   60806  583895

% grep .so, /var/log/kern.log|grep around|wc
  14438  215265 2107308
% grep .so, /var/log/kern.log|grep around|grep '= 32' | wc    
   3133   46757  462446

% grep .so, /var/log/kern.log|grep interleaved|wc
    997   14866  148921
% grep .so, /var/log/kern.log|grep interleaved|grep '= 0'|wc  
    544    8089   79661
% grep .so, /var/log/kern.log|grep interleaved|grep '= 32'|wc
    179    2683   28233

% grep .so, /var/log/kern.log|grep sequential|wc 
   3499   52275  541319
% grep .so, /var/log/kern.log|grep sequential|grep '= 0' | wc
    915   13598  131953
% grep .so, /var/log/kern.log|grep sequential|grep '= 32' | wc
   1327   19880  212896

That says, there are
   4085 page faults on start-of-lib-file,
  14438 mmap read-around,       22% full ra size
   3499 mmap async readahead,   38% full ra size, or 51% if removing pure cache hits
    997 mmap sync readahead,    18% full ra size, or 40% if removing pure cache hits
That's good numbers: I/O sizes get larger, and possibly less I/O waits :-)

Sure it's rather coarse estimation, but there are some sequential mmap accesses.
E.g.

[11736.998347] readahead-init0(process: sh/23926, file: sda1/w3m, offset=0:-1, ra=0+4-3) = 4
[11737.014985] readahead-around(process: w3m/23926, file: sda1/w3m, offset=0:0, ra=290+32-0) = 17
[11737.019488] readahead-around(process: w3m/23926, file: sda1/w3m, offset=0:0, ra=118+32-0) = 32
[11737.024921] readahead-interleaved(process: w3m/23926, file: sda1/w3m, offset=0:2, ra=4+6-6) = 6
[11737.025726] readahead-sequential(process: w3m/23926, file: sda1/w3m, offset=0:3, ra=10+12-12) = 12
[11737.025794] readahead-around(process: w3m/23926, file: sda1/w3m, offset=0:4, ra=90+32-0) = 28
--- sequential begin ---
[11737.037893] readahead-init(process: w3m/23926, file: sda1/w3m, offset=0:149, ra=150+64-32) = 64
[11737.043928] readahead-sequential(process: w3m/23926, file: sda1/w3m, offset=0:181, ra=214+32-32) = 32
[11737.044086] readahead-sequential(process: w3m/23926, file: sda1/w3m, offset=0:213, ra=246+32-32) = 32
[11737.045633] readahead-sequential(process: w3m/23926, file: sda1/w3m, offset=0:245, ra=278+32-32) = 12
[11737.047321] readahead-sequential(process: w3m/23926, file: sda1/w3m, offset=0:277, ra=310+32-32) = 0
--- sequential end ---
[11737.048296] readahead-around(process: w3m/23926, file: sda1/w3m, offset=0:119, ra=48+32-0) = 32
[11737.066908] readahead-around(process: w3m/23926, file: sda1/w3m, offset=0:63, ra=73+32-0) = 10
[11737.136880] readahead-around(process: w3m/23926, file: sda1/w3m, offset=0:116, ra=30+32-0) = 18
[11737.166005] readahead-around(process: w3m/23926, file: sda1/w3m, offset=0:37, ra=6+32-0) = 8


But also there is one minor problem.

[16416.600720] readahead-init0(process: zsh/30490, file: sda1/bc, offset=0:-1, ra=0+4-3) = 4
[16416.607967] readahead-around(process: bc/30490, file: sda1/bc, offset=0:0, ra=1+32-0) = 14

The 4-page readahead-init0() hurts performance. It occurs before every initial mmap reads.
A longer example:

wfg ~% dmesg|grep mplayer
[ 1221.454230] readahead-init0(process: mutt/7131, file: md0/mplayer-devel, offset=0:-1, ra=0+4-3) = 4
[ 1378.667305] readahead-init0(process: strace/7352, file: sda1/mplayer, offset=0:-1, ra=0+4-3) = 4
[ 1378.692389] readahead-around(process: mplayer/7352, file: sda1/mplayer, offset=0:0, ra=2212+32-0) = 17
[ 1378.703656] readahead-around(process: mplayer/7352, file: sda1/mplayer, offset=0:0, ra=2061+32-0) = 32
[ 1378.715537] readahead-around(process: mplayer/7352, file: sda1/mplayer, offset=0:2077, ra=0+32-0) = 28
[ 1378.716261] readahead-around(process: mplayer/7352, file: sda1/mplayer, offset=0:10, ra=44+32-0) = 32
[ 1378.727570] readahead-init0(process: mplayer/7352, file: sda1/libdirectfb-0.9.so.25.0.0, offset=0:-1, ra=0+4-3) = 4
[ 1378.740579] readahead-around(process: mplayer/7352, file: sda1/libdirectfb-0.9.so.25.0.0, offset=0:0, ra=79+32-0) = 17
[ 1378.744826] readahead-around(process: mplayer/7352, file: sda1/libdirectfb-0.9.so.25.0.0, offset=0:1, ra=0+32-0) = 28
[ 1378.749882] readahead-init0(process: mplayer/7352, file: sda1/libXv.so.1.0.0, offset=0:-1, ra=0+4-3) = 4
[ 1378.754546] readahead-around(process: mplayer/7352, file: sda1/libXv.so.1.0.0, offset=0:0, ra=0+32-0) = 1
[ 1378.758057] readahead-init0(process: mplayer/7352, file: sda1/libXvMC.so.1.0.0, offset=0:-1, ra=0+4-3) = 4
[ 1378.759566] readahead-init0(process: mplayer/7352, file: sda1/libXvMCW.so.1.0.0, offset=0:-1, ra=0+4-3) = 4
[ 1378.764991] readahead-init0(process: mplayer/7352, file: sda1/libXxf86dga.so.1.0.0, offset=0:-1, ra=0+4-3) = 4
[ 1378.766036] readahead-around(process: mplayer/7352, file: sda1/libXxf86dga.so.1.0.0, offset=0:0, ra=0+32-0) = 2
[ 1378.766887] readahead-init0(process: mplayer/7352, file: sda1/libGL.so.1.2, offset=0:-1, ra=0+4-3) = 4
[ 1378.778437] readahead-around(process: mplayer/7352, file: sda1/libGL.so.1.2, offset=0:0, ra=109+32-0) = 17
[ 1378.782107] readahead-around(process: mplayer/7352, file: sda1/libGL.so.1.2, offset=0:2, ra=1+32-0) = 29
[ 1378.792935] readahead-init0(process: mplayer/7352, file: sda1/libggi.so.2.0.2, offset=0:-1, ra=0+4-3) = 4
[ 1378.799236] readahead-around(process: mplayer/7352, file: sda1/libggi.so.2.0.2, offset=0:0, ra=132+32-0) = 18
[ 1378.808167] readahead-around(process: mplayer/7352, file: sda1/libggi.so.2.0.2, offset=0:0, ra=0+32-0) = 28
[ 1378.808759] readahead-init0(process: mplayer/7352, file: sda1/libaa.so.1.0.4, offset=0:-1, ra=0+4-3) = 4
[ 1378.818428] readahead-around(process: mplayer/7352, file: sda1/libaa.so.1.0.4, offset=0:0, ra=12+32-0) = 18
[ 1378.830829] readahead-init0(process: mplayer/7352, file: sda1/libcaca.so.0.99.0, offset=0:-1, ra=0+4-3) = 4
[ 1378.832195] readahead-around(process: mplayer/7352, file: sda1/libcaca.so.0.99.0, offset=0:0, ra=0+32-0) = 6
[ 1378.832945] readahead-init0(process: mplayer/7352, file: sda1/libcucul.so.0.99.0, offset=0:-1, ra=0+4-3) = 4
[ 1378.837474] readahead-around(process: mplayer/7352, file: sda1/libcucul.so.0.99.0, offset=0:0, ra=135+32-0) = 18
[ 1378.844951] readahead-around(process: mplayer/7352, file: sda1/libcucul.so.0.99.0, offset=0:151, ra=1+32-0) = 29
[ 1378.845851] readahead-init0(process: mplayer/7352, file: sda1/libSDL-1.2.so.0.11.0, offset=0:-1, ra=0+4-3) = 4
[ 1378.867151] readahead-around(process: mplayer/7352, file: sda1/libSDL-1.2.so.0.11.0, offset=0:0, ra=88+32-0) = 18
[ 1378.871796] readahead-around(process: mplayer/7352, file: sda1/libSDL-1.2.so.0.11.0, offset=0:0, ra=0+32-0) = 28
[ 1378.873248] readahead-init0(process: mplayer/7352, file: sda1/libartsc.so.0.0.0, offset=0:-1, ra=0+4-3) = 4
[ 1378.885419] readahead-around(process: mplayer/7352, file: sda1/libartsc.so.0.0.0, offset=0:0, ra=0+32-0) = 2
[ 1378.892469] readahead-init0(process: mplayer/7352, file: sda1/libpulse.so.0.2.0, offset=0:-1, ra=0+4-3) = 4
[ 1378.903642] readahead-around(process: mplayer/7352, file: sda1/libpulse.so.0.2.0, offset=0:0, ra=43+32-0) = 17
[ 1378.907206] readahead-around(process: mplayer/7352, file: sda1/libpulse.so.0.2.0, offset=0:1, ra=0+32-0) = 28
[ 1378.918549] readahead-init0(process: mplayer/7352, file: sda1/libjack.so.0.0.23, offset=0:-1, ra=0+4-3) = 4
[ 1378.928575] readahead-around(process: mplayer/7352, file: sda1/libjack.so.0.0.23, offset=0:0, ra=2+32-0) = 16
[ 1378.940046] readahead-init0(process: mplayer/7352, file: sda1/libopenal.so.0.0.0, offset=0:-1, ra=0+4-3) = 4
[ 1378.963093] readahead-around(process: mplayer/7352, file: sda1/libopenal.so.0.0.0, offset=0:0, ra=42+32-0) = 17
[ 1378.981748] readahead-init0(process: mplayer/7352, file: sda1/libfaac.so.0.0.0, offset=0:-1, ra=0+4-3) = 4
[ 1378.993281] readahead-around(process: mplayer/7352, file: sda1/libfaac.so.0.0.0, offset=0:0, ra=0+32-0) = 14
[ 1378.994296] readahead-init0(process: mplayer/7352, file: sda1/libx264.so.55, offset=0:-1, ra=0+4-3) = 4
[ 1379.004907] readahead-around(process: mplayer/7352, file: sda1/libx264.so.55, offset=0:0, ra=112+32-0) = 18
[ 1379.010374] readahead-around(process: mplayer/7352, file: sda1/libx264.so.55, offset=0:0, ra=0+32-0) = 28
[ 1379.025175] readahead-init0(process: mplayer/7352, file: sda1/libsmbclient.so.0.1, offset=0:-1, ra=0+4-3) = 4
[ 1379.040139] readahead-around(process: mplayer/7352, file: sda1/libsmbclient.so.0.1, offset=0:0, ra=530+32-0) = 17
[ 1379.043905] readahead-around(process: mplayer/7352, file: sda1/libsmbclient.so.0.1, offset=0:535, ra=0+32-0) = 28
[ 1379.044276] readahead-around(process: mplayer/7352, file: sda1/libsmbclient.so.0.1, offset=0:8, ra=49+32-0) = 32
[ 1379.083560] readahead-init0(process: mplayer/7352, file: sda1/libungif.so.4.1.4, offset=0:-1, ra=0+4-3) = 4
[ 1379.088050] readahead-around(process: mplayer/7352, file: sda1/libungif.so.4.1.4, offset=0:0, ra=0+32-0) = 4
[ 1379.095605] readahead-init0(process: mplayer/7352, file: sda1/libcdda_interface.so.0.10.0, offset=0:-1, ra=0+4-3) = 4
[ 1379.100462] readahead-around(process: mplayer/7352, file: sda1/libcdda_interface.so.0.10.0, offset=0:0, ra=0+32-0) = 12
[ 1379.100889] readahead-init0(process: mplayer/7352, file: sda1/libcdda_paranoia.so.0.10.0, offset=0:-1, ra=0+4-3) = 4
[ 1379.108911] readahead-around(process: mplayer/7352, file: sda1/libcdda_paranoia.so.0.10.0, offset=0:0, ra=0+32-0) = 4
[ 1379.110094] readahead-init0(process: mplayer/7352, file: sda1/libfribidi.so.0.0.0, offset=0:-1, ra=0+4-3) = 4
[ 1379.111707] readahead-around(process: mplayer/7352, file: sda1/libfribidi.so.0.0.0, offset=0:0, ra=0+32-0) = 11
[ 1379.116159] readahead-init0(process: mplayer/7352, file: sda1/libspeex.so.1.2.0, offset=0:-1, ra=0+4-3) = 4
[ 1379.134065] readahead-around(process: mplayer/7352, file: sda1/libspeex.so.1.2.0, offset=0:0, ra=18+32-0) = 17
[ 1379.137322] readahead-init0(process: mplayer/7352, file: sda1/libtheora.so.0.2.0, offset=0:-1, ra=0+4-3) = 4
[ 1379.137976] readahead-around(process: mplayer/7352, file: sda1/libtheora.so.0.2.0, offset=0:0, ra=33+32-0) = 18
[ 1379.141476] readahead-init0(process: mplayer/7352, file: sda1/libmpcdec.so.3.1.1, offset=0:-1, ra=0+4-3) = 4
[ 1379.150304] readahead-around(process: mplayer/7352, file: sda1/libmpcdec.so.3.1.1, offset=0:0, ra=0+32-0) = 10
[ 1379.151400] readahead-init0(process: mplayer/7352, file: sda1/libamrnb.so.2.0.0, offset=0:-1, ra=0+4-3) = 4
[ 1379.169518] readahead-around(process: mplayer/7352, file: sda1/libamrnb.so.2.0.0, offset=0:0, ra=44+32-0) = 17
[ 1379.171870] readahead-init0(process: mplayer/7352, file: sda1/libamrwb.so.2.0.0, offset=0:-1, ra=0+4-3) = 4
[ 1379.172558] readahead-around(process: mplayer/7352, file: sda1/libamrwb.so.2.0.0, offset=0:0, ra=28+32-0) = 17
[ 1379.179794] readahead-init0(process: mplayer/7352, file: sda1/libdv.so.4.0.3, offset=0:-1, ra=0+4-3) = 4
[ 1379.196072] readahead-around(process: mplayer/7352, file: sda1/libdv.so.4.0.3, offset=0:0, ra=13+32-0) = 17
[ 1379.209467] readahead-init0(process: mplayer/7352, file: sda1/libxvidcore.so.4.1, offset=0:-1, ra=0+4-3) = 4
[ 1379.210581] readahead-around(process: mplayer/7352, file: sda1/libxvidcore.so.4.1, offset=0:0, ra=115+32-0) = 18
[ 1379.225045] readahead-init0(process: mplayer/7352, file: sda1/liblirc_client.so.0.1.0, offset=0:-1, ra=0+4-3) = 4
[ 1379.229523] readahead-around(process: mplayer/7352, file: sda1/liblirc_client.so.0.1.0, offset=0:0, ra=0+32-0) = 2
[ 1379.230907] readahead-init0(process: mplayer/7352, file: sda1/libdirect-0.9.so.25.0.0, offset=0:-1, ra=0+4-3) = 4
[ 1379.237679] readahead-around(process: mplayer/7352, file: sda1/libdirect-0.9.so.25.0.0, offset=0:0, ra=0+32-0) = 12
[ 1379.238163] readahead-init0(process: mplayer/7352, file: sda1/libfusion-0.9.so.25.0.0, offset=0:-1, ra=0+4-3) = 4
[ 1379.245010] readahead-around(process: mplayer/7352, file: sda1/libfusion-0.9.so.25.0.0, offset=0:0, ra=0+32-0) = 3
[ 1379.246950] readahead-init0(process: mplayer/7352, file: sda1/libXxf86vm.so.1.0.0, offset=0:-1, ra=0+4-3) = 4
[ 1379.255703] readahead-around(process: mplayer/7352, file: sda1/libXxf86vm.so.1.0.0, offset=0:0, ra=0+32-0) = 1

There are so many readahead-init0() calls... because ld-linux.so will
do a read(0+832) before doing mmap(in L1):

L0: open("/lib/libc.so.6", O_RDONLY)        = 3
L1: read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\340\342"..., 832) = 832
L2: fstat(3, {st_mode=S_IFREG|0755, st_size=1420624, ...}) = 0
L3: mmap(NULL, 3527256, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fac6e51d000
L4: mprotect(0x7fac6e671000, 2097152, PROT_NONE) = 0
L5: mmap(0x7fac6e871000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x154000) = 0x7fac6e871000
L6: mmap(0x7fac6e876000, 16984, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fac6e876000
L7: close(3)                                = 0


I cannot think of a good solution to it. Teaching ld-linux.so to blindly
do a fadvise(128KB) looks bad. And the kernel can do little about it.

This is also the major reason I disabled the interleaved readahead
support for mmap reads. Otherwise the PG_readahead flag leaved by
ld-linux.so will trigger _small_ interleaved readahead like this:

readahead-interleaved(process: firefox-bin/4596, file: sda1/libmozjs.so, offset=0, ra=4+6-6) = 6

It would be a much larger read-around if we don't do that readahead ;-)

Fengguang


  reply	other threads:[~2007-12-18 11:46 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-16 11:59 [PATCH 0/9] mmap read-around and readahead Fengguang Wu
2007-12-16 11:59 ` Fengguang Wu
2007-12-16 23:35   ` Linus Torvalds
2007-12-18 11:46     ` Fengguang Wu [this message]
2007-12-18 11:46       ` Fengguang Wu
     [not found]     ` <20071218114609.GA27778@mail.ustc.edu.cn>
2007-12-18 12:13       ` Fengguang Wu
2007-12-18 12:13         ` Fengguang Wu
2007-12-19  7:37     ` Fengguang Wu
2007-12-19  7:37       ` Fengguang Wu
2007-12-16 11:59 ` [PATCH 1/9] readahead: simplify readahead call scheme Fengguang Wu
2007-12-16 11:59   ` Fengguang Wu
2007-12-16 11:59 ` [PATCH 2/9] readahead: clean up and simplify the code for filemap page fault readahead Fengguang Wu
2007-12-16 11:59   ` Fengguang Wu
2007-12-18  8:19   ` Nick Piggin
2007-12-18 11:50     ` Fengguang Wu
2007-12-18 11:50       ` Fengguang Wu
2007-12-18 23:54       ` Nick Piggin
2007-12-19  6:55         ` Fengguang Wu
2007-12-19  6:55           ` Fengguang Wu
2007-12-16 11:59 ` [PATCH 3/9] readahead: auto detection of sequential mmap reads Fengguang Wu
2007-12-16 11:59   ` Fengguang Wu
2007-12-16 11:59 ` [PATCH 4/9] readahead: quick startup on sequential mmap readahead Fengguang Wu
2007-12-16 11:59   ` Fengguang Wu
2007-12-16 11:59 ` [PATCH 5/9] readahead: make ra_submit() non-static Fengguang Wu
2007-12-16 11:59   ` Fengguang Wu
2007-12-16 11:59 ` [PATCH 6/9] readahead: save mmap read-around states in file_ra_state Fengguang Wu
2007-12-16 11:59   ` Fengguang Wu
2007-12-16 11:59 ` [PATCH 7/9] readahead: remove unused do_page_cache_readahead() Fengguang Wu
2007-12-16 11:59   ` Fengguang Wu
2007-12-16 11:59 ` [PATCH 8/9] readahead: move max_sane_readahead() calls into force_page_cache_readahead() Fengguang Wu
2007-12-16 11:59   ` Fengguang Wu
2007-12-16 11:59 ` [PATCH 9/9] readahead: call max_sane_readahead() in ondemand_readahead() Fengguang Wu
2007-12-16 11:59   ` Fengguang Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=397978379.15961@ustc.edu.cn \
    --to=wfg@mail.ustc.edu.cn \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@suse.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.