From: Wu Fengguang <wfg@mail.ustc.edu.cn>
To: Linda Walsh <lkml@tlinx.org>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [RFC] kernel facilities for cache prefetching
Date: Thu, 4 May 2006 20:12:12 +0800 [thread overview]
Message-ID: <346744728.01465@ustc.edu.cn> (raw)
Message-ID: <20060504121212.GA6008@mail.ustc.edu.cn> (raw)
In-Reply-To: <44592491.4060503@tlinx.org>
On Wed, May 03, 2006 at 02:45:53PM -0700, Linda Walsh wrote:
> 1. As you mention; reading files "sequentially" through the file
> system is "bad" for several reasons. Areas of interest:
> a) don't go through the file system. Don't waste time doing
> directory lookups and following file-allocation maps; Instead,
> use raw-disk i/o and read sectors in using device & block number.
Sorry, it does not fit in the linux's cache model.
> b) Be "dynamic"; "Trace" (record (dev&blockno/range) blocks
> starting ASAP after system boot and continuing for some "configurable"
> number of seconds past reaching the desired "run-level" (coinciding with
> initial disk quiescence). Save as "configurable" (~6-8?) number of
> traces to allow finding the common initial subset of blocks needed.
It is a alternative way of doing the same job: more precise, with more
complexity and more overhead. However the 'blockno' way is not so
tasteful.
> c) Allow specification of max# of blocks and max number of "sections"
> (discontiguous areas on disk);
Good point, will do it in my work.
> d) "Ideally", would have a way to "defrag" the common set of blocks.
> I.e. -- moving the needed blocks from potentially disparate areas of
> files into 1 or 2 contiguous areas, hopefully near the beginning of
> the disk (or partition(s)).
> That's the area of "boot" pre-caching.
I guess poor man's defrag would be good enough for the seeking storm.
> Next is doing something similar for "application" starts. Start tracing
> when an application is loaded & observe what blocks are requested for
> that app for the first 20 ("configurable") seconds of execution. Store
> traces on a per-application basis. Again, it would be ideal if the
> different files (blocks, really), needed by an application could be
> grouped so that sequentially needed disk-blocks are stored sequentially
> on disk (this _could_ imply the containing files are not contiguous).
>
> Essentially, one wants to do for applications, the same thing one does
> for booting. On small applications, the benefit would likely be negligible,
> but on loading a large app like a windowing system, IDE, or database app,
> multiple configuration files could be read into the cache in one large
> read.
>
> That's "application" pre-caching.
Yes, it is a planned feature, will do it.
> A third area -- that can't be easily done in the kernel, but would
> require a higher skill level on the part of application and library
> developers, is to move towards using "delay-loaded" libraries. In
> Windows, it seems common among system libraries to use this feature.
> An obvious benefit -- if certain features of a program are not used,
> the associated libraries are never loaded. Not loading unneeded parts
> of a program should speed up initial application load time, significantly.
Linux already does lazy loading for linked libs. The only one pitfall
is that /lib/ld-linux.so.2 seems to touch the first 512B data of every
libs before doing mmap():
% strace date
execve("/bin/date", ["date"], [/* 41 vars */]) = 0
uname({sys="Linux", node="lark", ...}) = 0
brk(0) = 0x8053000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f0e000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f0d000
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=77643, ...}) = 0
mmap2(NULL, 77643, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7efa000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/tls/librt.so.1", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0`\35\0\000"..., 512) = 512
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~HERE~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fstat64(3, {st_mode=S_IFREG|0644, st_size=30612, ...}) = 0
mmap2(NULL, 29264, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7ef2000
mmap2(0xb7ef8000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x6) = 0xb7ef8000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/tls/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\260O\1"..., 512) = 512
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~HERE~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fstat64(3, {st_mode=S_IFREG|0755, st_size=1270928, ...}) = 0
mmap2(NULL, 1276892, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7dba000
mmap2(0xb7ee8000, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x12e) = 0xb7ee8000
mmap2(0xb7ef0000, 7132, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7ef0000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/tls/libpthread.so.0", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\340G\0"..., 512) = 512
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~HERE~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fstat64(3, {st_mode=S_IFREG|0755, st_size=85770, ...}) = 0
mmap2(NULL, 70104, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7da8000
mmap2(0xb7db6000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xd) = 0xb7db6000
mmap2(0xb7db8000, 4568, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7db8000
close(3) = 0
They can lead to more seeks, and also disturb the readahead logic much.
Wu
next prev parent reply other threads:[~2006-05-04 12:38 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20060502075049.GA5000@mail.ustc.edu.cn>
2006-05-02 7:50 ` [RFC] kernel facilities for cache prefetching Wu Fengguang
2006-05-02 12:46 ` Diego Calleja
[not found] ` <20060502144203.GA10594@mail.ustc.edu.cn>
2006-05-02 14:42 ` Wu Fengguang
2006-05-02 16:07 ` Diego Calleja
[not found] ` <20060503064503.GA4781@mail.ustc.edu.cn>
2006-05-03 6:45 ` Wu Fengguang
2006-05-03 18:14 ` Diego Calleja
2006-05-03 23:39 ` Zan Lynx
2006-05-04 1:37 ` Diego Calleja
2006-05-02 15:55 ` Linus Torvalds
2006-05-02 16:35 ` Andi Kleen
[not found] ` <20060503071325.GC4781@mail.ustc.edu.cn>
2006-05-03 7:13 ` Wu Fengguang
2006-05-03 12:59 ` Nikita Danilov
[not found] ` <20060503041106.GC5915@mail.ustc.edu.cn>
2006-05-03 4:11 ` Wu Fengguang
2006-05-03 17:28 ` Badari Pulavarty
[not found] ` <346733486.30800@ustc.edu.cn>
2006-05-04 15:03 ` Linus Torvalds
2006-05-04 16:57 ` Badari Pulavarty
[not found] ` <20060505144451.GA6134@mail.ustc.edu.cn>
2006-05-05 14:44 ` Wu Fengguang
2006-05-03 22:20 ` Rik van Riel
[not found] ` <20060506011125.GA9099@mail.ustc.edu.cn>
2006-05-06 1:11 ` Wu Fengguang
2006-05-04 0:28 ` Linda Walsh
2006-05-04 1:31 ` Linus Torvalds
2006-05-04 7:08 ` Ph. Marek
2006-05-04 7:33 ` Arjan van de Ven
[not found] ` <20060504121454.GB6008@mail.ustc.edu.cn>
2006-05-04 12:14 ` Wu Fengguang
2006-05-04 12:34 ` Arjan van de Ven
2006-05-03 21:45 ` Linda Walsh
[not found] ` <20060504121212.GA6008@mail.ustc.edu.cn>
2006-05-04 12:12 ` Wu Fengguang [this message]
2006-05-04 18:57 ` Linda Walsh
[not found] ` <20060505152007.GB6134@mail.ustc.edu.cn>
2006-05-05 15:20 ` Wu Fengguang
2006-05-04 9:02 ` Helge Hafting
2006-05-02 7:58 ` Arjan van de Ven
[not found] ` <20060502080619.GA5406@mail.ustc.edu.cn>
2006-05-02 8:06 ` Wu Fengguang
2006-05-02 8:30 ` Arjan van de Ven
[not found] ` <20060502085325.GA9190@mail.ustc.edu.cn>
2006-05-02 8:53 ` Wu Fengguang
2006-05-06 6:49 ` Denis Vlasenko
2006-05-02 8:55 ` Arjan van de Ven
2006-05-02 11:39 ` Jan Engelhardt
[not found] ` <20060502114853.GA9983@mail.ustc.edu.cn>
2006-05-02 11:48 ` Wu Fengguang
2006-05-02 22:03 ` Dave Jones
2006-05-02 8:09 ` Jens Axboe
[not found] ` <20060502082009.GA9038@mail.ustc.edu.cn>
2006-05-02 8:20 ` Wu Fengguang
2006-05-03 22:05 ` Benjamin LaHaise
2006-05-02 19:10 ` Pavel Machek
2006-05-02 23:36 ` Nigel Cunningham
[not found] ` <20060503023505.GB5915@mail.ustc.edu.cn>
2006-05-03 2:35 ` Wu Fengguang
[not found] ` <20060503023223.GA5915@mail.ustc.edu.cn>
2006-05-03 2:32 ` Wu Fengguang
[not found] ` <20060503071948.GD4781@mail.ustc.edu.cn>
2006-05-03 7:19 ` Wu Fengguang
[not found] ` <20060504122830.GA6205@mail.ustc.edu.cn>
2006-05-04 12:28 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=346744728.01465@ustc.edu.cn \
--to=wfg@mail.ustc.edu.cn \
--cc=linux-kernel@vger.kernel.org \
--cc=lkml@tlinx.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox