From: Wu Fengguang <wfg@mail.ustc.edu.cn>
To: Linda Walsh <lkml@tlinx.org>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [RFC] kernel facilities for cache prefetching
Date: Thu, 4 May 2006 20:12:12 +0800 [thread overview]
Message-ID: <346744728.01465@ustc.edu.cn> (raw)
Message-ID: <20060504121212.GA6008@mail.ustc.edu.cn> (raw)
In-Reply-To: <44592491.4060503@tlinx.org>
On Wed, May 03, 2006 at 02:45:53PM -0700, Linda Walsh wrote:
> 1. As you mention; reading files "sequentially" through the file
> system is "bad" for several reasons. Areas of interest:
> a) don't go through the file system. Don't waste time doing
> directory lookups and following file-allocation maps; Instead,
> use raw-disk i/o and read sectors in using device & block number.
Sorry, it does not fit in the linux's cache model.
> b) Be "dynamic"; "Trace" (record (dev&blockno/range) blocks
> starting ASAP after system boot and continuing for some "configurable"
> number of seconds past reaching the desired "run-level" (coinciding with
> initial disk quiescence). Save as "configurable" (~6-8?) number of
> traces to allow finding the common initial subset of blocks needed.
It is a alternative way of doing the same job: more precise, with more
complexity and more overhead. However the 'blockno' way is not so
tasteful.
> c) Allow specification of max# of blocks and max number of "sections"
> (discontiguous areas on disk);
Good point, will do it in my work.
> d) "Ideally", would have a way to "defrag" the common set of blocks.
> I.e. -- moving the needed blocks from potentially disparate areas of
> files into 1 or 2 contiguous areas, hopefully near the beginning of
> the disk (or partition(s)).
> That's the area of "boot" pre-caching.
I guess poor man's defrag would be good enough for the seeking storm.
> Next is doing something similar for "application" starts. Start tracing
> when an application is loaded & observe what blocks are requested for
> that app for the first 20 ("configurable") seconds of execution. Store
> traces on a per-application basis. Again, it would be ideal if the
> different files (blocks, really), needed by an application could be
> grouped so that sequentially needed disk-blocks are stored sequentially
> on disk (this _could_ imply the containing files are not contiguous).
>
> Essentially, one wants to do for applications, the same thing one does
> for booting. On small applications, the benefit would likely be negligible,
> but on loading a large app like a windowing system, IDE, or database app,
> multiple configuration files could be read into the cache in one large
> read.
>
> That's "application" pre-caching.
Yes, it is a planned feature, will do it.
> A third area -- that can't be easily done in the kernel, but would
> require a higher skill level on the part of application and library
> developers, is to move towards using "delay-loaded" libraries. In
> Windows, it seems common among system libraries to use this feature.
> An obvious benefit -- if certain features of a program are not used,
> the associated libraries are never loaded. Not loading unneeded parts
> of a program should speed up initial application load time, significantly.
Linux already does lazy loading for linked libs. The only one pitfall
is that /lib/ld-linux.so.2 seems to touch the first 512B data of every
libs before doing mmap():
% strace date
execve("/bin/date", ["date"], [/* 41 vars */]) = 0
uname({sys="Linux", node="lark", ...}) = 0
brk(0) = 0x8053000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f0e000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f0d000
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=77643, ...}) = 0
mmap2(NULL, 77643, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7efa000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/tls/librt.so.1", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0`\35\0\000"..., 512) = 512
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~HERE~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fstat64(3, {st_mode=S_IFREG|0644, st_size=30612, ...}) = 0
mmap2(NULL, 29264, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7ef2000
mmap2(0xb7ef8000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x6) = 0xb7ef8000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/tls/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\260O\1"..., 512) = 512
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~HERE~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fstat64(3, {st_mode=S_IFREG|0755, st_size=1270928, ...}) = 0
mmap2(NULL, 1276892, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7dba000
mmap2(0xb7ee8000, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x12e) = 0xb7ee8000
mmap2(0xb7ef0000, 7132, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7ef0000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/tls/libpthread.so.0", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\340G\0"..., 512) = 512
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~HERE~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fstat64(3, {st_mode=S_IFREG|0755, st_size=85770, ...}) = 0
mmap2(NULL, 70104, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7da8000
mmap2(0xb7db6000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xd) = 0xb7db6000
mmap2(0xb7db8000, 4568, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7db8000
close(3) = 0
They can lead to more seeks, and also disturb the readahead logic much.
Wu
next prev parent reply other threads:[~2006-05-04 12:38 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-05-02 7:50 [RFC] kernel facilities for cache prefetching Wu Fengguang
2006-05-02 7:50 ` Wu Fengguang
2006-05-02 12:46 ` Diego Calleja
2006-05-02 14:42 ` Wu Fengguang
2006-05-02 14:42 ` Wu Fengguang
2006-05-02 16:07 ` Diego Calleja
2006-05-03 6:45 ` Wu Fengguang
2006-05-03 6:45 ` Wu Fengguang
2006-05-03 18:14 ` Diego Calleja
2006-05-03 23:39 ` Zan Lynx
2006-05-04 1:37 ` Diego Calleja
2006-05-02 15:55 ` Linus Torvalds
2006-05-02 16:35 ` Andi Kleen
2006-05-03 4:11 ` Wu Fengguang
2006-05-03 4:11 ` Wu Fengguang
2006-05-03 17:28 ` Badari Pulavarty
[not found] ` <346733486.30800@ustc.edu.cn>
2006-05-04 15:03 ` Linus Torvalds
2006-05-04 16:57 ` Badari Pulavarty
2006-05-05 14:44 ` Wu Fengguang
2006-05-05 14:44 ` Wu Fengguang
2006-05-03 7:13 ` Wu Fengguang
2006-05-03 7:13 ` Wu Fengguang
2006-05-03 12:59 ` Nikita Danilov
2006-05-03 22:20 ` Rik van Riel
2006-05-06 1:11 ` Wu Fengguang
2006-05-06 1:11 ` Wu Fengguang
2006-05-04 0:28 ` Linda Walsh
2006-05-04 1:31 ` Linus Torvalds
2006-05-04 7:08 ` Ph. Marek
2006-05-04 7:33 ` Arjan van de Ven
2006-05-04 12:14 ` Wu Fengguang
2006-05-04 12:14 ` Wu Fengguang
2006-05-04 12:34 ` Arjan van de Ven
2006-05-03 21:45 ` Linda Walsh
2006-05-04 12:12 ` Wu Fengguang [this message]
2006-05-04 12:12 ` Wu Fengguang
2006-05-04 18:57 ` Linda Walsh
2006-05-05 15:20 ` Wu Fengguang
2006-05-05 15:20 ` Wu Fengguang
2006-05-04 9:02 ` Helge Hafting
2006-05-02 7:58 ` Arjan van de Ven
2006-05-02 8:06 ` Wu Fengguang
2006-05-02 8:06 ` Wu Fengguang
2006-05-02 8:30 ` Arjan van de Ven
2006-05-02 8:53 ` Wu Fengguang
2006-05-02 8:53 ` Wu Fengguang
2006-05-06 6:49 ` Denis Vlasenko
2006-05-02 8:55 ` Arjan van de Ven
2006-05-02 11:39 ` Jan Engelhardt
2006-05-02 11:48 ` Wu Fengguang
2006-05-02 11:48 ` Wu Fengguang
2006-05-02 22:03 ` Dave Jones
2006-05-02 8:09 ` Jens Axboe
2006-05-02 8:20 ` Wu Fengguang
2006-05-02 8:20 ` Wu Fengguang
2006-05-03 22:05 ` Benjamin LaHaise
2006-05-02 19:10 ` Pavel Machek
2006-05-02 23:36 ` Nigel Cunningham
2006-05-03 2:35 ` Wu Fengguang
2006-05-03 2:35 ` Wu Fengguang
2006-05-03 2:32 ` Wu Fengguang
2006-05-03 2:32 ` Wu Fengguang
2006-05-03 7:19 ` Wu Fengguang
2006-05-03 7:19 ` Wu Fengguang
2006-05-04 12:28 ` Wu Fengguang
2006-05-04 12:28 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=346744728.01465@ustc.edu.cn \
--to=wfg@mail.ustc.edu.cn \
--cc=linux-kernel@vger.kernel.org \
--cc=lkml@tlinx.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.