From: Wu Fengguang <wfg@mail.ustc.edu.cn>
To: Diego Calleja <diegocg@gmail.com>
Cc: linux-kernel@vger.kernel.org, torvalds@osdl.org, akpm@osdl.org,
axboe@suse.de, nickpiggin@yahoo.com.au, pbadari@us.ibm.com,
arjan@infradead.org
Subject: Re: [RFC] kernel facilities for cache prefetching
Date: Tue, 2 May 2006 22:42:03 +0800 [thread overview]
Message-ID: <346580906.19175@ustc.edu.cn> (raw)
Message-ID: <20060502144203.GA10594@mail.ustc.edu.cn> (raw)
In-Reply-To: <20060502144641.62df9c18.diegocg@gmail.com>
Diego,
On Tue, May 02, 2006 at 02:46:41PM +0200, Diego Calleja wrote:
> El Tue, 2 May 2006 15:50:49 +0800,
> Wu Fengguang <wfg@mail.ustc.edu.cn> escribió:
>
> > 2) kernel module to query the file cache
>
> Can't mincore() + /proc/$PID/* stuff be used to replace that ?
Nope. mincore() only provides info about files that are currently
opened, by the process itself. The majority in the file cache are
closed files.
> Improving boot time is nice and querying the file cache would work
> for that, but improving the boot time of some programs once the system
> is running (ie: running openoffice 6 hours after booting) is something
> that other preloaders do in other OSes aswell, querying the full file
> cache wouldn't be that useful for such cases.
Yes, it can still be useful after booting :) One can get the cache
footprint of any task started at any time by taking snapshots of the
cache before and after the task, and do a set-subtract on them.
> The main reason why I believe that the pure userspace (preload.sf.net)
> solution slows down in some cases is becauses it uses bayesian heuristics
> (!) as a magic ball to guess the future, which is a flawed idea IMHO.
> I started (but didn't finish) a preloader which uses the process event
> connector to get notifications of what processes are being launched,
> then it profiles it (using mincore(), /proc/$PID/* stuff, etc) and
> preloads things optimally the next time it gets a notification of the
> same app.
My thought is that any prefetcher that ignores the thousands of small
data files is incomplete. Only kernel can provide that info.
> Mac OS X has a program that implements your idea, available (the sources)
> at http://darwinsource.opendarwin.org/projects/apsl/BootCache-25/
Ah, thanks for mentioning it.
It seems to do the job mostly in kernel, comprising of 2400+ LOC.
Whereas my plan is to write a module of ~300 LOC to _only_ provide the
necessary info, and leave other jobs to smart user-land tools.
Wu
next prev parent reply other threads:[~2006-05-02 14:41 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20060502075049.GA5000@mail.ustc.edu.cn>
2006-05-02 7:50 ` [RFC] kernel facilities for cache prefetching Wu Fengguang
2006-05-02 12:46 ` Diego Calleja
[not found] ` <20060502144203.GA10594@mail.ustc.edu.cn>
2006-05-02 14:42 ` Wu Fengguang [this message]
2006-05-02 16:07 ` Diego Calleja
[not found] ` <20060503064503.GA4781@mail.ustc.edu.cn>
2006-05-03 6:45 ` Wu Fengguang
2006-05-03 18:14 ` Diego Calleja
2006-05-03 23:39 ` Zan Lynx
2006-05-04 1:37 ` Diego Calleja
2006-05-02 15:55 ` Linus Torvalds
2006-05-02 16:35 ` Andi Kleen
[not found] ` <20060503071325.GC4781@mail.ustc.edu.cn>
2006-05-03 7:13 ` Wu Fengguang
2006-05-03 12:59 ` Nikita Danilov
[not found] ` <20060503041106.GC5915@mail.ustc.edu.cn>
2006-05-03 4:11 ` Wu Fengguang
2006-05-03 17:28 ` Badari Pulavarty
[not found] ` <346733486.30800@ustc.edu.cn>
2006-05-04 15:03 ` Linus Torvalds
2006-05-04 16:57 ` Badari Pulavarty
[not found] ` <20060505144451.GA6134@mail.ustc.edu.cn>
2006-05-05 14:44 ` Wu Fengguang
2006-05-03 22:20 ` Rik van Riel
[not found] ` <20060506011125.GA9099@mail.ustc.edu.cn>
2006-05-06 1:11 ` Wu Fengguang
2006-05-04 0:28 ` Linda Walsh
2006-05-04 1:31 ` Linus Torvalds
2006-05-04 7:08 ` Ph. Marek
2006-05-04 7:33 ` Arjan van de Ven
[not found] ` <20060504121454.GB6008@mail.ustc.edu.cn>
2006-05-04 12:14 ` Wu Fengguang
2006-05-04 12:34 ` Arjan van de Ven
2006-05-03 21:45 ` Linda Walsh
[not found] ` <20060504121212.GA6008@mail.ustc.edu.cn>
2006-05-04 12:12 ` Wu Fengguang
2006-05-04 18:57 ` Linda Walsh
[not found] ` <20060505152007.GB6134@mail.ustc.edu.cn>
2006-05-05 15:20 ` Wu Fengguang
2006-05-04 9:02 ` Helge Hafting
2006-05-02 7:58 ` Arjan van de Ven
[not found] ` <20060502080619.GA5406@mail.ustc.edu.cn>
2006-05-02 8:06 ` Wu Fengguang
2006-05-02 8:30 ` Arjan van de Ven
[not found] ` <20060502085325.GA9190@mail.ustc.edu.cn>
2006-05-02 8:53 ` Wu Fengguang
2006-05-06 6:49 ` Denis Vlasenko
2006-05-02 8:55 ` Arjan van de Ven
2006-05-02 11:39 ` Jan Engelhardt
[not found] ` <20060502114853.GA9983@mail.ustc.edu.cn>
2006-05-02 11:48 ` Wu Fengguang
2006-05-02 22:03 ` Dave Jones
2006-05-02 8:09 ` Jens Axboe
[not found] ` <20060502082009.GA9038@mail.ustc.edu.cn>
2006-05-02 8:20 ` Wu Fengguang
2006-05-03 22:05 ` Benjamin LaHaise
2006-05-02 19:10 ` Pavel Machek
2006-05-02 23:36 ` Nigel Cunningham
[not found] ` <20060503023505.GB5915@mail.ustc.edu.cn>
2006-05-03 2:35 ` Wu Fengguang
[not found] ` <20060503023223.GA5915@mail.ustc.edu.cn>
2006-05-03 2:32 ` Wu Fengguang
[not found] ` <20060503071948.GD4781@mail.ustc.edu.cn>
2006-05-03 7:19 ` Wu Fengguang
[not found] ` <20060504122830.GA6205@mail.ustc.edu.cn>
2006-05-04 12:28 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=346580906.19175@ustc.edu.cn \
--to=wfg@mail.ustc.edu.cn \
--cc=akpm@osdl.org \
--cc=arjan@infradead.org \
--cc=axboe@suse.de \
--cc=diegocg@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nickpiggin@yahoo.com.au \
--cc=pbadari@us.ibm.com \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox