From: Wu Fengguang <wfg@mail.ustc.edu.cn>
To: Linus Torvalds <torvalds@osdl.org>
Cc: linux-kernel@vger.kernel.org, Andrew Morton <akpm@osdl.org>,
Jens Axboe <axboe@suse.de>, Nick Piggin <nickpiggin@yahoo.com.au>,
Badari Pulavarty <pbadari@us.ibm.com>
Subject: Re: [RFC] kernel facilities for cache prefetching
Date: Wed, 3 May 2006 15:13:25 +0800 [thread overview]
Message-ID: <346640383.02545@ustc.edu.cn> (raw)
Message-ID: <20060503071325.GC4781@mail.ustc.edu.cn> (raw)
In-Reply-To: <Pine.LNX.4.64.0605020832570.4086@g5.osdl.org>
On Tue, May 02, 2006 at 08:55:06AM -0700, Linus Torvalds wrote:
> So I would _seriously_ claim that the place to do all the statistics
> allocation is in anything that ends up having to call "->readpage()", and
> do it all on a virtual mapping level.
>
> Yes, it isn't perfect either (I'll mention some problems), but it's a
> _lot_ better. It means that when you gather the statistics, you can see
> the actual _files_ and offsets being touched. You can even get the
> filenames by following the address space -> inode -> i_dentry list.
>
> This is important for several reasons:
> (a) it makes it a hell of a lot more readable, and the user gets a
> lot more information that may make him see the higher-level issues
> involved.
> (b) it's in the form that we cache things, so if you read-ahead in
> that form, you'll actually get real information.
> (c) it's in a form where you can actually _do_ something about things
> like fragmentation etc ("Oh, I could move these files all to a
> separate area")
There have been two alternatives for me:
1) static/passive interface i.e. the /proc/filecache querier
- user-land tools request to dump the cache contents on demand
2) dynamic/active interface i.e. the readpage() logger
- user-land daemon accepts live page access/io activities
> Now, admittedly it has a few downsides:
>
> - right now "readpage()" is called in several places, and you'd have to
> create some kind of nice wrapper for the most common
> "mapping->a_ops->readpage()" thing and hook into there to avoid
> duplicating the effort.
>
> Alternatively, you could decide that you only want to do this at the
> filesystem level, which actually simplifies some things. If you
> instrument "mpage_readpage[2]()", you'll already get several of the
> ones you care about, and you could do the others individually.
>
> [ As a third alternative, you might decide that the only thing you
> actually care about is when you have to wait on a locked page, and
> instrument the page wait-queues instead. ]
>
> - it will miss any situation where a filesystem does a read some other
> way. Notably, in many loads, the _directory_ accesses are the important
> ones, and if you want statistics for those you'd often have to do that
> separately (not always - some of the filesystems just use the same
> page reading stuff).
>
> The downsides basically boil down to the fact that it's not as clearly
> just one single point. You can't just look at the request queue and see
> what physical requests go out.
Good insights.
The readpage() activities logging idea has been appealing for me.
We might even go further to log mark_page_accessed() calls for more
information.
This approach is more precise, and provides process/page
correlations and time info that the /proc/filecache interface cannot
provide. Though it involves more complexity and overhead(for me they
mean the possibility of being rejected:).
Wu
next prev parent reply other threads:[~2006-05-03 7:13 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-05-02 7:50 [RFC] kernel facilities for cache prefetching Wu Fengguang
2006-05-02 7:50 ` Wu Fengguang
2006-05-02 12:46 ` Diego Calleja
2006-05-02 14:42 ` Wu Fengguang
2006-05-02 14:42 ` Wu Fengguang
2006-05-02 16:07 ` Diego Calleja
2006-05-03 6:45 ` Wu Fengguang
2006-05-03 6:45 ` Wu Fengguang
2006-05-03 18:14 ` Diego Calleja
2006-05-03 23:39 ` Zan Lynx
2006-05-04 1:37 ` Diego Calleja
2006-05-02 15:55 ` Linus Torvalds
2006-05-02 16:35 ` Andi Kleen
2006-05-03 4:11 ` Wu Fengguang
2006-05-03 4:11 ` Wu Fengguang
2006-05-03 17:28 ` Badari Pulavarty
[not found] ` <346733486.30800@ustc.edu.cn>
2006-05-04 15:03 ` Linus Torvalds
2006-05-04 16:57 ` Badari Pulavarty
2006-05-05 14:44 ` Wu Fengguang
2006-05-05 14:44 ` Wu Fengguang
2006-05-03 7:13 ` Wu Fengguang [this message]
2006-05-03 7:13 ` Wu Fengguang
2006-05-03 12:59 ` Nikita Danilov
2006-05-03 22:20 ` Rik van Riel
2006-05-06 1:11 ` Wu Fengguang
2006-05-06 1:11 ` Wu Fengguang
2006-05-04 0:28 ` Linda Walsh
2006-05-04 1:31 ` Linus Torvalds
2006-05-04 7:08 ` Ph. Marek
2006-05-04 7:33 ` Arjan van de Ven
2006-05-04 12:14 ` Wu Fengguang
2006-05-04 12:14 ` Wu Fengguang
2006-05-04 12:34 ` Arjan van de Ven
2006-05-03 21:45 ` Linda Walsh
2006-05-04 12:12 ` Wu Fengguang
2006-05-04 12:12 ` Wu Fengguang
2006-05-04 18:57 ` Linda Walsh
2006-05-05 15:20 ` Wu Fengguang
2006-05-05 15:20 ` Wu Fengguang
2006-05-04 9:02 ` Helge Hafting
2006-05-02 7:58 ` Arjan van de Ven
2006-05-02 8:06 ` Wu Fengguang
2006-05-02 8:06 ` Wu Fengguang
2006-05-02 8:30 ` Arjan van de Ven
2006-05-02 8:53 ` Wu Fengguang
2006-05-02 8:53 ` Wu Fengguang
2006-05-06 6:49 ` Denis Vlasenko
2006-05-02 8:55 ` Arjan van de Ven
2006-05-02 11:39 ` Jan Engelhardt
2006-05-02 11:48 ` Wu Fengguang
2006-05-02 11:48 ` Wu Fengguang
2006-05-02 22:03 ` Dave Jones
2006-05-02 8:09 ` Jens Axboe
2006-05-02 8:20 ` Wu Fengguang
2006-05-02 8:20 ` Wu Fengguang
2006-05-03 22:05 ` Benjamin LaHaise
2006-05-02 19:10 ` Pavel Machek
2006-05-02 23:36 ` Nigel Cunningham
2006-05-03 2:35 ` Wu Fengguang
2006-05-03 2:35 ` Wu Fengguang
2006-05-03 2:32 ` Wu Fengguang
2006-05-03 2:32 ` Wu Fengguang
2006-05-03 7:19 ` Wu Fengguang
2006-05-03 7:19 ` Wu Fengguang
2006-05-04 12:28 ` Wu Fengguang
2006-05-04 12:28 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=346640383.02545@ustc.edu.cn \
--to=wfg@mail.ustc.edu.cn \
--cc=akpm@osdl.org \
--cc=axboe@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=nickpiggin@yahoo.com.au \
--cc=pbadari@us.ibm.com \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.