From: Daniel Phillips <phillips@bonn-fries.net>
To: Richard Gooch <rgooch@ras.ucalgary.ca>,
Linus Torvalds <torvalds@transmeta.com>
Cc: Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Getting FS access events
Date: Mon, 14 May 2001 15:04:31 +0200 [thread overview]
Message-ID: <01051415043103.02742@starship> (raw)
In-Reply-To: <200105140224.f4E2OiE08257@vindaloo.ras.ucalgary.ca> <Pine.LNX.4.21.0105132139590.21224-100000@penguin.transmeta.com> <200105140515.f4E5FwP10245@vindaloo.ras.ucalgary.ca>
In-Reply-To: <200105140515.f4E5FwP10245@vindaloo.ras.ucalgary.ca>
On Monday 14 May 2001 07:15, Richard Gooch wrote:
> Linus Torvalds writes:
> > But sure, you can use bmap if you want. It would be interesting to
> > hear whether it makes much of a difference..
>
> I doubt bmap() would make any difference if there is a way of
> controlling when the I/O starts.
>
> However, this still doesn't address the issue of indirect blocks. If
> the indirect block has a higher bnum than the data blocks it points
> to, you've got a costly seek. This is why I'm still attracted to the
> idea of doing this at the block device layer. It's easy to capture
> *all* accesses and then warm the buffer cache.
>
> So, why can't the page cache check if a block is in the buffer cache?
That's not quite what you want, if only because there won't be anything
in the buffer cache pretty soon. What we really want is a block cache,
tightly integrated with the page cache. Readahead with a block cache
would be more effective than our current file-based readahead. For
example, it handles the case where blocks of two files are interleaved.
Since we know that the page cache maps each block at most once, the
optimal thing to do would be to just move a pointer from the block
cache to the page cache whenever we can. Unfortunately the layering in
the VFS as it stands isn't friendly to this: typically we allocate a
page in generic_file_read long before we ask the filesystem to map it.
To test this zero-copy idea we'd need to replace generic_file_read and
for mmap, filemap_nopage.
But we don't need anything so fancy to try out your idea, we just need
a lvm-like device that can:
- Maintain a block cache
- Remap logical to physical blocks
- Record the block accesses
- Physically reorder the blocks according to the recorded order
- Load a given region of disk into the block cache on command
None of this has to be particularly general to get to the benchmarking
stage. E.g, the 'block cache' only needs to cache one physical region.
The central idea here is that you obviously can't do any better than to
have all the blocks you want to read at boot physically together on
disk.
The advantage of using this lvm-style remapping is, it will work for
any filesystem. The disadvantage is that the ordering is then cast in
stone - after the system is up it might not like the ordering you chose
for the boot, and the elevator will be completely confused ;-) But the
thing is, everything you need to measure the boot performance is
together in one place, just one device driver to write. Then once you
know what the perfect result is you have a yardstick to measure the
effectivenns of other, less intrusive approaches.
I took a look at the lvm and md code to see if there's a quick way to
press them into service for this test, and there probably is, but the
complexity there is daunting. I think starting with a clean sheet and
writing a new driver would be easier.
--
Daniel
next prev parent reply other threads:[~2001-05-14 13:12 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <200105140117.f4E1HqN07362@vindaloo.ras.ucalgary.ca>
2001-05-14 1:32 ` Getting FS access events Linus Torvalds
2001-05-14 1:45 ` Larry McVoy
2001-05-14 2:39 ` Richard Gooch
2001-05-14 3:09 ` Rik van Riel
2001-05-14 4:27 ` Richard Gooch
2001-05-15 4:37 ` Chris Wedgwood
2001-05-23 11:37 ` Stephen C. Tweedie
2001-05-14 2:24 ` Richard Gooch
2001-05-14 4:46 ` Linus Torvalds
2001-05-14 5:15 ` Richard Gooch
2001-05-14 13:04 ` Daniel Phillips [this message]
2001-05-14 18:00 ` Andreas Dilger
2001-05-14 20:16 ` Linus Torvalds
2001-05-14 23:19 ` Richard Gooch
2001-05-15 0:42 ` Daniel Phillips
2001-05-15 4:00 ` Linus Torvalds
2001-05-15 4:35 ` Larry McVoy
2001-05-15 4:57 ` David S. Miller
2001-05-15 5:12 ` Alexander Viro
2001-05-15 9:10 ` Alan Cox
2001-05-15 9:48 ` Lars Brinkhoff
2001-05-15 9:54 ` Alexander Viro
2001-05-15 20:17 ` Kai Henningsen
2001-05-15 20:58 ` Alexander Viro
2001-05-15 21:08 ` Alexander Viro
2001-05-15 4:59 ` Alexander Viro
2001-05-15 17:01 ` Pavel Machek
2001-05-15 4:43 ` Linus Torvalds
2001-05-15 5:04 ` Alexander Viro
2001-05-15 6:20 ` Richard Gooch
2001-05-15 6:28 ` Linus Torvalds
2001-05-15 6:49 ` Richard Gooch
2001-05-15 6:57 ` Alexander Viro
2001-05-15 10:33 ` Daniel Phillips
2001-05-15 10:44 ` Alexander Viro
2001-05-15 14:42 ` Daniel Phillips
2001-05-15 7:13 ` Linus Torvalds
2001-05-15 7:56 ` Chris Wedgwood
2001-05-15 8:06 ` Linus Torvalds
2001-05-15 8:33 ` Alexander Viro
2001-05-15 10:27 ` David Woodhouse
2001-05-15 16:00 ` Chris Mason
2001-05-15 19:26 ` H. Peter Anvin
2001-05-15 20:03 ` Alexander Viro
2001-05-15 20:07 ` H. Peter Anvin
2001-05-15 20:15 ` Alexander Viro
2001-05-15 20:17 ` H. Peter Anvin
2001-05-15 20:22 ` Alexander Viro
2001-05-15 20:26 ` H. Peter Anvin
2001-05-15 20:31 ` Alexander Viro
2001-05-15 21:12 ` Linus Torvalds
2001-05-15 21:22 ` H. Peter Anvin
2001-05-15 21:02 ` Linus Torvalds
2001-05-15 21:53 ` Jan Harkes
2001-05-19 5:26 ` Chris Wedgwood
2001-05-15 10:04 ` Anton Altaparmakov
2001-05-15 19:28 ` H. Peter Anvin
2001-05-15 22:31 ` Albert D. Cahalan
2001-05-15 22:35 ` H. Peter Anvin
2001-05-16 1:17 ` Anton Altaparmakov
2001-05-16 1:30 ` H. Peter Anvin
2001-05-16 8:34 ` Anton Altaparmakov
2001-05-16 16:27 ` H. Peter Anvin
2001-05-15 16:26 ` Pavel Machek
2001-05-15 18:02 ` Craig Milo Rogers
2001-05-15 16:17 ` Pavel Machek
2001-05-19 19:39 ` Linus Torvalds
2001-05-19 19:44 ` Pavel Machek
2001-05-19 19:47 ` Linus Torvalds
2001-05-23 11:29 ` Stephen C. Tweedie
2001-05-20 4:30 ` Chris Wedgwood
2001-05-20 19:47 ` Alan Cox
2001-05-18 7:55 ` Rogier Wolff
2001-05-23 11:36 ` Stephen C. Tweedie
2001-05-15 6:13 ` Richard Gooch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=01051415043103.02742@starship \
--to=phillips@bonn-fries.net \
--cc=linux-kernel@vger.kernel.org \
--cc=rgooch@ras.ucalgary.ca \
--cc=torvalds@transmeta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox