From: torvalds@transmeta.com (Linus Torvalds)
To: linux-kernel@vger.kernel.org
Subject: Re: [reiserfs-dev] Re: Ext2 directory index: ALS paper and benchmarks
Date: Sat, 8 Dec 2001 07:19:32 +0000 (UTC) [thread overview]
Message-ID: <9useu4$f4o$1@penguin.transmeta.com> (raw)
In-Reply-To: <E16BjYc-0000hS-00@starship.berlin> <20011207174726.B6640@vestdata.no> <E16CP0X-0000uE-00@starship.berlin> <3C110B3F.D94DDE62@zip.com.au>
In article <3C110B3F.D94DDE62@zip.com.au>,
Andrew Morton <akpm@zip.com.au> wrote:
>Daniel Phillips wrote:
>>
>> Because Ext2 packs multiple entries onto a single inode table block, the
>> major effect is not due to lack of readahead but to partially processed inode
>> table blocks being evicted.
>
>Inode and directory lookups are satisfied direct from the icache/dcache,
>and the underlying fs is not informed of a lookup, which confuses the VM.
>
>Possibly, implementing a d_revalidate() method which touches the
>underlying block/page when a lookup occurs would help.
Well, the multi-level caching thing is very much "separate levels" on
purpose, one of the whole points of the icache/dcache being accessed
without going to any lower levels is that going all the way to the lower
levels is slow.
And there are cases where it is better to throw away the low-level
information, and keep the high-level cache, if that really is the access
pattern. For example, if we really always hit in the dcache, there is no
reason to keep any backing store around.
For inodes in particular, though, I suspect that we're just wasting
memory copying the ext2 data from the disk block to the "struct inode".
We might be much better off with
- get rid of the duplication between "ext2_inode_info" (in struct
inode) and "ext2_inode" (on-disk representation)
- add "struct ext2_inode *" and a "struct buffer_head *" pointer to
"ext2_inode_info".
- do all inode ops "in place" directly in the buffer cache.
This might actually _improve_ memory usage (avoid duplicate data), and
would make the buffer cache a "slave cache" of the inode cache, which in
turn would improve inode IO (ie writeback) noticeably. It would get rid
of a lot of horrible stuff in "ext2_update_inode()", and we'd never have
to read in a buffer block in order to write out an inode (right now,
because inodes are only partial blocks, write-out becomes a read-modify-
write cycle if the buffer has been evicted).
So "ext2_write_inode()" would basically become somehting like
struct ext2_inode *raw_inode = inode->u.ext2_i.i_raw_inode;
struct buffer_head *bh = inode->u.ext2_i.i_raw_bh;
/* Update the stuff we've brought into the generic part of the inode */
raw_inode->i_size = cpu_to_le32(inode->i_size);
...
mark_buffer_dirty(bh);
with part of the data already in the right place (ie the current
"inode->u.ext2_i.i_data[block]" wouldn't exist, it would just exist as
"raw_inode->i_block[block]" directly in the buffer block.
Linus
next prev parent reply other threads:[~2001-12-08 7:25 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-12-05 21:26 Ext2 directory index: ALS paper and benchmarks Daniel Phillips
2001-12-06 3:41 ` Hans Reiser
2001-12-06 3:54 ` Daniel Phillips
2001-12-06 3:56 ` Hans Reiser
2001-12-06 4:08 ` Daniel Phillips
2001-12-06 13:44 ` Hans Reiser
2001-12-06 17:22 ` Daniel Phillips
2001-12-07 0:13 ` [reiserfs-dev] " Hans Reiser
2001-12-07 4:39 ` Daniel Phillips
2001-12-07 12:36 ` Hans Reiser
2001-12-07 14:35 ` Daniel Phillips
2001-12-07 20:16 ` Hans Reiser
2001-12-06 11:27 ` Ragnar Kjørstad
2001-12-07 15:51 ` Daniel Phillips
2001-12-07 16:47 ` Ragnar Kjørstad
2001-12-07 17:41 ` Daniel Phillips
2001-12-07 18:03 ` Ragnar Kjørstad
2001-12-07 18:18 ` Daniel Phillips
2001-12-07 21:10 ` Hans Reiser
2001-12-07 21:12 ` Hans Reiser
2001-12-07 18:32 ` Andrew Morton
2001-12-07 19:46 ` Daniel Phillips
2001-12-07 20:00 ` Andrew Morton
2001-12-08 7:19 ` Linus Torvalds [this message]
2001-12-08 17:32 ` Daniel Phillips
2001-12-08 17:54 ` Jeff Garzik
2001-12-09 3:27 ` Daniel Phillips
2001-12-09 4:19 ` Linus Torvalds
2001-12-09 16:29 ` Alan Cox
2001-12-09 20:13 ` Daniel Phillips
2001-12-10 6:27 ` Linus Torvalds
2001-12-10 6:49 ` Alexander Viro
2001-12-10 8:32 ` Alan Cox
2001-12-10 16:14 ` Daniel Phillips
2001-12-08 20:28 ` Hans Reiser
2001-12-08 21:10 ` Ragnar Kjørstad
2001-12-07 21:01 ` Hans Reiser
2001-12-07 22:56 ` Ragnar Kjørstad
2001-12-08 0:15 ` Hans Reiser
2001-12-08 19:16 ` Ragnar Kjørstad
2001-12-08 19:55 ` Hans Reiser
2001-12-09 2:47 ` Daniel Phillips
2001-12-09 2:39 ` Daniel Phillips
2001-12-08 18:02 ` Jeremy Fitzhardinge
2001-12-09 2:24 ` Daniel Phillips
2001-12-07 3:19 ` Cameron Simpson
2001-12-07 10:54 ` Hans Reiser
2001-12-07 14:53 ` Daniel Phillips
2001-12-07 20:33 ` Hans Reiser
2001-12-07 13:06 ` [reiserfs-dev] " Ragnar Kjørstad
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='9useu4$f4o$1@penguin.transmeta.com' \
--to=torvalds@transmeta.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox