public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: Nathan Roberts <nroberts@yahoo-inc.com>
To: Eric Sandeen <sandeen@redhat.com>
Cc: linux-ext4@vger.kernel.org
Subject: Re: Storing inodes in a separate block device?
Date: Thu, 22 May 2008 11:58:41 -0500	[thread overview]
Message-ID: <4835A641.20909@yahoo-inc.com> (raw)
In-Reply-To: <48358F95.4070900@redhat.com>


>> I've ran some basic tests using ext4 on a SATA array plus a USB thumb 
>> drive for the inodes. Even with the slowness of a thumb drive, I was 
>> able to see encouraging results ( >50% read throughput improvement for a 
>> mixture of 4K-8K files).
> 
> How'd you test this, do you have a patch?  Sounds interesting.

Right now I have only changed enough code to be able to test the theory. 
It's in no way a presentable patch at this point. With some simplifying 
assumptions, the code changes were pretty easy:
- parse a new "idev=" mount option
- Store bdev information for the inode block device in sb_info struct
- Change __ext4_get_inode_loc() to recalculate the block offset in the 
case of a separate device and issue __getblk() to the alternate device.

- A simple utility which copies inodes from one block device to another 
is the only other thing that's needed. (This was simpler than modifying 
the tools. It also allowed me to easily perform BEFORE/AFTER comparisons 
with the only real variable being where the inodes are located.)

So, to get a file system going:
- mke2fs as usual
- copy inodes from original blkdev to inode_blkdev (yes, there are 2 
copies of the inodes, space conservation was not my objective.)
- mount using idev=<inode block device> option


To run the test:
- mkfs
- mount WITHOUT idev= option
- Create 10 million files
- copy inodes to inode_blkdev

SEQ1
-----
- umount, mount readonly, WITHOUT idev
- echo 3 > /proc/sys/vm/drop_caches
- Read 5000 random files using 500 threads, record average read time

SEQ2
-----
- umount, mount readonly, WITH idev,
- drop_caches
- Read 5000 random files using 500 threads, record average read time

- Repeat SEQ1 and then SEQ2 to verify no unexpected caching is going on 
(should see same results as original run).

--

The filesystem features reported by dumpe2fs were:
Filesystem features:      has_journal ext_attr resize_inode dir_index 
filetype needs_recovery extents sparse_super large_file


Thanks,
Nathan

  reply	other threads:[~2008-05-22 16:58 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-22 14:53 Storing inodes in a separate block device? Nathan Roberts
2008-05-22 15:21 ` Eric Sandeen
2008-05-22 16:58   ` Nathan Roberts [this message]
2008-05-22 16:03 ` Andreas Dilger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4835A641.20909@yahoo-inc.com \
    --to=nroberts@yahoo-inc.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=sandeen@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox