From: Ric Wheeler <rwheeler@redhat.com>
To: Andreas Dilger <adilger@sun.com>
Cc: nicholas.dokos@hp.com, linux-fsdevel@vger.kernel.org,
Christoph Hellwig <hch@infradead.org>,
Douglas Shakshober <dshaks@redhat.com>,
Joshua Giles <jgiles@redhat.com>,
Valerie Aurora <vaurora@redhat.com>,
Eric Sandeen <esandeen@redhat.com>,
Steven Whitehouse <swhiteho@redhat.com>,
Edward Shishkin <edward@redhat.com>,
Josef Bacik <jbacik@redhat.com>, Jeff Moyer <jmoyer@redhat.com>,
Chris Mason <chris.mason@oracle.com>,
"Whitney, Eric" <eric.whitney@hp.com>,
Theodore Tso <tytso@mit.edu>
Subject: Re: large fs testing
Date: Tue, 26 May 2009 18:17:21 -0400 [thread overview]
Message-ID: <4A1C6A71.7010300@redhat.com> (raw)
In-Reply-To: <20090526212132.GE3218@webber.adilger.int>
On 05/26/2009 05:21 PM, Andreas Dilger wrote:
> On May 26, 2009 13:47 -0400, Ric Wheeler wrote:
>> These runs were without lazy init, so I would expect to be a little more
>> than twice as slow as your second run (not the three times I saw)
>> assuming that it scales linearly.
>
> Making lazy_itable_init the default formatting option for ext4 is/was
> dependent upon the kernel doing the zeroing of the inode table blocks
> at first mount time. I'm not sure if that was implemented yet.
>
>> This run was with limited DRAM on the
>> box (6GB) and only a single HBA, but I am afraid that I did not get any
>> good insight into what was the bottleneck during my runs.
>
> For a very large array (80TB) this could be 1TB or more of inode tables
> that are being zeroed out at format time. After 64TB the default mke2fs
> options will cap out at 4B inodes in the filesystem. 1TB/90min ~= 200MB/s
> so this is probably your bottleneck.
>
>> Do you have any access to even larger storage, say the mythical 100TB :-)
>> ? Any insight on interesting workloads?
>
> I would definitely be most interested in e2fsck performance at this scale
> (RAM usage and elapsed time) because this will in the end be the defining
> limit on how large a usable filesystem can actually be in practise.
>
> Cheers, Andreas
Not sure why, but the box rebooted (crashed?) a couple of hours into the run (no
hints in the logs pointed at anything suspicious).
What I did get was the following from the fsck run:
root@l82bi250:/home/redhat\aYou have new mail in /var/spool/mail/root
[root@l82bi250 redhat]# time /sbin/fsck.ext4 -tt -y /dev/mapper/Big_boy-Big_boy
e2fsck 1.41.4 (27-Jan-2009)
Pass 1: Checking inodes, blocks, and sizes
Pass 1: Memory used: 1596k/1177752k (1447k/150k), time: 1184.73/514.16/344.38
Pass 1: I/O read: 50655MB, write: 0MB, rate: 42.76MB/s
Pass 2: Checking directory structure
Entry '4a1590dc~~~~~~~~O4A0SMJ1VC34YQ1PD3B5DL9Q' in /da (188378) references
inode 196988 in group 30 where _INODE_UNINIT is set.
Fix? yes
Restarting e2fsck from the beginning...
Group descriptor 15 checksum is invalid. Fix? yes
Pass 1: Checking inodes, blocks, and sizes
Pass 1: Memory used: 120396k/-1389015k (120134k/263k), time: 1134.71/522.48/323.65
Pass 1: I/O read: 50656MB, write: 0MB, rate: 44.64MB/s
Pass 2: Checking directory structure
Entry '4a15910c~~~~~~~~H8099TRM701Q29CSTCWBVIHJ' in /0b (404925) references
inode 413100 in group 62 where _INODE_UNINIT is set.
Fix? yes
Restarting e2fsck from the beginning...
Group descriptor 31 checksum is invalid. Fix? yes
Pass 1: Checking inodes, blocks, and sizes
Pass 1: Memory used: 231360k/246272k (231083k/278k), time: 1140.48/521.00/334.74
Pass 1: I/O read: 50658MB, write: 0MB, rate: 44.42MB/s
Pass 2: Checking directory structure
Pass 2: Memory used: 231360k/1290436k (231083k/278k), time: 538.22/264.56/83.49
Pass 2: I/O read: 13749MB, write: 0MB, rate: 25.55MB/s
Pass 3: Checking directory connectivity
Peak memory: Memory used: 231360k/1789000k (231083k/278k), time:
4221.57/1947.37/1116.21
Pass 3A: Memory used: 231360k/1789000k (231083k/278k), time: 0.00/ 0.00/ 0.00
Pass 3A: I/O read: 0MB, write: 0MB, rate: 0.00MB/s
Pass 3: Memory used: 231360k/1290436k (231083k/278k), time: 9.99/ 0.26/ 1.37
Pass 3: I/O read: 1MB, write: 0MB, rate: 0.10MB/s
Pass 4: Checking reference counts
Pass 4: Memory used: 231360k/-1481575k (231082k/279k), time: 147.16/139.87/ 1.94
Pass 4: I/O read: 0MB, write: 0MB, rate: 0.00MB/s
Pass 5: Checking group summary information
Inode bitmap differences: -(98404--98405)
Note that it got truncated in Pass 5 - just after writing out some values that
look like they sign wrapped?
-(103650--103655) -(103659--103660) -103663 -103665 -103667 -(103669--103670)
-(103673--103676) -103679 -103684 -103687 -10
ric
next prev parent reply other threads:[~2009-05-26 22:18 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-23 13:53 large fs testing Ric Wheeler
2009-05-26 12:21 ` Joshua Giles
2009-05-26 12:28 ` Ric Wheeler
2009-05-26 17:39 ` Nick Dokos
2009-05-26 17:47 ` Ric Wheeler
2009-05-26 21:21 ` Andreas Dilger
2009-05-26 21:39 ` Theodore Tso
2009-05-26 22:17 ` Ric Wheeler [this message]
2009-05-28 6:30 ` Andreas Dilger
2009-05-28 10:52 ` Ric Wheeler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A1C6A71.7010300@redhat.com \
--to=rwheeler@redhat.com \
--cc=adilger@sun.com \
--cc=chris.mason@oracle.com \
--cc=dshaks@redhat.com \
--cc=edward@redhat.com \
--cc=eric.whitney@hp.com \
--cc=esandeen@redhat.com \
--cc=hch@infradead.org \
--cc=jbacik@redhat.com \
--cc=jgiles@redhat.com \
--cc=jmoyer@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=nicholas.dokos@hp.com \
--cc=swhiteho@redhat.com \
--cc=tytso@mit.edu \
--cc=vaurora@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).