linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Theodore Tso <tytso@mit.edu>
To: Curt Wohlgemuth <curtw@google.com>
Cc: Andreas Dilger <adilger@sun.com>,
	ext4 development <linux-ext4@vger.kernel.org>
Subject: Re: Question on block group allocation
Date: Sun, 26 Apr 2009 22:14:11 -0400	[thread overview]
Message-ID: <20090427021411.GA9059@mit.edu> (raw)
In-Reply-To: <6601abe90904231502y393155dbrf8913b728c704320@mail.gmail.com>

On Thu, Apr 23, 2009 at 03:02:05PM -0700, Curt Wohlgemuth wrote:
> > This is likely the "uninit_bg" feature that is causing the allocations
> > to skip groups which are marked BLOCK_UNINIT.  In some sense the benefit
> > of skipping the block bitmap read during e2fsck is probably not at all
> > beneficial compared to the cost of the extra seeking during IO.  As the
> > filesystem gets more full, the BLOCK_UNIIT flags would be cleared anyways,
> > so we might as well just keep the early allocations contiguous.

Well, I tried out Andreas' patch, by doing an rsync copy from my SSD
root partition to a 5400 rpm laptop drive, and then ran e2fsck and
dumpe2fs.  The results were interesting:

               Before Patch			  After Patch
	      Time in seconds			Time in seconds
	    Real /  User/  Sys   MB/s	   Real /  User/  Sys    MB/s	   
Pass 1      8.52 / 2.21 / 0.46  20.43	   8.84 / 4.97 / 1.11   19.68
Pass 2	   21.16 / 1.02 / 1.86  11.30	   6.54 / 1.77 / 1.78   36.39
Pass 3 	    0.01 / 0.00 / 0.00 139.00	   0.01 / 0.01 / 0.00  128.90
Pass 4	    0.16 / 0.15 / 0.00   0.00	   0.17 / 0.17 / 0.00    0.00
Pass 5	    2.52 / 1.99 / 0.09   0.79	   2.31 / 1.78 / 0.06	 0.86
Total	   32.40 / 5.11 / 2.49  12.81	  17.99 / 8.75 / 2.98	23.01

The surprise is in the gross inspection of the dumpe2fs results:

    	     	       	     Before Patch    After Patch
# of non-contig files  	     	762	        779
# of non-contig directories	571		570
# of BLOCK_UNINIT bg's		307		293
# of INODE_UNINIT bg's		503		503

So the interesting thing is that the patch only "broke open" an
additional 14 block groups (out of a 333 block groups in use when the
filesystem was created with the unpatched kernel).  However, this
allowed the pass 2 directory time to go *down* by over a factor of
three (from 21.2 seconds with the unpatched ext4 code to 6.5 seconds
with the the patch.

I think what the patch did was to diminish allocation pressure on the
first block group in the flex_bg, so we weren't mixing directory and
regular file contents.  This eliminated seeks during pass 2 of e2fsck,
which was actually a Very Good Thing.

> > A simple change to verify this would be something like the following,
> > but it hasn't actually been tested.
> 
> Tell you what:  I'll try this out and see if it helps out my test case.

Let me know what this does for your test case.  Hopefully the patch
also makes things better, since this patch is looking very interesting
right now.

Andreas, can I get a Signed-off-by from you for this patch? 

Thanks,

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2009-04-27  2:14 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-23 16:41 Question on block group allocation Curt Wohlgemuth
2009-04-23 19:08 ` Andreas Dilger
2009-04-23 22:02   ` Curt Wohlgemuth
2009-04-27  2:14     ` Theodore Tso [this message]
2009-04-27  5:29       ` Curt Wohlgemuth
2009-04-27 10:42         ` Theodore Tso
2009-04-27 22:40         ` Theodore Tso
2009-04-29 18:38           ` Curt Wohlgemuth
2009-04-29 19:37             ` Theodore Tso
2009-04-29 20:21               ` Curt Wohlgemuth
2009-04-29 21:20                 ` Theodore Tso
2009-04-29 21:50                   ` Theodore Tso
2009-04-29 22:29                     ` Curt Wohlgemuth
2009-05-01  4:39                       ` Theodore Tso
2009-05-04 15:52                   ` Curt Wohlgemuth
2009-04-29 19:16         ` Theodore Tso
2009-04-27 23:12   ` Andreas Dilger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090427021411.GA9059@mit.edu \
    --to=tytso@mit.edu \
    --cc=adilger@sun.com \
    --cc=curtw@google.com \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).