linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Theodore Ts'o <tytso@mit.edu>
To: Benjamin LaHaise <bcrl@kvack.org>
Cc: linux-ext4@vger.kernel.org
Subject: Re: ext4: first write to large ext3 filesystem takes 96 seconds
Date: Mon, 7 Jul 2014 23:54:05 -0400	[thread overview]
Message-ID: <20140708035405.GA27440@thunk.org> (raw)
In-Reply-To: <20140708013510.GB12478@kvack.org>

On Mon, Jul 07, 2014 at 09:35:11PM -0400, Benjamin LaHaise wrote:
> 
> Sure -- I put a copy at http://www.kvack.org/~bcrl/mb_groups as it's a bit 
> too big for the mailing list.  The filesystem in question has a couple of 
> 11GB files on it, with the remainder of the space being taken up by files 
> 7200016 bytes in size.  

Right, so looking at mb_groups we see a bunch of the problems.  There
are a large number block groups which look like this:

#group: free  frags first [ 2^0   2^1   2^2   2^3   2^4   2^5   2^6   2^7   2^8   2^9   2^10  2^11  2^12  2^13  ]
#288  : 1540  7     13056 [ 0     0     1     0     0     0     0     0     6     0     0     0     0     0     ]

It would be very interesting to see what allocation pattern resulted
in so many block groups with this layout.  Before we read in
allocation bitmap, all we know from the block group descriptors is
that there are 1540 free blocks.  What we don't know is that they are
broken up into 6 256 block free regions, plus a 4 block region.

If we try to allocate a 1024 block region, we'll end up searching a
large number of these block groups before find one which is suitable.

Or there is a large collection of block groups that look like this:

#834  : 4900  39    514   [ 0     20    5     5     16    6     4     8     6     1     1     0     0     0     ]

Similarly, we could try to look for a contiguous 2048 range, but even
though there is 4900 blocks available, we can't tell the difference
between something a free block layout which looks like like the above,
versus one that looks like this:

#834  : 4900  39    514   [ 0      6    0     1     3    5     1     4     0     0     0     2     0     0     ]

We could try going straight for the largely empty block groups, but
that's more likely to fragment the file system more quickly, and then
once those largely empty block groups are partially used, then we'll
end up taking a long time while we scan all of the block groups.

       	      	     	  	   	- Ted



  reply	other threads:[~2014-07-08  3:54 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-07 21:13 ext4: first write to large ext3 filesystem takes 96 seconds Benjamin LaHaise
2014-07-08  0:16 ` Theodore Ts'o
2014-07-08  1:35   ` Benjamin LaHaise
2014-07-08  3:54     ` Theodore Ts'o [this message]
2014-07-08 14:53       ` Benjamin LaHaise
2014-07-08  5:11   ` Andreas Dilger
2014-07-30 14:49     ` Benjamin LaHaise
2014-07-31 13:03       ` Theodore Ts'o
2014-07-31 14:04         ` Benjamin LaHaise
2014-07-31 15:27           ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140708035405.GA27440@thunk.org \
    --to=tytso@mit.edu \
    --cc=bcrl@kvack.org \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).