linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Freemyer <greg.freemyer@gmail.com>
To: Peng Tao <bergwolf@gmail.com>
Cc: linux-ext4@vger.kernel.org, Theodore Tso <tytso@mit.edu>,
	Akira Fujita <a-fujita@rs.jp.nec.com>
Subject: Re: [PATCH] e4defrag: fallocate donor file only once
Date: Thu, 3 Sep 2009 05:30:35 -0400	[thread overview]
Message-ID: <87f94c370909030230w6ab49265yf6e689cbae1d458c@mail.gmail.com> (raw)
In-Reply-To: <6149e97b0909030100p1930c0fra28663724e51114@mail.gmail.com>

On Thu, Sep 3, 2009 at 4:00 AM, Peng Tao<bergwolf@gmail.com> wrote:
> On Thu, Sep 3, 2009 at 6:09 AM, Greg Freemyer<greg.freemyer@gmail.com> wrote:
>> On Wed, Sep 2, 2009 at 11:35 AM, Peng Tao<bergwolf@gmail.com> wrote:
>>> If we allocate the donor file once for all, it will have a better chance
>>> to be continuous.
>>>
>>> Signed-off-by: "Peng Tao" <bergwolf@gmail.com>
>>
>> Seems like an improvement, but I'm not seeing any special handling for
>> sparse files.  (Not before or after this patch.)
>>
>> Seems like there should be an outer loop that identifies contiguous
>> data block sets in a sparse file and defrags them individually as
>> opposed to trying to defrag the entire file at once.
>>
>> My impression is that with a large sparse file, e4defrag currently
>> (with or without this patch) would fallocate a full non-sparse donor
>> set of blocks the full size of the original file, then swap in just
>> the truly allocated blocks?
> Thanks for the reminder. The original code takes good care of sparse
> files in join_extents(). Please ignore my patch.
>
> Sorry for the noise.

RFC from a more ext4 knowledgeable person than me:

The code in e4defrag still looks way to complex.  I don't see why it
needs to know so much about extents and groups.

I just looked at util/copy_sparse.c

It simply loops through all the blocks in the source file and calls
ioctl(fd, FIBMAP, &b) to see if they are allocated vs. sparse,

If allocated it copies the block from src to dest.  Pretty straight
forward and has none of the complexity of e4defrag.

Seems to me e4defrag should have the actual defrag_file() rewritten to
be something like:

defrag_file()  {
    loop through the blocks looking for the contiguous set of data blocks.
          defrag_contiguous_data(start_block, num_blocks)
}

defrag_contiguous_data(start_block, num_blocks) {
    // allocate one full extent at a time and donate the blocks to orig file
    for(start=start_block; start < start_block, num_blocks; start+=chunk) {
          fallocate(chunk);
          move_ext(orig, donor, start, 0, chunk);
      }
}

And then set chunk to be the max size of one extent.  Maybe the
"chunk" should be bigger than one extent?

Also, I did not put any logic in above to show testing to see if the
new file is less fragmented than the original.  That will add to the
complexity, but hopefully the actual defrag logic can be as relatively
simple as the above instead of what it is now.

Anyway, t would be a major change to e4defrag, but it seems that it
would give ext4 a much better chance to reorganize itself by calling
fallocate on full extent size chunks at minimum, instead of what the
code currently does.

Greg
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2009-09-03  9:30 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-02 15:35 [PATCH] e4defrag: fallocate donor file only once Peng Tao
2009-09-02 22:09 ` Greg Freemyer
2009-09-02 22:24   ` Andreas Dilger
2009-09-03  8:00   ` Peng Tao
2009-09-03  9:30     ` Greg Freemyer [this message]
2009-09-04  3:08       ` Peng Tao
2009-09-04 12:36         ` Greg Freemyer
2009-09-04 16:56           ` Peng Tao
2009-09-04 19:10             ` Greg Freemyer
2009-09-05 16:29               ` Peng Tao
2009-09-08 15:41                 ` Greg Freemyer
2009-09-09  1:24                   ` Peng Tao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87f94c370909030230w6ab49265yf6e689cbae1d458c@mail.gmail.com \
    --to=greg.freemyer@gmail.com \
    --cc=a-fujita@rs.jp.nec.com \
    --cc=bergwolf@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).