All of lore.kernel.org
 help / color / mirror / Atom feed
From: Xiaoguang Wang <wangxg.fnst@cn.fujitsu.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: <linux-ext4@vger.kernel.org>, "Theodore Ts'o" <tytso@mit.edu>,
	<a-fujita@rs.jp.nec.com>, <t-sato@yk.jp.nec.com>,
	<sandeen@redhat.com>
Subject: Re: Question about e2fsprogs/e4defrag
Date: Thu, 17 Jul 2014 11:29:26 +0800	[thread overview]
Message-ID: <53C74316.1050902@cn.fujitsu.com> (raw)
In-Reply-To: <20140716192236.GA8628@birch.djwong.org>

Hi,

On 07/17/2014 03:22 AM, Darrick J. Wong wrote:
> On Wed, Jul 16, 2014 at 12:16:37PM +0800, Xiaoguang Wang wrote:
>> Hi,
>>
>> When I run xfstests/tests/generic/018 for ext4 file system in RHEL7.0GA,
>> sometimes it fails and sometime it succeeds. After looking into this case, I
>> think it's not a kernel ext4 bug, it maybe an e4dfrag bug. I compiled the
>> newest e2fsprogs to have test, it seems this issue still exits, so I still
>> send a mail to this list to look for some help, thanks.
>>
>> The issue is that sometimes e4defrag does not defrag file correctly.
>> Steps to reproduce this issue:
>> 1. cd mntpoint
>> 2. rm -f lege
>> 3. for I in `seq 9 -1 0`;
>>        do dd if=/dev/zero of=lege bs=4k count=1 conv=notrunc seek=$I oflag=sync &>/dev/null;
>>    done;
>> 4. e4defrag -c -v lege
>>
>> Repeatedly execute the 2, 3, 4 steps until you get a file which have the similar extent layout like below:
>> ################################################################
>> [root@localhost test_e2fsprogs]# e4defrag -c -v lege
>> <File>
>> [ext 1]:        start 49365571: logical 0:      len 1
>> [ext 2]:        start 49365570: logical 1:      len 1
>> [ext 3]:        start 49365569: logical 2:      len 1
>> [ext 4]:        start 49365568: logical 3:      len 1
>> [ext 5]:        start 49365567: logical 4:      len 1
>> [ext 6]:        start 49365566: logical 5:      len 1
>> [ext 7]:        start 49365565: logical 6:      len 1
>> [ext 8]:        start 49365564: logical 7:      len 1
>> [ext 9]:        start 49365563: logical 8:      len 1
>> [ext 10]:       start 49365562: logical 9:      len 1
>>
>>  Total/best extents                             10/1
>>  Average size per extent                        4 KB
>>  Fragmentation score                            98
>>  [0-30 no problem: 31-55 a little bit fragmented: 56- needs defrag]
>>  This file (lege) needs defragmentation.
>>  Done.
>> ################################################################
>> The physical blocks are continuous but reversed.
>>
>> If we call e4defrag against this file, the output would be:
>> ################################################################
>> [root@localhost test_e2fsprogs]# /tmp/e4defrag -v lege 
>> ext4 defragmentation for lege
>> [1/1]lege:      100%  extents: 10 -> 10 [ OK ]
>>  Success:                       [1/1]
>> [root@localhost test_e2fsprogs]# /tmp/e4defrag -v -c lege 
>> <File>
>> [ext 1]:        start 49365571: logical 0:      len 1
>> [ext 2]:        start 49365570: logical 1:      len 1
>> [ext 3]:        start 49365569: logical 2:      len 1
>> [ext 4]:        start 49365568: logical 3:      len 1
>> [ext 5]:        start 49365567: logical 4:      len 1
>> [ext 6]:        start 49365566: logical 5:      len 1
>> [ext 7]:        start 49365565: logical 6:      len 1
>> [ext 8]:        start 49365564: logical 7:      len 1
>> [ext 9]:        start 49365563: logical 8:      len 1
>> [ext 10]:       start 49365562: logical 9:      len 1
>>
>>  Total/best extents                             10/1
>>  Average size per extent                        4 KB
>>  Fragmentation score                            98
>>  [0-30 no problem: 31-55 a little bit fragmented: 56- needs defrag]
>>  This file (lege) needs defragmentation.
>>  Done.
>> ################################################################
>> According to my understanding, this file is not defraged correctly and should
>> be convert into one extent.  Or because if the physical blocks are continuous
>> though reversed, we do not need to do defragment?
> 
> Oh, I think we /do/ need to defragment.  Granted, file readahead might paper
> over the symptoms, but since the user explicitly ran e4defrag we can try
> to do better.

Yeah, agree.
> 
>> I have checked the e4defrag source code, whether to do real defragment
>> depends on some conditions, please
>> see this code(e4defrag.c).
>> --main
>>   --file_defrag
>>     
>> In file_defrag(), there is such a judgement:
>> "if (file_frags_start <= best || orig_physical_cnt <= donor_physical_cnt)", If this returns true, the e4defrag will
>> not call call_defrag() to do real defragment work.
>>
>> Here file_frags_start: number of file fragments before defrag
>>      orig_physical_cnt: number of original file's continuous physical region
>>      donor_physical_cnt: number of donor file's continuous physical region
>>
>> In this "lege" file, the orig_physical_cnt is 1, and donor_physical_cnt is also 1, so the "if" is satisfied and
>> call_defrag() won't be called.
> 
> This is a curious corner case of e4defrag -- if you look in get_file_extents(),
> the list of extents is insertion-sorted by physical block, which means that
> get_physical_count() (stupidly) looks only for gaps in the runs of physical
> blocks.  Therefore, e4defrag thinks that this "lege" file has one physical
> extent.  Ignoring logical block ordering, this is true, but as you point out,
> this leaves the "file written backwards" case in a fragmented state.  So let's
> not ignore the logical block ordering:
> 
> What I think we really need to do here is make get_physical_count() smarter --
> if there's a gap either in the physical or logical offsets of extents, then we
> need to increment *_physical_cnt so that we later decide to defragment the
> file.
> 
> (Please keep reading)
I checked the code again, you are right, thanks for your explanation
 
> 
>> Here I'd like to know the comparison "orig_physical_cnt <=
>> donor_physical_cnt" is useful? According to my understanding, what should we
>> have comparison are number of extents or average extent size.
>>
>> When I have this change:
>> diff --git a/misc/e4defrag.c b/misc/e4defrag.c
>> index a204793..cd95698 100644
>> --- a/misc/e4defrag.c
>> +++ b/misc/e4defrag.c
>> @@ -1598,8 +1598,7 @@ check_improvement:
>>                 extents_before_defrag += file_frags_start;
>>         }
>>  
>> -       if (file_frags_start <= best ||
>> -                       orig_physical_cnt <= donor_physical_cnt) {
>> +       if (file_frags_start <= best) { 
> 
> This is incorrect, since the point of the "orig_physical_cnt <=
> donor_physical_cnt" check is to ensure that we don't increase the fragmentation
> of a file by swapping it with pieces from a donor file whose contents are
> spread out over a larger number of runs of physical blocks.

Ah, I see. I hadn't realized that, thanks.
> 
> (It does, however, force defragmentation for all files, so you get the results
> you wanted.)
> 
> Please try the patch at the end of this message on for size.  It fixes things
> on my test VM; does it fix yours?

Yeah, it works, thanks.
Would you send a new version patch to fix this issue, or should I do it?

Regards,
Xiaoguang Wang
> 
> --D
> 
>>                 printf("\033[79;0H\033[K[%u/%u]%s:\t%3d%%",
>>                         defraged_file_count, total_count, file, 100);
>>                 if (mode_flag & DETAIL)
>>
>> Then the "lege" file could be defraged correctly.
>> ##################################################################
>> [root@localhost test_e2fsprogs]# /tmp/e4defrag -v  lege
>> ext4 defragmentation for lege
>> [1/1]lege:      100%  extents: 10 -> 1  [ OK ]
>>  Success:                       [1/1]
>> [root@localhost test_e2fsprogs]# /tmp/e4defrag -v -c lege
>> <File>
>> [ext 1]:        start 49366583: logical 0:      len 10
>>
>>  Total/best extents                             1/1
>>  Average size per extent                        40 KB
>>  Fragmentation score                            0
>>  [0-30 no problem: 31-55 a little bit fragmented: 56- needs defrag]
>>  This file (lege) does not need defragmentation.
>>  Done.
>> ##################################################################
>>
>> Any opinion or suggestions will be appreciated!
>> If I'm wrong, please correct me, thanks!
>>
>> Regards,
>> Xiaoguang Wang
> 
> diff --git a/misc/e4defrag.c b/misc/e4defrag.c
> index a204793..d0eac60 100644
> --- a/misc/e4defrag.c
> +++ b/misc/e4defrag.c
> @@ -888,7 +888,9 @@ static int get_physical_count(struct fiemap_extent_list *physical_list_head)
>  
>  	do {
>  		if ((ext_list_tmp->data.physical + ext_list_tmp->data.len)
> -				!= ext_list_tmp->next->data.physical) {
> +				!= ext_list_tmp->next->data.physical ||
> +		    (ext_list_tmp->data.logical + ext_list_tmp->data.len)
> +				!= ext_list_tmp->next->data.logical) {
>  			/* This extent and next extent are not continuous. */
>  			ret++;
>  		}
> .
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2014-07-17  3:30 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-16  4:16 Question about e2fsprogs/e4defrag Xiaoguang Wang
2014-07-16 19:22 ` Darrick J. Wong
2014-07-17  3:29   ` Xiaoguang Wang [this message]
2014-07-17 18:09     ` Darrick J. Wong
2014-07-18  2:05       ` Andreas Dilger
2014-07-18  3:19         ` Xiaoguang Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53C74316.1050902@cn.fujitsu.com \
    --to=wangxg.fnst@cn.fujitsu.com \
    --cc=a-fujita@rs.jp.nec.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=sandeen@redhat.com \
    --cc=t-sato@yk.jp.nec.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.