linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Btrfs-progs: fsck: fix wrong return value in check_block()
@ 2014-02-24 11:55 Wang Shilong
  2014-02-24 23:03 ` Mitch Harder
  0 siblings, 1 reply; 5+ messages in thread
From: Wang Shilong @ 2014-02-24 11:55 UTC (permalink / raw)
  To: linux-btrfs

We found btrfsck will output backrefs mismatch while the filesystem
is defenitely ok.

The problem is that check_block() don't return right value,which
makes btrfsck won't walk all tree blocks thus we don't get a consistent
filesystem, we will fail to check extent refs etc.

Reported-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
---
 cmds-check.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cmds-check.c b/cmds-check.c
index a2afae6..253569f 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -2477,7 +2477,7 @@ static int check_block(struct btrfs_trans_handle *trans,
 	struct cache_extent *cache;
 	struct btrfs_key key;
 	enum btrfs_tree_block_status status;
-	int ret = 1;
+	int ret = 0;
 	int level;
 
 	cache = lookup_cache_extent(extent_cache, buf->start, buf->len);
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] Btrfs-progs: fsck: fix wrong return value in check_block()
  2014-02-24 11:55 [PATCH] Btrfs-progs: fsck: fix wrong return value in check_block() Wang Shilong
@ 2014-02-24 23:03 ` Mitch Harder
  2014-02-25  1:38   ` Wang Shilong
  0 siblings, 1 reply; 5+ messages in thread
From: Mitch Harder @ 2014-02-24 23:03 UTC (permalink / raw)
  To: Wang Shilong; +Cc: linux-btrfs

On Mon, Feb 24, 2014 at 5:55 AM, Wang Shilong
<wangsl.fnst@cn.fujitsu.com> wrote:
> We found btrfsck will output backrefs mismatch while the filesystem
> is defenitely ok.
>
> The problem is that check_block() don't return right value,which
> makes btrfsck won't walk all tree blocks thus we don't get a consistent
> filesystem, we will fail to check extent refs etc.
>
> Reported-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
> Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
> ---
>  cmds-check.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/cmds-check.c b/cmds-check.c
> index a2afae6..253569f 100644
> --- a/cmds-check.c
> +++ b/cmds-check.c
> @@ -2477,7 +2477,7 @@ static int check_block(struct btrfs_trans_handle *trans,
>         struct cache_extent *cache;
>         struct btrfs_key key;
>         enum btrfs_tree_block_status status;
> -       int ret = 1;
> +       int ret = 0;
>         int level;
>
>         cache = lookup_cache_extent(extent_cache, buf->start, buf->len);
> --

I tried this fix on a broken btrfs volume I've been trying to repair,
and it seemed to put me in an infinite loop.

I agree that something seems wrong with the way the caller of
check_block uses the return value, and I also noticed that it seemed
to exit before walking all the tree blocks.

But I think the problem is more subtle than flipping the default ret
value from 1 to 0.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Btrfs-progs: fsck: fix wrong return value in check_block()
  2014-02-24 23:03 ` Mitch Harder
@ 2014-02-25  1:38   ` Wang Shilong
  2014-02-25 21:39     ` Mitch Harder
  0 siblings, 1 reply; 5+ messages in thread
From: Wang Shilong @ 2014-02-25  1:38 UTC (permalink / raw)
  To: Mitch Harder; +Cc: linux-btrfs

Hi Mitch,

On 02/25/2014 07:03 AM, Mitch Harder wrote:
> On Mon, Feb 24, 2014 at 5:55 AM, Wang Shilong
> <wangsl.fnst@cn.fujitsu.com> wrote:
>> We found btrfsck will output backrefs mismatch while the filesystem
>> is defenitely ok.
>>
>> The problem is that check_block() don't return right value,which
>> makes btrfsck won't walk all tree blocks thus we don't get a consistent
>> filesystem, we will fail to check extent refs etc.
>>
>> Reported-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
>> Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
>> ---
>>   cmds-check.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/cmds-check.c b/cmds-check.c
>> index a2afae6..253569f 100644
>> --- a/cmds-check.c
>> +++ b/cmds-check.c
>> @@ -2477,7 +2477,7 @@ static int check_block(struct btrfs_trans_handle *trans,
>>          struct cache_extent *cache;
>>          struct btrfs_key key;
>>          enum btrfs_tree_block_status status;
>> -       int ret = 1;
>> +       int ret = 0;
>>          int level;
>>
>>          cache = lookup_cache_extent(extent_cache, buf->start, buf->len);
>> --
> I tried this fix on a broken btrfs volume I've been trying to repair,
> and it seemed to put me in an infinite loop.
>
> I agree that something seems wrong with the way the caller of
> check_block uses the return value, and I also noticed that it seemed
> to exit before walking all the tree blocks.
>
> But I think the problem is more subtle than flipping the default ret
> value from 1 to 0.
No, not really even though i know there are other problems with fsck 
repair mode.
But this problem should be fixed and pushed into btrfs-progsv3.13.(Notice,
the below problem did not exist in btrfs-progsv3.12)

An easy way to trigger this problem:

# mkfs.btrfs -f /dev/sda9
# mount /dev/sda9 /mnt
# dd if=/dev/zero of=/mnt/data bs=4k count=10240 oflag=direct
# btrfs sub snapshot /mnt /mnt/snap1
# btrfs sub snapshot /mnt /mnt/snap2
# umount /mnt
# btrfs check /dev/sda9

After applying this patch, the above problems did not exist.
Feel free to correct me if i miss something here.^_^


Thanks,
Wang
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Btrfs-progs: fsck: fix wrong return value in check_block()
  2014-02-25  1:38   ` Wang Shilong
@ 2014-02-25 21:39     ` Mitch Harder
  2014-02-26  1:19       ` Wang Shilong
  0 siblings, 1 reply; 5+ messages in thread
From: Mitch Harder @ 2014-02-25 21:39 UTC (permalink / raw)
  To: Wang Shilong; +Cc: linux-btrfs

On Mon, Feb 24, 2014 at 7:38 PM, Wang Shilong
<wangsl.fnst@cn.fujitsu.com> wrote:
> Hi Mitch,
>
>
> On 02/25/2014 07:03 AM, Mitch Harder wrote:
>>
>> On Mon, Feb 24, 2014 at 5:55 AM, Wang Shilong
>> <wangsl.fnst@cn.fujitsu.com> wrote:
>>>
>>> We found btrfsck will output backrefs mismatch while the filesystem
>>> is defenitely ok.
>>>
>>> The problem is that check_block() don't return right value,which
>>> makes btrfsck won't walk all tree blocks thus we don't get a consistent
>>> filesystem, we will fail to check extent refs etc.
>>>
>>> Reported-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
>>> Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
>>> ---
>>>   cmds-check.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/cmds-check.c b/cmds-check.c
>>> index a2afae6..253569f 100644
>>> --- a/cmds-check.c
>>> +++ b/cmds-check.c
>>> @@ -2477,7 +2477,7 @@ static int check_block(struct btrfs_trans_handle
>>> *trans,
>>>          struct cache_extent *cache;
>>>          struct btrfs_key key;
>>>          enum btrfs_tree_block_status status;
>>> -       int ret = 1;
>>> +       int ret = 0;
>>>          int level;
>>>
>>>          cache = lookup_cache_extent(extent_cache, buf->start, buf->len);
>>> --
>>
>> I tried this fix on a broken btrfs volume I've been trying to repair,
>> and it seemed to put me in an infinite loop.
>>
>> I agree that something seems wrong with the way the caller of
>> check_block uses the return value, and I also noticed that it seemed
>> to exit before walking all the tree blocks.
>>
>> But I think the problem is more subtle than flipping the default ret
>> value from 1 to 0.
>
> No, not really even though i know there are other problems with fsck repair
> mode.
> But this problem should be fixed and pushed into btrfs-progsv3.13.(Notice,
> the below problem did not exist in btrfs-progsv3.12)
>
> An easy way to trigger this problem:
>
> # mkfs.btrfs -f /dev/sda9
> # mount /dev/sda9 /mnt
> # dd if=/dev/zero of=/mnt/data bs=4k count=10240 oflag=direct
> # btrfs sub snapshot /mnt /mnt/snap1
> # btrfs sub snapshot /mnt /mnt/snap2
> # umount /mnt
> # btrfs check /dev/sda9
>
> After applying this patch, the above problems did not exist.
> Feel free to correct me if i miss something here.^_^
>

I took a closer look at the check_block function today, and it looks
to me like the problem is that the return value is not modified when
BTRFS_BLOCK_FLAG_FULL_BACKREF is set.

@@ -2521,14 +2521,17 @@ static int check_block(struct btrfs_trans_handle *trans,
         }
     } else {
         rec->content_checked = 1;
-        if (flags & BTRFS_BLOCK_FLAG_FULL_BACKREF)
+        if (flags & BTRFS_BLOCK_FLAG_FULL_BACKREF) {
             rec->owner_ref_checked = 1;
+            ret = 0;
+        }
         else {
             ret = check_owner_ref(root, rec, buf);
             if (!ret)
                 rec->owner_ref_checked = 1;
         }

For me, in this function I would lean towards an initial return value
that must be updated by having check_block() make an affirmative
PASS/FAIL decision on the block.

What do you think about something like this?

diff --git a/cmds-check.c b/cmds-check.c
index ffc5d3e..55070da 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -2477,7 +2477,7 @@ static int check_block(struct btrfs_trans_handle *trans,
     struct cache_extent *cache;
     struct btrfs_key key;
     enum btrfs_tree_block_status status;
-    int ret = 1;
+    int ret = -EINVAL;
     int level;

     cache = lookup_cache_extent(extent_cache, buf->start, buf->len);
@@ -2521,14 +2521,17 @@ static int check_block(struct btrfs_trans_handle *trans,
         }
     } else {
         rec->content_checked = 1;
-        if (flags & BTRFS_BLOCK_FLAG_FULL_BACKREF)
+        if (flags & BTRFS_BLOCK_FLAG_FULL_BACKREF) {
             rec->owner_ref_checked = 1;
+            ret = 0;
+        }
         else {
             ret = check_owner_ref(root, rec, buf);
             if (!ret)
                 rec->owner_ref_checked = 1;
         }
     }
+    BUG_ON(ret == -EINVAL);
     if (!ret)
         maybe_free_extent_rec(extent_cache, rec);
     return ret;
--

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] Btrfs-progs: fsck: fix wrong return value in check_block()
  2014-02-25 21:39     ` Mitch Harder
@ 2014-02-26  1:19       ` Wang Shilong
  0 siblings, 0 replies; 5+ messages in thread
From: Wang Shilong @ 2014-02-26  1:19 UTC (permalink / raw)
  To: Mitch Harder; +Cc: linux-btrfs

On 02/26/2014 05:39 AM, Mitch Harder wrote:
> On Mon, Feb 24, 2014 at 7:38 PM, Wang Shilong
> <wangsl.fnst@cn.fujitsu.com> wrote:
>> Hi Mitch,
>>
>>
>> On 02/25/2014 07:03 AM, Mitch Harder wrote:
>>> On Mon, Feb 24, 2014 at 5:55 AM, Wang Shilong
>>> <wangsl.fnst@cn.fujitsu.com> wrote:
>>>> We found btrfsck will output backrefs mismatch while the filesystem
>>>> is defenitely ok.
>>>>
>>>> The problem is that check_block() don't return right value,which
>>>> makes btrfsck won't walk all tree blocks thus we don't get a consistent
>>>> filesystem, we will fail to check extent refs etc.
>>>>
>>>> Reported-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
>>>> Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
>>>> ---
>>>>    cmds-check.c | 2 +-
>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/cmds-check.c b/cmds-check.c
>>>> index a2afae6..253569f 100644
>>>> --- a/cmds-check.c
>>>> +++ b/cmds-check.c
>>>> @@ -2477,7 +2477,7 @@ static int check_block(struct btrfs_trans_handle
>>>> *trans,
>>>>           struct cache_extent *cache;
>>>>           struct btrfs_key key;
>>>>           enum btrfs_tree_block_status status;
>>>> -       int ret = 1;
>>>> +       int ret = 0;
>>>>           int level;
>>>>
>>>>           cache = lookup_cache_extent(extent_cache, buf->start, buf->len);
>>>> --
>>> I tried this fix on a broken btrfs volume I've been trying to repair,
>>> and it seemed to put me in an infinite loop.
>>>
>>> I agree that something seems wrong with the way the caller of
>>> check_block uses the return value, and I also noticed that it seemed
>>> to exit before walking all the tree blocks.
>>>
>>> But I think the problem is more subtle than flipping the default ret
>>> value from 1 to 0.
>> No, not really even though i know there are other problems with fsck repair
>> mode.
>> But this problem should be fixed and pushed into btrfs-progsv3.13.(Notice,
>> the below problem did not exist in btrfs-progsv3.12)
>>
>> An easy way to trigger this problem:
>>
>> # mkfs.btrfs -f /dev/sda9
>> # mount /dev/sda9 /mnt
>> # dd if=/dev/zero of=/mnt/data bs=4k count=10240 oflag=direct
>> # btrfs sub snapshot /mnt /mnt/snap1
>> # btrfs sub snapshot /mnt /mnt/snap2
>> # umount /mnt
>> # btrfs check /dev/sda9
>>
>> After applying this patch, the above problems did not exist.
>> Feel free to correct me if i miss something here.^_^
>>
> I took a closer look at the check_block function today, and it looks
> to me like the problem is that the return value is not modified when
> BTRFS_BLOCK_FLAG_FULL_BACKREF is set.
Hm, i'd say no.

Let's see what is check_block() doing.

It firstly check if there exists next block to deal, if not, return 1 
directly.
and then we do some checks with that block, and we only explictly set @ret
with error when we found an error.

So why we got such a regression when josef changed codes, it was because 
firstly
we set @ret with a none-zero value. So we had to take care of error and 
success
case both for the following codes!

I was considering your suggestion when i was writting patch, but IMO 
this patch
makes codes less error-prone.

I won't change the patch unless i am really missing something here.

Thanks,
Wang
>
> @@ -2521,14 +2521,17 @@ static int check_block(struct btrfs_trans_handle *trans,
>           }
>       } else {
>           rec->content_checked = 1;
> -        if (flags & BTRFS_BLOCK_FLAG_FULL_BACKREF)
> +        if (flags & BTRFS_BLOCK_FLAG_FULL_BACKREF) {
>               rec->owner_ref_checked = 1;
> +            ret = 0;
> +        }
>           else {
>               ret = check_owner_ref(root, rec, buf);
>               if (!ret)
>                   rec->owner_ref_checked = 1;
>           }
>
> For me, in this function I would lean towards an initial return value
> that must be updated by having check_block() make an affirmative
> PASS/FAIL decision on the block.
>
> What do you think about something like this?
>
> diff --git a/cmds-check.c b/cmds-check.c
> index ffc5d3e..55070da 100644
> --- a/cmds-check.c
> +++ b/cmds-check.c
> @@ -2477,7 +2477,7 @@ static int check_block(struct btrfs_trans_handle *trans,
>       struct cache_extent *cache;
>       struct btrfs_key key;
>       enum btrfs_tree_block_status status;
> -    int ret = 1;
> +    int ret = -EINVAL;
>       int level;
>
>       cache = lookup_cache_extent(extent_cache, buf->start, buf->len);
> @@ -2521,14 +2521,17 @@ static int check_block(struct btrfs_trans_handle *trans,
>           }
>       } else {
>           rec->content_checked = 1;
> -        if (flags & BTRFS_BLOCK_FLAG_FULL_BACKREF)
> +        if (flags & BTRFS_BLOCK_FLAG_FULL_BACKREF) {
>               rec->owner_ref_checked = 1;
> +            ret = 0;
> +        }
>           else {
>               ret = check_owner_ref(root, rec, buf);
>               if (!ret)
>                   rec->owner_ref_checked = 1;
>           }
>       }
> +    BUG_ON(ret == -EINVAL);
>       if (!ret)
>           maybe_free_extent_rec(extent_cache, rec);
>       return ret;
> --
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-02-26  1:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-24 11:55 [PATCH] Btrfs-progs: fsck: fix wrong return value in check_block() Wang Shilong
2014-02-24 23:03 ` Mitch Harder
2014-02-25  1:38   ` Wang Shilong
2014-02-25 21:39     ` Mitch Harder
2014-02-26  1:19       ` Wang Shilong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).