public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Dmitrii Tcvetkov <demfloro@demfloro.ru>, linux-btrfs@vger.kernel.org
Subject: Re: 4.17-rc1 FS went read-only during balance
Date: Mon, 23 Apr 2018 16:23:04 +0800	[thread overview]
Message-ID: <d935c0e0-c2c8-a0ec-bb07-5c3879dd1be0@gmx.com> (raw)
In-Reply-To: <20180423105543.43f13e3a@job>


[-- Attachment #1.1.1: Type: text/plain, Size: 2697 bytes --]



On 2018年04月23日 16:04, Dmitrii Tcvetkov wrote:
>>>>> TL;DR It seems as regression in 4.17, but I managed to find a
>>>>> workaround to make filesystem rw mountable again.
>>>>>
>>>>> Kernel built from tag v4.17-rc1
>>>>> btrfs-progs 4.16
>>>>>
>>>>> Tonight two my machines (PC (ECC RAM) and laptop(non-ECC RAM)) were
>>>>> doing usual weekly balance with this command via cron:
>>>>> btrfs balance start -musage=50 -dusage=50 <mountpoint>
>>>>> Both machines run same kernel version. 
>>>>>
>>>>> On PC that caused root and "data" filesystems to go readonly. Root
>>>>> is on an SSD with data single and metadata DUP, "data" filesystem
>>>>> is on 2 HDDs with RAID1 for data and metadata.
>>>>>
>>>>> On laptop only /home went ro, it's on NVMe SSD with data single and
>>>>> metadata DUP. 
>>>>>
>>>>> Btrfs check of PC rootfs was without any errors in both modes, I did
>>>>> them once each before reboot on readonly filesystem with --force
>>>>> flag and then from live usb. Same output without any errors.
>>>>>
>>>>> After reboot kernel refused rw mount rootfs with the same error as
>>>>> during cron balance, ro mount was accepted, error during rw mount:
>>>>> BTRFS: error (device dm-17) in merge_reloc_roots:2465: errno=-117    
>>>   
>>>> 117 means EUCLEAN, which could be caused by the newly introduced
>>>> first_key and level check.  
>>>   
>>>> Please apply this hotfix to fix it.
>>>> btrfs: Only check first key for committed tree blocks
>>>> (Which is included in latest pull request)  
>>>   
>>>> Also, please consider enable CONFIG_BTRFS_DEBUG to provide extra
>>>> debug info.  
>>>   
>>>> Thanks,
>>>> Qu  
>>>
>>> I tried 4.17-rc2 (as the pull request was pulled) with
>>> CONFIG_BTRFS_DEBUG on LVM snapshot of laptop home partition (/dev/vdb)
>>> in a VM (VM kernel sees only snapshot so no UUID collisions). Dmesg
>>> attached.  
>>
>> Thanks for the info and your previous btrfs-image.
>>
>> The image itself shows nothing wrong, so it should be runtime problem.
>> Would you please apply these two debug patches?
>> https://patchwork.kernel.org/patch/10335133/
>> https://patchwork.kernel.org/patch/10335135/
>>
>> And the attached diff file?
>>
>> My guess is the parent node is not initialized correctly in this case.
>>
>> Thanks,
>> Qu
> 
> Dmesg from kernel with all three patches applied attached.
> 
Thanks for the debug info, it really helps a lot!

It turns out that I'm just a super idiot, a typo in replace_path()
caused this, and it could not be trigger unless we enter it from
relocation recovery.

Please try the attached patch to see if it solves the problem.

Thanks,
Qu

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.1.2: 0001-btrfs-Fix-wrong-first_key-parameter-in-replace_path.patch --]
[-- Type: text/x-patch; name="0001-btrfs-Fix-wrong-first_key-parameter-in-replace_path.patch", Size: 1546 bytes --]

From 4b70eb864192ec5cf54a7e67e2957ddf0e5c0f6f Mon Sep 17 00:00:00 2001
From: Qu Wenruo <wqu@suse.com>
Date: Mon, 23 Apr 2018 16:13:55 +0800
Subject: [PATCH] btrfs: Fix wrong first_key parameter in replace_path

Commit 581c1760415c ("btrfs: Validate child tree block's level and first
key") introduced new @first_key parameter for read_tree_block(), however
caller in replace_path() is parasing wrong key to read_tree_block().

It should use parameter @first_key other than @key.

Normally it won't expose problem as @key is normally initialzied to the
same value of @first_key we expect.
However in relocation recovery case, @key can be set to (0, 0, 0), and
since no valid key in relocation tree can be (0, 0, 0), it will cause
read_tree_block() to return -EUCLEAN and interrupt relocation recovery.

Fix it by setting @first_key correctly.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/relocation.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 00b7d3231821..b041b945a7ae 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1841,7 +1841,7 @@ int replace_path(struct btrfs_trans_handle *trans,
 		old_bytenr = btrfs_node_blockptr(parent, slot);
 		blocksize = fs_info->nodesize;
 		old_ptr_gen = btrfs_node_ptr_generation(parent, slot);
-		btrfs_node_key_to_cpu(parent, &key, slot);
+		btrfs_node_key_to_cpu(parent, &first_key, slot);
 
 		if (level <= max_level) {
 			eb = path->nodes[level];
-- 
2.17.0


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  parent reply	other threads:[~2018-04-23  8:23 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-21 14:55 4.17-rc1 FS went read-only during balance Dmitrii Tcvetkov
2018-04-22  8:12 ` Dmitrii Tcvetkov
2018-04-23  1:23 ` Qu Wenruo
     [not found]   ` <20180423080745.5a9dc6be@demfloro.ru>
2018-04-23  6:13     ` Qu Wenruo
     [not found]       ` <20180423105543.43f13e3a@job>
2018-04-23  8:23         ` Qu Wenruo [this message]
2018-04-23  8:40           ` Dmitrii Tcvetkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d935c0e0-c2c8-a0ec-bb07-5c3879dd1be0@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=demfloro@demfloro.ru \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox