public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dmitrii Tcvetkov <demfloro@demfloro.ru>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>, linux-btrfs@vger.kernel.org
Subject: Re: 4.17-rc1 FS went read-only during balance
Date: Mon, 23 Apr 2018 11:40:16 +0300	[thread overview]
Message-ID: <20180423114016.3cdc0ac1@job> (raw)
In-Reply-To: <d935c0e0-c2c8-a0ec-bb07-5c3879dd1be0@gmx.com>

[-- Attachment #1: Type: text/plain, Size: 4125 bytes --]

> >>>>> TL;DR It seems as regression in 4.17, but I managed to find a
> >>>>> workaround to make filesystem rw mountable again.
> >>>>>
> >>>>> Kernel built from tag v4.17-rc1
> >>>>> btrfs-progs 4.16
> >>>>>
> >>>>> Tonight two my machines (PC (ECC RAM) and laptop(non-ECC RAM)) were
> >>>>> doing usual weekly balance with this command via cron:
> >>>>> btrfs balance start -musage=50 -dusage=50 <mountpoint>
> >>>>> Both machines run same kernel version. 
> >>>>>
> >>>>> On PC that caused root and "data" filesystems to go readonly. Root
> >>>>> is on an SSD with data single and metadata DUP, "data" filesystem
> >>>>> is on 2 HDDs with RAID1 for data and metadata.
> >>>>>
> >>>>> On laptop only /home went ro, it's on NVMe SSD with data single and
> >>>>> metadata DUP. 
> >>>>>
> >>>>> Btrfs check of PC rootfs was without any errors in both modes, I did
> >>>>> them once each before reboot on readonly filesystem with --force
> >>>>> flag and then from live usb. Same output without any errors.
> >>>>>
> >>>>> After reboot kernel refused rw mount rootfs with the same error as
> >>>>> during cron balance, ro mount was accepted, error during rw mount:
> >>>>> BTRFS: error (device dm-17) in merge_reloc_roots:2465: errno=-117      
> >>>     
> >>>> 117 means EUCLEAN, which could be caused by the newly introduced
> >>>> first_key and level check.    
> >>>     
> >>>> Please apply this hotfix to fix it.
> >>>> btrfs: Only check first key for committed tree blocks
> >>>> (Which is included in latest pull request)    
> >>>     
> >>>> Also, please consider enable CONFIG_BTRFS_DEBUG to provide extra
> >>>> debug info.    
> >>>     
> >>>> Thanks,
> >>>> Qu    
> >>>
> >>> I tried 4.17-rc2 (as the pull request was pulled) with
> >>> CONFIG_BTRFS_DEBUG on LVM snapshot of laptop home partition (/dev/vdb)
> >>> in a VM (VM kernel sees only snapshot so no UUID collisions). Dmesg
> >>> attached.    
> >>
> >> Thanks for the info and your previous btrfs-image.
> >>
> >> The image itself shows nothing wrong, so it should be runtime problem.
> >> Would you please apply these two debug patches?
> >> https://patchwork.kernel.org/patch/10335133/
> >> https://patchwork.kernel.org/patch/10335135/
> >>
> >> And the attached diff file?
> >>
> >> My guess is the parent node is not initialized correctly in this case.
> >>
> >> Thanks,
> >> Qu  
> > 
> > Dmesg from kernel with all three patches applied attached.
> >   
> Thanks for the debug info, it really helps a lot!
> 
> It turns out that I'm just a super idiot, a typo in replace_path()
> caused this, and it could not be trigger unless we enter it from
> relocation recovery.
> 
> Please try the attached patch to see if it solves the problem.
> 
> Thanks,
> Qu
Glad to help, the patch solved the problem, 
rw mount is successful and balance finished, no errors or debug output,
btrfs check is clean in both modes.

[    2.842718] BTRFS: device label home devid 1 transid 277952 /dev/vdb
[    2.924965] BTRFS: device label root devid 1 transid 84092 /dev/vda2
[    3.072271] BTRFS info (device vda2): use lzo compression, level 0
[    3.072897] BTRFS info (device vda2): enabling auto defrag
[    3.073476] BTRFS info (device vda2): using free space tree
[    3.074049] BTRFS info (device vda2): has skinny extents
[    5.411821] BTRFS info (device vda2): using free space tree
[   24.925293] BTRFS info (device vdb): using free space tree
[   24.925324] BTRFS info (device vdb): has skinny extents
[   31.711868] BTRFS info (device vdb): continuing balance
[   31.721658] BTRFS info (device vdb): checking UUID tree
[   31.822920] BTRFS info (device vdb): relocating block group 69889687552flags data 
[   33.730399] BTRFS info (device vdb): found 12 extents
[   36.950699] BTRFS info (device vdb): found 12 extents
[   37.030813] BTRFS info (device vdb): relocating block group 67742203904flags metadata|dup 
[   37.104174] BTRFS info (device vdb): relocating block group 67708649472 flags system|dup 
[   37.189843] BTRFS info (device vdb): found 1 extents


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

      reply	other threads:[~2018-04-23  8:40 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-21 14:55 4.17-rc1 FS went read-only during balance Dmitrii Tcvetkov
2018-04-22  8:12 ` Dmitrii Tcvetkov
2018-04-23  1:23 ` Qu Wenruo
     [not found]   ` <20180423080745.5a9dc6be@demfloro.ru>
2018-04-23  6:13     ` Qu Wenruo
     [not found]       ` <20180423105543.43f13e3a@job>
2018-04-23  8:23         ` Qu Wenruo
2018-04-23  8:40           ` Dmitrii Tcvetkov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180423114016.3cdc0ac1@job \
    --to=demfloro@demfloro.ru \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox