Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Nikolay Borisov <nborisov@suse.com>
To: "Agustín DallʼAlba" <agustin@dallalba.com.ar>,
	linux-btrfs@vger.kernel.org
Subject: Re: raid10 corruption while removing failing disk
Date: Mon, 10 Aug 2020 11:21:29 +0300	[thread overview]
Message-ID: <4c967884-2252-a21d-a994-80df64a7e6ef@suse.com> (raw)
In-Reply-To: <3dc4d28e81b3336311c979bda35ceb87b9645606.camel@dallalba.com.ar>



On 10.08.20 г. 10:03 ч., Agustín DallʼAlba wrote:
> Hello!
> 
> The last quarterly scrub on our btrfs filesystem found a few bad
> sectors in one of its devices (/dev/sdd), and because there's nobody on
> site to replace the failing disk I decided to remove it from the array
> with `btrfs device remove` before the problem could get worse.
> 
> The removal was going relatively well (although slowly and I had to
> reboot a few times due to the bad sectors) until it had about 200 GB
> left to move. Now the filesystem turns read only when I try to finish
> the removal and `btrfs check` complains about wrong metadata checksums.
> However as far as I can tell none of the copies of the corrupt data are
> in the failing disk.
> 
> How could this happen? Is it possible to fix this filesystem?
> 
> I have refrained from trying anything so far, like upgrading to a newer
> kernel or disconnecting the failing drive, before confirming with you
> that it's safe.
> 
> Kind regards.
> 
> 
> # uname -a
> Linux susanita 4.15.0-111-generic #112-Ubuntu SMP Thu Jul 9 20:32:34
> UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
> 
> 
> # btrfs --version
> btrfs-progs v4.15.1
> 
> 
> # btrfs fi show
> Label: 'Susanita'  uuid: 4d3acf20-d408-49ab-b0a6-182396a9f27c
> 	Total devices 5 FS bytes used 4.90TiB
> 	devid    1 size 3.64TiB used 3.42TiB path /dev/sda
> 	devid    2 size 3.64TiB used 3.42TiB path /dev/sde
> 	devid    3 size 1.82TiB used 1.59TiB path /dev/sdb
> 	devid    5 size 0.00B used 185.50GiB path /dev/sdd
> 	devid    6 size 1.82TiB used 1.22TiB path /dev/sdc
> 
> 
> # btrfs fi df /
> Data, RAID1: total=4.90TiB, used=4.90TiB
> System, RAID10: total=64.00MiB, used=880.00KiB
> Metadata, RAID10: total=9.00GiB, used=7.57GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> 
> # btrfs check --force --readonly /dev/sda
> WARNING: filesystem mounted, continuing because of --force
> Checking filesystem on /dev/sda
> UUID: 4d3acf20-d408-49ab-b0a6-182396a9f27c
> checksum verify failed on 10919566688256 found BAB1746E wanted A8A48266
> checksum verify failed on 10919566688256 found BAB1746E wanted A8A48266
> bytenr mismatch, want=10919566688256, have=17196831625821864417
> ERROR: failed to repair root items: Input/output error
> 
> # btrfs-map-logical -l 10919566688256 /dev/sda
> mirror 1 logical 10919566688256 physical 394473357312 device /dev/sdc
> mirror 2 logical 10919566688256 physical 477218586624 device /dev/sda
> 
> 
> Relevant dmesg output:
> [    4.963420] Btrfs loaded, crc32c=crc32c-generic
> [    5.072878] BTRFS: device label Susanita devid 6 transid 4241535 /dev/sdc
> [    5.073165] BTRFS: device label Susanita devid 3 transid 4241535 /dev/sdb
> [    5.073713] BTRFS: device label Susanita devid 2 transid 4241535 /dev/sde
> [    5.073916] BTRFS: device label Susanita devid 5 transid 4241535 /dev/sdd
> [    5.074398] BTRFS: device label Susanita devid 1 transid 4241535 /dev/sda
> [    5.152479] BTRFS info (device sda): disk space caching is enabled
> [    5.152551] BTRFS info (device sda): has skinny extents
> [    5.332538] BTRFS info (device sda): bdev /dev/sdd errs: wr 0, rd 24, flush 0, corrupt 0, gen 0
> [   38.869423] BTRFS info (device sda): enabling auto defrag
> [   38.869490] BTRFS info (device sda): use lzo compression, level 0
> [   38.869547] BTRFS info (device sda): disk space caching is enabled
> 
> 
> After running btrfs device remove /dev/sdd /:
> [  193.684703] BTRFS info (device sda): relocating block group 10593404846080 flags metadata|raid10
> [  312.921934] BTRFS error (device sda): bad tree block start 10597444141056 10919566688256
> [  313.034339] BTRFS error (device sda): bad tree block start 17196831625821864417 10919566688256
> [  313.034595] BTRFS error (device sda): bad tree block start 10597444141056 10919566688256
> [  313.034621] BTRFS: error (device sda) in btrfs_run_delayed_refs:3083: errno=-5 IO failure
> [  313.034627] BTRFS info (device sda): forced readonly
> [  313.036328] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> [  313.036596] IP: merge_reloc_roots+0x19f/0x2c0 [btrfs]

This suggests you are hitting a known problem with reloc roots which
have been fixed in the latest upstream and lts (5.4) kernels:

044ca910276b btrfs: reloc: fix reloc root leak and NULL pointer
dereference (3 months ago) <Qu Wenruo>
707de9c0806d btrfs: relocation: fix reloc_root lifespan and access (7
months ago) <Qu Wenruo>
1fac4a54374f btrfs: relocation: fix use-after-free on dead relocation
roots (11 months ago) <Qu Wenruo>


So yes, try to update to latest stable kernel and re-run the device
remove. Also update your btrfs progs to latest 5.6 version and rerun
check again (by default it's a read only operations so it shouldn't
cause any more damage).


<snip>

  parent reply	other threads:[~2020-08-10  8:21 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-10  7:03 raid10 corruption while removing failing disk Agustín DallʼAlba
2020-08-10  7:22 ` Nikolay Borisov
2020-08-10  7:38   ` Martin Steigerwald
2020-08-10  7:51     ` Nikolay Borisov
2020-08-10  8:57       ` Martin Steigerwald
2020-08-11  1:30       ` Chris Murphy
2020-08-10  7:59     ` Agustín DallʼAlba
2020-08-10  8:21 ` Nikolay Borisov [this message]
2020-08-10 22:24   ` Zygo Blaxell
2020-08-11  1:18   ` Agustín DallʼAlba
2020-08-11  1:48     ` Chris Murphy
2020-08-11  2:34 ` Chris Murphy
2020-08-11  5:06   ` Agustín DallʼAlba
2020-08-11 19:17     ` Chris Murphy
2020-08-11 20:40       ` Agustín DallʼAlba
2020-08-12  3:03         ` Chris Murphy
2020-08-31 20:05       ` Agustín DallʼAlba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4c967884-2252-a21d-a994-80df64a7e6ef@suse.com \
    --to=nborisov@suse.com \
    --cc=agustin@dallalba.com.ar \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox