From: Nikolay Borisov <nborisov@suse.com>
To: "Agustín DallʼAlba" <agustin@dallalba.com.ar>,
linux-btrfs@vger.kernel.org
Subject: Re: raid10 corruption while removing failing disk
Date: Mon, 10 Aug 2020 11:21:29 +0300 [thread overview]
Message-ID: <4c967884-2252-a21d-a994-80df64a7e6ef@suse.com> (raw)
In-Reply-To: <3dc4d28e81b3336311c979bda35ceb87b9645606.camel@dallalba.com.ar>
On 10.08.20 г. 10:03 ч., Agustín DallʼAlba wrote:
> Hello!
>
> The last quarterly scrub on our btrfs filesystem found a few bad
> sectors in one of its devices (/dev/sdd), and because there's nobody on
> site to replace the failing disk I decided to remove it from the array
> with `btrfs device remove` before the problem could get worse.
>
> The removal was going relatively well (although slowly and I had to
> reboot a few times due to the bad sectors) until it had about 200 GB
> left to move. Now the filesystem turns read only when I try to finish
> the removal and `btrfs check` complains about wrong metadata checksums.
> However as far as I can tell none of the copies of the corrupt data are
> in the failing disk.
>
> How could this happen? Is it possible to fix this filesystem?
>
> I have refrained from trying anything so far, like upgrading to a newer
> kernel or disconnecting the failing drive, before confirming with you
> that it's safe.
>
> Kind regards.
>
>
> # uname -a
> Linux susanita 4.15.0-111-generic #112-Ubuntu SMP Thu Jul 9 20:32:34
> UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
>
>
> # btrfs --version
> btrfs-progs v4.15.1
>
>
> # btrfs fi show
> Label: 'Susanita' uuid: 4d3acf20-d408-49ab-b0a6-182396a9f27c
> Total devices 5 FS bytes used 4.90TiB
> devid 1 size 3.64TiB used 3.42TiB path /dev/sda
> devid 2 size 3.64TiB used 3.42TiB path /dev/sde
> devid 3 size 1.82TiB used 1.59TiB path /dev/sdb
> devid 5 size 0.00B used 185.50GiB path /dev/sdd
> devid 6 size 1.82TiB used 1.22TiB path /dev/sdc
>
>
> # btrfs fi df /
> Data, RAID1: total=4.90TiB, used=4.90TiB
> System, RAID10: total=64.00MiB, used=880.00KiB
> Metadata, RAID10: total=9.00GiB, used=7.57GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
>
>
> # btrfs check --force --readonly /dev/sda
> WARNING: filesystem mounted, continuing because of --force
> Checking filesystem on /dev/sda
> UUID: 4d3acf20-d408-49ab-b0a6-182396a9f27c
> checksum verify failed on 10919566688256 found BAB1746E wanted A8A48266
> checksum verify failed on 10919566688256 found BAB1746E wanted A8A48266
> bytenr mismatch, want=10919566688256, have=17196831625821864417
> ERROR: failed to repair root items: Input/output error
>
> # btrfs-map-logical -l 10919566688256 /dev/sda
> mirror 1 logical 10919566688256 physical 394473357312 device /dev/sdc
> mirror 2 logical 10919566688256 physical 477218586624 device /dev/sda
>
>
> Relevant dmesg output:
> [ 4.963420] Btrfs loaded, crc32c=crc32c-generic
> [ 5.072878] BTRFS: device label Susanita devid 6 transid 4241535 /dev/sdc
> [ 5.073165] BTRFS: device label Susanita devid 3 transid 4241535 /dev/sdb
> [ 5.073713] BTRFS: device label Susanita devid 2 transid 4241535 /dev/sde
> [ 5.073916] BTRFS: device label Susanita devid 5 transid 4241535 /dev/sdd
> [ 5.074398] BTRFS: device label Susanita devid 1 transid 4241535 /dev/sda
> [ 5.152479] BTRFS info (device sda): disk space caching is enabled
> [ 5.152551] BTRFS info (device sda): has skinny extents
> [ 5.332538] BTRFS info (device sda): bdev /dev/sdd errs: wr 0, rd 24, flush 0, corrupt 0, gen 0
> [ 38.869423] BTRFS info (device sda): enabling auto defrag
> [ 38.869490] BTRFS info (device sda): use lzo compression, level 0
> [ 38.869547] BTRFS info (device sda): disk space caching is enabled
>
>
> After running btrfs device remove /dev/sdd /:
> [ 193.684703] BTRFS info (device sda): relocating block group 10593404846080 flags metadata|raid10
> [ 312.921934] BTRFS error (device sda): bad tree block start 10597444141056 10919566688256
> [ 313.034339] BTRFS error (device sda): bad tree block start 17196831625821864417 10919566688256
> [ 313.034595] BTRFS error (device sda): bad tree block start 10597444141056 10919566688256
> [ 313.034621] BTRFS: error (device sda) in btrfs_run_delayed_refs:3083: errno=-5 IO failure
> [ 313.034627] BTRFS info (device sda): forced readonly
> [ 313.036328] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> [ 313.036596] IP: merge_reloc_roots+0x19f/0x2c0 [btrfs]
This suggests you are hitting a known problem with reloc roots which
have been fixed in the latest upstream and lts (5.4) kernels:
044ca910276b btrfs: reloc: fix reloc root leak and NULL pointer
dereference (3 months ago) <Qu Wenruo>
707de9c0806d btrfs: relocation: fix reloc_root lifespan and access (7
months ago) <Qu Wenruo>
1fac4a54374f btrfs: relocation: fix use-after-free on dead relocation
roots (11 months ago) <Qu Wenruo>
So yes, try to update to latest stable kernel and re-run the device
remove. Also update your btrfs progs to latest 5.6 version and rerun
check again (by default it's a read only operations so it shouldn't
cause any more damage).
<snip>
next prev parent reply other threads:[~2020-08-10 8:21 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-10 7:03 raid10 corruption while removing failing disk Agustín DallʼAlba
2020-08-10 7:22 ` Nikolay Borisov
2020-08-10 7:38 ` Martin Steigerwald
2020-08-10 7:51 ` Nikolay Borisov
2020-08-10 8:57 ` Martin Steigerwald
2020-08-11 1:30 ` Chris Murphy
2020-08-10 7:59 ` Agustín DallʼAlba
2020-08-10 8:21 ` Nikolay Borisov [this message]
2020-08-10 22:24 ` Zygo Blaxell
2020-08-11 1:18 ` Agustín DallʼAlba
2020-08-11 1:48 ` Chris Murphy
2020-08-11 2:34 ` Chris Murphy
2020-08-11 5:06 ` Agustín DallʼAlba
2020-08-11 19:17 ` Chris Murphy
2020-08-11 20:40 ` Agustín DallʼAlba
2020-08-12 3:03 ` Chris Murphy
2020-08-31 20:05 ` Agustín DallʼAlba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4c967884-2252-a21d-a994-80df64a7e6ef@suse.com \
--to=nborisov@suse.com \
--cc=agustin@dallalba.com.ar \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox