linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RAID6, errors at missing device replacement
@ 2016-04-15 19:49 Yauhen Kharuzhy
  2016-04-15 23:00 ` Henk Slager
  2016-04-16  7:37 ` Duncan
  0 siblings, 2 replies; 7+ messages in thread
From: Yauhen Kharuzhy @ 2016-04-15 19:49 UTC (permalink / raw)
  To: linux-btrfs

Hi.

I have discovered case when replacement of missing devices causes
metadata corruption. Does anybody know anything about this?

I use 4.4.5 kernel with latest global spare patches.

If we have RAID6 (may be reproducible on RAID5 too) and try to replace
one missing drive by other and after this try to remove another drive
and replace it, plenty of errors are shown in the log:

[  748.641766] BTRFS error (device sdf): failed to rebuild valid
logical 7366459392 for dev /dev/sde
[  748.678069] BTRFS error (device sdf): failed to rebuild valid
logical 7381139456 for dev /dev/sde
[  748.693559] BTRFS error (device sdf): failed to rebuild valid
logical 7290974208 for dev /dev/sde
[  752.039100] BTRFS error (device sdf): bad tree block start
13048831955636601734 6919258112
[  752.647869] BTRFS error (device sdf): bad tree block start
12819300352 6919290880
[  752.658520] BTRFS error (device sdf): bad tree block start
31618367488 6919290880
[  752.712633] BTRFS error (device sdf): bad tree block start
31618367488 6919290880

After device replacement finish, scrub shows uncorrectable errors.
Btrfs check complains about errors too:
root@test:~/# btrfs check -p /dev/sdc
Checking filesystem on /dev/sdc
UUID: 833fef31-5536-411c-8f58-53b527569fa5
checksum verify failed on 9359163392 found E4E3BDB6 wanted 00000000
checksum verify failed on 9359163392 found E4E3BDB6 wanted 00000000
checksum verify failed on 9359163392 found 4D1F4197 wanted DE0E50EC
bytenr mismatch, want=9359163392, have=9359228928

Errors found in extent allocation tree or chunk allocation
checking free space cache [.]
checking fs roots [.]
checking csums
checking root refs
found 1049788420 bytes used err is 0
total csum bytes: 1024000
total tree bytes: 1179648
total fs tree bytes: 16384
total extent tree bytes: 16384
btree space waste bytes: 124962
file data blocks allocated: 1049755648
 referenced 1049755648

After first replacement metadata seems not spread across all devices:
Label: none  uuid: 3db39446-6810-47bf-8732-d5a8793500f3
        Total devices 4 FS bytes used 1002.00MiB
        devid    1 size 8.00GiB used 1.28GiB path /dev/sdc
        devid    2 size 8.00GiB used 1.28GiB path /dev/sdd
        devid    3 size 8.00GiB used 1.28GiB path /dev/sdf
        devid    4 size 8.00GiB used 1.25GiB path /dev/sdg

# btrfs device usage /mnt/
/dev/sdc, ID: 1
   Device size:             8.00GiB
   Data,RAID6:              1.00GiB
   Metadata,RAID6:        256.00MiB
   System,RAID6:           32.00MiB
   Unallocated:             6.72GiB

/dev/sdd, ID: 2
   Device size:             8.00GiB
   Data,RAID6:              1.00GiB
   Metadata,RAID6:        256.00MiB
   System,RAID6:           32.00MiB
   Unallocated:             6.72GiB

/dev/sdf, ID: 3
   Device size:             8.00GiB
   Data,RAID6:              1.00GiB
   Metadata,RAID6:        256.00MiB
   System,RAID6:           32.00MiB
   Unallocated:             6.72GiB

/dev/sdg, ID: 4
   Device size:             8.00GiB
   Data,RAID6:              1.00GiB
   Metadata,RAID6:        256.00MiB
   Unallocated:             6.75GiB


Steps to reproduce:
1) Create and mount RAID6
2) remove drive belonging to RAID, try write and let kernel code close
the device
3) replace missing device by 'btrfs replace start' command
4) remove drive in another slot, try write, wait for closing of it
5) start replacing of missing drive -> ERRORS.

If full balance after step 3) was done, no errors appeared.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-05-02 19:33 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-15 19:49 RAID6, errors at missing device replacement Yauhen Kharuzhy
2016-04-15 23:00 ` Henk Slager
2016-04-16  7:37 ` Duncan
2016-05-02 18:43   ` Yauhen Kharuzhy
2016-05-02 19:04     ` Chris Murphy
2016-05-02 19:19       ` Yauhen Kharuzhy
2016-05-02 19:33         ` Chris Murphy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).