public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* btrfs: raid1C3 and raid1C4 fails to go ro when all but 1 drive removed.
@ 2022-03-25  4:44 peter brown
  2022-03-29  9:54 ` Nikolay Borisov
  0 siblings, 1 reply; 2+ messages in thread
From: peter brown @ 2022-03-25  4:44 UTC (permalink / raw)
  To: linux-btrfs

Hi,

  If I set up raid1C3 or raid1C4 and pull drives to simulate a drive failure the 
fs does not go readonly.

  If I perform the same test on a raid1 setup the fs goes readonly when the 
second last drive is removed. Ie when the fs can no longer maintain a two copy 
mirror.



kernel 5.16.17-gentoo
btrfs-progs v5.16.2

short version of the logs...

test1 ~ # mkfs.btrfs -f -L test -d raid1C3 -m raid1C3  /dev/sdb /dev/sdc 
/dev/sde /dev/sdf
test1 ~ # mount -t btrfs -o noatime /dev/sdb /mnt/btrfs/



RAID1C3
pull drive 1 (3 left)
pull drive 2 (2 left)

Pulling drive 2 should trigger a ro fs. Ie we no longer support 3 copies.
[ 1098.411396] BTRFS error (device sdb): bdev /dev/sdb errs: wr 0, rd 0, flush 
1, corrupt 0, gen 0
[ 1098.430918] BTRFS warning (device sdb): lost page write due to IO error on 
/dev/sdb (-5)
[ 1098.430923] BTRFS error (device sdb): bdev /dev/sdb errs: wr 1, rd 0, flush 
1, corrupt 0, gen 0
[ 1098.430936] BTRFS warning (device sdb): lost page write due to IO error on 
/dev/sdb (-5)
[ 1098.430939] BTRFS error (device sdb): bdev /dev/sdb errs: wr 2, rd 0, flush 
1, corrupt 0, gen 0
[ 1098.430949] BTRFS warning (device sdb): lost page write due to IO error on 
/dev/sdb (-5)
[ 1098.430952] BTRFS error (device sdb): bdev /dev/sdb errs: wr 3, rd 0, flush 
1, corrupt 0, gen 0
[ 1098.431101] BTRFS error (device sdb): error writing primary super block to 
device 1
[ 1111.722182] ata6: SATA link down (SStatus 0 SControl 300)
[ 1117.150299] ata6: SATA link down (SStatus 0 SControl 300)
[ 1122.782299] ata6: SATA link down (SStatus 0 SControl 300)
[ 1122.782308] ata6.00: disabled
[ 1122.782324] ata6.00: detaching (SCSI 5:0:0:0)
[ 1122.792186] sd 5:0:0:0: [sdf] Synchronizing SCSI cache
[ 1122.792218] sd 5:0:0:0: [sdf] Synchronize Cache(10) failed: Result: 
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 1122.792222] sd 5:0:0:0: [sdf] Stopping disk
[ 1122.792231] sd 5:0:0:0: [sdf] Start/Stop Unit failed: Result: 
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK

  btrfs device usage /mnt/btrfs/
/dev/sdb, ID: 1
    Device size:               0.00B   <---------
    Device slack:              0.00B
    Data,RAID1C3:            3.00GiB
    Unallocated:           462.76GiB

/dev/sdc, ID: 2
    Device size:             1.82TiB
    Device slack:              0.00B
    Data,RAID1C3:            4.00GiB
    Metadata,RAID1C3:        1.00GiB
    System,RAID1C3:          8.00MiB
    Unallocated:             1.81TiB

/dev/sde, ID: 3
    Device size:           465.76GiB
    Device slack:              0.00B
    Data,RAID1C3:            3.00GiB
    Metadata,RAID1C3:        1.00GiB
    System,RAID1C3:          8.00MiB
    Unallocated:           461.75GiB

/dev/sdf, ID: 4
    Device size:               0.00B <--------
    Device slack:              0.00B
    Data,RAID1C3:            2.00GiB
    Metadata,RAID1C3:        1.00GiB
    System,RAID1C3:          8.00MiB
    Unallocated:           462.75GiB



I can read the fs.

Writing should trigger a ro fs.

touch k
[ 1381.545803] BTRFS error (device sdb): bdev /dev/sdf errs: wr 1, rd 0, flush 
0, corrupt 0, gen 0
[ 1381.546043] BTRFS error (device sdb): bdev /dev/sdf errs: wr 2, rd 0, flush 
0, corrupt 0, gen 0
[ 1381.546186] BTRFS error (device sdb): bdev /dev/sdf errs: wr 3, rd 0, flush 
0, corrupt 0, gen 0
[ 1381.547239] BTRFS error (device sdb): bdev /dev/sdb errs: wr 3, rd 0, flush 
2, corrupt 0, gen 0
[ 1381.572225] BTRFS error (device sdb): bdev /dev/sdf errs: wr 3, rd 0, flush 
1, corrupt 0, gen 0
[ 1381.572244] BTRFS warning (device sdb): lost page write due to IO error on 
/dev/sdb (-5)
[ 1381.572247] BTRFS error (device sdb): bdev /dev/sdb errs: wr 4, rd 0, flush 
2, corrupt 0, gen 0
[ 1381.572257] BTRFS warning (device sdb): lost page write due to IO error on 
/dev/sdb (-5)
[ 1381.572260] BTRFS error (device sdb): bdev /dev/sdb errs: wr 5, rd 0, flush 
2, corrupt 0, gen 0
[ 1381.572270] BTRFS warning (device sdb): lost page write due to IO error on 
/dev/sdb (-5)
[ 1381.572272] BTRFS error (device sdb): bdev /dev/sdb errs: wr 6, rd 0, flush 
2, corrupt 0, gen 0
[ 1381.572376] BTRFS warning (device sdb): lost page write due to IO error on 
/dev/sdf (-5)
[ 1381.572380] BTRFS error (device sdb): bdev /dev/sdf errs: wr 4, rd 0, flush 
1, corrupt 0, gen 0
[ 1381.572392] BTRFS warning (device sdb): lost page write due to IO error on 
/dev/sdf (-5)
[ 1381.572410] BTRFS error (device sdb): bdev /dev/sdf errs: wr 5, rd 0, flush 
1, corrupt 0, gen 0
[ 1381.572423] BTRFS warning (device sdb): lost page write due to IO error on 
/dev/sdf (-5)
[ 1381.572427] BTRFS error (device sdb): error writing primary super block to 
device 1
[ 1381.613884] BTRFS error (device sdb): error writing primary super block to 
device 4

write was sucessful



pulling drive 3  we no longer can support 2 copies.. just 1

[Thu Mar 24 23:27:28 2022] BTRFS error (device sdb): bdev /dev/sde errs: wr 1, 
rd 0, flush 0, corrupt 0, gen 0
[Thu Mar 24 23:27:28 2022] BTRFS error (device sdb): bdev /dev/sde errs: wr 2, 
rd 0, flush 0, corrupt 0, gen 0
[Thu Mar 24 23:27:28 2022] BTRFS error (device sdb): bdev /dev/sde errs: wr 3, 
rd 0, flush 0, corrupt 0, gen 0
[Thu Mar 24 23:27:28 2022] BTRFS error (device sdb): bdev /dev/sde errs: wr 4, 
rd 0, flush 0, corrupt 0, gen 0
[Thu Mar 24 23:27:28 2022] BTRFS error (device sdb): bdev /dev/sdb errs: wr 3, 
rd 0, flush 2, corrupt 0, gen 0
[Thu Mar 24 23:27:28 2022] BTRFS error (device sdb): bdev /dev/sde errs: wr 4, 
rd 0, flush 1, corrupt 0, gen 0
[Thu Mar 24 23:27:28 2022] BTRFS warning (device sdb): lost page write due to IO 
error on /dev/sdb (-5)
[Thu Mar 24 23:27:28 2022] BTRFS error (device sdb): bdev /dev/sdb errs: wr 4, 
rd 0, flush 2, corrupt 0, gen 0
[Thu Mar 24 23:27:28 2022] BTRFS warning (device sdb): lost page write due to IO 
error on /dev/sdb (-5)
[Thu Mar 24 23:27:28 2022] BTRFS error (device sdb): bdev /dev/sdb errs: wr 5, 
rd 0, flush 2, corrupt 0, gen 0
[Thu Mar 24 23:27:28 2022] BTRFS warning (device sdb): lost page write due to IO 
error on /dev/sdb (-5)
[Thu Mar 24 23:27:28 2022] BTRFS error (device sdb): bdev /dev/sdb errs: wr 6, 
rd 0, flush 2, corrupt 0, gen 0
[Thu Mar 24 23:27:28 2022] BTRFS warning (device sdb): lost page write due to IO 
error on /dev/sde (-5)
[Thu Mar 24 23:27:28 2022] BTRFS error (device sdb): bdev /dev/sde errs: wr 5, 
rd 0, flush 1, corrupt 0, gen 0
[Thu Mar 24 23:27:28 2022] BTRFS warning (device sdb): lost page write due to IO 
error on /dev/sde (-5)
[Thu Mar 24 23:27:28 2022] BTRFS warning (device sdb): lost page write due to IO 
error on /dev/sde (-5)
[Thu Mar 24 23:27:28 2022] BTRFS error (device sdb): error writing primary super 
block to device 1
[Thu Mar 24 23:27:28 2022] BTRFS error (device sdb): error writing primary super 
block to device 3
[Thu Mar 24 23:28:28 2022] ata6: SATA link down (SStatus 0 SControl 300)
[Thu Mar 24 23:28:33 2022] ata6: SATA link down (SStatus 0 SControl 300)
[Thu Mar 24 23:28:39 2022] ata6: SATA link down (SStatus 0 SControl 300)
[Thu Mar 24 23:28:39 2022] ata6.00: disabled
[Thu Mar 24 23:28:39 2022] ata6.00: detaching (SCSI 5:0:0:0)
[Thu Mar 24 23:28:39 2022] sd 5:0:0:0: [sdf] Synchronizing SCSI cache
[Thu Mar 24 23:28:39 2022] sd 5:0:0:0: [sdf] Synchronize Cache(10) failed: 
Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[Thu Mar 24 23:28:39 2022] sd 5:0:0:0: [sdf] Stopping disk
[Thu Mar 24 23:28:39 2022] sd 5:0:0:0: [sdf] Start/Stop Unit failed: Result: 
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK

btrfs device usage /mnt/btrfs/
/dev/sdb, ID: 1
    Device size:               0.00B  <-----------
    Device slack:              0.00B
    Data,RAID1C3:            3.00GiB
    Unallocated:           462.76GiB

/dev/sdc, ID: 2
    Device size:             1.82TiB
    Device slack:              0.00B
    Data,RAID1C3:            4.00GiB
    Metadata,RAID1C3:        1.00GiB
    System,RAID1C3:          8.00MiB
    Unallocated:             1.81TiB

/dev/sde, ID: 3
    Device size:               0.00B  <-----------
    Device slack:              0.00B
    Data,RAID1C3:            3.00GiB
    Metadata,RAID1C3:        1.00GiB
    System,RAID1C3:          8.00MiB
    Unallocated:           461.75GiB

/dev/sdf, ID: 4
    Device size:               0.00B  <-----------
    Device slack:              0.00B
    Data,RAID1C3:            2.00GiB
    Metadata,RAID1C3:        1.00GiB
    System,RAID1C3:          8.00MiB
    Unallocated:           462.75GiB



read from fs
find .
no errors and no logs

The fs should go ro when we write the the drive in this state.

write to fs
touch 3
no errors ---  dmesg logs.
[Thu Mar 24 23:31:34 2022] btrfs_dev_stat_print_on_error: 2 callbacks suppressed
[Thu Mar 24 23:31:34 2022] BTRFS error (device sdb): bdev /dev/sde errs: wr 8, 
rd 0, flush 1, corrupt 0, gen 0
[Thu Mar 24 23:31:34 2022] BTRFS error (device sdb): bdev /dev/sdf errs: wr 1, 
rd 0, flush 0, corrupt 0, gen 0
[Thu Mar 24 23:31:34 2022] BTRFS error (device sdb): bdev /dev/sde errs: wr 9, 
rd 0, flush 1, corrupt 0, gen 0
[Thu Mar 24 23:31:34 2022] BTRFS error (device sdb): bdev /dev/sdf errs: wr 2, 
rd 0, flush 0, corrupt 0, gen 0
[Thu Mar 24 23:31:34 2022] BTRFS error (device sdb): bdev /dev/sde errs: wr 10, 
rd 0, flush 1, corrupt 0, gen 0
[Thu Mar 24 23:31:34 2022] BTRFS error (device sdb): bdev /dev/sdf errs: wr 3, 
rd 0, flush 0, corrupt 0, gen 0
[Thu Mar 24 23:31:34 2022] BTRFS error (device sdb): bdev /dev/sdb errs: wr 6, 
rd 0, flush 3, corrupt 0, gen 0
[Thu Mar 24 23:31:34 2022] BTRFS error (device sdb): bdev /dev/sde errs: wr 10, 
rd 0, flush 2, corrupt 0, gen 0
[Thu Mar 24 23:31:34 2022] BTRFS error (device sdb): bdev /dev/sdf errs: wr 3, 
rd 0, flush 1, corrupt 0, gen 0
[Thu Mar 24 23:31:34 2022] BTRFS warning (device sdb): lost page write due to IO 
error on /dev/sdb (-5)
[Thu Mar 24 23:31:34 2022] BTRFS error (device sdb): bdev /dev/sdb errs: wr 7, 
rd 0, flush 3, corrupt 0, gen 0
[Thu Mar 24 23:31:34 2022] BTRFS warning (device sdb): lost page write due to IO 
error on /dev/sdb (-5)
[Thu Mar 24 23:31:34 2022] BTRFS warning (device sdb): lost page write due to IO 
error on /dev/sdb (-5)
[Thu Mar 24 23:31:34 2022] BTRFS warning (device sdb): lost page write due to IO 
error on /dev/sde (-5)
[Thu Mar 24 23:31:34 2022] BTRFS warning (device sdb): lost page write due to IO 
error on /dev/sde (-5)
[Thu Mar 24 23:31:34 2022] BTRFS warning (device sdb): lost page write due to IO 
error on /dev/sde (-5)
[Thu Mar 24 23:31:34 2022] BTRFS warning (device sdb): lost page write due to IO 
error on /dev/sdf (-5)
[Thu Mar 24 23:31:34 2022] BTRFS warning (device sdb): lost page write due to IO 
error on /dev/sdf (-5)
[Thu Mar 24 23:31:34 2022] BTRFS warning (device sdb): lost page write due to IO 
error on /dev/sdf (-5)
[Thu Mar 24 23:31:34 2022] BTRFS error (device sdb): error writing primary super 
block to device 1
[Thu Mar 24 23:31:34 2022] BTRFS error (device sdb): error writing primary super 
block to device 3
[Thu Mar 24 23:31:34 2022] BTRFS error (device sdb): error writing primary super 
block to device 4

There is no going ro log and the write should fail.

raid1C4 is the same thing. As 4 copies are required it should do ro when
the first drive is pulled.


If I configure as raid1

mkfs.btrfs -f -L test -d raid1 -m raid1  /dev/sdb /dev/sdc /dev/sde /dev/sdf
mount -t btrfs -o noatime /dev/sdb /mnt/btrfs/

when I get to pulling the 3rd drive the fs goes ro when I try and write to it. 
As it should

[Fri Mar 25 00:05:55 2022] BTRFS error (device sdb): bdev /dev/sde errs: wr 1, 
rd 0, flush 0, corrupt 0, gen 0
[Fri Mar 25 00:05:55 2022] BTRFS error (device sdb): bdev /dev/sdf errs: wr 1, 
rd 0, flush 0, corrupt 0, gen 0
[Fri Mar 25 00:05:55 2022] BTRFS error (device sdb): bdev /dev/sde errs: wr 2, 
rd 0, flush 0, corrupt 0, gen 0
[Fri Mar 25 00:05:55 2022] BTRFS error (device sdb): bdev /dev/sdf errs: wr 2, 
rd 0, flush 0, corrupt 0, gen 0
[Fri Mar 25 00:05:55 2022] BTRFS error (device sdb): bdev /dev/sde errs: wr 3, 
rd 0, flush 0, corrupt 0, gen 0
[Fri Mar 25 00:05:55 2022] BTRFS error (device sdb): bdev /dev/sdf errs: wr 3, 
rd 0, flush 0, corrupt 0, gen 0
[Fri Mar 25 00:05:55 2022] BTRFS error (device sdb): bdev /dev/sde errs: wr 4, 
rd 0, flush 0, corrupt 0, gen 0
[Fri Mar 25 00:05:55 2022] BTRFS error (device sdb): bdev /dev/sdf errs: wr 4, 
rd 0, flush 0, corrupt 0, gen 0
[Fri Mar 25 00:05:55 2022] BTRFS error (device sdb): bdev /dev/sde errs: wr 5, 
rd 0, flush 0, corrupt 0, gen 0
[Fri Mar 25 00:05:55 2022] BTRFS error (device sdb): bdev /dev/sdf errs: wr 5, 
rd 0, flush 0, corrupt 0, gen 0
[Fri Mar 25 00:05:55 2022] BTRFS: error (device sdb) in 
btrfs_commit_transaction:2437: errno=-5 IO failure (Error while writing out 
transaction)
[Fri Mar 25 00:05:55 2022] BTRFS info (device sdb): forced readonly
[Fri Mar 25 00:05:55 2022] BTRFS warning (device sdb): Skipping commit of 
aborted transaction.
[Fri Mar 25 00:05:55 2022] BTRFS: error (device sdb) in 
cleanup_transaction:2010: errno=-5 IO failure
[Fri Mar 25 00:06:01 2022] btrfs_dev_stat_print_on_error: 4 callbacks suppressed
[Fri Mar 25 00:06:01 2022] BTRFS error (device sdb): bdev /dev/sdb errs: wr 4, 
rd 0, flush 1, corrupt 0, gen 0

  Am I missing something here?









^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-03-29  9:55 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-25  4:44 btrfs: raid1C3 and raid1C4 fails to go ro when all but 1 drive removed peter brown
2022-03-29  9:54 ` Nikolay Borisov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox