* degraded raid scribbling upon wrong device
@ 2017-07-13 6:40 Adam Borowski
2017-07-22 20:36 ` Adam Borowski
0 siblings, 1 reply; 2+ messages in thread
From: Adam Borowski @ 2017-07-13 6:40 UTC (permalink / raw)
To: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 741 bytes --]
Hi!
Here's a set of test cases, two of them in some cases seem to scribble upon
the wrong device:
* deg-mid-missing
* deg-last-replaced (not on the innocent "re")
* but never deg-last-missing
When all goes ok, there are no errors other than wrong generation on the
re-added disk (expected). When it goes bad, there's a lot of corruption.
In all cases, though, the "Device missing:" field is wrong.
I'm not yet sure how to trigger this, perhaps someone would have a clue?
8:30am, hitting the sack, will try again todorrow.
Meow!
--
⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ A dumb species has no way to open a tuna can.
⢿⡄⠘⠷⠚⠋⠀ A smart species invents a can opener.
⠈⠳⣄⠀⠀⠀⠀ A master species delegates.
[-- Attachment #2: deg-mid-missing --]
[-- Type: text/plain, Size: 818 bytes --]
#!/bin/sh
set -e
set -x
umount /mnt/vol1 ||:
losetup -D
dd if=/dev/zero bs=1048576 count=1 seek=4095 of=ra
dd if=/dev/zero bs=1048576 count=1 seek=4095 of=rb
dd if=/dev/zero bs=1048576 count=1 seek=4095 of=rc
dd if=/dev/zero bs=1048576 count=1 seek=4095 of=rd
mkfs.btrfs -draid1 -mraid1 ra rb rc rd
losetup -D
losetup -f ra
losetup -f rb
losetup -f rc
losetup -f rd
sleep 1
mount /dev/loop0 /mnt/vol1
cp -pr /bin /mnt/vol1
btrfs fi sync /mnt/vol1
btrfs fi us /mnt/vol1
umount /mnt/vol1
losetup -D
losetup -f ra
losetup -f rb
losetup -f rd
sleep 1
mount -odegraded /dev/loop0 /mnt/vol1
btrfs fi us /mnt/vol1
dd if=/dev/zero of=/mnt/vol1/foo bs=1048576 count=2222
umount /mnt/vol1
losetup -D
losetup -f ra
losetup -f rb
losetup -f rc
losetup -f rd
sleep 1
mount /dev/loop0 /mnt/vol1
btrfs scrub start -B /mnt/vol1
[-- Attachment #3: deg-last-missing --]
[-- Type: text/plain, Size: 818 bytes --]
#!/bin/sh
set -e
set -x
umount /mnt/vol1 ||:
losetup -D
dd if=/dev/zero bs=1048576 count=1 seek=4095 of=ra
dd if=/dev/zero bs=1048576 count=1 seek=4095 of=rb
dd if=/dev/zero bs=1048576 count=1 seek=4095 of=rc
dd if=/dev/zero bs=1048576 count=1 seek=4095 of=rd
mkfs.btrfs -draid1 -mraid1 ra rb rc rd
losetup -D
losetup -f ra
losetup -f rb
losetup -f rc
losetup -f rd
sleep 1
mount /dev/loop0 /mnt/vol1
cp -pr /bin /mnt/vol1
btrfs fi sync /mnt/vol1
btrfs fi us /mnt/vol1
umount /mnt/vol1
losetup -D
losetup -f ra
losetup -f rb
losetup -f rc
sleep 1
mount -odegraded /dev/loop0 /mnt/vol1
btrfs fi us /mnt/vol1
dd if=/dev/zero of=/mnt/vol1/foo bs=1048576 count=2222
umount /mnt/vol1
losetup -D
losetup -f ra
losetup -f rb
losetup -f rc
losetup -f rd
sleep 1
mount /dev/loop0 /mnt/vol1
btrfs scrub start -B /mnt/vol1
[-- Attachment #4: deg-last-replaced --]
[-- Type: text/plain, Size: 883 bytes --]
#!/bin/sh
set -e
set -x
umount /mnt/vol1 ||:
losetup -D
dd if=/dev/zero bs=1048576 count=1 seek=4095 of=ra
dd if=/dev/zero bs=1048576 count=1 seek=4095 of=rb
dd if=/dev/zero bs=1048576 count=1 seek=4095 of=rc
dd if=/dev/zero bs=1048576 count=1 seek=4095 of=rd
dd if=/dev/zero bs=1048576 count=1 seek=4095 of=re
mkfs.btrfs -draid1 -mraid1 ra rb rc rd
losetup -D
losetup -f ra
losetup -f rb
losetup -f rc
losetup -f rd
sleep 1
mount /dev/loop0 /mnt/vol1
cp -pr /bin /mnt/vol1
btrfs fi sync /mnt/vol1
btrfs fi us /mnt/vol1
umount /mnt/vol1
losetup -D
losetup -f ra
losetup -f rb
losetup -f rc
losetup -f re
sleep 1
mount -odegraded /dev/loop0 /mnt/vol1
btrfs fi us /mnt/vol1
dd if=/dev/zero of=/mnt/vol1/foo bs=1048576 count=2222
umount /mnt/vol1
losetup -D
losetup -f ra
losetup -f rb
losetup -f rc
losetup -f rd
sleep 1
mount /dev/loop0 /mnt/vol1
btrfs scrub start -B /mnt/vol1
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: degraded raid scribbling upon wrong device
2017-07-13 6:40 degraded raid scribbling upon wrong device Adam Borowski
@ 2017-07-22 20:36 ` Adam Borowski
0 siblings, 0 replies; 2+ messages in thread
From: Adam Borowski @ 2017-07-22 20:36 UTC (permalink / raw)
To: linux-btrfs
On Thu, Jul 13, 2017 at 08:40:12AM +0200, Adam Borowski wrote:
> Here's a set of test cases, two of them in some cases seem to scribble upon
> the wrong device:
>
> * deg-mid-missing
> * deg-last-replaced (not on the innocent "re")
> * but never deg-last-missing
>
> When all goes ok, there are no errors other than wrong generation on the
> re-added disk (expected). When it goes bad, there's a lot of corruption.
> In all cases, though, the "Device missing:" field is wrong.
I did not explore this adequately yet, in a good part because of ENOSPC
triggering a lot of time for an unrelated reason that Omar just fixed
(thanks!). So, here's what I know so far:
* copying in, say, 2.2GB /usr/share is a lot more likely to trigger than
dd-ing 2.2GB of /dev/null
* no "real" degrading is needed: in the original scripts, the missing device
is empty so all blocks are doubled anyway. It's not about degraded chunks
but because of a bogus device.
* bogus output of "btrfs f u" is a sure predictor that, with enough tries,
you'll get corruption -- if it shows something when it should say
"missing", shit is likely to happen
Meow!
--
⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ A dumb species has no way to open a tuna can.
⢿⡄⠘⠷⠚⠋⠀ A smart species invents a can opener.
⠈⠳⣄⠀⠀⠀⠀ A master species delegates.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2017-07-22 20:36 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-07-13 6:40 degraded raid scribbling upon wrong device Adam Borowski
2017-07-22 20:36 ` Adam Borowski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).