Re: [PATCH v2 0/6] Chunk level degradable check

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Adam Borowski <kilobyte@angband.pl>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>
Cc: linux-btrfs@vger.kernel.org, lists@colorremedies.com
Subject: Re: [PATCH v2 0/6] Chunk level degradable check
Date: Tue, 7 Mar 2017 03:23:41 +0100	[thread overview]
Message-ID: <20170307022341.phpur3esp5fx5cmd@angband.pl> (raw)
In-Reply-To: <c07e3381-5a6e-4de6-0d9e-4a397f07726c@cn.fujitsu.com>

On Tue, Mar 07, 2017 at 09:35:56AM +0800, Qu Wenruo wrote:
> At 03/07/2017 08:36 AM, Adam Borowski wrote:
> > Not so for -draid5 -mraid1, unfortunately:
> 
> Unfortunately, for raid5 there are still unfixed bugs.
> In fact, some raid5/6 bugs are already fixed, but still not merged yet.
> 
> > [/mnt/btr2/scratch]# btrfs fi us /mnt/vol1
> > Data,RAID5: Size:2.02GiB, Used:1.21GiB
> >    /dev/loop0	   1.01GiB
> >    /dev/loop1	   1.01GiB
> >    /dev/loop2	   1.01GiB

> > [/mnt/btr2/scratch]# umount /mnt/vol1
> > [/mnt/btr2/scratch]# losetup -D
> > [/mnt/btr2/scratch]# losetup -f rb
> > [/mnt/btr2/scratch]# losetup -f rc
> 
> So you're pulling out first device.
> In theory, it should be completely OK for RAID5.
> And the degradable check follows it.
> 
> > [/mnt/btr2/scratch]# mount -noatime,degraded /dev/loop0 /mnt/vol1
> > [/mnt/btr2/scratch]# btrfs fi us /mnt/vol1
> > Data,RAID5: Size:2.02GiB, Used:1.21GiB
> >    /dev/loop0	   1.01GiB
> >    /dev/loop0	   1.01GiB
> >    /dev/loop1	   1.01GiB
> 
> Two loop0 shows up here, which should be detected as missing.
> 
> So it should be a btrfs-progs bug, and it'll be much easier to fix than
> kernel.

Alas, it's not merely a display bug, mounting is enough.

> > Write something, mount degraded again.  Massive data corruption, both on
> > plain reads and on scrub, unrecoverable.
> 
> Yep, same thing here.
> And you'll be surprised that even 2 devices RAID5, which is the same as
> RAID1(parity is the same as data), can still cause the problem.
> 
> So, RAID5/6 definitely has problem in degraded mode.
> While I prefer to focus on normal RAID5/6 bug fix first, and until we solve
> all RAID5/6 normal mode bugs with enough test cases covering them.

Actually, turns out even the _first_ mount gets bad, even without writing a
single data byte.  So it's not related to our single chunks bug.

> > Obviously, this problem is somewhere with RAID5 rather than this patch set,
> > but the safety check can't be removed before that is fixed.
> 
> Do we have *safety check* in original behavior?
> 
> At least v4.11-rc1, btrfs still allows us to mount raid5/6 degraded.
> So the patchset itself is behaving just as old one.

Right.  Thus, there's no regression.

As it's a strict improvement over previous state (ie, fixes raid1 issues),
Tested-by: Adam Borowski <kilobyte@angband.pl> (if you don't mind spamming
commits with too many tags).

> I'm completely fine to add a new patch to prohibit raid5/6 degraded mount,
> but that would be a different enhancement though.

Yeah.  I guess it's more in the "don't use RAID5, there be dragons" land.


Thanks for these patches, they fix the #1 problem people have with RAID1.


[Apologies for that "✔" crap on some lines, my exit code on prompt thingy
is very paste-unfriendly; I keep forgetting it so often that I'd better get
rid of it...]

-- 
⢀⣴⠾⠻⢶⣦⠀ Meow!
⣾⠁⢠⠒⠀⣿⡁
⢿⡄⠘⠷⠚⠋⠀ Collisions shmolisions, let's see them find a collision or second
⠈⠳⣄⠀⠀⠀⠀ preimage for double rot13!

     prev parent reply	other threads:[~2017-03-07  4:53 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-06  8:58 [PATCH v2 0/6] Chunk level degradable check Qu Wenruo
2017-03-06  8:58 ` [PATCH v2 1/6] btrfs: Introduce a function to check if all chunks a OK for degraded rw mount Qu Wenruo
2017-03-07  4:48   ` Anand Jain
2017-03-08 18:26     ` Anand Jain
2017-03-09  0:31       ` Qu Wenruo
2017-03-06  8:58 ` [PATCH v2 2/6] btrfs: Do chunk level rw degrade check at mount time Qu Wenruo
2017-03-06  8:58 ` [PATCH v2 3/6] btrfs: Do chunk level degradation check for remount Qu Wenruo
2017-03-06  8:58 ` [PATCH v2 4/6] btrfs: Allow barrier_all_devices to do chunk level device check Qu Wenruo
2017-03-07  4:48   ` Anand Jain
2017-03-07  5:36     ` Qu Wenruo
2017-03-07  6:55       ` Anand Jain
2017-03-07  7:08         ` Qu Wenruo
2017-03-07  8:07           ` Qu Wenruo
2017-03-06  8:58 ` [PATCH v2 5/6] btrfs: Cleanup num_tolerated_disk_barrier_failures Qu Wenruo
2017-03-06  8:58 ` [PATCH v2 6/6] btrfs: Enhance missing device kernel message Qu Wenruo
2017-03-07  4:47   ` Anand Jain
2017-03-06 18:49 ` [PATCH v2 0/6] Chunk level degradable check Dmitrii Tcvetkov
2017-03-07  0:36 ` Adam Borowski
2017-03-07  1:35   ` Qu Wenruo
2017-03-07  2:23     ` Adam Borowski [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170307022341.phpur3esp5fx5cmd@angband.pl \
    --to=kilobyte@angband.pl \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    --cc=quwenruo@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).