All of lore.kernel.org
 help / color / mirror / Atom feed
From: Adam Borowski <kilobyte@angband.pl>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>
Cc: linux-btrfs@vger.kernel.org, lists@colorremedies.com
Subject: Re: [PATCH v2 0/6] Chunk level degradable check
Date: Tue, 7 Mar 2017 03:23:41 +0100	[thread overview]
Message-ID: <20170307022341.phpur3esp5fx5cmd@angband.pl> (raw)
In-Reply-To: <c07e3381-5a6e-4de6-0d9e-4a397f07726c@cn.fujitsu.com>

On Tue, Mar 07, 2017 at 09:35:56AM +0800, Qu Wenruo wrote:
> At 03/07/2017 08:36 AM, Adam Borowski wrote:
> > Not so for -draid5 -mraid1, unfortunately:
> 
> Unfortunately, for raid5 there are still unfixed bugs.
> In fact, some raid5/6 bugs are already fixed, but still not merged yet.
> 
> > [/mnt/btr2/scratch]# btrfs fi us /mnt/vol1
> > Data,RAID5: Size:2.02GiB, Used:1.21GiB
> >    /dev/loop0	   1.01GiB
> >    /dev/loop1	   1.01GiB
> >    /dev/loop2	   1.01GiB

> > [/mnt/btr2/scratch]# umount /mnt/vol1
> > [/mnt/btr2/scratch]# losetup -D
> > [/mnt/btr2/scratch]# losetup -f rb
> > [/mnt/btr2/scratch]# losetup -f rc
> 
> So you're pulling out first device.
> In theory, it should be completely OK for RAID5.
> And the degradable check follows it.
> 
> > [/mnt/btr2/scratch]# mount -noatime,degraded /dev/loop0 /mnt/vol1
> > [/mnt/btr2/scratch]# btrfs fi us /mnt/vol1
> > Data,RAID5: Size:2.02GiB, Used:1.21GiB
> >    /dev/loop0	   1.01GiB
> >    /dev/loop0	   1.01GiB
> >    /dev/loop1	   1.01GiB
> 
> Two loop0 shows up here, which should be detected as missing.
> 
> So it should be a btrfs-progs bug, and it'll be much easier to fix than
> kernel.

Alas, it's not merely a display bug, mounting is enough.

> > Write something, mount degraded again.  Massive data corruption, both on
> > plain reads and on scrub, unrecoverable.
> 
> Yep, same thing here.
> And you'll be surprised that even 2 devices RAID5, which is the same as
> RAID1(parity is the same as data), can still cause the problem.
> 
> So, RAID5/6 definitely has problem in degraded mode.
> While I prefer to focus on normal RAID5/6 bug fix first, and until we solve
> all RAID5/6 normal mode bugs with enough test cases covering them.

Actually, turns out even the _first_ mount gets bad, even without writing a
single data byte.  So it's not related to our single chunks bug.

> > Obviously, this problem is somewhere with RAID5 rather than this patch set,
> > but the safety check can't be removed before that is fixed.
> 
> Do we have *safety check* in original behavior?
> 
> At least v4.11-rc1, btrfs still allows us to mount raid5/6 degraded.
> So the patchset itself is behaving just as old one.

Right.  Thus, there's no regression.

As it's a strict improvement over previous state (ie, fixes raid1 issues),
Tested-by: Adam Borowski <kilobyte@angband.pl> (if you don't mind spamming
commits with too many tags).

> I'm completely fine to add a new patch to prohibit raid5/6 degraded mount,
> but that would be a different enhancement though.

Yeah.  I guess it's more in the "don't use RAID5, there be dragons" land.


Thanks for these patches, they fix the #1 problem people have with RAID1.


[Apologies for that "✔" crap on some lines, my exit code on prompt thingy
is very paste-unfriendly; I keep forgetting it so often that I'd better get
rid of it...]

-- 
⢀⣴⠾⠻⢶⣦⠀ Meow!
⣾⠁⢠⠒⠀⣿⡁
⢿⡄⠘⠷⠚⠋⠀ Collisions shmolisions, let's see them find a collision or second
⠈⠳⣄⠀⠀⠀⠀ preimage for double rot13!

      reply	other threads:[~2017-03-07  4:53 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-06  8:58 [PATCH v2 0/6] Chunk level degradable check Qu Wenruo
2017-03-06  8:58 ` [PATCH v2 1/6] btrfs: Introduce a function to check if all chunks a OK for degraded rw mount Qu Wenruo
2017-03-07  4:48   ` Anand Jain
2017-03-08 18:26     ` Anand Jain
2017-03-09  0:31       ` Qu Wenruo
2017-03-06  8:58 ` [PATCH v2 2/6] btrfs: Do chunk level rw degrade check at mount time Qu Wenruo
2017-03-06  8:58 ` [PATCH v2 3/6] btrfs: Do chunk level degradation check for remount Qu Wenruo
2017-03-06  8:58 ` [PATCH v2 4/6] btrfs: Allow barrier_all_devices to do chunk level device check Qu Wenruo
2017-03-07  4:48   ` Anand Jain
2017-03-07  5:36     ` Qu Wenruo
2017-03-07  6:55       ` Anand Jain
2017-03-07  7:08         ` Qu Wenruo
2017-03-07  8:07           ` Qu Wenruo
2017-03-06  8:58 ` [PATCH v2 5/6] btrfs: Cleanup num_tolerated_disk_barrier_failures Qu Wenruo
2017-03-06  8:58 ` [PATCH v2 6/6] btrfs: Enhance missing device kernel message Qu Wenruo
2017-03-07  4:47   ` Anand Jain
2017-03-06 18:49 ` [PATCH v2 0/6] Chunk level degradable check Dmitrii Tcvetkov
2017-03-07  0:36 ` Adam Borowski
2017-03-07  1:35   ` Qu Wenruo
2017-03-07  2:23     ` Adam Borowski [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170307022341.phpur3esp5fx5cmd@angband.pl \
    --to=kilobyte@angband.pl \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    --cc=quwenruo@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.