Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Andrei Borzenkov <arvidjaar@gmail.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: remi@georgianit.com, Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files
Date: Thu, 28 Jun 2018 11:16:24 +0300	[thread overview]
Message-ID: <CAA91j0VTewHifyxBisMgiOw0ufmSWM7m-Cm4=5d0DP4kppSK6w@mail.gmail.com> (raw)
In-Reply-To: <fb26f3cc-fa80-222e-e403-0eead6f5e575@gmx.com>

On Thu, Jun 28, 2018 at 8:39 AM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
> On 2018年06月28日 11:14, remi@georgianit.com wrote:
>>
>>
>> On Wed, Jun 27, 2018, at 10:55 PM, Qu Wenruo wrote:
>>
>>>
>>> Please get yourself clear of what other raid1 is doing.
>>
>> A drive failure, where the drive is still there when the computer reboots, is a situation that *any* raid 1, (or for that matter, raid 5, raid 6, anything but raid 0) will recover from perfectly without raising a sweat. Some will rebuild the array automatically,
>
> WOW, that's black magic, at least for RAID1.
> The whole RAID1 has no idea of which copy is correct unlike btrfs who
> has datasum.
>
> Don't bother other things, just tell me how to determine which one is
> correct?
>

When one drive fails, it is recorded in meta-data on remaining drives;
probably configuration generation number is increased. Next time drive
with older generation is not incorporated. Hardware controllers also
keep this information in NVRAM and so do not even depend on scanning
of other disks.

> The only possibility is that, the misbehaved device missed several super
> block update so we have a chance to detect it's out-of-date.
> But that's not always working.
>

Why it should not work as long as any write to array is suspended
until superblock on remaining devices is updated?

> If you're talking about missing generation check for btrfs, that's
> valid, but it's far from a "major design flaw", as there are a lot of
> cases where other RAID1 (mdraid or LVM mirrored) can also be affected
> (the brain-split case).
>

That's different. Yes, with software-based raid there is usually no
way to detect outdated copy if no other copies are present. Having
older valid data is still very different from corrupting newer data.

>> others will automatically kick out the misbehaving drive.  *none* of them will take back the the drive with old data and start commingling that data with good copy.)\ This behaviour from BTRFS is completely abnormal.. and defeats even the most basic expectations of RAID.
>
> RAID1 can only tolerate 1 missing device, it has nothing to do with
> error detection.
> And it's impossible to detect such case without extra help.
>
> Your expectation is completely wrong.
>

Well ... somehow it is my experience as well ... :)

>>
>> I'm not the one who has to clear his expectations here.
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>

next prev parent reply	other threads:[~2018-06-28  8:16 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-28  1:42 Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files Remi Gauvin
2018-06-28  1:58 ` Qu Wenruo
2018-06-28  2:10   ` Remi Gauvin
2018-06-28  2:55     ` Qu Wenruo
2018-06-28  3:14       ` remi
2018-06-28  5:39         ` Qu Wenruo
2018-06-28  8:16           ` Andrei Borzenkov [this message]
2018-06-28  8:20             ` Andrei Borzenkov
2018-06-28  9:15             ` Qu Wenruo
2018-06-28 11:12               ` Austin S. Hemmelgarn
2018-06-28 11:46                 ` Qu Wenruo
2018-06-28 12:20                   ` Austin S. Hemmelgarn
2018-06-28 17:10               ` Andrei Borzenkov
2018-06-29  0:07                 ` Qu Wenruo
2018-06-28 22:00               ` Remi Gauvin
2018-06-28 13:24 ` Anand Jain
2018-06-28 14:17   ` Chris Murphy
2018-06-28 15:37     ` Remi Gauvin
2018-06-28 22:04       ` Chris Murphy
2018-06-28 17:37     ` Goffredo Baroncelli
2018-06-28 22:27       ` Chris Murphy
2018-06-29 15:15         ` james harvey
2018-06-29 17:09           ` Austin S. Hemmelgarn
2018-06-29 17:58             ` james harvey
2018-06-29 18:31               ` Austin S. Hemmelgarn
2018-06-30  6:33                 ` Duncan
2018-07-02 12:03                   ` Austin S. Hemmelgarn
2018-06-29 18:40           ` Chris Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAA91j0VTewHifyxBisMgiOw0ufmSWM7m-Cm4=5d0DP4kppSK6w@mail.gmail.com' \
    --to=arvidjaar@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=remi@georgianit.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).