Re: [RFC] Checksum of the parity

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Chris Murphy <lists@colorremedies.com>
To: Goffredo Baroncelli <kreijack@inwind.it>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [RFC] Checksum of the parity
Date: Sun, 13 Aug 2017 12:45:08 -0600	[thread overview]
Message-ID: <CAJCQCtToFj4BowawgYPT-GiUnZPAXsjtuZO2=imcoyOZmaQzug@mail.gmail.com> (raw)
In-Reply-To: <5d703b4c-e8ac-21dc-e327-ff1d8e232ee9@inwind.it>

On Sun, Aug 13, 2017 at 8:16 AM, Goffredo Baroncelli <kreijack@inwind.it> wrote:
> Hi all,
>
> in the BTRFS wiki, in the status page, in the "line" RAID5/6 it is reported that the parity is not checksummed. This was reported several time in the ML and also on other site (e.g. phoronix) as a BTRFS defect.
>
> However I was unable to understand it, and I am supposing that this is a false mith.
>
> So my question is: the fact that in the BTRFS5/6 the parity is not checksummed could be considered a defect ?
>
> My goal is to verify if there is a rationale to require the parity checksummed, and if no I would like to remove this from the wiki.

It is not a per se defect. If parity is corrupt, and parity is needed
for reconstruction, reconstruction will be corrupt, but is then
detected and we get EIO [1]

Further, the error detection of corrupt reconstruction is why I say
Btrfs is not subject *in practice* to the write hole problem. [2]

[1]
I haven't tested the raid6 normal read case where a stripe contains
corrupt data strip and corrupt P strip, and Q strip is good. I expect
instead of EIO, we get a reconstruction from Q, and then both data and
P get fixed up, but I can't find it in comments or code.

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/fs/btrfs/raid56.c?h=v4.12.7
line 1851, I'm not sure where we're at exactly at this line; seems
like it must be a scrub because P & Q are not relevant if data is
good.

[2]
Is Btrfs subject to the write hole problem manifesting on disk? I'm
not sure, sadly I don't read the code well enough. But if all Btrfs
raid56 writes are full stripe CoW writes, and if the prescribed order
guarantees still happen: data CoW to disk > metadata CoW to disk >
superblock update, then I don't see how the write hole happens. Write
hole requires: RMW of a stripe, which is a partial stripe overwrite,
and a crash during the modification of the stripe making that stripe
inconsistent as well as still pointed to by metadata.

-- 
Chris Murphy

next prev parent reply	other threads:[~2017-08-13 18:45 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-13 14:16 [RFC] Checksum of the parity Goffredo Baroncelli
2017-08-13 18:45 ` Chris Murphy [this message]
2017-08-13 23:40   ` Janos Toth F.
2017-08-14 14:12   ` Goffredo Baroncelli
2017-08-14 19:28     ` Chris Murphy
2017-08-14 20:18       ` Goffredo Baroncelli
2017-08-14 21:10         ` Chris Murphy
2017-08-14 13:23 ` Austin S. Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJCQCtToFj4BowawgYPT-GiUnZPAXsjtuZO2=imcoyOZmaQzug@mail.gmail.com' \
    --to=lists@colorremedies.com \
    --cc=kreijack@inwind.it \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).