From: Chris Murphy <lists@colorremedies.com>
To: Goffredo Baroncelli <kreijack@inwind.it>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [RFC] Checksum of the parity
Date: Sun, 13 Aug 2017 12:45:08 -0600 [thread overview]
Message-ID: <CAJCQCtToFj4BowawgYPT-GiUnZPAXsjtuZO2=imcoyOZmaQzug@mail.gmail.com> (raw)
In-Reply-To: <5d703b4c-e8ac-21dc-e327-ff1d8e232ee9@inwind.it>
On Sun, Aug 13, 2017 at 8:16 AM, Goffredo Baroncelli <kreijack@inwind.it> wrote:
> Hi all,
>
> in the BTRFS wiki, in the status page, in the "line" RAID5/6 it is reported that the parity is not checksummed. This was reported several time in the ML and also on other site (e.g. phoronix) as a BTRFS defect.
>
> However I was unable to understand it, and I am supposing that this is a false mith.
>
> So my question is: the fact that in the BTRFS5/6 the parity is not checksummed could be considered a defect ?
>
> My goal is to verify if there is a rationale to require the parity checksummed, and if no I would like to remove this from the wiki.
It is not a per se defect. If parity is corrupt, and parity is needed
for reconstruction, reconstruction will be corrupt, but is then
detected and we get EIO [1]
Further, the error detection of corrupt reconstruction is why I say
Btrfs is not subject *in practice* to the write hole problem. [2]
[1]
I haven't tested the raid6 normal read case where a stripe contains
corrupt data strip and corrupt P strip, and Q strip is good. I expect
instead of EIO, we get a reconstruction from Q, and then both data and
P get fixed up, but I can't find it in comments or code.
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/fs/btrfs/raid56.c?h=v4.12.7
line 1851, I'm not sure where we're at exactly at this line; seems
like it must be a scrub because P & Q are not relevant if data is
good.
[2]
Is Btrfs subject to the write hole problem manifesting on disk? I'm
not sure, sadly I don't read the code well enough. But if all Btrfs
raid56 writes are full stripe CoW writes, and if the prescribed order
guarantees still happen: data CoW to disk > metadata CoW to disk >
superblock update, then I don't see how the write hole happens. Write
hole requires: RMW of a stripe, which is a partial stripe overwrite,
and a crash during the modification of the stripe making that stripe
inconsistent as well as still pointed to by metadata.
--
Chris Murphy
next prev parent reply other threads:[~2017-08-13 18:45 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-13 14:16 [RFC] Checksum of the parity Goffredo Baroncelli
2017-08-13 18:45 ` Chris Murphy [this message]
2017-08-13 23:40 ` Janos Toth F.
2017-08-14 14:12 ` Goffredo Baroncelli
2017-08-14 19:28 ` Chris Murphy
2017-08-14 20:18 ` Goffredo Baroncelli
2017-08-14 21:10 ` Chris Murphy
2017-08-14 13:23 ` Austin S. Hemmelgarn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJCQCtToFj4BowawgYPT-GiUnZPAXsjtuZO2=imcoyOZmaQzug@mail.gmail.com' \
--to=lists@colorremedies.com \
--cc=kreijack@inwind.it \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).