Re: RAID56 - 6 parity raid - Andrei Borzenkov

public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Andrei Borzenkov <arvidjaar@gmail.com>
To: waxhead@dirtcellar.net, kreijack@inwind.it,
	Duncan <1i5t5.duncan@cox.net>,
	linux-btrfs@vger.kernel.org
Subject: Re: RAID56 - 6 parity raid
Date: Wed, 2 May 2018 21:50:25 +0300	[thread overview]
Message-ID: <06986e21-28cc-a58e-b16c-129d9f5a845d@gmail.com> (raw)
In-Reply-To: <52465207-6c06-ef4b-b212-94b2199811a0@dirtcellar.net>

02.05.2018 21:17, waxhead пишет:
> Goffredo Baroncelli wrote:
>> On 05/02/2018 06:55 PM, waxhead wrote:
>>>>
>>>> So again, which problem would solve having the parity checksummed ?
>>>> On the best of my knowledge nothing. In any case the data is
>>>> checksummed so it is impossible to return corrupted data (modulo bug
>>>> :-) ).
>>>>
>>> I am not a BTRFS dev , but this should be quite easy to answer.
>>> Unless you checksum the parity there is no way to verify that that
>>> the data (parity) you use to reconstruct other data is correct.
>>
>> In any case you could catch that the compute data is wrong, because
>> the data is always checksummed. And in any case you must check the
>> data against their checksum.
>>
> What if you lost an entire disk? 

How does it matter exactly? RAID is per chunk anyway.

> or had corruption for both data AND checksum?

By the same logic you may have corrupted parity and its checksum.

> How do you plan to safely reconstruct that without checksummed
> parity?
> 

Define "safely". The main problem of current RAID56 implementation is
that stripe is not updated atomically (at least, that is what I
understood from the past discussions) and this is not solved by having
extra parity checksum. So how exactly "safety" is improved here? You
still need overall checksum to verify result of reconstruction, what
exactly extra parity checksum buys you?

>> My point is that storing the checksum is a cost that you pay *every
>> time*. Every time you update a part of a stripe you need to update the
>> parity, and then in turn the parity checksum. It is not a problem of
>> space occupied nor a computational problem. It is a a problem of write
>> amplification...
> How much of a problem is this? no benchmarks have been run since the
> feature is not yet there I suppose.
> 
>>
>> The only gain is to avoid to try to use the parity when
>> a) you need it (i.e. when the data is missing and/or corrupted)
> I'm not sure I can make out your argument here , but with RAID5/6 you
> don't have another copy to restore from. You *have* to use the parity to
> reconstruct data and it is a good thing if this data is trusted.
> 

Again - please describe when having parity checksum will be beneficial
over current implementation. You do not reconstruct anything as long as
all data strips are there, so parity checksum will not be used. If one
data strip fails (including checksum) it will be reconstructed and
verified. If parity itself is corrupted, checksum verification fails
(hopefully). How is it different from verifying parity checksum before
reconstructing? In both cases data cannot be reconstructed, end of story.

>> and b) it is corrupted.
>> But the likelihood of this case is very low. And you can catch it
>> during the data checksum check (which has to be performed in any case !).
>>
>> So from one side you have a *cost every time* (the write
>> amplification), to other side you have a gain (cpu-time) *only in
>> case* of the parity is corrupted and you need it (eg. scrub or
>> corrupted data)).
>>
>> IMHO the cost are very higher than the gain, and the likelihood the
>> gain is very lower compared to the likelihood (=100% or always) of the
>> cost.
>>
> Then run benchmarks and considering making parity checksums optional
> (but pretty please dipped in syrup with sugar on top - keep in on by
> default).
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2018-05-02 18:50 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-01 21:57 RAID56 - 6 parity raid Gandalf Corvotempesta
2018-05-02  1:47 ` Duncan
2018-05-02 16:27   ` Goffredo Baroncelli
2018-05-02 16:55     ` waxhead
2018-05-02 17:19       ` Austin S. Hemmelgarn
2018-05-02 17:25       ` Goffredo Baroncelli
2018-05-02 18:17         ` waxhead
2018-05-02 18:50           ` Andrei Borzenkov [this message]
2018-05-02 21:20             ` waxhead
2018-05-02 21:54               ` Goffredo Baroncelli
2018-05-02 19:04           ` Goffredo Baroncelli
2018-05-02 19:29         ` Austin S. Hemmelgarn
2018-05-02 20:40           ` Goffredo Baroncelli
2018-05-02 23:32             ` Duncan
2018-05-03 11:26             ` Austin S. Hemmelgarn
2018-05-03 19:00               ` Goffredo Baroncelli
2018-05-03  8:11           ` Andrei Borzenkov
2018-05-03 11:28             ` Austin S. Hemmelgarn
2018-05-03 12:47 ` Alberto Bursi
2018-05-03 19:03   ` Goffredo Baroncelli
  -- strict thread matches above, loose matches on Subject: below --
2018-05-02 19:25 Gandalf Corvotempesta
2018-05-02 23:07 ` Duncan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=06986e21-28cc-a58e-b16c-129d9f5a845d@gmail.com \
    --to=arvidjaar@gmail.com \
    --cc=1i5t5.duncan@cox.net \
    --cc=kreijack@inwind.it \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=waxhead@dirtcellar.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox