Re: ditto blocks on ZFS

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Martin <m_btrfs@ml1.co.uk>
To: linux-btrfs@vger.kernel.org
Subject: Re: ditto blocks on ZFS
Date: Mon, 19 May 2014 21:36:06 +0100	[thread overview]
Message-ID: <lldpvm$s7c$1@ger.gmane.org> (raw)
In-Reply-To: <10946613.XrCytCZfuu@xev>

On 18/05/14 17:09, Russell Coker wrote:
> On Sat, 17 May 2014 13:50:52 Martin wrote:
[...]
>> Do you see or measure any real advantage?
> 
> Imagine that you have a RAID-1 array where both disks get ~14,000 read errors.  
> This could happen due to a design defect common to drives of a particular 
> model or some shared environmental problem.  Most errors would be corrected by 
> RAID-1 but there would be a risk of some data being lost due to both copies 
> being corrupt.  Another possibility is that one disk could entirely die 
> (although total disk death seems rare nowadays) and the other could have 
> corruption.  If metadata was duplicated in addition to being on both disks 
> then the probability of data loss would be reduced.
> 
> Another issue is the case where all drive slots are filled with active drives 
> (a very common configuration).  To replace a disk you have to physically 
> remove the old disk before adding the new one.  If the array is a RAID-1 or 
> RAID-5 then ANY error during reconstruction loses data.  Using dup for 
> metadata on top of the RAID protections (IE the ZFS ditto idea) means that 
> case doesn't lose you data.

Your example there is for the case where in effect there is no RAID. How
is that case any better than what is already done for btrfs duplicating
metadata?

So...

What real-world failure modes do the ditto blocks usefully protect against?

And how does that compare for failure rates and against what is already
done?

For example, we have RAID1 and RAID5 to protect against any one RAID
chunk being corrupted or for the total loss of any one device.

There is a second part to that in that another failure cannot be
tolerated until the RAID is remade.

Hence, we have RAID6 that protects against any two failures for a chunk
or device. Hence with just one failure, you can tolerate a second
failure whilst rebuilding the RAID.

And then we supposedly have safety-by-design where the filesystem itself
is using a journal and barriers/sync to ensure that the filesystem is
always kept in a consistent state, even after an interruption to any writes.

*What other failure modes* should we guard against?

There has been mention of fixing metadata keys from single bit flips...

Should hamming codes be used instead of a crc so that we can have
multiple bit error detect, single bit error correct functionality for
all data both in RAM and on disk for those systems that do not use ECC RAM?

Would that be useful?...

Regards,
Martin

next prev parent reply	other threads:[~2014-05-19 20:59 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-16  3:07 ditto blocks on ZFS Russell Coker
2014-05-17 12:50 ` Martin
2014-05-17 14:24   ` Hugo Mills
2014-05-18 16:09   ` Russell Coker
2014-05-19 20:36     ` Martin [this message]
2014-05-19 21:47       ` Brendan Hide
2014-05-20  2:07         ` Russell Coker
2014-05-20 14:07           ` Austin S Hemmelgarn
2014-05-20 20:11             ` Brendan Hide
2014-05-20 14:56           ` ashford
2014-05-21  2:51             ` Russell Coker
2014-05-21 23:05               ` Martin
2014-05-22 11:10                 ` Austin S Hemmelgarn
2014-05-22 22:09               ` ashford
2014-05-23  3:54                 ` Russell Coker
2014-05-23  8:03                   ` Duncan
2014-05-21 23:29           ` Konstantinos Skarlatos
  -- strict thread matches above, loose matches on Subject: below --
2014-05-22 15:28 Tomasz Chmielewski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='lldpvm$s7c$1@ger.gmane.org' \
    --to=m_btrfs@ml1.co.uk \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).