From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: btrfs and ECC RAM
Date: Mon, 20 Jan 2014 16:13:18 +0000 (UTC) [thread overview]
Message-ID: <pan$36f1b$8300c6a3$5960d52c$7e050399@cox.net> (raw)
In-Reply-To: 96091903-E1B0-4455-9EDC-EF94EE2E5110@aei.mpg.de
Ian Hinder posted on Mon, 20 Jan 2014 15:57:42 +0100 as excerpted:
> In hxxp://forums.freenas.org/threads/ecc-vs-non-ecc-ram-and-zfs.15449,
> they talk about reconstructing corrupted data from parity information:
>
>> Ok, no problem. ZFS will check against its parity. Oops, the parity
>> failed since we have a new corrupted bit. Remember, the checksum data
>> was calculated after the corruption from the first memory error
>> occurred. So now the parity data is used to "repair" the bad data. So
>> the data is "fixed" in RAM.
>
> i.e. that there is parity information stored with every piece of data,
> and ZFS will "correct" errors automatically from the parity information.
> I start to suspect that there is confusion here between checksumming
> for data integrity and parity information. If this is really how ZFS
> works, then if memory corruption interferes with this process, then I
> can see how a scrub could be devastating. I don't know if ZFS really
> works like this. It sounds very odd to do this without an additional
> checksum check.
Good point on the difference between parity and checksumming.
I've absolutely no confirmation of this, but privately I've begun to
wonder if this difference has anything at all to do with the delay in
getting "complete" raid5/6 support in btrfs, including scrub; if once
they actually started working with it, they realized that the traditional
parity solution of raid5/6 didn't end up working out so well with
checksumming, such that in an ungraceful shutdown/crash situation, if the
crash happened at just the wrong point, if the parity and checksumming
would actually fight each other, such that restoring one (presumably
parity, since it'd be the lower level, closer to the metal) triggered a
failure of the other (presumably the checksumming, existing above the
parity).
That could trigger all sorts of issues that I suppose to be solvable in
theory, but said theory is well beyond me, and could well invite complex
coding issues that are incredibly difficult to resolve in a satisfactory
way, thus the hiccup in getting /complete/ btrfs raid5/6 support, even
when the basic parity calculation and write-out as been in-kernel for
several kernel cycles already, and was available as patches from well
before that.
If that's correct (and again, I've absolutely nothing but the delay and
personal intuition to back it up, and the delay in itself means little,
as it seems most btrfs features have taken longer to complete than
originally planned, such that btrfs as a whole is now years behind the
originally it turned out wildly optimistic plan, meaning it's simply
personal intuition, and as I'm not a dev that should mean approximately
nothing to anyone else, so take it for what it's worth...), then
ultimately we may end up with btrfs raid5/6 modes that end up being
declared usable, but that come with lower integrity and checksumming
guarantees (particularly across device failure and replace) than those
that normally apply to btrfs in other configurations. At least for the
btrfs initially considered stable. Perhaps down the road a few years a
more advanced btrfs raid5/6 implementation, with better integrity/
checksumming guarantees, would become available.
Perhaps zfs has a similar parity mode, as opposed to real checksumming,
but has real checksumming in other modes?
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
next prev parent reply other threads:[~2014-01-20 16:13 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-18 0:23 btrfs and ECC RAM Ian Hinder
2014-01-18 0:49 ` cwillu
2014-01-18 1:10 ` George Mitchell
2014-01-18 7:16 ` Duncan
2014-01-19 19:02 ` Martin Steigerwald
2014-01-19 20:20 ` George Mitchell
2014-01-19 20:54 ` Duncan
2014-01-24 23:57 ` Russell Coker
2014-01-25 4:34 ` Duncan
2014-01-19 21:32 ` Duncan
2014-01-20 0:17 ` George Eleftheriou
2014-01-20 3:13 ` Austin S Hemmelgarn
2014-01-20 14:57 ` Ian Hinder
2014-01-20 15:36 ` Bob Marley
2014-01-20 16:04 ` Austin S Hemmelgarn
2014-01-20 16:08 ` George Mitchell
2014-01-25 0:45 ` Chris Murphy
2014-01-27 16:08 ` Calvin Walton
2014-01-27 16:42 ` Chris Murphy
2014-01-20 16:13 ` Duncan [this message]
2014-01-20 15:55 ` Fajar A. Nugraha
2014-01-23 16:00 ` David Sterba
-- strict thread matches above, loose matches on Subject: below --
2014-01-20 15:27 Ian Hinder
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='pan$36f1b$8300c6a3$5960d52c$7e050399@cox.net' \
--to=1i5t5.duncan@cox.net \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox