Re: Btrfs/RAID5 became unmountable after SATA cable fault

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Zoiled <zoiled@online.no>
To: Duncan <1i5t5.duncan@cox.net>, linux-btrfs@vger.kernel.org
Subject: Re: Btrfs/RAID5 became unmountable after SATA cable fault
Date: Fri, 6 Nov 2015 04:19:11 +0100	[thread overview]
Message-ID: <563C1C2F.3030503@online.no> (raw)
In-Reply-To: <pan$f33d7$d1be1dd2$81273c9e$505ff3bb@cox.net>

Duncan wrote:
> Austin S Hemmelgarn posted on Wed, 04 Nov 2015 13:45:37 -0500 as
> excerpted:
>
>> On 2015-11-04 13:01, Janos Toth F. wrote:
>>> But the worst part is that there are some ISO files which were
>>> seemingly copied without errors but their external checksums (the one
>>> which I can calculate with md5sum and compare to the one supplied by
>>> the publisher of the ISO file) don't match!
>>> Well... this, I cannot understand.
>>> How could these files become corrupt from a single disk failure? And
>>> more importantly: how could these files be copied without errors? Why
>>> didn't Btrfs gave a read error when the checksums didn't add up?
>> If you can prove that there was a checksum mismatch and BTRFS returned
>> invalid data instead of a read error or going to the other disk, then
>> that is a very serious bug that needs to be fixed.  You need to keep in
>> mind also however that it's completely possible that the data was bad
>> before you wrote it to the filesystem, and if that's the case, there's
>> nothing any filesystem can do to fix it for you.
> As Austin suggests, if btrfs is returning data, and you haven't turned
> off checksumming with nodatasum or nocow, then it's almost certainly
> returning the data it was given to write out in the first place.  Whether
> that data it was given to write out was correct, however, is an
> /entirely/ different matter.
>
> If ISOs are failing their external checksums, then something is going
> on.  Had you verified the external checksums when you first got the
> files?  That is, are you sure the files were correct as downloaded and/or
> ripped?
>
> Where were the ISOs stored between original procurement/validation and
> writing to btrfs?  Is it possible you still have some/all of them on that
> media?  Do they still external-checksum-verify there?
>
> Basically, assuming btrfs checksums are validating, there's three other
> likely possibilities for where the corruption could have come from before
> writing to btrfs.  Either the files were bad as downloaded or otherwise
> procured -- which is why I asked whether you verified them upon receipt
> -- or you have memory that's going bad, or your temporary storage is
> going bad, before the files ever got written to btrfs.
>
> The memory going bad is a particularly worrying possibility,
> considering...
>
>>> Now I am really considering to move from Linux to Windows and from
>>> Btrfs RAID-5 to Storage Spaces RAID-1 + ReFS (the only limitation is
>>> that ReFS is only "self-healing" on RAID-1, not RAID-5, so I need a new
>>> motherboard with more native SATA connectors and an extra HDD). That
>>> one seemed to actually do what it promises (abort any read operations
>>> upon checksum errors [which always happens seamlessly on every read]
>>> but look at the redundant data first and seamlessly "self-heal" if
>>> possible). The only thing which made Btrfs to look as a better
>>> alternative was the RAID-5 support. But I recently experienced two
>>> cases of 1 drive failing of 3 and it always tuned out as a smaller or
>>> bigger disaster (completely lost data or inconsistent data).
>> Have you considered looking into ZFS?  I hate to suggest it as an
>> alternative to BTRFS, but it's a much more mature and well tested
>> technology than ReFS, and has many of the same features as BTRFS (and
>> even has the option for triple parity instead of the double you get with
>> RAID6).  If you do consider ZFS, make a point to look at FreeBSD in
>> addition to the Linux version, the BSD one was a much better written
>> port of the original Solaris drivers, and has better performance in many
>> cases (and as much as I hate to admit it, BSD is way more reliable than
>> Linux in most use cases).
>>
>> You should also seriously consider whether the convenience of having a
>> filesystem that fixes internal errors itself with no user intervention
>> is worth the risk of it corrupting your data.  Returning correct data
>> whenever possible is one thing, being 'self-healing' is completely
>> different.  When you start talking about things that automatically fix
>> internal errors without user intervention is when most seasoned system
>> administrators start to get really nervous.  Self correcting systems
>> have just as much chance to make things worse as they do to make things
>> better, and most of them depend on the underlying hardware working
>> correctly to actually provide any guarantee of reliability.
> I too would point you at ZFS, but there's one VERY BIG caveat, and one
> related smaller one!
>
> The people who have a lot of ZFS experience say it's generally quite
> reliable, but gobs of **RELIABLE** memory are *absolutely* *critical*!
> The self-healing works well, *PROVIDED* memory isn't producing errors.
> Absolutely reliable memory is in fact *so* critical, that running ZFS on
> non-ECC memory is severely discouraged as a very real risk to your data.
>
> Which is why the above hints that your memory may be bad are so
> worrying.  Don't even *THINK* about ZFS, particularly its self-healing
> features, if you're not absolutely sure your memory is 100% reliable,
> because apparently, based on the comment's I've seen, if it's not, you
> WILL have data loss, likely far worse than btrfs under similar
> circumstances, because when btrfs detects a checksum error it tries
> another copy if it has one (raid1/10 mode), and simply fails the read if
> it doesn't, while apparently, zfs with self-healing activated will give
> you what it thinks is the corrected data, writing it back to repair the
> problem as well, but if memory is bad, it'll be self-damaging instead of
> self-healing, and from what I've read, that's actually a reasonably
> common experience with non-ecc RAM, the reason they so severely
> discourage attempts to run zfs on non-ecc.  But people still keep doing
> it, and still keep getting burned as a result.
>
> (The smaller, in context, caveat, is that zfs works best with /lots/ of
> RAM, particularly when run on Linux, since it is designed to work with a
> different cache system than Linux uses, and won't work without it, so in
> effect with ZFS on Linux everything must be cached twice, upping the
> memory requirements dramatically.)
>
>
> (Tho I should mention, while not on zfs, I've actually had my own
> problems with ECC RAM too.  In my case, the RAM was certified to run at
> speeds faster than it was actually reliable at, such that actually stored
> data, what the ECC protects, was fine, the data was actually getting
> damaged in transit to/from the RAM.  On a lightly loaded system, such as
> one running many memory tests or under normal desktop usage conditions,
> the RAM was generally fine, no problems.  But on a heavily loaded system,
> such as when doing parallel builds (I run gentoo, which builds from
> sources in ordered to get the higher level of option flexibility that
> comes only when you can toggle build-time options), I'd often have memory
> faults and my builds would fail.
>
> The most common failure, BTW, was on tarball decompression, bunzip2 or
> the like, since the tarballs contained checksums that were verified on
> data decompression, and often they'd fail to verify.
>
> Once I updated the BIOS to one that would let me set the memory speed
> instead of using the speed the modules themselves reported, and I
> declocked the memory just one notch (this was DDR1, IIRC I declocked from
> the PC3200 it was rated, to PC3000 speeds), not only was the memory then
> 100% reliable, but I could and did actually reduce the number of wait-
> states for various operations, and it was STILL 100% reliable.  It simply
> couldn't handle the raw speeds it was certified to run, is all, tho it
> did handle it well enough, enough of the time, to make the problem far
> more difficult to diagnose and confirm than it would have been had the
> problem appeared at low load as well.
>
> As it happens, I was running reiserfs at the time, and it handled both
> that hardware issue, and a number of others I've had, far better than I'd
> have expected of /any/ filesystem, when the memory feeding it is simply
> not reliable.  Reiserfs metadata, in particular, seems incredibly
> resilient in the face of hardware issues, and I lost far less data than I
> might have expected, tho without checksums and with bad memory, I imagine
> I had occasional undetected bitflip corruption in files here or there,
> but generally nothing I detected.  I still use reiserfs on my spinning
> rust today, but it's not well suited to SSD, which is where I run btrfs.
>
> But the point for this discussion is that just because it's ECC RAM
> doesn't mean you can't have memory related errors, just that if you do,
> they're likely to be different errors, "transit errors", that will tend
> to be undetected by many memory checkers, at least the ones that don't
> tend to run full out memory bandwidth if they're simply checking that
> what was stored in a cell can be read back, unchanged.)
>
I just want to point out that please don't forget about your harddrive 
controlers memory. You mainboard might have ECC ram but your controller 
might not.

next prev parent reply	other threads:[~2015-11-06  4:15 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <g7loe3red3ksp64hmb0vsbs2.1445476794489@email.android.com>
2015-11-04 18:01 ` Btrfs/RAID5 became unmountable after SATA cable fault Janos Toth F.
2015-11-04 18:45   ` Austin S Hemmelgarn
2015-11-05  4:06     ` Duncan
2015-11-05 12:30       ` Austin S Hemmelgarn
2015-11-06  3:19       ` Zoiled [this message]
2015-11-06  9:03   ` Janos Toth F.
2015-11-06 10:23     ` Patrik Lundquist
2016-07-23 13:20 Janos Toth F.
  -- strict thread matches above, loose matches on Subject: below --
2015-10-22  1:18 János Tóth F.
2015-10-19  8:39 Janos Toth F.
2015-10-20 14:59 ` Duncan
2015-10-21 16:09 ` Janos Toth F.
2015-10-21 16:44   ` ronnie sahlberg
2015-10-21 17:42   ` ronnie sahlberg
2015-10-21 18:40     ` Janos Toth F.
2015-10-21 17:46   ` Janos Toth F.
2015-10-21 20:26   ` Chris Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=563C1C2F.3030503@online.no \
    --to=zoiled@online.no \
    --cc=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).