From: Gregory Maxwell <gmaxwell@gmail.com>
To: Tracy Reed <treed@ultraviolet.org>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>,
Jens Axboe <jens.axboe@oracle.com>,
linux-btrfs@vger.kernel.org
Subject: Re: btrfs csum failed on git .pack file
Date: Wed, 9 Sep 2009 03:28:01 -0400 [thread overview]
Message-ID: <e692861c0909090028x53faa467g7e91bfaf6ad5139@mail.gmail.com> (raw)
In-Reply-To: <20090908215357.GK6779@tracyreed.org>
On Tue, Sep 8, 2009 at 5:53 PM, Tracy Reed<treed@ultraviolet.org> wrote=
:
> On Tue, Sep 08, 2009 at 10:22:11PM +0200, Markus Trippelsdorf spake t=
husly:
>> I've already deleted the file in question unfortunately.
>> On IRC Chris decided that either bad RAM or a harddrive error was th=
e
>> most likely reason for this chechsum mismatch.
>
> Which raises an interesting point: I know reiserfs had its problems
> but it also turned up a lot of machines with bad RAM which contribute=
d
> to giving the fs a bad name. With more and more complicated and memor=
y
> consuming filesystem datastructures being stored in RAM, larger volum=
es
> of RAM in systems, and RAM not really getting any more reliable will
> we ever see a day where something like btrfs is not recommended for
> use in any machine that doesn't have ECC? Does the filesystem do
> anything to protect itself from bad hardware?
Such as the checksums that started this thread? That *is* a
protection against bad hardware feature.
A large part of reiserfs' problem was a religious degree of "panic on
inconsistency!" so failures of identical severity that might slip by
unnoticed on other file systems were more likely to be noticed. Sadly
shooting the messenger is still a popular sport and the qualities of
BTRFS which make it more bad hardware resistant may well give it a bad
reputation. I don't know that there is much that can be done about
that.
On Wed, Sep 9, 2009 at 3:01 AM, Jens Axboe<jens.axboe@oracle.com> wrote=
:
> On Wed, Sep 09 2009, Markus Trippelsdorf wrote:
>> What a strange coincidence that it affected git pack files in both c=
ases.
>> It's almost too improbable...
>
> Probably more than a coincidence I think, the question is what though=
=2E..
Could this have been the same data in both cases? Either way=E2=80=94 =
if the
hardware was randomly corrupting high entropy blocks with very-low
probability it's quite possible that you two would have seen it while
anyone else who did chalked it up to some other problem.
I've encountered telecom equipment where a particular packet data
interacted poorly with the clock recovery hardware. "Any file
transfers fine, except for this one. This one stalls and never
finishes, but if I unzip it. it's fine!". Ugh. or it could be some
busted ECC that always 'corrects' a particular class of perfectly
valid blocks to something wrong... or it could be a million other
things. At the end of the day you just need to accept that the
hardware is junk. Black list it, give the vendor the best black eye
that you can, and move on.
I can only expect that this is going to get worse over time. I really
wish that it had become the norm for drive makers to expose an
optional raw interface to the flash. Alas, we're stuck with the
equivalent of running Linux on a hypervisor provided by Microsoft...
except the SSD makers are less experienced.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2009-09-09 7:28 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-07 20:35 btrfs csum failed on git .pack file Markus Trippelsdorf
2009-09-08 20:00 ` Jens Axboe
2009-09-08 20:22 ` Markus Trippelsdorf
2009-09-08 20:32 ` Jens Axboe
2009-09-08 20:55 ` Tomasz Torcz
2009-09-09 6:55 ` Markus Trippelsdorf
2009-09-09 7:01 ` Jens Axboe
2009-09-09 7:23 ` Markus Trippelsdorf
2009-09-09 7:29 ` Jens Axboe
2009-09-09 8:18 ` Daniel J Blueman
2009-09-09 8:26 ` Jens Axboe
2009-09-09 8:37 ` Daniel J Blueman
2009-09-09 11:19 ` Chris Mason
2009-09-09 21:01 ` Oliver Mattos
2009-09-10 10:49 ` Bryan Østergaard
2009-09-08 21:53 ` Tracy Reed
2009-09-09 7:28 ` Gregory Maxwell [this message]
2009-09-17 5:05 ` Markus Trippelsdorf
2009-09-17 6:44 ` Jens Axboe
2009-09-17 9:04 ` Markus Trippelsdorf
2009-09-17 9:05 ` Jens Axboe
2009-09-17 12:15 ` Markus Trippelsdorf
2009-09-17 13:58 ` Markus Trippelsdorf
2009-09-17 17:00 ` Zach Brown
2009-09-17 17:10 ` Markus Trippelsdorf
2009-09-17 17:50 ` Tomasz Torcz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e692861c0909090028x53faa467g7e91bfaf6ad5139@mail.gmail.com \
--to=gmaxwell@gmail.com \
--cc=jens.axboe@oracle.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=markus@trippelsdorf.de \
--cc=treed@ultraviolet.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox