From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: Gareth Pye <gareth@cerberos.id.au>
Cc: Goffredo Baroncelli <kreijack@inwind.it>,
Qu Wenruo <quwenruo@cn.fujitsu.com>,
linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH] btrfs: raid56: Use correct stolen pages to calculate P/Q
Date: Fri, 25 Nov 2016 00:07:40 -0500 [thread overview]
Message-ID: <20161125050740.GH8685@hungrycats.org> (raw)
In-Reply-To: <CA+WRLO_M=HkDox6acxxLtu9rhcK1cXu03d=cXNtvMxYfvhC3WA@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 1828 bytes --]
On Fri, Nov 25, 2016 at 03:40:36PM +1100, Gareth Pye wrote:
> On Fri, Nov 25, 2016 at 3:31 PM, Zygo Blaxell
> <ce3g8jdj@umail.furryterror.org> wrote:
> >
> > This risk mitigation measure does rely on admins taking a machine in this
> > state down immediately, and also somehow knowing not to start a scrub
> > while their RAM is failing...which is kind of an annoying requirement
> > for the admin.
>
> Attempting to detect if RAM is bad when scrub starts is both time
> consuming and not very reliable right.
RAM, like all hardware, could fail at any time, and a scrub could already
be running when it happens. This is annoying but also a fact of life that
admins have to deal with.
Testing RAM before scrub starts is not more beneficial than testing RAM
at random intervals--but if you are testing RAM at random intervals,
why not do it at the same intervals as scrub?
If I see corruption errors showing up in stats, I will do a basic sanity
test to make sure they're coming from the storage layer and not somewhere
closer to the CPU. If all errors come from one device and there are clear
log messages showing SCSI device errors and the SMART log matches the
other data, RAM is probably not the root case of failures, so scrub away.
If normally reliable programs like /bin/sh start randomly segfaulting,
there's smoke pouring out of the back of the machine, all the disks are
full of csum failures, and the BIOS welcome message has spelling errors
that weren't there before, I would *not* start a scrub. More like
turn the machine off, take it apart, test all the pieces separately,
and only do a scrub after everything above the storage layer had been
replaced or recertified. I certainly wouldn't want the filesystem to
try to fix the csum failures it finds in such situations.
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
next prev parent reply other threads:[~2016-11-25 5:18 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-21 8:50 [PATCH] btrfs: raid56: Use correct stolen pages to calculate P/Q Qu Wenruo
2016-11-21 18:48 ` Goffredo Baroncelli
2016-11-22 0:28 ` Qu Wenruo
2016-11-22 18:02 ` Goffredo Baroncelli
2016-11-25 4:31 ` Zygo Blaxell
2016-11-25 4:40 ` Gareth Pye
2016-11-25 5:07 ` Zygo Blaxell [this message]
2016-11-26 13:12 ` Goffredo Baroncelli
2016-11-26 18:54 ` Zygo Blaxell
2016-11-26 23:16 ` Goffredo Baroncelli
2016-11-27 16:53 ` Zygo Blaxell
2016-11-28 0:40 ` Qu Wenruo
2016-11-28 18:45 ` Goffredo Baroncelli
2016-11-28 19:01 ` Christoph Anton Mitterer
2016-11-28 19:39 ` Austin S. Hemmelgarn
2016-11-28 3:37 ` Christoph Anton Mitterer
2016-11-28 3:53 ` Andrei Borzenkov
2016-11-28 4:01 ` Christoph Anton Mitterer
2016-11-28 18:32 ` Goffredo Baroncelli
2016-11-28 19:00 ` Christoph Anton Mitterer
2016-11-28 21:48 ` Zygo Blaxell
2016-11-29 1:52 ` Christoph Anton Mitterer
2016-11-29 3:19 ` Zygo Blaxell
2016-11-29 7:35 ` Adam Borowski
2016-11-29 14:24 ` Christoph Anton Mitterer
2016-11-22 18:58 ` Chris Mason
2016-11-23 0:26 ` Qu Wenruo
2016-11-26 17:18 ` Chris Mason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161125050740.GH8685@hungrycats.org \
--to=ce3g8jdj@umail.furryterror.org \
--cc=gareth@cerberos.id.au \
--cc=kreijack@inwind.it \
--cc=linux-btrfs@vger.kernel.org \
--cc=quwenruo@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).