From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: philip@philip-seeger.de
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Monitoring not working as "dev stats" returns 0 after read error occurred
Date: Wed, 15 Jan 2020 20:16:56 -0500 [thread overview]
Message-ID: <20200116011656.GL13306@hungrycats.org> (raw)
In-Reply-To: <3283de40c2750cd62d020ed71430cd35@philip-seeger.de>
[-- Attachment #1: Type: text/plain, Size: 3330 bytes --]
On Wed, Jan 08, 2020 at 06:41:36PM +0100, philip@philip-seeger.de wrote:
> A bad drive caused i/o errors, but no notifications were sent out because
> "btrfs dev stats" fails to increase the error counter.
>
> When checking dmesg, looking for something completely different, there were
> some error messages indicating that this drive is causing i/o errors and may
> die soon:
>
> BTRFS info (device sda3): read error corrected: ino 194473 off 2170880
>
> But checking the stats (as generally recommended to get notified about such
> errors) claims that no errors have occurred (nothing to see here, keep
> moving):
>
> # btrfs dev stats / | grep sda3 | grep read
> [/dev/sda3].read_io_errs 0
>
> Why?
> Isn't that the most important feature of a RAID system - to get notified
> about errors, to be able to replace such a drive...?
>
> The fs is mirrored, so those errors didn't cause any data loss.
This seems to be a result of commit
0cc068e6ee59c1fffbfa977d8bf868b7551d80ac "btrfs: don't report readahead
errors and don't update statistics" which assumes that readahead errors
are ignored and have no side effects. It turns out this is not true: in
the event of a recoverable read or csum failure, the filesystem overwrites
the bad block with good data recovered by using mirror copies or parity,
in both normal reads and readahead reads. Later, when a non-readahead
read comes along to read the block, there's no error on the device any
more because btrfs already corrected it, so no counters are updated.
This is very bad--the whole point of dev stats is to make failure events
observable so failing hardware can be identified and replaced, and commit
0cc068e6ee makes some of those events not count.
This commit appears in kernels 5.1-rc3 and later. A simple workaround
is to revert the commit, and let printk ratelimiting deal with the dmesg
spam issue. I don't know if there are other errors that readahead can
generate that are not IO errors (e.g. running out of memory). If there
are, then there will be some false positive read error counts. These
are still better than false negative read error counts.
An alternative fix would be to have readahead not correct the data,
so readahead truly has no side-effects. This would make the rationale
of this commit more supportable--except we'd still want to know if a
device was intermittently failing to read, so we should still count the
readahead errors anyway.
It would be nice to have a dev stats counter for corrections too. It's a
pretty significant event for monitoring if we overwrite committed data
for any reason, but especially so if we are doing it in response to a
detected device failure. We should definitely count those.
It would also be nice to throw some data into a tree every time an error
occurs (extent bytenr, offset, generation, error type, event count,
device or mirror ID) every time an error occurs. A tool can read this
tree to pull out the extent bytenrs, map them back to paths (if the
generation matches; otherwise, it's an old error and the file has since
been overwritten), and produce something like the zfs status output.
But that's way beyond a bug fix.
>
> # uname -sr
> Linux 5.2.7-100.fc29.x86_64
> # btrfs --version
> btrfs-progs v5.1
>
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
prev parent reply other threads:[~2020-01-16 1:16 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-08 17:41 Monitoring not working as "dev stats" returns 0 after read error occurred philip
2020-01-08 19:35 ` Graham Cobb
2020-01-09 10:33 ` Philip Seeger
2020-01-09 12:04 ` Nikolay Borisov
2020-01-09 14:34 ` Philip Seeger
2020-01-09 23:50 ` Philip Seeger
2020-01-11 7:42 ` Andrei Borzenkov
2020-01-12 11:42 ` Philip Seeger
2020-01-12 17:39 ` waxhead
2020-01-12 20:27 ` Chris Murphy
2020-01-12 22:45 ` Philip Seeger
2020-01-08 21:47 ` Chris Murphy
2020-01-09 10:49 ` Philip Seeger
2020-01-16 1:16 ` Zygo Blaxell [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200116011656.GL13306@hungrycats.org \
--to=ce3g8jdj@umail.furryterror.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=philip@philip-seeger.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox