From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: linux-btrfs@vger.kernel.org
Subject: spuious I/O errors from btrfs...at the caching layer?
Date: Sat, 24 Jan 2015 13:06:01 -0500 [thread overview]
Message-ID: <20150124180601.GA15018@hungrycats.org> (raw)
[-- Attachment #1: Type: text/plain, Size: 1570 bytes --]
I am seeing a lot of spurious I/O errors that look like they come from
the cache-facing side of btrfs. While running a heavy load with some
extent-sharing (e.g. building 20 Linux kernels at once from source trees
copied with 'cp -a --reflink=always'), some files will return spurious
EIO on read. It happens often enough to prevent a Linux kernel build
about 1/3 of the time.
I believe the I/O errors to be spurious because:
- there is no kernel message of any kind during the event
- scrub detects 0 errors
- device stats report 0 errors
- the drive firmware reports nothing wrong through SMART
- there seems to be no attempt to read the disk when the error
is reported
- "sysctl vm.drop_caches={1,2}" makes the I/O error go away.
Files become unreadable at random, and stay unreadable indefinitely;
however, any time I discover a file that gives EIO on read, I can
poke vm.drop_caches and make the EIO go away. The file can then be
read normally and has correct contents. The disk does not seem to be
involved in the I/O error return.
This seems to happen more often when snapshots are being deleted;
however, it occurs on systems with no snapshots as well (though
in these cases the system had snapshots in the past).
When a file returns EIO on read, other snapshots of the same file also
return EIO on read. I have not been able to test whether this affects
reflink copies (clones) as well.
Observed from 3.17..3.18.3. All filesystems affected use skinny-metadata.
No filesystems that are not using skinny-metadata seem to have this
problem.
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
next reply other threads:[~2015-01-24 18:06 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-24 18:06 Zygo Blaxell [this message]
2015-01-25 16:50 ` spuious I/O errors from btrfs...at the caching layer? Zygo Blaxell
2015-01-26 4:22 ` Resolved...ish. was: Re: spurious " Zygo Blaxell
2015-01-26 12:39 ` Austin S Hemmelgarn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150124180601.GA15018@hungrycats.org \
--to=ce3g8jdj@umail.furryterror.org \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).