From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: linux-btrfs@vger.kernel.org
Subject: spuious I/O errors from btrfs...at the caching layer?
Date: Sat, 24 Jan 2015 13:06:01 -0500 [thread overview]
Message-ID: <20150124180601.GA15018@hungrycats.org> (raw)
[-- Attachment #1: Type: text/plain, Size: 1570 bytes --]
I am seeing a lot of spurious I/O errors that look like they come from
the cache-facing side of btrfs. While running a heavy load with some
extent-sharing (e.g. building 20 Linux kernels at once from source trees
copied with 'cp -a --reflink=always'), some files will return spurious
EIO on read. It happens often enough to prevent a Linux kernel build
about 1/3 of the time.
I believe the I/O errors to be spurious because:
- there is no kernel message of any kind during the event
- scrub detects 0 errors
- device stats report 0 errors
- the drive firmware reports nothing wrong through SMART
- there seems to be no attempt to read the disk when the error
is reported
- "sysctl vm.drop_caches={1,2}" makes the I/O error go away.
Files become unreadable at random, and stay unreadable indefinitely;
however, any time I discover a file that gives EIO on read, I can
poke vm.drop_caches and make the EIO go away. The file can then be
read normally and has correct contents. The disk does not seem to be
involved in the I/O error return.
This seems to happen more often when snapshots are being deleted;
however, it occurs on systems with no snapshots as well (though
in these cases the system had snapshots in the past).
When a file returns EIO on read, other snapshots of the same file also
return EIO on read. I have not been able to test whether this affects
reflink copies (clones) as well.
Observed from 3.17..3.18.3. All filesystems affected use skinny-metadata.
No filesystems that are not using skinny-metadata seem to have this
problem.
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
next reply other threads:[~2015-01-24 18:06 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-24 18:06 Zygo Blaxell [this message]
2015-01-25 16:50 ` spuious I/O errors from btrfs...at the caching layer? Zygo Blaxell
2015-01-26 4:22 ` Resolved...ish. was: Re: spurious " Zygo Blaxell
2015-01-26 12:39 ` Austin S Hemmelgarn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150124180601.GA15018@hungrycats.org \
--to=ce3g8jdj@umail.furryterror.org \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.