linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: linux-btrfs@vger.kernel.org
Subject: spuious I/O errors from btrfs...at the caching layer?
Date: Sat, 24 Jan 2015 13:06:01 -0500	[thread overview]
Message-ID: <20150124180601.GA15018@hungrycats.org> (raw)

[-- Attachment #1: Type: text/plain, Size: 1570 bytes --]

I am seeing a lot of spurious I/O errors that look like they come from
the cache-facing side of btrfs.  While running a heavy load with some
extent-sharing (e.g. building 20 Linux kernels at once from source trees
copied with 'cp -a --reflink=always'), some files will return spurious
EIO on read.  It happens often enough to prevent a Linux kernel build
about 1/3 of the time.

I believe the I/O errors to be spurious because:

	- there is no kernel message of any kind during the event

	- scrub detects 0 errors

	- device stats report 0 errors

	- the drive firmware reports nothing wrong through SMART

	- there seems to be no attempt to read the disk when the error
	is reported

	- "sysctl vm.drop_caches={1,2}" makes the I/O error go away.

Files become unreadable at random, and stay unreadable indefinitely;
however, any time I discover a file that gives EIO on read, I can
poke vm.drop_caches and make the EIO go away.  The file can then be
read normally and has correct contents.  The disk does not seem to be
involved in the I/O error return.

This seems to happen more often when snapshots are being deleted;
however, it occurs on systems with no snapshots as well (though
in these cases the system had snapshots in the past).

When a file returns EIO on read, other snapshots of the same file also
return EIO on read.  I have not been able to test whether this affects
reflink copies (clones) as well.

Observed from 3.17..3.18.3.  All filesystems affected use skinny-metadata.
No filesystems that are not using skinny-metadata seem to have this
problem.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

             reply	other threads:[~2015-01-24 18:06 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-24 18:06 Zygo Blaxell [this message]
2015-01-25 16:50 ` spuious I/O errors from btrfs...at the caching layer? Zygo Blaxell
2015-01-26  4:22   ` Resolved...ish. was: Re: spurious " Zygo Blaxell
2015-01-26 12:39     ` Austin S Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150124180601.GA15018@hungrycats.org \
    --to=ce3g8jdj@umail.furryterror.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).