public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Josh Triplett <josh@joshtriplett.org>
To: "Theodore Ts'o" <tytso@mit.edu>,
	Lukas Czerner <lczerner@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: Nature of ext4 corruption fixed by recent patch?
Date: Tue, 19 May 2015 09:37:40 -0700	[thread overview]
Message-ID: <20150519163739.GC2598@x> (raw)
In-Reply-To: <20150519134005.GB20421@thunk.org>

On Tue, May 19, 2015 at 09:40:05AM -0400, Theodore Ts'o wrote:
> On Mon, May 18, 2015 at 03:58:24PM -0700, josh@joshtriplett.org wrote:
> > 
> > I recently had my server's filesystem implode, and I'm currently in the
> > process of cleaning it up.  It had widespread corruption in files and
> > directories scattered across the filesystem, though all vaguely recently
> > changed.  Directories appeared corrupted or truncated, various files
> > showed up as piles of NULs, and 5000+ files and directories ended up in
> > lost+found.  I observed this corruption shortly after a reboot into
> > 4.0.2 (from a previous kernel of 3.16), with ext4 noticing an
> > inconsistency and mounting the filesystem read-only.  The underling
> > disks had no errors.
> > 
> > Reading about the corruption issue fixed by
> > d2dc317d564a46dfc683978a2e5a4f91434e9711 ("ext4: fix data corruption
> > caused by unwritten and delayed extents"), it sounds plausible.  Can
> > that strike both file data and directory data, assuming all of that data
> > ended up grouped with a delayed extent?  Would that bug manifest as
> > corrupted directories and files filled with NULs?  The system is a
> > 72-way server on which I was doing piles of parallel git pulls and
> > builds, so hitting a race seems plausible.
> 
> Unfortunately, I don't think you can blame all of your problems on the
> bug fixed by this particular bug.  First of all, it doesn't apply to
> directories at all; secondly, it's been around for a long time.  I'd
> have to check and see whether or not 3.16 had the problem, but it
> wouldn't surprise me at all.  Finally, git pulls and builds are not
> at all likely to hit the problem.
>
> It requires the combination of (a) writing to a portion of a file that
> was not previously allocated using buffered I/O, (b) an fallocate of a
> region of the file which is a superset of region written in (a) before
> it has chance to be written to disk, (c) waiting for the file data in
> (a) to be written out to disk (either via fsync or via the writeback
> daemons), and then (d) before the extent status cache gets pushed out
> of memory, another random write to a portion of the file covered by
> (a) -- in which case that specific portion of (a) could be replaced by
> all zeros.
> 
> Even most database or torrent downloads are not likely to hit this
> pattern, since it requires an fallocate of a previous previously (and
> very recently) allocated region of a file using a buffered write.
> Torrent downloads will tend to fallocate the whole file in advance,
> and while Oracle or DB2 might intermix writes and fallocates, they
> don't fallocate previously written regions of the file, and they use
> direct I/O in any case.

Ah, thanks for the clarification. :(

In particular, I didn't realize this was *only* the data of the
delayed-extent-based files.  The bug here seems to have struck various
recently-written files and directories.  (Recent in days, not seconds,
as far as I can tell; and it isn't universal based on age.) The initial
symptom was ext4 noticing that a directory was corrupt (truncated, IIRC)
and immediately marking the whole filesystem read-only.

> So it's pretty hard to hit this bug by accident, unless you happen to
> be using fsx, and even then, the only files that would get corrupted
> would be the files being written using fsx.  So I'm afraid you'll have
> to look farther afield, and consider other bugs as well as potential
> hardware problems before trusting the system again.

I'm quite skeptical of hardware problems.  The system is a few months
old, well past infant-mortality and too young for burnout.  And I've
tested the disks carefully.

Are there any other known bugs that seem likely to fit the symptoms and
circumstances?

Note that since I saw this after rebooting from 3.16 into 4.0.2, I don't
know whether the corruption was more likely caused by 3.16 or 4.0.2.

> P.S.  It's bugs like these which is why I'm always amused by people
> who think that just because a file system is safely being used by
> their developers, that it's safe to throw production workloads on
> them.

Heh.  Yeah, I like exciting new software in most areas, but not in
filesystems.  In filesystems I prefer boring. :)

> These sorts of subtle data corruptors tend to be highly timing
> depend, and very hard to find.  Sometimes these bugs can hang around
> for years before they are found and fixed.  The flip side is that
> fortunately, they tend to strike very rarely.

...lucky me.

> It's also why I'm very
> grateful for developers like Jan and Lukas.  :-)

Indeed.

- Josh Triplett

  reply	other threads:[~2015-05-19 16:37 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-18 22:58 Nature of ext4 corruption fixed by recent patch? josh
2015-05-19 13:40 ` Theodore Ts'o
2015-05-19 16:37   ` Josh Triplett [this message]
2015-05-19 17:50     ` Theodore Ts'o
2015-05-20 22:50       ` josh
2015-05-21  1:23         ` Henrique de Moraes Holschuh
2015-05-21  3:24           ` Josh Triplett

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150519163739.GC2598@x \
    --to=josh@joshtriplett.org \
    --cc=lczerner@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox