All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josh Triplett <josh@joshtriplett.org>
To: "Theodore Ts'o" <tytso@mit.edu>,
	Lukas Czerner <lczerner@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: Nature of ext4 corruption fixed by recent patch?
Date: Tue, 19 May 2015 09:37:40 -0700	[thread overview]
Message-ID: <20150519163739.GC2598@x> (raw)
In-Reply-To: <20150519134005.GB20421@thunk.org>

On Tue, May 19, 2015 at 09:40:05AM -0400, Theodore Ts'o wrote:
> On Mon, May 18, 2015 at 03:58:24PM -0700, josh@joshtriplett.org wrote:
> > 
> > I recently had my server's filesystem implode, and I'm currently in the
> > process of cleaning it up.  It had widespread corruption in files and
> > directories scattered across the filesystem, though all vaguely recently
> > changed.  Directories appeared corrupted or truncated, various files
> > showed up as piles of NULs, and 5000+ files and directories ended up in
> > lost+found.  I observed this corruption shortly after a reboot into
> > 4.0.2 (from a previous kernel of 3.16), with ext4 noticing an
> > inconsistency and mounting the filesystem read-only.  The underling
> > disks had no errors.
> > 
> > Reading about the corruption issue fixed by
> > d2dc317d564a46dfc683978a2e5a4f91434e9711 ("ext4: fix data corruption
> > caused by unwritten and delayed extents"), it sounds plausible.  Can
> > that strike both file data and directory data, assuming all of that data
> > ended up grouped with a delayed extent?  Would that bug manifest as
> > corrupted directories and files filled with NULs?  The system is a
> > 72-way server on which I was doing piles of parallel git pulls and
> > builds, so hitting a race seems plausible.
> 
> Unfortunately, I don't think you can blame all of your problems on the
> bug fixed by this particular bug.  First of all, it doesn't apply to
> directories at all; secondly, it's been around for a long time.  I'd
> have to check and see whether or not 3.16 had the problem, but it
> wouldn't surprise me at all.  Finally, git pulls and builds are not
> at all likely to hit the problem.
>
> It requires the combination of (a) writing to a portion of a file that
> was not previously allocated using buffered I/O, (b) an fallocate of a
> region of the file which is a superset of region written in (a) before
> it has chance to be written to disk, (c) waiting for the file data in
> (a) to be written out to disk (either via fsync or via the writeback
> daemons), and then (d) before the extent status cache gets pushed out
> of memory, another random write to a portion of the file covered by
> (a) -- in which case that specific portion of (a) could be replaced by
> all zeros.
> 
> Even most database or torrent downloads are not likely to hit this
> pattern, since it requires an fallocate of a previous previously (and
> very recently) allocated region of a file using a buffered write.
> Torrent downloads will tend to fallocate the whole file in advance,
> and while Oracle or DB2 might intermix writes and fallocates, they
> don't fallocate previously written regions of the file, and they use
> direct I/O in any case.

Ah, thanks for the clarification. :(

In particular, I didn't realize this was *only* the data of the
delayed-extent-based files.  The bug here seems to have struck various
recently-written files and directories.  (Recent in days, not seconds,
as far as I can tell; and it isn't universal based on age.) The initial
symptom was ext4 noticing that a directory was corrupt (truncated, IIRC)
and immediately marking the whole filesystem read-only.

> So it's pretty hard to hit this bug by accident, unless you happen to
> be using fsx, and even then, the only files that would get corrupted
> would be the files being written using fsx.  So I'm afraid you'll have
> to look farther afield, and consider other bugs as well as potential
> hardware problems before trusting the system again.

I'm quite skeptical of hardware problems.  The system is a few months
old, well past infant-mortality and too young for burnout.  And I've
tested the disks carefully.

Are there any other known bugs that seem likely to fit the symptoms and
circumstances?

Note that since I saw this after rebooting from 3.16 into 4.0.2, I don't
know whether the corruption was more likely caused by 3.16 or 4.0.2.

> P.S.  It's bugs like these which is why I'm always amused by people
> who think that just because a file system is safely being used by
> their developers, that it's safe to throw production workloads on
> them.

Heh.  Yeah, I like exciting new software in most areas, but not in
filesystems.  In filesystems I prefer boring. :)

> These sorts of subtle data corruptors tend to be highly timing
> depend, and very hard to find.  Sometimes these bugs can hang around
> for years before they are found and fixed.  The flip side is that
> fortunately, they tend to strike very rarely.

...lucky me.

> It's also why I'm very
> grateful for developers like Jan and Lukas.  :-)

Indeed.

- Josh Triplett

  reply	other threads:[~2015-05-19 16:37 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-18 22:58 Nature of ext4 corruption fixed by recent patch? josh
2015-05-19 13:40 ` Theodore Ts'o
2015-05-19 16:37   ` Josh Triplett [this message]
2015-05-19 17:50     ` Theodore Ts'o
2015-05-20 22:50       ` josh
2015-05-21  1:23         ` Henrique de Moraes Holschuh
2015-05-21  3:24           ` Josh Triplett

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150519163739.GC2598@x \
    --to=josh@joshtriplett.org \
    --cc=lczerner@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.