All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nix <nix@esperi.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Linux 3.7-rc4
Date: Thu, 08 Nov 2012 15:06:14 +0000	[thread overview]
Message-ID: <87625gqgm1.fsf@spindle.srvr.nix> (raw)
In-Reply-To: <CA+55aFxGm+2-OiVL1JRYH12BZiMBjiaST+9jrQCtHeU-iVvmyA@mail.gmail.com> (Linus Torvalds's message of "Sun, 4 Nov 2012 11:43:33 -0800")

On 4 Nov 2012, Linus Torvalds stated:

> Perhaps notable just because of the noise it caused in certain
> circles, there's the ext4 bitmap journaling fix for the issue that
> caused such a ruckus. It's a tiny patch and despite all the noise
> about it you couldn't actually trigger the problem unless you were
> doing crazy things with special mount options.

It also helps if you reboot during umount. Which is also crazy (says the
man who's still doing it). But the *real* problem is the way
journal_async_commit uses a journal checksum failure as an indication
that the commit was interrupted as long as there is no following commit
block, which as the comment in
fs/jbd2/recovery.c:do_one_pass():JBD2_COMMIT_BLOCK says, is going to
lead to an incorrect conclusion of interrupted commit and a successful
remount whenever commit N is corrupt and commit N+1 is interrupted (e.g.
by some loony rebooting or powerfailing during umount).

This problem seems to be intrinsic to journal_async_commit to me, since
it repurposes journal checksums to do a second job of missing-commit-
block detection, which pretty much means that *actual* checksum
failures, i.e. kernel bugs or corruption at writeout time, go
undetected, just as they do when journal checksumming is off -- but they
*also* mean that errors computing the checksum can go undetected. And
since journal checksumming is rarely used, such bugs can persist for a
relatively long time.

All of this means that journal_async_commit is *more* likely to cause a
no-warnings remount of a corrupted filesystem that really needs fscking
than is a filesystem using a normal non-checksummed journal. And that,
to me, is the really dangerous part. If you know the fs is corrupt, you
can fsck it and all is well, after a bit of flak: you won't overlook it.
If you don't know the fs is corrupt, you run a substantial risk of
making things much much worse before the problem escalates from ext4
errors in a log you never read into -EIO. (I happened to be reading that
log because I was trying to reproduce the nsm lockd bug. But normally?
Yeah, I spend all my time reading the kernel log, doesn't everyone?)


I'd apologise for causing all the fuss, but it wasn't me who decided to
submit it to Phoronix (actually I suspect Michael Larabel just read the
list and everything snowballed from there).

-- 
NULL && (void)

  reply	other threads:[~2012-11-08 15:06 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-04 19:43 Linux 3.7-rc4 Linus Torvalds
2012-11-08 15:06 ` Nix [this message]
2012-11-08 23:12   ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87625gqgm1.fsf@spindle.srvr.nix \
    --to=nix@esperi.org.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.