linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Sandeen <sandeen@redhat.com>
To: "Theodore Ts'o" <tytso@mit.edu>, Nix <nix@esperi.org.uk>,
	linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org,
	"J. Bruce Fields" <bfields@fieldses.org>,
	"Bryan Schumaker" <bjschuma@netapp.com>,
	"Peng Tao" <bergwolf@gmail.com>,
	Trond.Myklebust@netapp.com, gregkh@linuxfoundation.org,
	"Toralf Förster" <toralf.foerster@gmx.de>
Subject: Re: Apparent serious progressive ext4 data corruption bug in 3.6.3 (and other stable branches?)
Date: Wed, 24 Oct 2012 12:22:28 -0500	[thread overview]
Message-ID: <508823D4.4010407@redhat.com> (raw)
In-Reply-To: <20121024052351.GB21714@thunk.org>

On 10/24/2012 12:23 AM, Theodore Ts'o wrote:
> On Tue, Oct 23, 2012 at 11:27:09PM -0500, Eric Sandeen wrote:
>>
>> Ok, fair enough.  If the BBU is working, nobarrier is ok; I don't trust
>> journal_async_commit, but that doesn't mean this isn't a regression.
> 
> Note that Toralf has reported almost exactly the same set of symptoms,
> but he's using an external USB stick --- and as far as I know he
> wasn't using nobarrier and/or the journal_async_commit.  Toralf, can
> you confirm what, if any, mount options you were using when you saw
> it.
> 
> I've been looking at this some more, and there's one other thing that
> the short circuit code does, which is neglects setting the
> JBD2_FLUSHED flag, which is used by the commit code to know when it
> needs to reset the s_start fields in the superblock when we make our
> next commit.  However, this would only happen if the short circuit
> code is getting hit some time other than when the file system is
> getting unmounted --- and that's what Eric and I can't figure out how
> it might be happening.  Journal flushes outside of an unmount does
> happen as part of online resizing, the FIBMAP ioctl, or when the file
> system is frozen.  But it didn't sound like Toralf or Nix was using
> any of those features.  (Toralf, Nix, please correct me if my
> assumptions here is wrong).

If I freeze w/ anything in the log, then s_start !=0 and we proceed
normally.  If I re-freeze w/o anything in the log, it's already set to
FLUSHED (which makes sense) so not re-setting it doesn't matter.  So I
don't see that that's an issue.

As for FIBMAP I think we only do journal_flush if it's data=journal.

In other news, Phoronix is on the case, so expect escalating freakouts ;)

-Eric

  parent reply	other threads:[~2012-10-24 17:22 UTC|newest]

Thread overview: 90+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-22 16:17 Heads-up: 3.6.2 / 3.6.3 NFS server panic: 3.6.2+ regression? Nix
2012-10-23  1:33 ` J. Bruce Fields
2012-10-23 14:07   ` Nix
2012-10-23 14:30     ` J. Bruce Fields
2012-10-23 16:32       ` Heads-up: 3.6.2 / 3.6.3 NFS server oops: 3.6.2+ regression? (also an unrelated ext4 data loss bug) Nix
2012-10-23 16:46         ` J. Bruce Fields
2012-10-23 16:54           ` J. Bruce Fields
2012-10-23 16:56           ` Myklebust, Trond
2012-10-23 17:05             ` Nix
2012-10-23 17:36               ` Nix
2012-10-23 17:43                 ` J. Bruce Fields
2012-10-23 17:44                 ` Myklebust, Trond
2012-10-23 17:57                   ` Myklebust, Trond
     [not found]                   ` <1351015039.4622.23.camel@lade.trondhjem.org>
2012-10-23 18:23                     ` Myklebust, Trond
2012-10-23 19:49                       ` Nix
2012-10-24 10:18                         ` [PATCH] lockd: fix races in per-net NSM client handling Stanislav Kinsbursky
2012-10-23 20:57         ` Apparent serious progressive ext4 data corruption bug in 3.6.3 (and other stable branches?) Nix
2012-10-23 22:19           ` Theodore Ts'o
2012-10-23 22:47             ` Nix
2012-10-23 23:16               ` Theodore Ts'o
2012-10-23 23:06             ` Nix
2012-10-23 23:28               ` Theodore Ts'o
2012-10-23 23:34                 ` Nix
2012-10-24  0:57             ` Eric Sandeen
2012-10-24 20:17               ` Jan Kara
2012-10-26 15:25                 ` Eric Sandeen
2012-10-24 19:13             ` Jannis Achstetter
2012-10-24 21:31               ` Theodore Ts'o
2012-10-24 22:05                 ` Jannis Achstetter
2012-10-24 23:47                 ` Nix
2012-10-25 17:02                 ` Felipe Contreras
2012-10-24 21:04             ` Jannis Achstetter
2012-10-24  1:13           ` Eric Sandeen
2012-10-24  4:15             ` Nix
2012-10-24  4:27               ` Eric Sandeen
2012-10-24  5:23                 ` Theodore Ts'o
2012-10-24  7:00                   ` Hugh Dickins
2012-10-24 11:46                     ` Nix
2012-10-24 11:45                   ` Nix
2012-10-24 17:22                   ` Eric Sandeen [this message]
2012-10-24 19:49                   ` Nix
2012-10-24 19:54                     ` Nix
2012-10-24 20:30                     ` Eric Sandeen
2012-10-24 20:34                       ` Nix
2012-10-24 20:45                     ` Nix
2012-10-24 21:08                     ` Theodore Ts'o
2012-10-24 23:27                       ` Apparent serious progressive ext4 data corruption bug in 3.6 (when rebooting during umount) Nix
2012-10-24 23:42                         ` Nix
2012-10-25  1:10                         ` Theodore Ts'o
2012-10-25  1:45                           ` Nix
2012-10-25 14:12                             ` Theodore Ts'o
2012-10-25 14:15                               ` Nix
2012-10-25 17:39                                 ` Nix
2012-10-25 11:06                           ` Nix
2012-10-26  0:22                           ` Apparent serious progressive ext4 data corruption bug in 3.6 (when rebooting during umount) (possibly blockdev / arcmsr at fault??) Nix
2012-10-26  0:11               ` Apparent serious progressive ext4 data corruption bug in 3.6.3 (and other stable branches?) Ric Wheeler
2012-10-26  0:43                 ` Theodore Ts'o
2012-10-26 12:12                   ` Nix
2012-10-26 20:35           ` Eric Sandeen
2012-10-26 20:37             ` Nix
2012-10-26 20:56               ` Theodore Ts'o
2012-10-26 20:59                 ` Nix
2012-10-26 21:15                   ` Theodore Ts'o
2012-10-26 21:19                     ` Nix
2012-10-27  0:22                       ` Theodore Ts'o
2012-10-27 12:45                         ` Nix
2012-10-27 17:55                           ` Theodore Ts'o
2012-10-27 18:47                             ` Nix
2012-10-27 21:19                               ` Eric Sandeen
2012-10-27 22:42                                 ` Eric Sandeen
2012-10-29  1:00                                   ` Theodore Ts'o
2012-10-29  1:04                                     ` Nix
2012-10-29  2:24                                     ` Eric Sandeen
2012-10-29  2:34                                       ` Theodore Ts'o
2012-10-29  2:35                                         ` Eric Sandeen
2012-10-29  2:42                                           ` Theodore Ts'o
2012-10-27 18:30                           ` Eric Sandeen
2012-10-27  3:11                     ` Jim Rees
2012-10-28  4:23           ` [PATCH] ext4: fix unjournaled inode bitmap modification Eric Sandeen
2012-10-28 13:59             ` Nix
2012-10-29  2:30             ` [PATCH -v3] " Theodore Ts'o
2012-10-29  3:24               ` Eric Sandeen
2012-10-29 17:08               ` Darrick J. Wong
     [not found] <jXsTo-5lW-13@gated-at.bofh.it>
     [not found] ` <jXBDk-7vn-13@gated-at.bofh.it>
     [not found]   ` <jXNl8-5m5-13@gated-at.bofh.it>
     [not found]     ` <jXNOa-5MR-23@gated-at.bofh.it>
     [not found]       ` <jXPGh-87s-5@gated-at.bofh.it>
     [not found]         ` <jXTJW-4CH-55@gated-at.bofh.it>
     [not found]           ` <jXUZj-6mo-13@gated-at.bofh.it>
     [not found]             ` <jXVLH-7kO-5@gated-at.bofh.it>
     [not found]               ` <jXW53-7CC-5@gated-at.bofh.it>
     [not found]                 ` <jXWeJ-7Lk-1@gated-at.bofh.it>
2012-10-24 17:38                   ` Apparent serious progressive ext4 data corruption bug in 3.6.3 (and other stable branches?) Martin
2012-10-26 20:13                     ` Martin
2012-10-26 20:24                       ` Nix
2012-10-26 20:44                         ` Martin
2012-10-26 20:47                           ` Nix
2012-10-26 21:10                       ` Theodore Ts'o
2012-10-26 23:15                         ` Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=508823D4.4010407@redhat.com \
    --to=sandeen@redhat.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=bergwolf@gmail.com \
    --cc=bfields@fieldses.org \
    --cc=bjschuma@netapp.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nix@esperi.org.uk \
    --cc=toralf.foerster@gmx.de \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).