Re: Apparent serious progressive ext4 data corruption bug in 3.6.3 (and other stable branches?)

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Eric Sandeen <sandeen@redhat.com>
To: Jan Kara <jack@suse.cz>
Cc: "Theodore Ts'o" <tytso@mit.edu>, Nix <nix@esperi.org.uk>,
	linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org,
	"J. Bruce Fields" <bfields@fieldses.org>,
	"Bryan Schumaker" <bjschuma@netapp.com>,
	"Peng Tao" <bergwolf@gmail.com>,
	Trond.Myklebust@netapp.com, gregkh@linuxfoundation.org,
	"Toralf Förster" <toralf.foerster@gmx.de>,
	stable@vger.kernel.org
Subject: Re: Apparent serious progressive ext4 data corruption bug in 3.6.3 (and other stable branches?)
Date: Fri, 26 Oct 2012 10:25:44 -0500	[thread overview]
Message-ID: <508AAB78.5030505@redhat.com> (raw)
In-Reply-To: <20121024201717.GA5572@quack.suse.cz>

On 10/24/12 3:17 PM, Jan Kara wrote:
> On Tue 23-10-12 19:57:09, Eric Sandeen wrote:
>> On 10/23/12 5:19 PM, Theodore Ts'o wrote:
>>> On Tue, Oct 23, 2012 at 09:57:08PM +0100, Nix wrote:
>>>>
>>>> It is now quite clear that this is a bug introduced by one or more of
>>>> the post-3.6.1 ext4 patches (which have all been backported at least to
>>>> 3.5, so the problem is probably there too).
>>>>
>>>> [   60.290844] EXT4-fs error (device dm-3): ext4_mb_generate_buddy:741: group 202, 1583 clusters in bitmap, 1675 in gd
>>>> [   60.291426] JBD2: Spotted dirty metadata buffer (dev = dm-3, blocknr = 0). There's a risk of filesystem corruption in case of system crash.
>>>>
>>>
>>> I think I've found the problem.  I believe the commit at fault is commit
>>> 14b4ed22a6 (upstream commit eeecef0af5e):
>>>
>>>     jbd2: don't write superblock when if its empty
>>>
>>> which first appeared in v3.6.2.
>>>
>>> The reason why the problem happens rarely is that the effect of the
>>> buggy commit is that if the journal's starting block is zero, we fail
>>> to truncate the journal when we unmount the file system.  This can
>>> happen if we mount and then unmount the file system fairly quickly,
>>> before the log has a chance to wrap.After the first time this has
>>> happened, it's not a disaster, since when we replay the journal, we'll
>>> just replay some extra transactions.  But if this happens twice, the
>>> oldest valid transaction will still not have gotten updated, but some
>>> of the newer transactions from the last mount session will have gotten
>>> written by the very latest transacitons, and when we then try to do
>>> the extra transaction replays, the metadata blocks can end up getting
>>> very scrambled indeed.
>>
>> I'm stumped by this; maybe Ted can see if I'm missing something.
>>
>> (and Nix, is there anything special about your fs?  Any nondefault
>> mkfs or mount options, external journal, inordinately large fs, or
>> anything like that?)
>>
>> The suspect commit added this in jbd2_mark_journal_empty():
>>
>>         /* Is it already empty? */
>>         if (sb->s_start == 0) {
>>                 read_unlock(&journal->j_state_lock);
>>                 return;
>>         }
>>
>> thereby short circuiting the function.
>>
>> But Ted's suggestion that mounting the fs, doing a little work, and
>> unmounting before we wrap would lead to this doesn't make sense to
>> me.  When I do a little work, s_start is at 1, not 0.  We start
>> the journal at s_first:
>>
>> load_superblock()
>> 	journal->j_first = be32_to_cpu(sb->s_first);
>>
>> And when we wrap the journal, we wrap back to j_first:
>>
>> jbd2_journal_next_log_block():
>>         if (journal->j_head == journal->j_last)
>>                 journal->j_head = journal->j_first;
>>
>> and j_first comes from s_first, which is set at journal creation
>> time to be "1" for an internal journal.
>>
>> So s_start == 0 sure looks special to me; so far I can only see that
>> we get there if we've been through jbd2_mark_journal_empty() already,
>> though I'm eyeballing jbd2_journal_get_log_tail() as well.
>>
>> Ted's proposed patch seems harmless but so far I don't understand
>> what problem it fixes, and I cannot recreate getting to
>> jbd2_mark_journal_empty() with a dirty log and s_start == 0.
>   Agreed. I rather thing we might miss journal->j_flags |= JBD2_FLUSHED
> when shortcircuiting jbd2_mark_journal_empty(). But I still don't exactly
> see how that would cause the corruption...

Agreed, except so far I cannot see any way to get here with s_start == 0
without ALREADY having JBD2_FLUSHED set.  Can you?

Anyway, I think the problem is still poorly understood; lots of random facts
floating about, and a pretty weird usecase with nonstandard/dangerous mount
options.  I do want to figure out what regressed (if anything) but so far
this investigation doesn't seem very methodical.

-Eric

> 								Honza
>

next prev parent reply	other threads:[~2012-10-26 15:26 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <87objupjlr.fsf@spindle.srvr.nix>
     [not found] ` <20121023013343.GB6370@fieldses.org>
     [not found]   ` <87mwzdnuww.fsf@spindle.srvr.nix>
     [not found]     ` <20121023143019.GA3040@fieldses.org>
     [not found]       ` <874nllxi7e.fsf_-_@spindle.srvr.nix>
     [not found]         ` <874nllxi7e.fsf_-_-AdTWujXS48Mg67Zj9sPl2A@public.gmane.org>
2012-10-23 20:57           ` Apparent serious progressive ext4 data corruption bug in 3.6.3 (and other stable branches?) Nix
2012-10-23 22:19             ` Theodore Ts'o
2012-10-23 22:47               ` Nix
2012-10-23 23:16                 ` Theodore Ts'o
2012-10-23 23:06               ` Nix
2012-10-23 23:28                 ` Theodore Ts'o
2012-10-23 23:34                   ` Nix
2012-10-24  0:57               ` Eric Sandeen
2012-10-24 20:17                 ` Jan Kara
2012-10-26 15:25                   ` Eric Sandeen [this message]
2012-10-24 19:13               ` Jannis Achstetter
2012-10-24 21:31                 ` Theodore Ts'o
2012-10-24 22:05                   ` Jannis Achstetter
2012-10-24 23:47                   ` Nix
2012-10-25 17:02                   ` Felipe Contreras
     [not found]             ` <87pq48nbyz.fsf_-_-AdTWujXS48Mg67Zj9sPl2A@public.gmane.org>
2012-10-24  1:13               ` Eric Sandeen
2012-10-24  4:15                 ` Nix
2012-10-24  4:27                   ` Eric Sandeen
2012-10-24  5:23                     ` Theodore Ts'o
2012-10-24  7:00                       ` Hugh Dickins
2012-10-24 11:46                         ` Nix
2012-10-24 11:45                       ` Nix
2012-10-24 17:22                       ` Eric Sandeen
2012-10-24 19:49                       ` Nix
2012-10-24 19:54                         ` Nix
2012-10-24 20:30                         ` Eric Sandeen
2012-10-24 20:34                           ` Nix
2012-10-24 20:45                         ` Nix
2012-10-24 21:08                         ` Theodore Ts'o
2012-10-24 23:27                           ` Apparent serious progressive ext4 data corruption bug in 3.6 (when rebooting during umount) Nix
2012-10-24 23:42                             ` Nix
2012-10-25  1:10                             ` Theodore Ts'o
2012-10-25  1:45                               ` Nix
2012-10-25 14:12                                 ` Theodore Ts'o
2012-10-25 14:15                                   ` Nix
2012-10-25 17:39                                     ` Nix
2012-10-25 11:06                               ` Nix
2012-10-26  0:22                               ` Apparent serious progressive ext4 data corruption bug in 3.6 (when rebooting during umount) (possibly blockdev / arcmsr at fault??) Nix
2012-10-26 20:35             ` Apparent serious progressive ext4 data corruption bug in 3.6.3 (and other stable branches?) Eric Sandeen
2012-10-26 20:37               ` Nix
     [not found]                 ` <87wqydx957.fsf-AdTWujXS48Mg67Zj9sPl2A@public.gmane.org>
2012-10-26 20:56                   ` Theodore Ts'o
     [not found]                     ` <20121026205618.GC8614-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2012-10-26 20:59                       ` Nix
     [not found]                         ` <87objpx84k.fsf-AdTWujXS48Mg67Zj9sPl2A@public.gmane.org>
2012-10-26 21:15                           ` Theodore Ts'o
2012-10-26 21:19                             ` Nix
     [not found]                               ` <87haphx76u.fsf-AdTWujXS48Mg67Zj9sPl2A@public.gmane.org>
2012-10-27  0:22                                 ` Theodore Ts'o
2012-10-27 12:45                                   ` Nix
2012-10-27 17:55                                     ` Theodore Ts'o
2012-10-27 18:47                                       ` Nix
2012-10-27 21:19                                         ` Eric Sandeen
2012-10-27 21:21                                           ` Nix
2012-10-27 21:23                                             ` Eric Sandeen
2012-10-27 21:29                                               ` Nix
2012-10-27 21:34                                                 ` Eric Sandeen
2012-10-27 21:40                                                   ` Nix
     [not found]                                                   ` <09758CEA-74B5-48D0-8075-BB723A2CABBB@dilger.ca>
2012-10-29  2:09                                                     ` Eric Sandeen
2012-10-27 22:42                                           ` Eric Sandeen
2012-10-29  1:00                                             ` Theodore Ts'o
2012-10-29  1:04                                               ` Nix
2012-10-29  2:24                                               ` Eric Sandeen
2012-10-29  2:34                                                 ` Theodore Ts'o
2012-10-29  2:35                                                   ` Eric Sandeen
2012-10-29  2:42                                                     ` Theodore Ts'o
2012-10-27 18:30                                     ` Eric Sandeen
     [not found]                             ` <20121026211542.GE8614-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2012-10-27  3:11                               ` Jim Rees
2012-10-27  8:01               ` Testing ext4's journal via simulating a reboot via KVM Theodore Ts'o
2012-10-28  4:23             ` [PATCH] ext4: fix unjournaled inode bitmap modification Eric Sandeen
2012-10-28 13:59               ` Nix
2012-10-29  2:30               ` [PATCH -v3] " Theodore Ts'o
2012-10-29  3:24                 ` Eric Sandeen
2012-10-29  5:07                 ` Andreas Dilger
2012-10-29 17:08                 ` Darrick J. Wong
     [not found] <jXsTo-5lW-13@gated-at.bofh.it>
     [not found] ` <jXBDk-7vn-13@gated-at.bofh.it>
     [not found]   ` <jXNl8-5m5-13@gated-at.bofh.it>
     [not found]     ` <jXNOa-5MR-23@gated-at.bofh.it>
     [not found]       ` <jXPGh-87s-5@gated-at.bofh.it>
     [not found]         ` <jXTJW-4CH-55@gated-at.bofh.it>
     [not found]           ` <jXUZj-6mo-13@gated-at.bofh.it>
     [not found]             ` <jXVLH-7kO-5@gated-at.bofh.it>
     [not found]               ` <jXW53-7CC-5@gated-at.bofh.it>
     [not found]                 ` <jXWeJ-7Lk-1@gated-at.bofh.it>
2012-10-24 17:38                   ` Apparent serious progressive ext4 data corruption bug in 3.6.3 (and other stable branches?) Martin
2012-10-26 20:13                     ` Martin
2012-10-26 20:24                       ` Nix
2012-10-26 20:44                         ` Martin
2012-10-26 20:47                           ` Nix
2012-10-26 21:10                       ` Theodore Ts'o
2012-10-26 23:15                         ` Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=508AAB78.5030505@redhat.com \
    --to=sandeen@redhat.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=bergwolf@gmail.com \
    --cc=bfields@fieldses.org \
    --cc=bjschuma@netapp.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nix@esperi.org.uk \
    --cc=stable@vger.kernel.org \
    --cc=toralf.foerster@gmx.de \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).