[Bug 15576] New: Data Loss (flex_bg and ext4_mb_generate

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [Bug 15576] New: Data Loss (flex_bg and ext4_mb_generate_buddy errors)
@ 2010-03-19  1:05 bugzilla-daemon
  2010-03-21 23:43 ` tytso
  0 siblings, 1 reply; 2+ messages in thread
From: bugzilla-daemon @ 2010-03-19  1:05 UTC (permalink / raw)
  To: linux-ext4

http://bugzilla.kernel.org/show_bug.cgi?id=15576

           Summary: Data Loss (flex_bg and ext4_mb_generate_buddy errors)
           Product: File System
           Version: 2.5
    Kernel Version: Linux elitemx-desktop 2.6.31-20-generic #58-Ubuntu SMP
                    Fri Mar 12 05:23:09 UTC 2010 i686 GNU/Linux
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: ext4
        AssignedTo: fs_ext4@kernel-bugs.osdl.org
        ReportedBy: xpenev@gmail.com
        Regression: No


# create a 484 cylinder disk [3.7 GB]
dd of=disk.bin bs=512 count=0 seek=$((484*255*63))

# associate with loop device
losetup /dev/loop0 disk.bin

# generate bad blocks file [600 MB]
for((i=360491;i<=497992;i++)); do echo $i; done > omit

# format disk with ext4
mkfs.ext4 -l omit /dev/loop0

# mount disk
mkdir foobar; mount /dev/loop0 foobar

# create a 2 GB file
cd foobar; dd if=/dev/zero bs=1024 count=$((1024 * 1024 * 2))

# check dmesg
[ 9200.006021] EXT4-fs error (device loop0): ext4_mb_generate_buddy: EXT4-fs:
group 12: 0 blocks in bitmap, 2 in gd
[ 9200.010311] EXT4-fs error (device loop0): ext4_mb_generate_buddy: EXT4-fs:
group 13: 0 blocks in bitmap, 2 in gd
[ 9200.010359] EXT4-fs error (device loop0): ext4_mb_generate_buddy: EXT4-fs:
group 14: 0 blocks in bitmap, 2 in gd
[ 9200.010683] EXT4-fs error (device loop0): ext4_mb_generate_buddy: EXT4-fs:
group 15: 9911 blocks in bitmap, 9913 in gd

Worse off, however, if rather than creating a 2 GB file, you use this partition
as the target root partition for installation using the latest [32-bit] Ubuntu
installer ... consistently at 57 percent of the install ext4 reports data loss.

[ 1129.344600] EXT4-fs error (device sda1): ext4_mb_generate_buddy: EXT4-fs:
group 12: 0 blocks in bitmap, 2 in gd
[ 1129.344626] Aborting journal on device sda1:8.
[ 1129.380671] EXT4-fs error (device sda1): ext4_journal_start_sb: Detected
aborted journal
[ 1129.380697] EXT4-fs (sda1): Remounting filesystem read-only
[ 1129.492154] EXT4-fs (sda1): Remounting filesystem read-only
[ 1129.542049] EXT4-fs error (device sda1): ext4_mb_generate_buddy: EXT4-fs:
group 13: 0 blocks in bitmap, 2 in gd
[ 1129.554043] EXT4-fs error (device sda1): ext4_mb_generate_buddy: EXT4-fs:
group 14: 0 blocks in bitmap, 2 in gd
[ 1129.574283] EXT4-fs error (device sda1): ext4_mb_generate_buddy: EXT4-fs:
group 15: 9911 blocks in bitmap, 9913 in gd
[ 1129.574343] mpage_da_map_blocks block allocation failed for inode 41510 at
logical offset 0 with max blocks 6 with error -30
[ 1129.574352] This should not happen.!! Data will be lost
[ 1129.574393] ext4_da_writepages: jbd2_start: 1000 pages, ino 41510; err -30
[ 1129.574406] Pid: 11796, comm: pdflush Not tainted 2.6.31-14-generic
#48-Ubuntu
[ 1129.574414] Call Trace:
[ 1129.574440]  [<c056e41c>] ? printk+0x18/0x1c
[ 1129.574456]  [<c0266162>] ext4_da_writepages+0x452/0x490
[ 1129.574474]  [<c01ba551>] do_writepages+0x21/0x40
[ 1129.574489]  [<c02033fe>] writeback_single_inode+0x16e/0x3d0
[ 1129.574503]  [<c0150510>] ? process_timeout+0x0/0x10
[ 1129.574515]  [<c0203afd>] generic_sync_sb_inodes+0x38d/0x4a0
[ 1129.574528]  [<c0203ced>] writeback_inodes+0x4d/0xe0
[ 1129.574539]  [<c01b9432>] wb_kupdate+0xa2/0x110
[ 1129.574551]  [<c01bac27>] __pdflush+0xf7/0x1f0
[ 1129.574562]  [<c01bad20>] ? pdflush+0x0/0x40
[ 1129.574573]  [<c01bad20>] ? pdflush+0x0/0x40
[ 1129.574583]  [<c01bad59>] pdflush+0x39/0x40
[ 1129.574594]  [<c01b9390>] ? wb_kupdate+0x0/0x110
[ 1129.574606]  [<c015bf8c>] kthread+0x7c/0x90
[ 1129.574616]  [<c015bf10>] ? kthread+0x0/0x90
[ 1129.574630]  [<c0104007>] kernel_thread_helper+0x7/0x10

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [Bug 15576] New: Data Loss (flex_bg and ext4_mb_generate_buddy errors)
  2010-03-19  1:05 [Bug 15576] New: Data Loss (flex_bg and ext4_mb_generate_buddy errors) bugzilla-daemon
@ 2010-03-21 23:43 ` tytso
  0 siblings, 0 replies; 2+ messages in thread
From: tytso @ 2010-03-21 23:43 UTC (permalink / raw)
  To: bugzilla-daemon; +Cc: linux-ext4

On Fri, Mar 19, 2010 at 01:05:23AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> # create a 484 cylinder disk [3.7 GB]
> dd of=disk.bin bs=512 count=0 seek=$((484*255*63))
> 
> # associate with loop device
> losetup /dev/loop0 disk.bin
> 
> # generate bad blocks file [600 MB]
> for((i=360491;i<=497992;i++)); do echo $i; done > omit
> 
> # format disk with ext4
> mkfs.ext4 -l omit /dev/loop0

This is an e2fsprogs bug.  If you run e2fsck at this point, pass 5
errors will be reported, that exactly correspond with what you report
the kernel ends up complaining about:

Free blocks count wrong for group #12 (2, counted=0).

Free blocks count wrong for group #13 (2, counted=0).

Free blocks count wrong for group #14 (2, counted=0).

Free blocks count wrong for group #15 (9913, counted=9911).

Free blocks count wrong (800730, counted=800722).

> Worse off, however, if rather than creating a 2 GB file, you use
> this partition as the target root partition for installation using
> the latest [32-bit] Ubuntu installer ... consistently at 57 percent
> of the install ext4 reports data loss.

That's because the the file system is getting remounted read-only when
the file system corruption is detected:

> [ 1129.344600] EXT4-fs error (device sda1): ext4_mb_generate_buddy: EXT4-fs:
> group 12: 0 blocks in bitmap, 2 in gd
> [ 1129.380697] EXT4-fs (sda1): Remounting filesystem read-only

The basic idea behind this is when there is a discrepancy between the
pass #5 summary statistics and the block allocation bitmap, the
problem could be in the block allocation bitmap.  (In this case it is
the summary statistics, but there's no way for the code to know that.)
If the block allocation bitmap is bogus, it's very dangerous to
continue writing into the file system, since we may end up allocating
blocks that are already in use by other files, and this would cause
data loss when those data blocks get overwritten.

Once the file system is marked as read-only, data written just before
the file system was remounted read-only can't be pushed out to disk,
which is the reason for the warnign message:

> [ 1129.574343] mpage_da_map_blocks block allocation failed for inode 41510 at
> logical offset 0 with max blocks 6 with error -30
> [ 1129.574352] This should not happen.!! Data will be lost

(Error -30 is "EROFS".)

We should probably improve the error messages here, but there's not
much else we can do.

The real core issue is the fact that mke2fs isn't doing the right
thing when there are bad blocks and flex_bg is specified.  It's
something we don't test for, since in practice it never happens with
modern disk drives. 

						- Ted

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2010-03-22  2:12 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-19  1:05 [Bug 15576] New: Data Loss (flex_bg and ext4_mb_generate_buddy errors) bugzilla-daemon
2010-03-21 23:43 ` tytso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).