Incompatibility between mballoc and online resize

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Incompatibility between mballoc and online resize
@ 2008-06-10  3:46 Theodore Ts'o
  2008-06-10  6:24 ` Andreas Dilger
  0 siblings, 1 reply; 3+ messages in thread
From: Theodore Ts'o @ 2008-06-10  3:46 UTC (permalink / raw)
  To: Alex Tomas, Andreas Dilger; +Cc: linux-ext4

I've been trying to track down the problems in ext4's online-resizing,
and one of the ones which is most noticeable is that online resizing
mballoc has some specific data structures which need to be enlarged when
the number of block groups in the filesystem are grown dynamically.

Specifically, the s_group_info array; in the current ext4 patch queue,
this isn't happening, which means after the online resizing operation,
when the filesystem is unmounted, ext4_put_super() calls
ext4_mb_release(), which then iterates over s_group_info array, and then
this triggers a kernel oops.

Is clusterfs running with mballoc in production?  If so, how was this
problem fixed?  Did we miss a patch to make sure that on-line resizing
worked with mballoc enabled?

Thanks, regards,

						- Ted

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Incompatibility between mballoc and online resize
  2008-06-10  3:46 Incompatibility between mballoc and online resize Theodore Ts'o
@ 2008-06-10  6:24 ` Andreas Dilger
  2008-06-10 12:36   ` Theodore Tso
  0 siblings, 1 reply; 3+ messages in thread
From: Andreas Dilger @ 2008-06-10  6:24 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Alex Tomas, linux-ext4

On Jun 09, 2008  23:46 -0400, Theodore Ts'o wrote:
> I've been trying to track down the problems in ext4's online-resizing,
> and one of the ones which is most noticeable is that online resizing
> mballoc has some specific data structures which need to be enlarged when
> the number of block groups in the filesystem are grown dynamically.
> 
> Specifically, the s_group_info array; in the current ext4 patch queue,
> this isn't happening, which means after the online resizing operation,
> when the filesystem is unmounted, ext4_put_super() calls
> ext4_mb_release(), which then iterates over s_group_info array, and then
> this triggers a kernel oops.
> 
> Is clusterfs running with mballoc in production?  If so, how was this
> problem fixed?  Did we miss a patch to make sure that on-line resizing
> worked with mballoc enabled?

When Lustre is mounting the backing filesystem on the server, there is
no ext3 mountpoint visible to userspace, hence no access to the underlying
filesystem to pass the resize ioctl to, so we haven't had this problem
yet.  We filed a bug on it, for the time that we can pass an ioctl through:

https://bugzilla.lustre.org/show_bug.cgi?id=15208

We have another open bug related to resize2fs and uninit_bg, but that
is for offline resizing:

https://bugzilla.lustre.org/show_bug.cgi?id=12002

Both of these bugs are mere placeholders, they don't have any patches.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Incompatibility between mballoc and online resize
  2008-06-10  6:24 ` Andreas Dilger
@ 2008-06-10 12:36   ` Theodore Tso
  0 siblings, 0 replies; 3+ messages in thread
From: Theodore Tso @ 2008-06-10 12:36 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Alex Tomas, linux-ext4

On Tue, Jun 10, 2008 at 12:24:45AM -0600, Andreas Dilger wrote:
> When Lustre is mounting the backing filesystem on the server, there is
> no ext3 mountpoint visible to userspace, hence no access to the underlying
> filesystem to pass the resize ioctl to, so we haven't had this problem
> yet.  We filed a bug on it, for the time that we can pass an ioctl through:
> 
> https://bugzilla.lustre.org/show_bug.cgi?id=15208
> 
> We have another open bug related to resize2fs and uninit_bg, but that
> is for offline resizing:
> 
> https://bugzilla.lustre.org/show_bug.cgi?id=12002
> 
> Both of these bugs are mere placeholders, they don't have any patches.

There is a third (and possibly fourth) problem, which is that online
resizing with ext4dev (even without any patches from the ext4 patch
queue) is corrupting the filesystem, by not properly initializing the
block group descriptors:

Group 8: (Blocks 65537-73728)
  Block bitmap at 0, Inode bitmap at 0
  Inode table at 0-255
  0 free blocks, 0 free inodes, 0 directories
  Free blocks: 
  Free inodes: 
Group 9: (Blocks 73729-79999)
  Backup superblock at 73729, Group descriptors at 73730-73730
  Reserved GDT blocks at 73731-73985
  Block bitmap at 0, Inode bitmap at 0
  Inode table at 0-255
  0 free blocks, 0 free inodes, 0 directories
  Free blocks: 
  Free inodes: 

Furthermore, if the filesystem is grown to the point where a second
set of blocks need to be pulled from the resize inode, apparently the
resize inode is getting corrupted:

Performing an on-line resize of /dev/ubd16 to 12582912 (1k) blocks.
EXT4-fs warning (device ubdb): verify_reserved_gdb: reserved GDT 3 missing grp 1 (8195)
resize2fs: Invalid argument While trying to add group #25

I'm not sure if this is related to the third probably above, since
until that problem is fixed it makes it hard to determine what is
going on with the 4th.  They may end up having the same root cause.

I'm looking into it, but it seems pretty clear to me no one has really
tested online resizing on ext4 in quite a while, and the code has
bitrotted.  Hopefully it won't be too hard to fix it.  In the mean
time, it really makes me wonder how on earth Josef Bacik actually
tested this patch:

commit 944600930a37aa725ba6f93c3244e2d77a1e3581
Author: Josef Bacik <jbacik@redhat.com>
Date:   Fri Jun 6 18:05:52 2008 -0400

    ext4: fix online resize bug
    
    There is a bug when we are trying to verify that the reserve inode's
    double indirect blocks point back to the primary gdt blocks.  The fix is
    obvious, we need to mod the gdb count by the addr's per block.  This was
    verified using the same testcase as with the ext3 equivalent of this
    patch.
    
    Signed-off-by: Josef Bacik <jbacik@redhat.com>
    Signed-off-by: Mingming Cao <cmm@us.ibm.com>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

							- Ted

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2008-06-10 12:36 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-10  3:46 Incompatibility between mballoc and online resize Theodore Ts'o
2008-06-10  6:24 ` Andreas Dilger
2008-06-10 12:36   ` Theodore Tso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).