public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* mount failed after xfs_growfs beyond 16 TB
@ 2006-11-02 17:26 Christian Guggenberger
  2006-11-02 18:38 ` Eric Sandeen
  2006-11-03  0:41 ` David Chinner
  0 siblings, 2 replies; 12+ messages in thread
From: Christian Guggenberger @ 2006-11-02 17:26 UTC (permalink / raw)
  To: xfs

Hi,

a colleague recently tried to grow a 16 TB filesystem (x86, 32bit) on
top of lvm2 to 17TB. (I am not even sure if that's supposed work with
linux-2.6, 32bit)

used kernel seems to be debian sarge's 2.6.8

xfs_growfs seemed to succeed (AFAIK..)

however, the fs shut down:

XFS internal error
XFS_WANT_CORRUPTED_GOTO at line 1583 of file fs/xfs/xfs_alloc.c.  Caller
0xf89978a8
[__crc_pm_idle+550816/2056674] xfs_free_ag_extent+0x454/0x78a [xfs]
[__crc_pm_idle+555561/2056674] xfs_free_extent+0xea/0x10f [xfs]
[__crc_pm_idle+555561/2056674] xfs_free_extent+0xea/0x10f [xfs]
[__crc_pm_idle+553757/2056674] xfs_alloc_read_agf+0xbe/0x1e4 [xfs]
[__crc_pm_idle+764480/2056674] xfs_growfs_data_private+0xd80/0xec0 [xfs]
[pty_write+305/307] pty_write+0x131/0x133
[opost+154/428] opost+0x9a/0x1ac
[__crc_pm_idle+765024/2056674] xfs_growfs_data+0x3f/0x5e [xfs]
[__crc_pm_idle+972873/2056674] xfs_ioctl+0x256/0x860 [xfs]
[tty_write+436/788] tty_write+0x1b4/0x314
[write_chan+0/538] write_chan+0x0/0x21a
[__crc_pm_idle+968754/2056674] linvfs_ioctl+0x78/0x101 [xfs]
[sys_ioctl+315/675] sys_ioctl+0x13b/0x2a3
[syscall_call+7/11] syscall_call+0x7/0xb
xfs_force_shutdown(dm-1,0x8) called
from line 1088 of file fs/xfs/xfs_trans.c.  Return address = 0xf8a01c3c
Filesystem "dm-1": Corruption of
in-memory data detected.  Shutting down filesystem: dm-1
Please umount the filesystem, and
rectify the problem(s)
xfs_force_shutdown(dm-1,0x1) called
from line 353 of file fs/xfs/xfs_rw.c.  Return address = 0xf8a01c3c

mounting fails with:

XFS: SB sanity check 2 failed
Filesystem "dm-1": XFS internal error
xfs_mount_validate_sb(4) at line 277 of file fs/xfs/xfs_mount.c.  Caller
0xf89e568c
[__crc_pm_idle+872883/2056674] xfs_mount_validate_sb+0x21d/0x39a [xfs]
[__crc_pm_idle+874509/2056674] xfs_readsb+0xee/0x1f9 [xfs]
[__crc_pm_idle+874509/2056674] xfs_readsb+0xee/0x1f9 [xfs]
[__crc_pm_idle+908971/2056674] xfs_mount+0x282/0x5d4 [xfs]
[__crc_pm_idle+989973/2056674] vfs_mount+0x34/0x38 [xfs]
[__crc_pm_idle+989973/2056674] vfs_mount+0x34/0x38 [xfs]
[__crc_pm_idle+989534/2056674] linvfs_fill_super+0xa1/0x1ee [xfs]
[snprintf+39/43] snprintf+0x27/0x2b
[disk_name+169/171] disk_name+0xa9/0xab
[sb_set_blocksize+46/93] sb_set_blocksize+0x2e/0x5d
[get_sb_bdev+262/313] get_sb_bdev+0x106/0x139
[__crc_pm_idle+989914/2056674] linvfs_get_sb+0x2f/0x36 [xfs]
[__crc_pm_idle+989373/2056674] linvfs_fill_super+0x0/0x1ee [xfs]
[do_kern_mount+162/354] do_kern_mount+0xa2/0x162
[do_new_mount+115/181] do_new_mount+0x73/0xb5
[do_mount+370/446] do_mount+0x172/0x1be
[copy_mount_options+99/188] copy_mount_options+0x63/0xbc
[sys_mount+212/344] sys_mount+0xd4/0x158
[syscall_call+7/11] syscall_call+0x7/0xb
XFS: SB validate failed
XFS: SB sanity check 2 failed

and finally, xfs_repair stops at

bad primary superblock: inconsistent file geometrie information

found candidate secondary superblock...
superblock read failed, offset 10093861404672, size 2048, ag 0, rval 29

thanks in advance,

 - Christian

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: mount failed after xfs_growfs beyond 16 TB
  2006-11-02 17:26 mount failed after xfs_growfs beyond 16 TB Christian Guggenberger
@ 2006-11-02 18:38 ` Eric Sandeen
  2006-11-03  9:32   ` Christian Guggenberger
  2006-11-03  0:41 ` David Chinner
  1 sibling, 1 reply; 12+ messages in thread
From: Eric Sandeen @ 2006-11-02 18:38 UTC (permalink / raw)
  To: christian.guggenberger; +Cc: xfs

Christian Guggenberger wrote:
> Hi,
> 
> a colleague recently tried to grow a 16 TB filesystem (x86, 32bit) on
> top of lvm2 to 17TB. (I am not even sure if that's supposed work with
> linux-2.6, 32bit)

If you have CONFIG_LBD enabled (do you?), it should in theory, barring
bugs :)

> used kernel seems to be debian sarge's 2.6.8

hmm old....

> xfs_growfs seemed to succeed (AFAIK..)

trace below looks like not...

> however, the fs shut down:
> 
> XFS internal error
> XFS_WANT_CORRUPTED_GOTO at line 1583 of file fs/xfs/xfs_alloc.c.  Caller
> 0xf89978a8
> [__crc_pm_idle+550816/2056674] xfs_free_ag_extent+0x454/0x78a [xfs]
> [__crc_pm_idle+555561/2056674] xfs_free_extent+0xea/0x10f [xfs]
> [__crc_pm_idle+555561/2056674] xfs_free_extent+0xea/0x10f [xfs]
> [__crc_pm_idle+553757/2056674] xfs_alloc_read_agf+0xbe/0x1e4 [xfs]

in the growfs thread here

> [__crc_pm_idle+764480/2056674] xfs_growfs_data_private+0xd80/0xec0 [xfs]
> [pty_write+305/307] pty_write+0x131/0x133
> [opost+154/428] opost+0x9a/0x1ac
> [__crc_pm_idle+765024/2056674] xfs_growfs_data+0x3f/0x5e [xfs]
> [__crc_pm_idle+972873/2056674] xfs_ioctl+0x256/0x860 [xfs]
> [tty_write+436/788] tty_write+0x1b4/0x314
> [write_chan+0/538] write_chan+0x0/0x21a
> [__crc_pm_idle+968754/2056674] linvfs_ioctl+0x78/0x101 [xfs]
> [sys_ioctl+315/675] sys_ioctl+0x13b/0x2a3
> [syscall_call+7/11] syscall_call+0x7/0xb
> xfs_force_shutdown(dm-1,0x8) called
> from line 1088 of file fs/xfs/xfs_trans.c.  Return address = 0xf8a01c3c
> Filesystem "dm-1": Corruption of
> in-memory data detected.  Shutting down filesystem: dm-1
> Please umount the filesystem, and
> rectify the problem(s)
> xfs_force_shutdown(dm-1,0x1) called
> from line 353 of file fs/xfs/xfs_rw.c.  Return address = 0xf8a01c3c
> 
> mounting fails with:
> 
> XFS: SB sanity check 2 failed

This is checking:

        if (unlikely(
            sbp->sb_dblocks == 0 ||
            sbp->sb_dblocks >
             (xfs_drfsbno_t)sbp->sb_agcount * sbp->sb_agblocks ||
            sbp->sb_dblocks < (xfs_drfsbno_t)(sbp->sb_agcount - 1) *
                              sbp->sb_agblocks + XFS_MIN_AG_BLOCKS)) {
                xfs_fs_mount_cmn_err(flags, "SB sanity check 2 failed");
                return XFS_ERROR(EFSCORRUPTED);
        }

can you point xfs_db -r /dev/dm-1 and then:

xfs_db> sb 0
xfs_db> p

let's see what you've got.

Also how big does /proc/partitions think your new device is?

> Filesystem "dm-1": XFS internal error
> xfs_mount_validate_sb(4) at line 277 of file fs/xfs/xfs_mount.c.  Caller
> 0xf89e568c
> [__crc_pm_idle+872883/2056674] xfs_mount_validate_sb+0x21d/0x39a [xfs]
> [__crc_pm_idle+874509/2056674] xfs_readsb+0xee/0x1f9 [xfs]
> [__crc_pm_idle+874509/2056674] xfs_readsb+0xee/0x1f9 [xfs]
> [__crc_pm_idle+908971/2056674] xfs_mount+0x282/0x5d4 [xfs]
> [__crc_pm_idle+989973/2056674] vfs_mount+0x34/0x38 [xfs]
> [__crc_pm_idle+989973/2056674] vfs_mount+0x34/0x38 [xfs]
> [__crc_pm_idle+989534/2056674] linvfs_fill_super+0xa1/0x1ee [xfs]
> [snprintf+39/43] snprintf+0x27/0x2b
> [disk_name+169/171] disk_name+0xa9/0xab
> [sb_set_blocksize+46/93] sb_set_blocksize+0x2e/0x5d
> [get_sb_bdev+262/313] get_sb_bdev+0x106/0x139
> [__crc_pm_idle+989914/2056674] linvfs_get_sb+0x2f/0x36 [xfs]
> [__crc_pm_idle+989373/2056674] linvfs_fill_super+0x0/0x1ee [xfs]
> [do_kern_mount+162/354] do_kern_mount+0xa2/0x162
> [do_new_mount+115/181] do_new_mount+0x73/0xb5
> [do_mount+370/446] do_mount+0x172/0x1be
> [copy_mount_options+99/188] copy_mount_options+0x63/0xbc
> [sys_mount+212/344] sys_mount+0xd4/0x158
> [syscall_call+7/11] syscall_call+0x7/0xb
> XFS: SB validate failed
> XFS: SB sanity check 2 failed
> 
> and finally, xfs_repair stops at
> 
> bad primary superblock: inconsistent file geometrie information
> 
> found candidate secondary superblock...
> superblock read failed, offset 10093861404672, size 2048, ag 0, rval 29

hmm that offset is about 9.4 terabytes.

any kernel messages when this happens?

rval 29 is ESPIPE / illegal seek.

-Eric

> thanks in advance,
> 
>  - Christian
> 
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: mount failed after xfs_growfs beyond 16 TB
  2006-11-02 17:26 mount failed after xfs_growfs beyond 16 TB Christian Guggenberger
  2006-11-02 18:38 ` Eric Sandeen
@ 2006-11-03  0:41 ` David Chinner
  2006-11-03 14:54   ` Eric Sandeen
  1 sibling, 1 reply; 12+ messages in thread
From: David Chinner @ 2006-11-03  0:41 UTC (permalink / raw)
  To: Christian Guggenberger; +Cc: xfs

On Thu, Nov 02, 2006 at 06:26:08PM +0100, Christian Guggenberger wrote:
> Hi,
> 
> a colleague recently tried to grow a 16 TB filesystem (x86, 32bit) on
> top of lvm2 to 17TB. (I am not even sure if that's supposed work with
> linux-2.6, 32bit)

Not supported - any metadata access past 16TB will wrap the 32 bit page cache
index for the metadata address space and you'll corrupt the filesystem.

> used kernel seems to be debian sarge's 2.6.8
> 
> xfs_growfs seemed to succeed (AFAIK..)
> 
> however, the fs shut down:
> 
> XFS internal error
> XFS_WANT_CORRUPTED_GOTO at line 1583 of file fs/xfs/xfs_alloc.c.  Caller
> 0xf89978a8
> [__crc_pm_idle+550816/2056674] xfs_free_ag_extent+0x454/0x78a [xfs]
> [__crc_pm_idle+555561/2056674] xfs_free_extent+0xea/0x10f [xfs]
> [__crc_pm_idle+555561/2056674] xfs_free_extent+0xea/0x10f [xfs]
> [__crc_pm_idle+553757/2056674] xfs_alloc_read_agf+0xbe/0x1e4 [xfs]
> [__crc_pm_idle+764480/2056674] xfs_growfs_data_private+0xd80/0xec0 [xfs]

No, growfs failed trying to extend the data partition and shut
down the filesystem.

> mounting fails with:
> 
> XFS: SB sanity check 2 failed
.....
> and finally, xfs_repair stops at
> 
> bad primary superblock: inconsistent file geometrie information

Probably because growfs failed part way through and left inconsistent
state behind.

> found candidate secondary superblock...
> superblock read failed, offset 10093861404672, size 2048, ag 0, rval 29

Does LVM2 even support volumes larger than 16TB on 32 bit machines?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: mount failed after xfs_growfs beyond 16 TB
  2006-11-02 18:38 ` Eric Sandeen
@ 2006-11-03  9:32   ` Christian Guggenberger
  2006-11-03 12:34     ` David Chinner
  0 siblings, 1 reply; 12+ messages in thread
From: Christian Guggenberger @ 2006-11-03  9:32 UTC (permalink / raw)
  To: Eric Sandeen, dgc; +Cc: christian.guggenberger, xfs

Eric, Dave,

> 
> xfs_db> sb 0
> xfs_db> p
> 
> let's see what you've got.
> 

xfs_db: read failed: Invalid argument
xfs_db: data size check failed
xfs_db> sb 0
xfs_db> p
magicnum = 0x58465342
blocksize = 4096
dblocks = 18446744070056148512
rblocks = 0
rextents = 0
uuid = 27d35a50-724e-440b-ae1a-79f934f7915a
logstart = 2147483652
rootino = 128
rbmino = 129
rsumino = 130
rextsize = 16
agblocks = 84976608
agcount = 570
rbmblocks = 0
logblocks = 32768
versionnum = 0x30c4
sectsize = 512
inodesize = 256
inopblock = 16
fname = "\000\000\000\000\000\000\000\000\000\000\000\000"
blocklog = 12
sectlog = 9
inodelog = 8
inopblog = 4
agblklog = 27
rextslog = 0
inprogress = 0
imax_pct = 25
icount = 1298880
ifree = 376826
fdblocks = 18446744067363131928
frextents = 0
uquotino = 131
gquotino = null
qflags = 0x7
flags = 0
shared_vn = 0
inoalignmt = 2
unit = 0
width = 0
dirblklog = 0
logsectlog = 0
logsectsize = 0
logsunit = 0
features2 = 0
xfs_db> 

> Also how big does /proc/partitions think your new device is?
> 
it thinks it's 26983133184 blocks, which seems to be correct:

  --- Logical volume ---
    LV Name                /dev/data/project
    VG Name                data
    LV UUID                4RIXaW-QxWj-KOr5-CysS-TmLF-Jebu-lPyPOU
    LV Write Access        read/write
    LV Status              available
    # open                 1
    LV Size                25.13 TB
    Current LE             6587679
    Segments               4
    Allocation             inherit
    Read ahead sectors     0
    Block device           254:1

note, the fs was first grown with (originally mounted on /data/projects)

xfs_growfs -D 4294966000 /data/projects
which succeeded.

a further

xfs_growfs -D 4300000000 /data/projects

shut the fs down.

> > found candidate secondary superblock...
> > superblock read failed, offset 10093861404672, size 2048, ag 0, rval 29
> 
> hmm that offset is about 9.4 terabytes.
> 
> any kernel messages when this happens?
> 
> rval 29 is ESPIPE / illegal seek.

not that I know of, unfortunately.

As Dave already stated that > 16TB is not supported on 32bits - is there
any way to step back ? 

cheers.
 - Christian
 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: mount failed after xfs_growfs beyond 16 TB
  2006-11-03  9:32   ` Christian Guggenberger
@ 2006-11-03 12:34     ` David Chinner
  2006-11-03 15:44       ` Christian Guggenberger
  0 siblings, 1 reply; 12+ messages in thread
From: David Chinner @ 2006-11-03 12:34 UTC (permalink / raw)
  To: Christian Guggenberger; +Cc: Eric Sandeen, dgc, xfs

On Fri, Nov 03, 2006 at 10:32:03AM +0100, Christian Guggenberger wrote:
> Eric, Dave,
> 
> > 
> > xfs_db> sb 0
> > xfs_db> p
> > 
> > let's see what you've got.
> > 
> 
> xfs_db: read failed: Invalid argument
> xfs_db: data size check failed
> xfs_db> sb 0
> xfs_db> p
> magicnum = 0x58465342
> blocksize = 4096
> dblocks = 18446744070056148512

That looks like an overflow to me ;)

> fdblocks = 18446744067363131928

Free space gone kaboom too...

> frextents = 0
> uquotino = 131
> gquotino = null
> qflags = 0x7
> flags = 0
> shared_vn = 0
> inoalignmt = 2
> unit = 0
> width = 0
> dirblklog = 0
> logsectlog = 0
> logsectsize = 0
> logsunit = 0
> features2 = 0
> xfs_db> 
> 
> > Also how big does /proc/partitions think your new device is?
> > 
> it thinks it's 26983133184 blocks, which seems to be correct:
> 
>   --- Logical volume ---
>     LV Name                /dev/data/project
>     VG Name                data
>     LV UUID                4RIXaW-QxWj-KOr5-CysS-TmLF-Jebu-lPyPOU
>     LV Write Access        read/write
>     LV Status              available
>     # open                 1
>     LV Size                25.13 TB
>     Current LE             6587679
>     Segments               4
>     Allocation             inherit
>     Read ahead sectors     0
>     Block device           254:1
> 
> note, the fs was first grown with (originally mounted on /data/projects)
> 
> xfs_growfs -D 4294966000 /data/projects
> which succeeded.

Which is just less than 16TB: 0x1ffeffaf0000

> a further
> 
> xfs_growfs -D 4300000000 /data/projects

Which is just more than 16TB: 0x2008ccb00000

> shut the fs down.

Probably corrupted metadata in the first couple of AGs...

> > > found candidate secondary superblock...
> > > superblock read failed, offset 10093861404672, size 2048, ag 0, rval 29
> > 
> > hmm that offset is about 9.4 terabytes.

With a size of 25.13TiB in the LVM, 9.4TB is ~(25.13 - 16)TiB

That's a 32 bit overflow as well...

> As Dave already stated that > 16TB is not supported on 32bits - is there
> any way to step back ? 

xfs_db mojo.... ;)

Note - no guarantee this will work - practise on an expendable
sparse loopback filessytem image by making a filesystem of slightly less
than 16TB then growing it to corrupt it the same way and then fixing it up
successfully.

Once it's corrupted, unmount and run xfs_db in expert mode.
The superblock:

blocksize = 4096
dblocks = 18446744070056148512
...
agblocks = 84976608
agcount = 570

An AG is ~43.5GB, so 570 AGs is 24.8TB. It's to big, and
we will only shrink by whole AGs. Hence we have to correct
agcount and dblocks.

So, 404 AGs gives:

dblocks = agblocks * agcount
	= 84976608 * 404 * 512 bytes
	= 0xFFC853B0000 bytes, which is under 16TiB
	= 4291318704 blocks

Now you need to zero fdblocks, and now you should be able to run
xfs_repair to fix it up. Don't be surprised if repair runs out of
memory - you'll have to hope Barry gets finished with the memory
reduction work he's doing soon or get a 64 bit machine to fix that
problem. A 64bit machine wouldn't have the 16TB limit, either ;)

Good luck....

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: mount failed after xfs_growfs beyond 16 TB
  2006-11-03  0:41 ` David Chinner
@ 2006-11-03 14:54   ` Eric Sandeen
  2006-11-06  1:15     ` Timothy Shimmin
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Sandeen @ 2006-11-03 14:54 UTC (permalink / raw)
  To: David Chinner; +Cc: Christian Guggenberger, xfs

David Chinner wrote:
> On Thu, Nov 02, 2006 at 06:26:08PM +0100, Christian Guggenberger wrote:
>> Hi,
>>
>> a colleague recently tried to grow a 16 TB filesystem (x86, 32bit) on
>> top of lvm2 to 17TB. (I am not even sure if that's supposed work with
>> linux-2.6, 32bit)
> 
> Not supported - any metadata access past 16TB will wrap the 32 bit page cache
> index for the metadata address space and you'll corrupt the filesystem.


Ohhhh right.  I've been in x86_64 land for too long, sorry for the earlier false 
assertion....  :(

xfs guys, if it's not there already (and I don't see it from a quick look..) 
growfs -really- should refuse (in the kernel) to grow a filesystem past 16T on a 
32-bit machine, just as we refuse to mount one.  something like this in 
xfs_growfs_data_private:

#if XFS_BIG_BLKNOS     /* Limited by ULONG_MAX of page cache index */
         if (unlikely(
             (nb >> (PAGE_SHIFT - sbp->sb_blocklog)) > ULONG_MAX) {
#else                  /* Limited by UINT_MAX of sectors */
         if (unlikely(
             (nb << (sbp->sb_blocklog - BBSHIFT)) > UINT_MAX) {
#endif
                 cmn_err(CE_WARN,
                         "new filesystem size too large for this system.");
                 return XFS_ERROR(E2BIG);
         }

and something similar in xfs_growfs_rt ?

-Eric

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: mount failed after xfs_growfs beyond 16 TB
  2006-11-03 12:34     ` David Chinner
@ 2006-11-03 15:44       ` Christian Guggenberger
  2006-11-03 15:54         ` Eric Sandeen
                           ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Christian Guggenberger @ 2006-11-03 15:44 UTC (permalink / raw)
  To: David Chinner; +Cc: Christian Guggenberger, Eric Sandeen, xfs

> 
> xfs_db mojo.... ;)
> 
> Note - no guarantee this will work - practise on an expendable
> sparse loopback filessytem image by making a filesystem of slightly less
> than 16TB then growing it to corrupt it the same way and then fixing it up
> successfully.
> 
> Once it's corrupted, unmount and run xfs_db in expert mode.
> The superblock:
> 
> blocksize = 4096
> dblocks = 18446744070056148512
> ...
> agblocks = 84976608
> agcount = 570
> 
> An AG is ~43.5GB, so 570 AGs is 24.8TB. It's to big, and
> we will only shrink by whole AGs. Hence we have to correct
> agcount and dblocks.

isn't the AG size 'agblocks * blocksize' == ~324 GB here ?

got further input on a secondray superblock form the colleague:
looks more reasonable, I'd say. Is there a way to manually recover sb0
from sb1 ?

(btw, I still hope they get access to an 64bit system with recent
xfsprogs and kernel, soon)

xfs_db: read failed: Invalid argument
xfs_db: data size check failed
xfs_db> sb 1
xfs_db> p
magicnum = 0x58465342
blocksize = 4096
dblocks = 4294966000
rblocks = 0
rextents = 0
uuid = 27d35a50-724e-440b-ae1a-79f934f7915a
logstart = 2147483652
rootino = 128
rbmino = 129
rsumino = 130
rextsize = 16
agblocks = 84976608
agcount = 51
rbmblocks = 0
logblocks = 32768
versionnum = 0x30c4
sectsize = 512
inodesize = 256
inopblock = 16
fname = "\000\000\000\000\000\000\000\000\000\000\000\000"
blocklog = 12
sectlog = 9
inodelog = 8
inopblog = 4
agblklog = 27
rextslog = 0
inprogress = 0
imax_pct = 25
icount = 1298880
ifree = 376828
fdblocks = 1601952378
frextents = 0
uquotino = 131
gquotino = null
qflags = 0x7
flags = 0
shared_vn = 0
inoalignmt = 2
unit = 0
width = 0
dirblklog = 0
logsectlog = 0
logsectsize = 0
logsunit = 0
features2 = 0

cheers.
 - Christian
 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: mount failed after xfs_growfs beyond 16 TB
  2006-11-03 15:44       ` Christian Guggenberger
@ 2006-11-03 15:54         ` Eric Sandeen
  2006-11-06 13:41         ` Christian Guggenberger
  2006-11-07  8:17         ` David Chinner
  2 siblings, 0 replies; 12+ messages in thread
From: Eric Sandeen @ 2006-11-03 15:54 UTC (permalink / raw)
  To: christian.guggenberger; +Cc: David Chinner, xfs

Christian Guggenberger wrote:
>> xfs_db mojo.... ;)
>>
>> Note - no guarantee this will work - practise on an expendable
>> sparse loopback filessytem image by making a filesystem of slightly less
>> than 16TB then growing it to corrupt it the same way and then fixing it up
>> successfully.
>>
>> Once it's corrupted, unmount and run xfs_db in expert mode.
>> The superblock:
>>
>> blocksize = 4096
>> dblocks = 18446744070056148512
>> ...
>> agblocks = 84976608
>> agcount = 570
>>
>> An AG is ~43.5GB, so 570 AGs is 24.8TB. It's to big, and
>> we will only shrink by whole AGs. Hence we have to correct
>> agcount and dblocks.
> 
> isn't the AG size 'agblocks * blocksize' == ~324 GB here ?
> 
> got further input on a secondray superblock form the colleague:
> looks more reasonable, I'd say. Is there a way to manually recover sb0
> from sb1 ?

you can copy it over field-by-field.... not sure if there's an easier way.

-Eric

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: mount failed after xfs_growfs beyond 16 TB
  2006-11-03 14:54   ` Eric Sandeen
@ 2006-11-06  1:15     ` Timothy Shimmin
  2006-11-06  3:25       ` Eric Sandeen
  0 siblings, 1 reply; 12+ messages in thread
From: Timothy Shimmin @ 2006-11-06  1:15 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: David Chinner, Christian Guggenberger, xfs

Good idea, Eric.
I've created a pv.
I noticed this was taken from xfs_mount_validate_sb() for the dblocks test.
I guess it would be nice to abstract this test in a macro for use in multiple places.

Cheers,
Tim.

--On 3 November 2006 8:54:43 AM -0600 Eric Sandeen <sandeen@sandeen.net> wrote:

> David Chinner wrote:
>> On Thu, Nov 02, 2006 at 06:26:08PM +0100, Christian Guggenberger wrote:
>>> Hi,
>>>
>>> a colleague recently tried to grow a 16 TB filesystem (x86, 32bit) on
>>> top of lvm2 to 17TB. (I am not even sure if that's supposed work with
>>> linux-2.6, 32bit)
>>
>> Not supported - any metadata access past 16TB will wrap the 32 bit page cache
>> index for the metadata address space and you'll corrupt the filesystem.
>
>
> Ohhhh right.  I've been in x86_64 land for too long, sorry for the earlier false assertion....  :(
>
> xfs guys, if it's not there already (and I don't see it from a quick look..) growfs -really-
> should refuse (in the kernel) to grow a filesystem past 16T on a 32-bit machine, just as we
> refuse to mount one.  something like this in xfs_growfs_data_private:
>
># if XFS_BIG_BLKNOS     /* Limited by ULONG_MAX of page cache index */
>          if (unlikely(
>              (nb >> (PAGE_SHIFT - sbp->sb_blocklog)) > ULONG_MAX) {
># else                  /* Limited by UINT_MAX of sectors */
>          if (unlikely(
>              (nb << (sbp->sb_blocklog - BBSHIFT)) > UINT_MAX) {
># endif
>                  cmn_err(CE_WARN,
>                          "new filesystem size too large for this system.");
>                  return XFS_ERROR(E2BIG);
>          }
>
> and something similar in xfs_growfs_rt ?
>
> -Eric
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: mount failed after xfs_growfs beyond 16 TB
  2006-11-06  1:15     ` Timothy Shimmin
@ 2006-11-06  3:25       ` Eric Sandeen
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Sandeen @ 2006-11-06  3:25 UTC (permalink / raw)
  To: Timothy Shimmin; +Cc: David Chinner, Christian Guggenberger, xfs

Timothy Shimmin wrote:
> Good idea, Eric.
> I've created a pv.
> I noticed this was taken from xfs_mount_validate_sb() for the dblocks test.

yep

> I guess it would be nice to abstract this test in a macro for use in 
> multiple places.

yep, it'd just need to be refactored a bit to support data only & rt only (for 
growfs), while mount wants to check both at the same time.

-Eric

> 
> Cheers,
> Tim.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: mount failed after xfs_growfs beyond 16 TB
  2006-11-03 15:44       ` Christian Guggenberger
  2006-11-03 15:54         ` Eric Sandeen
@ 2006-11-06 13:41         ` Christian Guggenberger
  2006-11-07  8:17         ` David Chinner
  2 siblings, 0 replies; 12+ messages in thread
From: Christian Guggenberger @ 2006-11-06 13:41 UTC (permalink / raw)
  To: Christian Guggenberger; +Cc: David Chinner, Eric Sandeen, xfs

On Fri, Nov 03, 2006 at 04:44:48PM +0100, Christian Guggenberger wrote:
> > 
> > xfs_db mojo.... ;)
> > 
> > Note - no guarantee this will work - practise on an expendable
> > sparse loopback filessytem image by making a filesystem of slightly less
> > than 16TB then growing it to corrupt it the same way and then fixing it up
> > successfully.
> > 
...

> 
> (btw, I still hope they get access to an 64bit system with recent
> xfsprogs and kernel, soon)
>
for your info - with recent xfsprogs (2.8.11) repair (on a 32bit system)
succeeded. No xfs_db magic needed. 

thanks again for your help,

cheers.
 - Christian

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: mount failed after xfs_growfs beyond 16 TB
  2006-11-03 15:44       ` Christian Guggenberger
  2006-11-03 15:54         ` Eric Sandeen
  2006-11-06 13:41         ` Christian Guggenberger
@ 2006-11-07  8:17         ` David Chinner
  2 siblings, 0 replies; 12+ messages in thread
From: David Chinner @ 2006-11-07  8:17 UTC (permalink / raw)
  To: Christian Guggenberger; +Cc: David Chinner, Eric Sandeen, xfs

On Fri, Nov 03, 2006 at 04:44:48PM +0100, Christian Guggenberger wrote:
> > The superblock:
> > 
> > blocksize = 4096
> > dblocks = 18446744070056148512
> > ...
> > agblocks = 84976608
> > agcount = 570
> > 
> > An AG is ~43.5GB, so 570 AGs is 24.8TB. It's to big, and
> > we will only shrink by whole AGs. Hence we have to correct
> > agcount and dblocks.
> 
> isn't the AG size 'agblocks * blocksize' == ~324 GB here ?

Yes, you are right - I was thinking 512 byte blocks which
then gave the right size that you grew to. Otherwise 570*324GB
gives 200TB, which is somewhat larger than you apparently tried
to grow to...

Sorry for the misdirection, but I'm glad to see that you got
it fixed.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2006-11-07  8:18 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-02 17:26 mount failed after xfs_growfs beyond 16 TB Christian Guggenberger
2006-11-02 18:38 ` Eric Sandeen
2006-11-03  9:32   ` Christian Guggenberger
2006-11-03 12:34     ` David Chinner
2006-11-03 15:44       ` Christian Guggenberger
2006-11-03 15:54         ` Eric Sandeen
2006-11-06 13:41         ` Christian Guggenberger
2006-11-07  8:17         ` David Chinner
2006-11-03  0:41 ` David Chinner
2006-11-03 14:54   ` Eric Sandeen
2006-11-06  1:15     ` Timothy Shimmin
2006-11-06  3:25       ` Eric Sandeen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox