Repairing a possibly incomplete xfs

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

* Repairing a possibly incomplete xfs_growfs command?
@ 2008-01-16 23:19 Mark Magpayo
  2008-01-17  2:31 ` Eric Sandeen
  2008-01-17  3:01 ` David Chinner
  0 siblings, 2 replies; 22+ messages in thread
From: Mark Magpayo @ 2008-01-16 23:19 UTC (permalink / raw)
  To: xfs

Hi,

So I have run across a strange situation which I hope there are some
gurus out there to help.

The original setup was a logical volume of 8.9TB.  I extended the volume
to 17.7TB and attempted to run xfs_growfs.  I am not sure whether the
command actually finished, as after I ran the command, the metadata was
displayed, but there was no nothing that stated the the number of data
blocks had changed.  I was just returned to the prompt, so I'm not sure
whether the command completed or not..

I was unable write to the logical volume I had just created.  I tried to
remount it, but I kept getting an error saying the superblock could not
be read.  I tried running an xfs_repair on the filesystem, and get the
following:

Phase 1 - find and verify superblock...
superblock read failed, offset 19504058859520, size 2048, ag 64, rval 0

fatal error -- Invalid argument

I am not very experienced with xfs (I was following commands in some
documentaion), and I was recommended to post to this mailing list.  If
anyone could provide some help, it would be greatly appreciate.  Also,
if there is any information I can provide to help, I will gladly provide
it.  Thanks in advance!

Sincerely,

Mark

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Repairing a possibly incomplete xfs_growfs command?
  2008-01-16 23:19 Repairing a possibly incomplete xfs_growfs command? Mark Magpayo
@ 2008-01-17  2:31 ` Eric Sandeen
  2008-01-17  3:01 ` David Chinner
  1 sibling, 0 replies; 22+ messages in thread
From: Eric Sandeen @ 2008-01-17  2:31 UTC (permalink / raw)
  To: Mark Magpayo; +Cc: xfs

Mark Magpayo wrote:
> Hi,
> 
> So I have run across a strange situation which I hope there are some
> gurus out there to help.
> 
> The original setup was a logical volume of 8.9TB.  I extended the volume
> to 17.7TB and attempted to run xfs_growfs.  I am not sure whether the
> command actually finished, as after I ran the command, the metadata was
> displayed, but there was no nothing that stated the the number of data
> blocks had changed.  I was just returned to the prompt, so I'm not sure
> whether the command completed or not..
> 
> I was unable write to the logical volume I had just created.  I tried to
> remount it, but I kept getting an error saying the superblock could not
> be read.  I tried running an xfs_repair on the filesystem, and get the
> following:
> 
> Phase 1 - find and verify superblock...
> superblock read failed, offset 19504058859520, size 2048, ag 64, rval 0
> 
> fatal error -- Invalid argument

hm, how big is your block device for starters - look in /proc/partitions.

-Eric

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Repairing a possibly incomplete xfs_growfs command?
  2008-01-16 23:19 Repairing a possibly incomplete xfs_growfs command? Mark Magpayo
  2008-01-17  2:31 ` Eric Sandeen
@ 2008-01-17  3:01 ` David Chinner
  2008-01-17 17:29   ` Mark Magpayo
  1 sibling, 1 reply; 22+ messages in thread
From: David Chinner @ 2008-01-17  3:01 UTC (permalink / raw)
  To: Mark Magpayo; +Cc: xfs

On Wed, Jan 16, 2008 at 03:19:19PM -0800, Mark Magpayo wrote:
> Hi,
> 
> So I have run across a strange situation which I hope there are some
> gurus out there to help.
> 
> The original setup was a logical volume of 8.9TB.  I extended the volume
> to 17.7TB and attempted to run xfs_growfs.  I am not sure whether the
> command actually finished, as after I ran the command, the metadata was
> displayed, but there was no nothing that stated the the number of data
> blocks had changed.  I was just returned to the prompt, so I'm not sure
> whether the command completed or not..

Hmmm - what kernel and what version of xfsprogs are you using?
(xfs_growfs -V).

Also, can you post the output of the growfs command if you still
have it?

If not, the output of:

# xfs_db -r -c 'sb 0' -c p <device>
# xfs_db -r -c 'sb 1' -c p <device>

because:

> I was unable write to the logical volume I had just created.  I tried to
> remount it, but I kept getting an error saying the superblock could not
> be read.  I tried running an xfs_repair on the filesystem, and get the
> following:
> 
> Phase 1 - find and verify superblock...
> superblock read failed, offset 19504058859520, size 2048, ag 64, rval 0

That's a weird size for a superblock, and I suspect you should only
have AG's numbered 0-63 in your filesystem. (a 8.9TB filesystem will
have 32 AGs (0-31) by default, and doubling the size will take it
up to 64).

> I am not very experienced with xfs (I was following commands in some
> documentaion), and I was recommended to post to this mailing list.  If
> anyone could provide some help, it would be greatly appreciate.  Also,
> if there is any information I can provide to help, I will gladly provide
> it.  Thanks in advance!

Seeing as the filesystem has not mounted, I think this should be
recoverable if you don't try to mount or write anything to the
filesystem until we fix the geometry back up....

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: Repairing a possibly incomplete xfs_growfs command?
  2008-01-17  3:01 ` David Chinner
@ 2008-01-17 17:29   ` Mark Magpayo
  2008-01-17 19:10     ` Eric Sandeen
  2008-01-17 23:15     ` David Chinner
  0 siblings, 2 replies; 22+ messages in thread
From: Mark Magpayo @ 2008-01-17 17:29 UTC (permalink / raw)
  To: David Chinner; +Cc: xfs



> -----Original Message-----
> From: xfs-bounce@oss.sgi.com [mailto:xfs-bounce@oss.sgi.com] On Behalf
Of
> David Chinner
> Sent: Wednesday, January 16, 2008 7:01 PM
> To: Mark Magpayo
> Cc: xfs@oss.sgi.com
> Subject: Re: Repairing a possibly incomplete xfs_growfs command?
> 
> On Wed, Jan 16, 2008 at 03:19:19PM -0800, Mark Magpayo wrote:
> > Hi,
> >
> > So I have run across a strange situation which I hope there are some
> > gurus out there to help.
> >
> > The original setup was a logical volume of 8.9TB.  I extended the
volume
> > to 17.7TB and attempted to run xfs_growfs.  I am not sure whether
the
> > command actually finished, as after I ran the command, the metadata
was
> > displayed, but there was no nothing that stated the the number of
data
> > blocks had changed.  I was just returned to the prompt, so I'm not
sure
> > whether the command completed or not..
> 
> Hmmm - what kernel and what version of xfsprogs are you using?
> (xfs_growfs -V).
> 

xfs_growfs version 2.9.4

> Also, can you post the output of the growfs command if you still
> have it?
> 
> If not, the output of:
> 
> # xfs_db -r -c 'sb 0' -c p <device>

#xfs_db -r -c 'sb 0' -c p /dev/vg0/lv0
magicnum = 0x58465342
blocksize = 4096
dblocks = 11904332800
rblocks = 0
rextents = 0
uuid = 05d4f6ba-1e9c-4564-898b-98088c163fe1
logstart = 2147483652
rootino = 128
rbmino = 129
rsumino = 130
rextsize = 16
agblocks = 74402080
agcount = 160
rbmblocks = 0
logblocks = 32768
versionnum = 0x3094
sectsize = 512
inodesize = 256
inopblock = 16
fname = "\000\000\000\000\000\000\000\000\000\000\000\000"
blocklog = 12
sectlog = 9
inodelog = 8
inopblog = 4
agblklog = 27
rextslog = 0
inprogress = 0
imax_pct = 25
icount = 1335040
ifree = 55
fdblocks = 9525955616
frextents = 0
uquotino = 0
gquotino = 0
qflags = 0
flags = 0
shared_vn = 0
inoalignmt = 2
unit = 0
width = 0
dirblklog = 0
logsectlog = 0
logsectsize = 0
logsunit = 0
features2 = 0


> # xfs_db -r -c 'sb 1' -c p <device>
> 

#xfs_db -r -c 'sb 1' -c p /dev/vg0/lv0
magicnum = 0x58465342
blocksize = 4096
dblocks = 2380866560
rblocks = 0
rextents = 0
uuid = 05d4f6ba-1e9c-4564-898b-98088c163fe1
logstart = 2147483652
rootino = 128
rbmino = 129
rsumino = 130
rextsize = 16
agblocks = 74402080
agcount = 32
rbmblocks = 0
logblocks = 32768
versionnum = 0x3094
sectsize = 512
inodesize = 256
inopblock = 16
fname = "\000\000\000\000\000\000\000\000\000\000\000\000"
blocklog = 12
sectlog = 9
inodelog = 8
inopblog = 4
agblklog = 27
rextslog = 0
inprogress = 0
imax_pct = 25
icount = 1334912
ifree = 59
fdblocks = 2809815
frextents = 0
uquotino = 0
gquotino = 0
qflags = 0
flags = 0
shared_vn = 0
inoalignmt = 2
unit = 0
width = 0
dirblklog = 0
logsectlog = 0
logsectsize = 0
logsunit = 0
features2 = 0

> because:
> 
> > I was unable write to the logical volume I had just created.  I
tried to
> > remount it, but I kept getting an error saying the superblock could
not
> > be read.  I tried running an xfs_repair on the filesystem, and get
the
> > following:
> >
> > Phase 1 - find and verify superblock...
> > superblock read failed, offset 19504058859520, size 2048, ag 64,
rval 0
> 
> That's a weird size for a superblock, and I suspect you should only
> have AG's numbered 0-63 in your filesystem. (a 8.9TB filesystem will
> have 32 AGs (0-31) by default, and doubling the size will take it
> up to 64).
> 
> > I am not very experienced with xfs (I was following commands in some
> > documentaion), and I was recommended to post to this mailing list.
If
> > anyone could provide some help, it would be greatly appreciate.
Also,
> > if there is any information I can provide to help, I will gladly
provide
> > it.  Thanks in advance!
> 
> Seeing as the filesystem has not mounted, I think this should be
> recoverable if you don't try to mount or write anything to the
> filesystem until we fix the geometry back up....
> 
> Cheers,
> 
> Dave.
> --
> Dave Chinner
> Principal Engineer
> SGI Australian Software Group
> 


Someone else asked for the size of the block devices... here's the
output from /proc/partitions: 

152     0 9523468862 etherd/e1.0
152    16 9523468862 etherd/e0.0


I appreciate everyone's help!

Thanks,

Mark

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Repairing a possibly incomplete xfs_growfs command?
  2008-01-17 17:29   ` Mark Magpayo
@ 2008-01-17 19:10     ` Eric Sandeen
  2008-01-17 20:04       ` Mark Magpayo
  2008-01-17 23:15     ` David Chinner
  1 sibling, 1 reply; 22+ messages in thread
From: Eric Sandeen @ 2008-01-17 19:10 UTC (permalink / raw)
  To: Mark Magpayo; +Cc: David Chinner, xfs

Mark Magpayo wrote:

> Someone else asked for the size of the block devices... here's the
> output from /proc/partitions: 
> 
> 152     0 9523468862 etherd/e1.0
> 152    16 9523468862 etherd/e0.0


are those two assembled into your actual block device?  They look each
about 8T.

Is the lvm device also in /proc/partitions?

-Eric

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: Repairing a possibly incomplete xfs_growfs command?
  2008-01-17 19:10     ` Eric Sandeen
@ 2008-01-17 20:04       ` Mark Magpayo
  2008-01-17 22:19         ` Eric Sandeen
  0 siblings, 1 reply; 22+ messages in thread
From: Mark Magpayo @ 2008-01-17 20:04 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: David Chinner, xfs


> -----Original Message-----
> From: Eric Sandeen [mailto:sandeen@sandeen.net]
> Sent: Thursday, January 17, 2008 11:11 AM
> To: Mark Magpayo
> Cc: David Chinner; xfs@oss.sgi.com
> Subject: Re: Repairing a possibly incomplete xfs_growfs command?
> 
> Mark Magpayo wrote:
> 
> > Someone else asked for the size of the block devices... here's the
> > output from /proc/partitions:
> >
> > 152     0 9523468862 etherd/e1.0
> > 152    16 9523468862 etherd/e0.0
> 
> 
> are those two assembled into your actual block device?  They look each
> about 8T.
> 
> Is the lvm device also in /proc/partitions?
> 
> -Eric

Here's the entire output:

major minor  #blocks  name

   3     0     512000 hda
   3     1     511528 hda1
 152     0 9523468862 etherd/e1.0
 152    16 9523468862 etherd/e0.0
 254     0 19046932480 dm-0


I believe dm-0 is the lvm device.

Thanks,

Mark

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Repairing a possibly incomplete xfs_growfs command?
  2008-01-17 20:04       ` Mark Magpayo
@ 2008-01-17 22:19         ` Eric Sandeen
  2008-01-17 22:47           ` Nathan Scott
  0 siblings, 1 reply; 22+ messages in thread
From: Eric Sandeen @ 2008-01-17 22:19 UTC (permalink / raw)
  To: Mark Magpayo; +Cc: David Chinner, xfs

Mark Magpayo wrote:

> Here's the entire output:
> 
> major minor  #blocks  name
> 
>    3     0     512000 hda
>    3     1     511528 hda1
>  152     0 9523468862 etherd/e1.0
>  152    16 9523468862 etherd/e0.0
>  254     0 19046932480 dm-0
> 
> 
> I believe dm-0 is the lvm device.


Yep, in 1k units, so:

19046932480*1024
19504058859520

and:

superblock read failed, offset 19504058859520, size 2048, ag 64, rval 0

so it's trying to read a 2k (?) superblock right in the last 1k of the
device?  Hrm.  (Dave, Barry - isn't that 2048 the sector size, not block
size?)

Also from your sb 0 printout:

blocksize = 4096
dblocks = 11904332800

is 48760147148800, exactly 2.5x bigger than your device is. Weird.

-Eric

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Repairing a possibly incomplete xfs_growfs command?
  2008-01-17 22:19         ` Eric Sandeen
@ 2008-01-17 22:47           ` Nathan Scott
  0 siblings, 0 replies; 22+ messages in thread
From: Nathan Scott @ 2008-01-17 22:47 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Mark Magpayo, David Chinner, xfs

On Thu, 2008-01-17 at 16:19 -0600, Eric Sandeen wrote:
> 
> Yep, in 1k units, so:
> 
> 19046932480*1024
> 19504058859520
> 
> and:
> 
> superblock read failed, offset 19504058859520, size 2048, ag 64, rval
> 0
> 
> so it's trying to read a 2k (?) superblock right in the last 1k of the
> device?  Hrm.  (Dave, Barry - isn't that 2048 the sector size, not
> block
> size?)
> 
> Also from your sb 0 printout:
> 
> blocksize = 4096
> dblocks = 11904332800

sectsize = 512
sectlog = 9

So, SB reckons its a regular 512 byte sector size.  Perhaps the device
driver is reporting a 2K sector size from the BLKSSZGET ioctl?  That'd
be wierd, cos mkfs would have issued a warning when creating with 512
byte sectors.  *shrug*.

> is 48760147148800, exactly 2.5x bigger than your device is. Weird.

cheers.

-- 
Nathan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Repairing a possibly incomplete xfs_growfs command?
  2008-01-17 17:29   ` Mark Magpayo
  2008-01-17 19:10     ` Eric Sandeen
@ 2008-01-17 23:15     ` David Chinner
  2008-01-17 23:29       ` Mark Magpayo
  1 sibling, 1 reply; 22+ messages in thread
From: David Chinner @ 2008-01-17 23:15 UTC (permalink / raw)
  To: Mark Magpayo; +Cc: xfs

On Thu, Jan 17, 2008 at 09:29:22AM -0800, Mark Magpayo wrote:
> > On Wed, Jan 16, 2008 at 03:19:19PM -0800, Mark Magpayo wrote:
> > > Hi,
> > >
> > > So I have run across a strange situation which I hope there
> > > are some gurus out there to help.
> > >
> > > The original setup was a logical volume of 8.9TB.  I extended
> > > the volume to 17.7TB and attempted to run xfs_growfs.  I am
> > > not sure whether the command actually finished, as after I ran
> > > the command, the metadata was displayed, but there was no
> > > nothing that stated the the number of data blocks had changed.
> > > I was just returned to the prompt, so I'm not sure whether the
> > > command completed or not..
> > 
> > Hmmm - what kernel and what version of xfsprogs are you using?
> > (xfs_growfs -V).
> 
> xfs_growfs version 2.9.4

Ok, that's recent - what kernel? (uname -a)

> > Also, can you post the output of the growfs command if you still
> > have it?
> > 
> > If not, the output of:
> > 
> > # xfs_db -r -c 'sb 0' -c p <device>
> 
> #xfs_db -r -c 'sb 0' -c p /dev/vg0/lv0
> magicnum = 0x58465342
> blocksize = 4096
> dblocks = 11904332800

	= 44TB?
....
> agblocks = 74402080
	= ~283GB

> agcount = 160

	160*283GB = 44TB.

Hold on - 160 AGs? I saw this exact same growfs failure signature
just before Christmas at a customer site on an old kernel and
xfsprogs.  I really need to know what kernel you are running to
determine if we may have fixed this bug or not....

But, I did manage to recover that filesystem successfully,
so I can give you a simple recipe to fix it up and it won't
take me 4 hours on IRC to understand the scope of the damage
completely.

BTW, if you wanted 18TB, that should be ~64AGs at that size AG
so my initial suspicion was confirmed....

> rbmblocks = 0
> logblocks = 32768
> versionnum = 0x3094
....
> icount = 1335040
> ifree = 55
> fdblocks = 9525955616
	= 35TB

So the free block count got updated as well.

Ok, that means once we've fixed up the number of AGs and block
count, we'll need to run xfs_repair to ensure all the accounting
is correct....

So the superblock in AG1 shoul dhave the original (pre-grow)
geometry in it:

> #xfs_db -r -c 'sb 1' -c p /dev/vg0/lv0
> magicnum = 0x58465342
> blocksize = 4096
> dblocks = 2380866560

	= 8.9TB
....
> agblocks = 74402080
> agcount = 32

Yup, 32 AGs originally.

> rbmblocks = 0
> logblocks = 32768
> versionnum = 0x3094
....
> icount = 1334912
> ifree = 59
> fdblocks = 2809815

Yeah, you didn't have much free space, did you? ;)

FWIW: sb.0.fdblocks - (sb.0.dblocks - sb.1.dblocks)
	= 9525955616 - (11904332800 - 2380866560)
	= 2489376

Which means we can use simple subtraction to fix up the free
block count. You'll need to run xfs_reapir to fix this after
we've fixed the geometry.

The way to fix this is to manually fix up the agcount
and dblocks in all the AGs. Seeing as you simply doubled the
volume size, that is relatively easy to do. dblocks should
be 2*2380866560 = 4761733120 blocks = 19,046,932,480 bytes.

Your device is 19,504,058,859,520 bytes in size, so this should
fit just fine.

# for i in `seq 0 1 63`; do
> xfs_db -x -c "sb $i" -c 'write agcount 64' -c 'write dblock 4761733120' /dev/vg0/lv0
> done

Then run 'xfs_repair -n /dev/vg0/lv0' to check that phase 1 will
pass (i.e. it can read the last block of the filesystem). If phase
1 completes, then you can kill it and run xfs_repair again without
the '-n' flag.

Once that completes, you should have a mountable filesystem that is
~18TB in size.

If you want, once you've mounted it run xfs_growfs again to extend
the filesystem completely to the end of new device....

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: Repairing a possibly incomplete xfs_growfs command?
  2008-01-17 23:15     ` David Chinner
@ 2008-01-17 23:29       ` Mark Magpayo
  2008-01-17 23:46         ` David Chinner
  0 siblings, 1 reply; 22+ messages in thread
From: Mark Magpayo @ 2008-01-17 23:29 UTC (permalink / raw)
  To: David Chinner; +Cc: xfs

> -----Original Message-----
> From: David Chinner [mailto:dgc@sgi.com]
> Sent: Thursday, January 17, 2008 3:15 PM
> To: Mark Magpayo
> Cc: xfs@oss.sgi.com
> Subject: Re: Repairing a possibly incomplete xfs_growfs command?
> 
> On Thu, Jan 17, 2008 at 09:29:22AM -0800, Mark Magpayo wrote:
> > > On Wed, Jan 16, 2008 at 03:19:19PM -0800, Mark Magpayo wrote:
> > > > Hi,
> > > >
> > > > So I have run across a strange situation which I hope there
> > > > are some gurus out there to help.
> > > >
> > > > The original setup was a logical volume of 8.9TB.  I extended
> > > > the volume to 17.7TB and attempted to run xfs_growfs.  I am
> > > > not sure whether the command actually finished, as after I ran
> > > > the command, the metadata was displayed, but there was no
> > > > nothing that stated the the number of data blocks had changed.
> > > > I was just returned to the prompt, so I'm not sure whether the
> > > > command completed or not..
> > >
> > > Hmmm - what kernel and what version of xfsprogs are you using?
> > > (xfs_growfs -V).
> >
> > xfs_growfs version 2.9.4
> 
> Ok, that's recent - what kernel? (uname -a)
> 
> > > Also, can you post the output of the growfs command if you still
> > > have it?
> > >
> > > If not, the output of:
> > >
> > > # xfs_db -r -c 'sb 0' -c p <device>
> >
> > #xfs_db -r -c 'sb 0' -c p /dev/vg0/lv0
> > magicnum = 0x58465342
> > blocksize = 4096
> > dblocks = 11904332800
> 
> 	= 44TB?
> ....
> > agblocks = 74402080
> 	= ~283GB
> 
> > agcount = 160
> 
> 	160*283GB = 44TB.
> 
> Hold on - 160 AGs? I saw this exact same growfs failure signature
> just before Christmas at a customer site on an old kernel and
> xfsprogs.  I really need to know what kernel you are running to
> determine if we may have fixed this bug or not....
> 
> But, I did manage to recover that filesystem successfully,
> so I can give you a simple recipe to fix it up and it won't
> take me 4 hours on IRC to understand the scope of the damage
> completely.
> 
> BTW, if you wanted 18TB, that should be ~64AGs at that size AG
> so my initial suspicion was confirmed....
> 
> > rbmblocks = 0
> > logblocks = 32768
> > versionnum = 0x3094
> ....
> > icount = 1335040
> > ifree = 55
> > fdblocks = 9525955616
> 	= 35TB
> 
> So the free block count got updated as well.
> 
> Ok, that means once we've fixed up the number of AGs and block
> count, we'll need to run xfs_repair to ensure all the accounting
> is correct....
> 
> 
> So the superblock in AG1 shoul dhave the original (pre-grow)
> geometry in it:
> 
> > #xfs_db -r -c 'sb 1' -c p /dev/vg0/lv0
> > magicnum = 0x58465342
> > blocksize = 4096
> > dblocks = 2380866560
> 
> 	= 8.9TB
> ....
> > agblocks = 74402080
> > agcount = 32
> 
> Yup, 32 AGs originally.
> 
> > rbmblocks = 0
> > logblocks = 32768
> > versionnum = 0x3094
> ....
> > icount = 1334912
> > ifree = 59
> > fdblocks = 2809815
> 
> Yeah, you didn't have much free space, did you? ;)
> 
> FWIW: sb.0.fdblocks - (sb.0.dblocks - sb.1.dblocks)
> 	= 9525955616 - (11904332800 - 2380866560)
> 	= 2489376
> 
> Which means we can use simple subtraction to fix up the free
> block count. You'll need to run xfs_reapir to fix this after
> we've fixed the geometry.
> 
> The way to fix this is to manually fix up the agcount
> and dblocks in all the AGs. Seeing as you simply doubled the
> volume size, that is relatively easy to do. dblocks should
> be 2*2380866560 = 4761733120 blocks = 19,046,932,480 bytes.
> 
> Your device is 19,504,058,859,520 bytes in size, so this should
> fit just fine.
> 
> # for i in `seq 0 1 63`; do
> > xfs_db -x -c "sb $i" -c 'write agcount 64' -c 'write dblock
4761733120'
> /dev/vg0/lv0
> > done
> 
> Then run 'xfs_repair -n /dev/vg0/lv0' to check that phase 1 will
> pass (i.e. it can read the last block of the filesystem). If phase
> 1 completes, then you can kill it and run xfs_repair again without
> the '-n' flag.
> 
> Once that completes, you should have a mountable filesystem that is
> ~18TB in size.
> 
> If you want, once you've mounted it run xfs_growfs again to extend
> the filesystem completely to the end of new device....
> 
> Cheers,
> 
> Dave.
> --
> Dave Chinner
> Principal Engineer
> SGI Australian Software Group

This is quite a relief to know that this is a fairly straightforward
fix!  What luck that you had encountered it recently, I really
appreciate the help.  Here's my uname output:

Linux purenas 2.6.16.55-c1 #1 SMP Fri Oct 19 16:45:15 EDT 2007 x86_64
GNU/Linux

Maybe you guys fixed the bug already?

iirc, I may have run xfs_growfs with an older version of xfsprogs, then
was advised to update to the newest and try it again.  I may have run it
on a version that still contained the bug?

So is this all I need then prior to an xfs_repair?:

> # for i in `seq 0 1 63`; do
> > xfs_db -x -c "sb $i" -c 'write agcount 64' -c 'write dblock
4761733120'
> /dev/vg0/lv0

I really appreciate all of the help everyone has given. =)

Thanks,

Mark

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Repairing a possibly incomplete xfs_growfs command?
  2008-01-17 23:29       ` Mark Magpayo
@ 2008-01-17 23:46         ` David Chinner
  2008-01-18 17:50           ` Mark Magpayo
  0 siblings, 1 reply; 22+ messages in thread
From: David Chinner @ 2008-01-17 23:46 UTC (permalink / raw)
  To: Mark Magpayo; +Cc: xfs

On Thu, Jan 17, 2008 at 03:29:17PM -0800, Mark Magpayo wrote:
> This is quite a relief to know that this is a fairly straightforward
> fix!  What luck that you had encountered it recently, I really
> appreciate the help.  Here's my uname output:
> 
> Linux purenas 2.6.16.55-c1 #1 SMP Fri Oct 19 16:45:15 EDT 2007 x86_64
> GNU/Linux
> 
> Maybe you guys fixed the bug already?

/me breathes a sigh of relief

I think we have:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=20f4ebf2bf2f57c1a9abb3655391336cc90314b3

[XFS] Make growfs work for amounts greater than 2TB
 
> iirc, I may have run xfs_growfs with an older version of xfsprogs, then
> was advised to update to the newest and try it again.  I may have run it
> on a version that still contained the bug?

Kernel bug, not userspace bug, AFAICT.

> So is this all I need then prior to an xfs_repair?:
> 
> > # for i in `seq 0 1 63`; do
> > > xfs_db -x -c "sb $i" -c 'write agcount 64' -c 'write dblock 4761733120'
> > /dev/vg0/lv0

Yes, I think that is all that is necessary (that+repair was what fixed
the problem at the customer site successfully).

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: Repairing a possibly incomplete xfs_growfs command?
  2008-01-17 23:46         ` David Chinner
@ 2008-01-18 17:50           ` Mark Magpayo
  2008-01-18 18:34             ` Eric Sandeen
  2008-01-19  0:40             ` David Chinner
  0 siblings, 2 replies; 22+ messages in thread
From: Mark Magpayo @ 2008-01-18 17:50 UTC (permalink / raw)
  To: David Chinner; +Cc: xfs

> > So is this all I need then prior to an xfs_repair?:
> >
> > > # for i in `seq 0 1 63`; do
> > > > xfs_db -x -c "sb $i" -c 'write agcount 64' -c 'write dblock
> 4761733120'
> > > /dev/vg0/lv0
> 
> Yes, I think that is all that is necessary (that+repair was what fixed
> the problem at the customer site successfully).
> 

Is this supposed to be the proper output to the command above?

purenas:~# for i in `seq 0 1 63`; do xfs_db -x -c "sb $i" -c 'write
agcount 64' -c 'write dblock 4761733120' /dev/vg0/lv0; done
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error
agcount = 64
field dblock not found
parsing error

Hopefully I just mis-typed something?

Thanks,

Mark

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Repairing a possibly incomplete xfs_growfs command?
  2008-01-18 17:50           ` Mark Magpayo
@ 2008-01-18 18:34             ` Eric Sandeen
  2008-01-19  0:40             ` David Chinner
  1 sibling, 0 replies; 22+ messages in thread
From: Eric Sandeen @ 2008-01-18 18:34 UTC (permalink / raw)
  To: Mark Magpayo; +Cc: David Chinner, xfs

Mark Magpayo wrote:
>>> So is this all I need then prior to an xfs_repair?:
>>>
>>>> # for i in `seq 0 1 63`; do
>>>>> xfs_db -x -c "sb $i" -c 'write agcount 64' -c 'write dblock
>> 4761733120'
>>>> /dev/vg0/lv0
>> Yes, I think that is all that is necessary (that+repair was what fixed
>> the problem at the customer site successfully).
>>
> 
> Is this supposed to be the proper output to the command above?
> 
> purenas:~# for i in `seq 0 1 63`; do xfs_db -x -c "sb $i" -c 'write
> agcount 64' -c 'write dblock 4761733120' /dev/vg0/lv0; done
> agcount = 64
> field dblock not found

...

I think dave had a typo, should be dblocks with an "s" on the end.

Feel free to wait for his confirmation, though, since this is surgery,
after all :)

-eric

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Repairing a possibly incomplete xfs_growfs command?
  2008-01-18 17:50           ` Mark Magpayo
  2008-01-18 18:34             ` Eric Sandeen
@ 2008-01-19  0:40             ` David Chinner
  2008-01-22 19:40               ` Mark Magpayo
  1 sibling, 1 reply; 22+ messages in thread
From: David Chinner @ 2008-01-19  0:40 UTC (permalink / raw)
  To: Mark Magpayo; +Cc: David Chinner, xfs

On Fri, Jan 18, 2008 at 09:50:37AM -0800, Mark Magpayo wrote:
> > > So is this all I need then prior to an xfs_repair?:
> > >
> > > > # for i in `seq 0 1 63`; do
> > > > > xfs_db -x -c "sb $i" -c 'write agcount 64' -c 'write dblock
> > 4761733120'
> > > > /dev/vg0/lv0
> > 
> > Yes, I think that is all that is necessary (that+repair was what fixed
> > the problem at the customer site successfully).
> > 
> 
> Is this supposed to be the proper output to the command above?
> 
> purenas:~# for i in `seq 0 1 63`; do xfs_db -x -c "sb $i" -c 'write
> agcount 64' -c 'write dblock 4761733120' /dev/vg0/lv0; done
> agcount = 64
> field dblock not found
> parsing error

Ah - As eric pointed out, that should be "dblocks".

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: Repairing a possibly incomplete xfs_growfs command?
  2008-01-19  0:40             ` David Chinner
@ 2008-01-22 19:40               ` Mark Magpayo
  2008-01-22 21:13                 ` David Chinner
  2008-01-23  2:57                 ` Barry Naujok
  0 siblings, 2 replies; 22+ messages in thread
From: Mark Magpayo @ 2008-01-22 19:40 UTC (permalink / raw)
  To: David Chinner; +Cc: xfs


> -----Original Message-----
> From: David Chinner [mailto:dgc@sgi.com]
> Sent: Friday, January 18, 2008 4:40 PM
> To: Mark Magpayo
> Cc: David Chinner; xfs@oss.sgi.com
> Subject: Re: Repairing a possibly incomplete xfs_growfs command?
> 
> On Fri, Jan 18, 2008 at 09:50:37AM -0800, Mark Magpayo wrote:
> > > > So is this all I need then prior to an xfs_repair?:
> > > >
> > > > > # for i in `seq 0 1 63`; do
> > > > > > xfs_db -x -c "sb $i" -c 'write agcount 64' -c 'write dblock
> > > 4761733120'
> > > > > /dev/vg0/lv0
> > >
> > > Yes, I think that is all that is necessary (that+repair was what
fixed
> > > the problem at the customer site successfully).
> > >
> >
> > Is this supposed to be the proper output to the command above?
> >
> > purenas:~# for i in `seq 0 1 63`; do xfs_db -x -c "sb $i" -c 'write
> > agcount 64' -c 'write dblock 4761733120' /dev/vg0/lv0; done
> > agcount = 64
> > field dblock not found
> > parsing error
> 
> Ah - As eric pointed out, that should be "dblocks".
> 
> Cheers,
> 
> Dave.
> --
> Dave Chinner
> Principal Engineer
> SGI Australian Software Group

Any ideas on how long the xfs_repair is supposed to take on 18TB?  I
started it Friday nite, and it's now Tuesday afternoon.  It's stuck
here:

Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...

I figure traversing a filesystem of 18TB takes a while, but does 4 days
sound right?

Thanks,

Mark

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Repairing a possibly incomplete xfs_growfs command?
  2008-01-22 19:40               ` Mark Magpayo
@ 2008-01-22 21:13                 ` David Chinner
  2008-01-22 21:46                   ` Mark Magpayo
  2008-01-23  2:57                 ` Barry Naujok
  1 sibling, 1 reply; 22+ messages in thread
From: David Chinner @ 2008-01-22 21:13 UTC (permalink / raw)
  To: Mark Magpayo; +Cc: xfs

On Tue, Jan 22, 2008 at 11:40:52AM -0800, Mark Magpayo wrote:
> Any ideas on how long the xfs_repair is supposed to take on 18TB?  I
> started it Friday nite, and it's now Tuesday afternoon.  It's stuck
> here:
> 
> Phase 5 - rebuild AG headers and trees...
>         - reset superblock...
> Phase 6 - check inode connectivity...
>         - resetting contents of realtime bitmap and summary inodes
>         - traversing filesystem ...
> 
> I figure traversing a filesystem of 18TB takes a while, but does 4 days
> sound right?

Yes, it can if it's swapping like mad because you don't have enough
RAM in the machine. Runtime is also detemrined by how many inodes there
are in the filesystem - do you know how many there are? Also, more
recent xfs_repair versions tend to be faster - what version are you
using again?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: Repairing a possibly incomplete xfs_growfs command?
  2008-01-22 21:13                 ` David Chinner
@ 2008-01-22 21:46                   ` Mark Magpayo
  2008-01-22 22:48                     ` Mark Goodwin
  0 siblings, 1 reply; 22+ messages in thread
From: Mark Magpayo @ 2008-01-22 21:46 UTC (permalink / raw)
  To: David Chinner; +Cc: xfs


> -----Original Message-----
> From: xfs-bounce@oss.sgi.com [mailto:xfs-bounce@oss.sgi.com] On Behalf
Of
> David Chinner
> Sent: Tuesday, January 22, 2008 1:13 PM
> To: Mark Magpayo
> Cc: xfs@oss.sgi.com
> Subject: Re: Repairing a possibly incomplete xfs_growfs command?
> 
> On Tue, Jan 22, 2008 at 11:40:52AM -0800, Mark Magpayo wrote:
> > Any ideas on how long the xfs_repair is supposed to take on 18TB?  I
> > started it Friday nite, and it's now Tuesday afternoon.  It's stuck
> > here:
> >
> > Phase 5 - rebuild AG headers and trees...
> >         - reset superblock...
> > Phase 6 - check inode connectivity...
> >         - resetting contents of realtime bitmap and summary inodes
> >         - traversing filesystem ...
> >
> > I figure traversing a filesystem of 18TB takes a while, but does 4
days
> > sound right?
> 
> Yes, it can if it's swapping like mad because you don't have enough
> RAM in the machine. Runtime is also detemrined by how many inodes
there
> are in the filesystem - do you know how many there are? Also, more
> recent xfs_repair versions tend to be faster - what version are you
> using again?

Using version 2.9.4.  I may have forgotten to allocate more swap space
(as was told in the manual given to me by the vendor), so would breaking
out of the repair and restarting with more swap help out, or am I too
deep (4 days) into it and should just let it run?

-Mark
 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Repairing a possibly incomplete xfs_growfs command?
  2008-01-22 21:46                   ` Mark Magpayo
@ 2008-01-22 22:48                     ` Mark Goodwin
  2008-01-22 22:50                       ` Mark Magpayo
  0 siblings, 1 reply; 22+ messages in thread
From: Mark Goodwin @ 2008-01-22 22:48 UTC (permalink / raw)
  To: Mark Magpayo; +Cc: David Chinner, xfs



Mark Magpayo wrote:
>> -----Original Message-----
>> From: xfs-bounce@oss.sgi.com [mailto:xfs-bounce@oss.sgi.com] On Behalf
> Of
>> David Chinner
>> Sent: Tuesday, January 22, 2008 1:13 PM
>> To: Mark Magpayo
>> Cc: xfs@oss.sgi.com
>> Subject: Re: Repairing a possibly incomplete xfs_growfs command?
>>
>> On Tue, Jan 22, 2008 at 11:40:52AM -0800, Mark Magpayo wrote:
>>> Any ideas on how long the xfs_repair is supposed to take on 18TB?  I
>>> started it Friday nite, and it's now Tuesday afternoon.  It's stuck
>>> here:
>>>
>>> Phase 5 - rebuild AG headers and trees...
>>>         - reset superblock...
>>> Phase 6 - check inode connectivity...
>>>         - resetting contents of realtime bitmap and summary inodes
>>>         - traversing filesystem ...
>>>
>>> I figure traversing a filesystem of 18TB takes a while, but does 4
> days
>>> sound right?
>> Yes, it can if it's swapping like mad because you don't have enough
>> RAM in the machine. Runtime is also detemrined by how many inodes
> there
>> are in the filesystem - do you know how many there are? Also, more
>> recent xfs_repair versions tend to be faster - what version are you
>> using again?
> 
> Using version 2.9.4.  I may have forgotten to allocate more swap space
> (as was told in the manual given to me by the vendor), so would breaking
> out of the repair and restarting with more swap help out, or am I too
> deep (4 days) into it and should just let it run?

cat /proc/meminfo for this machine and post it here. If it's swapping,
adding more swap wont speed it up. If it runs out of swap the repair
will stop anyway ;-)

Cheers
-- 

  Mark Goodwin                                  markgw@sgi.com
  Engineering Manager for XFS and PCP    Phone: +61-3-99631937
  SGI Australian Software Group           Cell: +61-4-18969583
-------------------------------------------------------------

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: Repairing a possibly incomplete xfs_growfs command?
  2008-01-22 22:48                     ` Mark Goodwin
@ 2008-01-22 22:50                       ` Mark Magpayo
  0 siblings, 0 replies; 22+ messages in thread
From: Mark Magpayo @ 2008-01-22 22:50 UTC (permalink / raw)
  To: markgw; +Cc: David Chinner, xfs

> -----Original Message-----
> From: xfs-bounce@oss.sgi.com [mailto:xfs-bounce@oss.sgi.com] On Behalf
Of
> Mark Goodwin
> Sent: Tuesday, January 22, 2008 2:48 PM
> To: Mark Magpayo
> Cc: David Chinner; xfs@oss.sgi.com
> Subject: Re: Repairing a possibly incomplete xfs_growfs command?
> 
> 
> 
> Mark Magpayo wrote:
> >> -----Original Message-----
> >> From: xfs-bounce@oss.sgi.com [mailto:xfs-bounce@oss.sgi.com] On
Behalf
> > Of
> >> David Chinner
> >> Sent: Tuesday, January 22, 2008 1:13 PM
> >> To: Mark Magpayo
> >> Cc: xfs@oss.sgi.com
> >> Subject: Re: Repairing a possibly incomplete xfs_growfs command?
> >>
> >> On Tue, Jan 22, 2008 at 11:40:52AM -0800, Mark Magpayo wrote:
> >>> Any ideas on how long the xfs_repair is supposed to take on 18TB?
I
> >>> started it Friday nite, and it's now Tuesday afternoon.  It's
stuck
> >>> here:
> >>>
> >>> Phase 5 - rebuild AG headers and trees...
> >>>         - reset superblock...
> >>> Phase 6 - check inode connectivity...
> >>>         - resetting contents of realtime bitmap and summary inodes
> >>>         - traversing filesystem ...
> >>>
> >>> I figure traversing a filesystem of 18TB takes a while, but does 4
> > days
> >>> sound right?
> >> Yes, it can if it's swapping like mad because you don't have enough
> >> RAM in the machine. Runtime is also detemrined by how many inodes
> > there
> >> are in the filesystem - do you know how many there are? Also, more
> >> recent xfs_repair versions tend to be faster - what version are you
> >> using again?
> >
> > Using version 2.9.4.  I may have forgotten to allocate more swap
space
> > (as was told in the manual given to me by the vendor), so would
breaking
> > out of the repair and restarting with more swap help out, or am I
too
> > deep (4 days) into it and should just let it run?
> 
> cat /proc/meminfo for this machine and post it here. If it's swapping,
> adding more swap wont speed it up. If it runs out of swap the repair
> will stop anyway ;-)

Here you go:

purenas:~# cat /proc/meminfo
MemTotal:      1019732 kB
MemFree:        580920 kB
Buffers:          1720 kB
Cached:          17912 kB
SwapCached:      21712 kB
Active:          44016 kB
Inactive:         8488 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:      1019732 kB
LowFree:        580920 kB
SwapTotal:    732574516 kB
SwapFree:     732548180 kB
Dirty:              12 kB
Writeback:           0 kB
Mapped:          36096 kB
Slab:            18016 kB
CommitLimit:  732778460 kB
Committed_AS:    55612 kB
PageTables:       1420 kB
VmallocTotal: 34359738367 kB
VmallocUsed:    358616 kB
VmallocChunk: 34359379631 kB


Nevermind my previous comment about turning on swap, looks like I had it
turned on afterall.  Any reason to think it may have stopped?  Or does
it just take that long to run?

Thanks,

Mark

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Repairing a possibly incomplete xfs_growfs command?
  2008-01-22 19:40               ` Mark Magpayo
  2008-01-22 21:13                 ` David Chinner
@ 2008-01-23  2:57                 ` Barry Naujok
  2008-01-23 17:24                   ` Mark Magpayo
  1 sibling, 1 reply; 22+ messages in thread
From: Barry Naujok @ 2008-01-23  2:57 UTC (permalink / raw)
  To: Mark Magpayo; +Cc: xfs

On Wed, 23 Jan 2008 06:40:52 +1100, Mark Magpayo <mmagpayo@purevideo.com>  
wrote:

>
>> -----Original Message-----
>> From: David Chinner [mailto:dgc@sgi.com]
>> Sent: Friday, January 18, 2008 4:40 PM
>> To: Mark Magpayo
>> Cc: David Chinner; xfs@oss.sgi.com
>> Subject: Re: Repairing a possibly incomplete xfs_growfs command?
>>
>> On Fri, Jan 18, 2008 at 09:50:37AM -0800, Mark Magpayo wrote:
>> > > > So is this all I need then prior to an xfs_repair?:
>> > > >
>> > > > > # for i in `seq 0 1 63`; do
>> > > > > > xfs_db -x -c "sb $i" -c 'write agcount 64' -c 'write dblock
>> > > 4761733120'
>> > > > > /dev/vg0/lv0
>> > >
>> > > Yes, I think that is all that is necessary (that+repair was what
> fixed
>> > > the problem at the customer site successfully).
>> > >
>> >
>> > Is this supposed to be the proper output to the command above?
>> >
>> > purenas:~# for i in `seq 0 1 63`; do xfs_db -x -c "sb $i" -c 'write
>> > agcount 64' -c 'write dblock 4761733120' /dev/vg0/lv0; done
>> > agcount = 64
>> > field dblock not found
>> > parsing error
>>
>> Ah - As eric pointed out, that should be "dblocks".
>>
>> Cheers,
>>
>> Dave.
>> --
>> Dave Chinner
>> Principal Engineer
>> SGI Australian Software Group
>
> Any ideas on how long the xfs_repair is supposed to take on 18TB?  I
> started it Friday nite, and it's now Tuesday afternoon.  It's stuck
> here:
>
> Phase 5 - rebuild AG headers and trees...
>         - reset superblock...
> Phase 6 - check inode connectivity...
>         - resetting contents of realtime bitmap and summary inodes
>         - traversing filesystem ...
>
> I figure traversing a filesystem of 18TB takes a while, but does 4 days
> sound right?

Was it stuck on Phase 6 all that time? With only 1GB of RAM (from your
meminfo output) and 18TB filesystem, Phases 3 and 4 will take a very
long time due to swapping.

Phase 6 in your scenario should be relatively quick and light on
memory usage (500MB as reported in your other email).

It is feasible it is deadlocked by trying to double-access a buffer,
or access a buffer that wasn't released. This is an unlikely scenario,
but it is possible.

Regards,
Barry.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: Repairing a possibly incomplete xfs_growfs command?
  2008-01-23  2:57                 ` Barry Naujok
@ 2008-01-23 17:24                   ` Mark Magpayo
  2008-01-24  1:02                     ` Barry Naujok
  0 siblings, 1 reply; 22+ messages in thread
From: Mark Magpayo @ 2008-01-23 17:24 UTC (permalink / raw)
  To: Barry Naujok; +Cc: xfs

> 
> Was it stuck on Phase 6 all that time? With only 1GB of RAM (from your
> meminfo output) and 18TB filesystem, Phases 3 and 4 will take a very
> long time due to swapping.

It's been stuck on Phase 6 since I came back to check on it on Monday.  


> 
> Phase 6 in your scenario should be relatively quick and light on
> memory usage (500MB as reported in your other email).
> 
> It is feasible it is deadlocked by trying to double-access a buffer,
> or access a buffer that wasn't released. This is an unlikely scenario,
> but it is possible.

Could I break out of the process here?  Seems like most of the repair
work has been done...  Then again, I imagine traversing the filesystem
is a pretty important step.

Are there any more phases after this by the way?

Thanks,

Mark

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Repairing a possibly incomplete xfs_growfs command?
  2008-01-23 17:24                   ` Mark Magpayo
@ 2008-01-24  1:02                     ` Barry Naujok
  0 siblings, 0 replies; 22+ messages in thread
From: Barry Naujok @ 2008-01-24  1:02 UTC (permalink / raw)
  To: Mark Magpayo; +Cc: xfs

On Thu, 24 Jan 2008 04:24:17 +1100, Mark Magpayo <mmagpayo@purevideo.com>  
wrote:

>>
>> Was it stuck on Phase 6 all that time? With only 1GB of RAM (from your
>> meminfo output) and 18TB filesystem, Phases 3 and 4 will take a very
>> long time due to swapping.
>
> It's been stuck on Phase 6 since I came back to check on it on Monday.
>
>
>>
>> Phase 6 in your scenario should be relatively quick and light on
>> memory usage (500MB as reported in your other email).
>>
>> It is feasible it is deadlocked by trying to double-access a buffer,
>> or access a buffer that wasn't released. This is an unlikely scenario,
>> but it is possible.
>
> Could I break out of the process here?  Seems like most of the repair
> work has been done...  Then again, I imagine traversing the filesystem
> is a pretty important step.

Breaking repair is fine.

> Are there any more phases after this by the way?

Checking nlink counts in Phase 7 is the last.

I would run xfs_check to see if there are any errors remaining.

The other thing I can suggest is to run an older repair from the
2.8.x series (2.8.21) with the options "-M -o bhash=512". This
should finish.

Regards,
Barry.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2008-01-24  1:01 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-16 23:19 Repairing a possibly incomplete xfs_growfs command? Mark Magpayo
2008-01-17  2:31 ` Eric Sandeen
2008-01-17  3:01 ` David Chinner
2008-01-17 17:29   ` Mark Magpayo
2008-01-17 19:10     ` Eric Sandeen
2008-01-17 20:04       ` Mark Magpayo
2008-01-17 22:19         ` Eric Sandeen
2008-01-17 22:47           ` Nathan Scott
2008-01-17 23:15     ` David Chinner
2008-01-17 23:29       ` Mark Magpayo
2008-01-17 23:46         ` David Chinner
2008-01-18 17:50           ` Mark Magpayo
2008-01-18 18:34             ` Eric Sandeen
2008-01-19  0:40             ` David Chinner
2008-01-22 19:40               ` Mark Magpayo
2008-01-22 21:13                 ` David Chinner
2008-01-22 21:46                   ` Mark Magpayo
2008-01-22 22:48                     ` Mark Goodwin
2008-01-22 22:50                       ` Mark Magpayo
2008-01-23  2:57                 ` Barry Naujok
2008-01-23 17:24                   ` Mark Magpayo
2008-01-24  1:02                     ` Barry Naujok

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox