public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* xfs_growfs doesn't resize
@ 2011-06-30 21:42 kkeller
  2011-07-03 15:59 ` Eric Sandeen
  0 siblings, 1 reply; 15+ messages in thread
From: kkeller @ 2011-06-30 21:42 UTC (permalink / raw)
  To: xfs

Hello kind XFS folks,

I am having a strange issue with xfs_growfs, and before I attempt to do
something potentially unsafe, I thought I would check in with the list
for advice.

Our fileserver had an ~11TB xfs filesystem hosted under linux lvm.  I
recently added more disks to create a new 9TB container, and used the
lvm tools to add the container to the existing volume group.  When I
went to xfs_growfs the filesystem, I had the first issue that this user
had, where the metadata was reported, but there was no message about
the new number of blocks:

http://oss.sgi.com/archives/xfs/2008-01/msg00085.html

Fortunately, I have not yet seen the other symptoms that the OP saw: I
can still read from and write to the original filesystem.  But the
filesystem size hasn't changed, and I'm not experienced enough to
interpret the xfs_info output properly.

I read through that thread (and others), but none seemed specific to my
issue.  Plus, since my filesystem still seems healthy, I'm hoping that
there's a graceful way to resolve the issue and add the new disk space.

Here's some of the information I've seen asked for in the past.  I
apologize for it being fairly long.

/proc/partitions:

major minor  #blocks  name

   8     0  244129792 sda
   8     1     104391 sda1
   8     2    8385930 sda2
   8     3   21205800 sda3
   8     4          1 sda4
   8     5   30876898 sda5
   8     6   51761398 sda6
   8     7   20555136 sda7
   8     8    8233281 sda8
   8     9   20603331 sda9
   8    16 11718684672 sdb
   8    17 11718684638 sdb1
 253     1 21484244992 dm-1
   8    48 9765570560 sdd
   8    49 9765568085 sdd1


sdb1 is the original member of the volume group.  sdd1 is the new PV.  I
believe dm-1 is the LV where the volume group is hosted (and all the LVM
tools report a 20TB logical volume).


# lvdisplay 
  --- Logical volume ---
  LV Name                /dev/saharaVG/saharaLV
  VG Name                saharaVG
  LV UUID                DjacPa-p9mk-mBmv-69c2-dmXF-LfxQ-wsRUOD
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                20.01 TB
  Current LE             5245177
  Segments               2
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:1

# uname -a
Linux sahara.xxx 2.6.18-128.1.6.el5 #1 SMP Wed Apr 1 09:10:25 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux

Yes, it's not a completely current kernel.  This box is running CentOS 5
with some yum updates.

# xfs_growfs -V
xfs_growfs version 2.9.4

This xfs_info is from after the xfs_growfs attempt.  I regret that I don't have one from before; I was actually thinking of it, but the resize went so smoothly on my test machine (and went fine in the past as well on other platforms) that I didn't give it much thought till it was too late.

# xfs_info /export/
meta-data=/dev/mapper/saharaVG-saharaLV isize=256    agcount=32, agsize=91552192 blks
         =                       sectsz=512   attr=0
data     =                       bsize=4096   blocks=2929670144,
imaxpct=25
         =                       sunit=0      swidth=0 blks, unwritten=1
naming   =version 2              bsize=4096
log      =internal               bsize=4096   blocks=32768, version=1
         =                       sectsz=512   sunit=0 blks, lazy-count=0
realtime =none                   extsz=4096   blocks=0, rtextents=0

I saw requests to run xfs_db, but I don't want to mess up the syntax, even if -r should be safe.

Thanks for any help you can provide!

--keith


-- 
kkeller@sonic.net


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread
* Re: xfs_growfs doesn't resize
@ 2011-06-30 23:30 kkeller
  2011-07-01 10:46 ` Dave Chinner
  0 siblings, 1 reply; 15+ messages in thread
From: kkeller @ 2011-06-30 23:30 UTC (permalink / raw)
  To: xfs

Hello again all,

I apologize for following up my own post, but I found some new information.

On Thu 30/06/11  2:42 PM , kkeller@sonic.net wrote:

> http://oss.sgi.com/archives/xfs/2008-01/msg00085.html

I found a newer thread in the archives which might be more relevant to my issue:

http://oss.sgi.com/archives/xfs/2009-09/msg00206.html

But I haven't yet done a umount, and don't really wish to.  So, my followup questions are:

==Is there a simple way to figure out what xfs_growfs did, and whether it caused any problems?
==Will I be able to fix these problems, if any, without needing a umount?
==Assuming my filesystem is healthy, will a simple kernel update (and reboot of course!) allow me to resize the filesystem in one step, instead of 2TB increments?

Again, many thanks!

--keith

-- 
kkeller@sonic.net

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread
* Re: xfs_growfs doesn't resize
@ 2011-07-01 16:44 kkeller
  0 siblings, 0 replies; 15+ messages in thread
From: kkeller @ 2011-07-01 16:44 UTC (permalink / raw)
  To: xfs

Thanks for the response, Dave!  I have some additional questions inline.


On Fri 01/07/11  3:46 AM , Dave Chinner <david@fromorbit.com> wrote:

> So either way, you will have to unmount the filesystem.

Yikes!  I am guessing that may put the filesystem at risk of not being able to re-mount without xfs_db commands, as happened to the other posters I cited.  If I want to minimize the amount of downtime if umounting does cause the fs not to be mountable, is there a way for me to look at the xfs_db output after I umount, and calculate any new parameters myself?  Or is that considered generally unwise, and xfs_db needs an expert to look at the output?  I want to minimize downtime, but I also want to minimize the risk of data loss, so I wouldn't want to derive my own xfs_db commands unless it was very safe.  (Even with backups available, it's more work to switch over or restore if I do lose the filesystem; we're a small group so we don't have an automatic failover server.)

Are there any other docs concerning using xfs_db?  I saw a post from last year that said that there weren't, but I'm wondering if that's changed since then.  There is of course the man page, but that doesn't describe how to interpret what's going on from its output (or what the correct steps to take are if there's a problem).

> > ==Assuming my filesystem is healthy, will a simple kernel update
> > (and reboot of course!) allow me to resize the filesystem in one
> > step, instead of 2TB increments?
> 
> I'd upgrade both kernel and userspace.

Would you recommend upgrading userspace from source?  CentOS 5 still calls the version available (from their centosplus repo) 2.9.4, but I haven't investigated what sort of patches they may have applied.


--keith


-- 
kkeller@sonic.net


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread
* Re: xfs_growfs doesn't resize
@ 2011-07-03 19:42 kkeller
  0 siblings, 0 replies; 15+ messages in thread
From: kkeller @ 2011-07-03 19:42 UTC (permalink / raw)
  To: xfs; +Cc: Eric Sandeen

On Sun, Jul 03, 2011 at 10:59:03AM -0500, Eric Sandeen wrote:
> On 6/30/11 4:42 PM, kkeller@sonic.net wrote:
> > # uname -a
> > Linux sahara.xxx 2.6.18-128.1.6.el5 #1 SMP Wed Apr 1 09:10:25 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
> > 
> > Yes, it's not a completely current kernel. This box is running CentOS 5
> > with some yum updates.
> 
> try
> 
> # rpm -qa | grep xfs
> 
> If you see anything with "kmod" you're running an exceptionally old xfs codebase.


Yes, I do have a kmod-xfs package, so clearly a kernel update is in
order. So my goals are twofold: 1) verify the current filesystem's
state--is it healthy, or does it need xfs_db voodoo? 2) once it's
determined healthy, again attempt to grow the filesystem. Here is
my current plan for reaching these goals:

0) get a nearer-term backup, just in case :) The filesystem still seems
perfectly normal, but without knowing what my first xfs_growfs did I
don't know if or how long this state will last.

1) umount the fs to run xfs_db

2) attempt a remount--is this safe, or is there risk of damaging the filesystem?

3) If a remount succeeds, then update the kernel and xfsprogs. If a remount
doesn't work, then revert to the near-term backup I took in 0) and attempt
to fix the issue (with the help of the list, I hope).

4) In either case, post my xfs_db output to the list and get your
opinions on the health of the fs.

5) If the fs seems correct, attempt xfs_growfs again.

Do all these steps seem reasonable? I am most concerned about step 2--
I really do want to be able to remount as quickly as possible, but I
do not know how to tell whether it's okay from xfs_db's output. So if a
remount attempt is reasonably nondestructive (i.e., it won't make worse
an already unhealthy XFS fs) then I can try it and hope for the best.
(From the other threads I've seen it seems like it's not a good idea to
run xfs_repair.)

Would it make more sense to update the kernel and xfsprogs before
attempting a remount? If a remount fails under the original kernel,
what do people think the odds are that a new kernel would be able to
mount the original fs, or is that really unwise?

Again, many thanks for all your help.

--keith

-- 
kkeller@sonic.net

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread
* Re: xfs_growfs doesn't resize
@ 2011-07-04  4:34 kkeller
  2011-07-04  4:41 ` Eric Sandeen
  0 siblings, 1 reply; 15+ messages in thread
From: kkeller @ 2011-07-04  4:34 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: xfs



On Sun 03/07/11  3:14 PM , Eric Sandeen <sandeen@sandeen.net> wrote:

[some rearranging]

> You're welcome but here's the obligatory plug in return - running RHEL5
> proper would have gotten you up to date, fully supported xfs, and you
> wouldn't have run into this mess. Just sayin' ... ;)

Yep, that's definitely a lesson learned.  Though I don't think I can blame CentOS either--from what I can tell the bug has been available from yum for some time now.  So it's pretty much entirely my own fault.  :(

I also am sorry for not preserving threading--for some reason, the SGI mailserver rejected mail from my normal host (which is odd, as it's not in any blacklists I know of), so I am using an unfamiliar mail client.

> You probably hit this bug:
> http://oss.sgi.com/archives/xfs/2007-01/msg00053.html [1]
> 
> See also:
> http://oss.sgi.com/archives/xfs/2009-07/msg00087.html [2]
> 
> I can't remember how much damage the original bug did ...

If any?  I'm a bit amazed that, if there was damage, that the filesystem is still usable.  Perhaps if I were to fill it it would show signs of inconsistency?  Or remounting would read the now-incorrect values from the superblock 0?

> is it still mounted I guess?

Yes, it's still mounted, and as far as I can tell perfectly fine.  But I won't really know till I can throw xfs_repair -n and/or xfs_db and/or remount it; I'm choosing to get as much data off as I can before I try these things, just in case.

How safe is running xfs_db with -r on my mounted filesystem?  I understand that results might not be consistent, but on the off chance that they are I am hoping that it might be at least a little helpful.

I was re-reading some of the threads I posted in my original messages, in particular these posts:

http://oss.sgi.com/archives/xfs/2009-09/msg00210.html
http://oss.sgi.com/archives/xfs/2009-09/msg00211.html

If I am reading those, plus the xfs_db man page, correctly, it seems like what Russell suggested was to look at superblock 1 (or some other one?) and use those values to correct superblock 0.  At what points (if any) are the other superblocks updated?  I was testing on another machine, on a filesystem that I had successfully grown using xfs_growfs, and of the two values Russell suggested the OP to change, dblocks is different between sb 0 and sb 1, but agcount is not.  Could that just be that I did not grow the filesystem too much, so that agcount didn't need to change?  That seems a bit counterintuitive, but (as should be obvious) I don't know XFS all that well.  I am hoping to know because, in re-reading those messages, I got a better idea of what those particular xfs_db commands do, so that if I did run into problems remounting, I might be able to determine the appropriate new values myself and reduce my downtime.  But I want to understand more what I'm doing before I try!
  that!

--keith

-- 
kkeller@sonic.net

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread
* Re: xfs_growfs doesn't resize
@ 2011-07-06 22:51 kkeller
  2011-07-07 18:25 ` Keith Keller
  0 siblings, 1 reply; 15+ messages in thread
From: kkeller @ 2011-07-06 22:51 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: xfs

Hello again XFS folks,

I have finally made the time to revisit this, after copying most of my data
elsewhere.

On Sun 03/07/11  9:41 PM , Eric Sandeen <sandeen@sandeen.net> wrote:
> On 7/3/11 11:34 PM, kkeller@sonic.net wrote:

> > How safe is running xfs_db with -r on my mounted filesystem? I
> 
> it's safe. At worst it might read inconsistent data, but it's
> perfectly safe.

So, here is my xfs_db output.  This is still on a mounted filesystem.

# xfs_db -r -c 'sb 0' -c 'print' /dev/mapper/saharaVG-saharaLV
magicnum = 0x58465342
blocksize = 4096
dblocks = 5371061248
rblocks = 0
rextents = 0
uuid = 1bffcb88-0d9d-4228-93af-83ec9e208e88
logstart = 2147483652
rootino = 128
rbmino = 129
rsumino = 130
rextsize = 1
agblocks = 91552192
agcount = 59
rbmblocks = 0
logblocks = 32768
versionnum = 0x30e4
sectsize = 512
inodesize = 256
inopblock = 16
fname = "\000\000\000\000\000\000\000\000\000\000\000\000"
blocklog = 12
sectlog = 9
inodelog = 8
inopblog = 4
agblklog = 27
rextslog = 0
inprogress = 0
imax_pct = 25
icount = 19556544
ifree = 1036
fdblocks = 2634477046
frextents = 0
uquotino = 131
gquotino = 132
qflags = 0x7
flags = 0
shared_vn = 0
inoalignmt = 2
unit = 0
width = 0
dirblklog = 0
logsectlog = 0
logsectsize = 0
logsunit = 0
features2 = 0


#  xfs_db -r -c 'sb 1' -c 'print' /dev/mapper/saharaVG-saharaLV
magicnum = 0x58465342
blocksize = 4096
dblocks = 2929670144
rblocks = 0
rextents = 0
uuid = 1bffcb88-0d9d-4228-93af-83ec9e208e88
logstart = 2147483652
rootino = 128
rbmino = 129
rsumino = 130
rextsize = 1
agblocks = 91552192
agcount = 32
rbmblocks = 0
logblocks = 32768
versionnum = 0x30e4
sectsize = 512
inodesize = 256
inopblock = 16
fname = "\000\000\000\000\000\000\000\000\000\000\000\000"
blocklog = 12
sectlog = 9
inodelog = 8
inopblog = 4
agblklog = 27
rextslog = 0
inprogress = 0
imax_pct = 25
icount = 19528640
ifree = 15932
fdblocks = 170285408
frextents = 0
uquotino = 131
gquotino = 132
qflags = 0x7
flags = 0
shared_vn = 0
inoalignmt = 2
unit = 0
width = 0
dirblklog = 0
logsectlog = 0
logsectsize = 0
logsunit = 0
features2 = 0


I can immediately see with a diff that dblocks and agcount are
different.  Some other variables are also different, namely icount,
ifree, and fdblocks, which I am unclear how to interpret.  But judging
from the other threads I quoted, it seems that dblocks and agcount
are using values for a 20TB filesystem, and that therefore on a umount
the filesystem will become (at least temporarily) unmountable.

I've seen two different routes for trying to correct this issue--either use
xfs_db to manipulate the values directly, or using xfs_repair on a frozen
ro-mounted filesystem with a dump from xfs_metadata.  My worry about
the latter is twofold--will I even be able to do a remount?  And will I
have space for a dump from xfs_metadata of an 11TB filesystem?  I have
also seen advice in some of the other threads that xfs_repair can actually
make the damage worse (though presumably xfs_repair -n should be safe).

If xfs_db is a better way to go, and if the values xfs_db returns on a
umount don't change, would I simply do this?

# xfs_db -x /dev/mapper/saharaVG-saharaLV
sb 0 w dblocks = 2929670144 w agcount = 32

and then do an xfs_repair -n?

A route I have used many ages ago, on ext2 filesystems, was to specify
an alternate superblock when running e2fsck.  Can xfs_repair do this?

> Get a recent xfsprogs too, if you haven't already, it scales better
> than the really old versions.

I think I may have asked this in another post, but would you suggest
compiling 3.0 from source?  The version that CentOS distributes is marked
as 2.9.4, but I don't know what patches they've applied (if any).  Would 3.0
be more likely to help recover the fs?

Thanks all for your patience!

--keith

-- 
kkeller@sonic.net


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2011-07-07 22:30 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-30 21:42 xfs_growfs doesn't resize kkeller
2011-07-03 15:59 ` Eric Sandeen
2011-07-03 16:01   ` Eric Sandeen
     [not found]   ` <20110703193822.GA28632@wombat.san-francisco.ca.us>
2011-07-03 22:14     ` Eric Sandeen
  -- strict thread matches above, loose matches on Subject: below --
2011-06-30 23:30 kkeller
2011-07-01 10:46 ` Dave Chinner
2011-07-01 16:44 kkeller
2011-07-03 19:42 kkeller
2011-07-04  4:34 kkeller
2011-07-04  4:41 ` Eric Sandeen
2011-07-06 22:51 kkeller
2011-07-07 18:25 ` Keith Keller
2011-07-07 19:34   ` Eric Sandeen
2011-07-07 22:23     ` Keith Keller
2011-07-07 22:30       ` Eric Sandeen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox