xfs_repair memory usage and stopping on "Traversing filesystem..."

* xfs_repair memory usage and stopping on "Traversing filesystem..."
@ 2010-05-18 19:28 Colin Wilson
  2010-05-19  0:19 ` Dave Chinner
  0 siblings, 1 reply; 4+ messages in thread
From: Colin Wilson @ 2010-05-18 19:28 UTC (permalink / raw)
  To: xfs@oss.sgi.com

Hello all,
	I seem to be having the same problem as Tomasz had in this post to the mailing list: http://oss.sgi.com/archives/xfs/2009-07/msg00082.html .  Eric ultimately suggested running xfs_repair with the '-P' and '-o bhash=1024' flags to get past this problem and described what he thought the underlieing problem was as such:

> "This looks like some of the caching that xfs_repair does is mis-sized,
> and it gets stuck when it's unable to find a slot for a new node to
> cache.  IMHO that's still a bug that I'd like to work out.  If it gets
> stuck this way, it'd probably be better to exit, and suggest a larger
> hash size."

	Currently my file system is ~50 TB in size with ~40TB in use and when I do the repair memory usage ends up between 10 and 11 GB used for most of the check .  The system currently has 12GB of ram not including swap.  Is this expected behavior?  My concern is setting bhash too large and causing xfs_repair to swap for long periods of time.  It already takes a few days to get to Phase 6 in the repair.

	I am currently running Debian Lenny(5.0.4) with xfsprogs 2.9.8 with linux kernel 2.6.26.  I've briefly looked through the change logs for newer version of xfsprogs and noticed that there were a few updates mentioning better memory performance or management  so upgrading to a newer version may be all I need.  Has the bug Eric mentions been fixed in a later version of xfsprogs?  What is your suggestion as to my best course of action to get this xfs-repair to complete in a timely manor without using up all the ram in my system?  Thanks

xfs_info dump:
# xfs_info /u1/
meta-data=/dev/mapper/sangroup-sandisk isize=256    agcount=821, agsize=15258784 blks
         =                       sectsz=512   attr=0
data     =                       bsize=4096   blocks=12514290688, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096  
log      =internal               bsize=4096   blocks=32768, version=1
         =                       sectsz=512   sunit=0 blks, lazy-count=0
realtime =none                   extsz=65536  blocks=0, rtextents=0

--Colin

Colin Wilson
Linux Systems Administrator
T +1.781.810.1331
F +1.781.891.5145
cwilson@blackducksoftware.com
http://www.blackducksoftware.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread