From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id n0HIoVXg031284 for ; Sat, 17 Jan 2009 12:50:31 -0600 Received: from mail.sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id EAA3598952 for ; Sat, 17 Jan 2009 10:50:29 -0800 (PST) Received: from mail.sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id VAMrEnOLal9xtovB for ; Sat, 17 Jan 2009 10:50:29 -0800 (PST) Message-ID: <49722875.90202@sandeen.net> Date: Sat, 17 Jan 2009 12:50:29 -0600 From: Eric Sandeen MIME-Version: 1.0 Subject: Re: help with xfs_repair on 10TB fs References: <4972166D.5000006@sandeen.net> In-Reply-To: List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: xfs-bounces@oss.sgi.com Errors-To: xfs-bounces@oss.sgi.com To: Alberto Accomazzi Cc: xfs@oss.sgi.com Alberto Accomazzi wrote: > On Sat, Jan 17, 2009 at 12:33 PM, Eric Sandeen wrote: > >> Alberto Accomazzi wrote: >>> I need some help with figuring out how to repair a large XFS >>> filesystem (10TB of data, 100+ million files). xfs_repair seems to >>> have crapped out before finishing the job and now I'm not sure how to >>> proceed. >> How did it "crap out? > > > Well, in the way I described below, namely it ran for several hours and then > died without completing. As you can see from the log (which captured both > stdout and stderr) there's nothing that indicates what terminated the > program. And it's definitely not running now. > > >> the src.rpm from >> >> http://kojipkgs.fedoraproject.org/packages/xfsprogs/2.10.2/3.fc11/src/ >> > > Ok, I guess it's worth giving it a shot. I assume I don't need to worry > about kernel modules because the xfsprogs don't depend on that, right? right. > >>> After bringing the system back, a mount of the fs reported problems: >>> >>> Starting XFS recovery on filesystem: sdb1 (logdev: internal) >>> Filesystem "sdb1": XFS internal error xfs_btree_check_sblock at line 334 >> of file >>> /home/buildsvn/rpmbuild/BUILD/xfs-kmod-0.4/_kmod_build_/xfs_btree.c. >> Caller 0x >>> ffffffff882fa8d2 >> so log replay is failing now; but that indicates an unclean shutdown. >> Something else must have happened between the xfs_repair and this mount >> instance? >> > > Sorry, I wasn't clear: there was indeed an unclean shutdown (actually a > couple), after which the mount would not succeed presumably because of the > dirty log. I was able to mount the system read-only and take enough of a > look to see that there was significant corruption of the data. Running > xfs_repair -L at that point seemed the only option available. But do let me > know if this line of thinking is incorrect. yes, if you have a dirty log that won't replay, zapping the log via repair is about the only option. I wonder what the first hint of trouble here was, though, what led to all this misery.... :) -Eric _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs