From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.sgi.com (relay3.corp.sgi.com [198.149.34.15]) by oss.sgi.com (Postfix) with ESMTP id AD5E57FBA for ; Thu, 11 Apr 2013 04:55:51 -0500 (CDT) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.157.11]) by relay3.corp.sgi.com (Postfix) with ESMTP id 4867BAC001 for ; Thu, 11 Apr 2013 02:55:48 -0700 (PDT) Received: from greer.hardwarefreak.com (mo-65-41-216-221.sta.embarqhsd.net [65.41.216.221]) by cuda.sgi.com with ESMTP id VSyuo0wz1jXejvgj for ; Thu, 11 Apr 2013 02:55:47 -0700 (PDT) Received: from [192.168.100.53] (gffx.hardwarefreak.com [192.168.100.53]) by greer.hardwarefreak.com (Postfix) with ESMTP id BD6FA6C162 for ; Thu, 11 Apr 2013 04:55:46 -0500 (CDT) Message-ID: <516688A9.8050506@hardwarefreak.com> Date: Thu, 11 Apr 2013 04:55:53 -0500 From: Stan Hoeppner MIME-Version: 1.0 Subject: Re: xfs_repair breaks with assertion References: <20130411062515.GH10481@dastard> In-Reply-To: Reply-To: stan@hardwarefreak.com List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com To: xfs@oss.sgi.com On 4/11/2013 1:34 AM, Victor K wrote: > The raid array did not suffer, at least, not according to mdadm; it is now > happily recovering the one disk that officially failed, but the whole thing > assembled without a problem > There was a similar crash several weeks ago on this same array, but had > ext4 system back then. > I was able to save some of the latest stuff, and decided to move to xfs as > something more reliable. > I suspect now I should also had replaced the disk controller then. Rebuilds are *supposed* to be transparent to the filesystem but this is not always the case. Sometimes due to bugs. In fact we just recently saw an LVM bug wherein a pvmove operation was not transparent, and hosed up an XFS. This is but one of many reasons I prefer hardware based RAID and volume management. It isolates these functions and RAID memory structures from the kernel, and thus prevents such bugs from causing problems. This may/not be the source of your apparent XFS corruption. We don't have enough (log) data to ascertain the cause at this point. Running repair on an 8/10TB filesystem while md is rebuilding the underlying RAID6 array isn't something I'd put a lot of trust in. Wait until the rebuild is finished and then run a non-destructive repair. Compare the results to the previous repair. -- Stan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs