From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id n0J3NbiE024085 for ; Sun, 18 Jan 2009 21:23:37 -0600 Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id B33A5181C059 for ; Sun, 18 Jan 2009 19:23:34 -0800 (PST) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id MAIFTzreYo7jC1HB for ; Sun, 18 Jan 2009 19:23:34 -0800 (PST) Date: Mon, 19 Jan 2009 14:17:43 +1100 From: Dave Chinner Subject: Re: problems showing up as XFS problems on kernels after 2.6.28-git2 Message-ID: <20090119031743.GN8071@disturbed> References: <20090109061043.GA31450@dth.net> <20090109194445.GA28759@infradead.org> <20090109195144.GA19857@dth.net> <20090109195852.GA6362@infradead.org> <20090109214206.GA2901@dth.net> <20090109220138.GA5282@infradead.org> <20090113200414.GA21013@dth.net> <20090116204346.GA5117@dth.net> <20090117073824.GK8071@disturbed> <20090117232511.GA8443@dth.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20090117232511.GA8443@dth.net> List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: xfs-bounces@oss.sgi.com Errors-To: xfs-bounces@oss.sgi.com To: Danny ter Haar Cc: Christoph Hellwig , xfs@oss.sgi.com On Sun, Jan 18, 2009 at 12:25:11AM +0100, Danny ter Haar wrote: > Quoting Dave Chinner (david@fromorbit.com): > > Sorry for not getting back to you sooner. > > No problem. I initally posted to LKLM, git redirected by Christoph to this > list. I'm so stupid that i didn't check the other messages from this list. > Sorry. > > > I think that Alexander tripped over this same problem during his bisect. > > If you follow the thread from here: > > http://oss.sgi.com/archives/xfs/2009-01/msg00496.html > > Yep! [cheer] i'm not alone! :-) > But why only us two ? there must be thousands of users out there using > XFS. Why did it bite us ? large filesystem together with slow hardware ? No idea - I can't reproduce it either so there's some state that your filesystem is getting into that trips over it. > > You'll see that Alexander had the same problem and managed > > to continue the bisect once he copied the xfs_btree_trace.h > > header file from top-of-tree back into the broken commits. > > Grwat. > > > I hope this helps (and I hope that the bisect lands on the > > same commit that it did for Alexander). > > Do you want me to still try it ? > I think you allready figured out where the culprit is ?! Yes, i think we have, but it wasn't totally conclusive. Can you continue your bisect to see if it narrows down to the same commit on your machine? I'm still trying to reproduce it but I haven't worked out what the initial state is. One thing that might be useful is to put a printk into the kernel on the failure path that prints the inode number out (e.g. at the goto that the WANT_CORRUPTED_GOTO jumps to). Then we can use xfs_db to find the file that is causing the problem and then use xfs_db or xfs_bmap to look at the extent tree prior to the corruption. That might help me set up the initial state needed to trip the problem..... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs