From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from verein.lst.de ([213.95.11.211]:48576 "EHLO newverein.lst.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753119AbdKAPBh (ORCPT ); Wed, 1 Nov 2017 11:01:37 -0400 Date: Wed, 1 Nov 2017 16:01:36 +0100 From: Christoph Hellwig Subject: Re: xfs: list corruption in xfs_setup_inode() Message-ID: <20171101150136.GA26080@lst.de> References: <20171031003358.GD5858@dastard> <20171101030536.GN5858@dastard> <20171101050701.GP5858@dastard> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171101050701.GP5858@dastard> Sender: linux-xfs-owner@vger.kernel.org List-ID: List-Id: xfs To: Dave Chinner Cc: Cong Wang , Dave Chinner , darrick.wong@oracle.com, linux-xfs@vger.kernel.org, LKML , Christoph Hellwig , Al Viro On Wed, Nov 01, 2017 at 04:07:01PM +1100, Dave Chinner wrote: > > We are trying to make kdump working, but even if kdump works > > we still can't turn on panic_on_warn since this is production > > machine. > > Hmmm. Ok, maybe you could leave a trace of the xfs_iget* trace > points running and check the log tail for unusual events around the > time of the next crash. e.g. xfs_iget_reclaim_fail events. That > might point us to a potential interaction we can look at more > closely. I'd also suggest slab poisoning as well, as that will > catch other lifecycle problems that could be causing list > corruptions such as use-after-free. KASAN has also been really useful for these kinds of issues.