From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: with ECARTIS (v1.0.0; list xfs); Mon, 14 Jul 2008 20:18:01 -0700 (PDT) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m6F3HvKo012520 for ; Mon, 14 Jul 2008 20:17:57 -0700 Received: from ipmail01.adl6.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 8BAD4E1FCDB for ; Mon, 14 Jul 2008 20:19:03 -0700 (PDT) Received: from ipmail01.adl6.internode.on.net (ipmail01.adl6.internode.on.net [203.16.214.146]) by cuda.sgi.com with ESMTP id AFOhUqmAuia78tFq for ; Mon, 14 Jul 2008 20:19:03 -0700 (PDT) Date: Tue, 15 Jul 2008 13:18:40 +1000 From: Dave Chinner Subject: Re: xfs bug in 2.6.26-rc9 Message-ID: <20080715031840.GB29319@disturbed> References: <20080711084248.GU29319@disturbed> <487B019B.9090401@sgi.com> <20080714121332.GX29319@disturbed> <487C07A4.70202@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <487C07A4.70202@sgi.com> Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com List-Id: xfs To: Lachlan McIlroy Cc: Mikael Abrahamsson , linux-kernel@vger.kernel.org, xfs@oss.sgi.com, linux-mm@kvack.org On Tue, Jul 15, 2008 at 12:12:52PM +1000, Lachlan McIlroy wrote: > Dave Chinner wrote: >> On Mon, Jul 14, 2008 at 05:34:51PM +1000, Lachlan McIlroy wrote: >>> This is a race between xfs_fsr and a mmap write. xfs_fsr acquires the >>> iolock and then flushes the file and because it has the iolock it doesn't >>> expect any new delayed allocations to occur. A mmap write can allocate >>> delayed allocations without acquiring the iolock so is able to get in >>> after the flush but before the ASSERT. >> >> Christoph and I were contemplating this problem with ->page_mkwrite >> reecently. The problem is that we can't, right now, return an >> EAGAIN-like error to ->page_mkwrite() and have it retry the >> page fault. Other parts of the page faulting code can do this, >> so it seems like a solvable problem. >> >> The basic concept is that if we can return a EAGAIN result we can >> try-lock the inode and hold the locks necessary to avoid this race >> or prevent the page fault from dirtying the page until the >> filesystem is unfrozen. > Why do we need to try-lock the inode? Will we have an ABBA deadlock > if we block on the iolock in ->page_mkwrite()? Yes. With the mmap_sem. Look at the rules in mm/filemap.c and replace i_mutex with iolock.... Cheers, Dave. -- Dave Chinner david@fromorbit.com