From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Chinner Subject: Re: [RFC] The many faces of the I_LOCK Date: Thu, 22 Feb 2007 05:29:04 +1100 Message-ID: <20070221182904.GL6095633@melbourne.sgi.com> References: <20070221130956.GB464@lazybastard.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: linux-fsdevel@vger.kernel.org, Anton Altaparmakov , David Chinner , Dave Kleikamp , Al Viro , Christoph Hellwig To: =?iso-8859-1?B?SsO2cm4=?= Engel Return-path: Received: from netops-testserver-4-out.sgi.com ([192.48.171.29]:46150 "EHLO netops-testserver-4.corp.sgi.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751325AbXBUTBE (ORCPT ); Wed, 21 Feb 2007 14:01:04 -0500 Content-Disposition: inline In-Reply-To: <20070221130956.GB464@lazybastard.org> Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Wed, Feb 21, 2007 at 01:09:56PM +0000, J=C3=B6rn Engel wrote: > 1. Introduction >=20 > This lengthy investigation was caused by a deadlock problem in LogFS, > but uncovered a more general problem. It affects, at the least, all > filesystems that need to read inodes in their write path. To my > knowledge, that includes LogFS and NTFS, possibly also JFS and XFS. I don't think XFS has any problems here - we're pretty careful about reading inodes from disk before we lock other potentially dependent objects in the filesystem.... > Deadlock happens when two processes A and B (that may be identical) h= ave > these two call chains: >=20 > Process A: Process B: > inode_wait [filesystem locking write path] > __wait_on_bit __writeback_single_inode > out_of_line_wait_on_bit > ifind_fast > [filesystem calling iget()] > [filesystem locking write path] >=20 > Process A will wait_on_inode() and block until process B proceeds > through __sync_single_inode(), which is called from > __writeback_single_inode() in Process B. Process B will block on the > lock of the filesystem write path, held by Process A. This is caused by your cleaner thread racing with writeback doing inode lookup, right? You need a non-blocking inode lookup to prevent the dead= lock, I guess.... > 2. The usage of inode_lock and I_LOCK =2E.... > Three rows have not exposed their meaning to me yet, so I'd > gladly receive some insight here. IIRC, the checks in xfs_ichgtime[_fast] are simply an optimisation - if= the inode is currently I_LOCKed then we can't mark it dirty anyway so we do= n't even bother trying. We do mark the XFS inode structure (*) dirty, thoug= h, so the modification will make it to disk at some time in the future. (*) XFS does double inode caching because the XFS transaction subsystem requires inodes to have a different lifecycle to the linux struct inode lifecycle. > 3. Seperating out sync notification >=20 > One of the results from the investigations in 2 appears to be that on= e > class of users in fs/fs-writeback.c is completely unrelated to anothe= r > class of users in fs/inode.c. >=20 > In particular, __sync_single_inode(), __writeback_single_inode(), > write_inode_now(), clear_inode(), __mark_inode_dirty() and (possibly?= ) > generic_osync_inode() seem to only need a completion event to > synchronize with. There is no reason why this group should share a l= ock > with iget() and any of its many permutations. Seems reasonable, but I don't know all the little details in these paths.... > Now, if the group in fs/fs-writeback.c had a completion event that is > independent of anything in fs/inode.c, the deadlock scenario describe= d > in section 1 goes away. As a further result, ilookup5_nowait() can g= et > removed as its only user, NTFS, only introduced it as a workaround fo= r > the deadlock scenario. Could you use ilookup5_nowait() in LogFS and avoid the cleaner deadlock that way? Cheers, Dave. --=20 Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html