From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?utf-8?B?SsO2cm4=?= Engel Subject: Re: [RFC] The many faces of the I_LOCK Date: Wed, 21 Feb 2007 18:47:29 +0000 Message-ID: <20070221184729.GB3219@lazybastard.org> References: <20070221130956.GB464@lazybastard.org> <20070221182904.GL6095633@melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: linux-fsdevel@vger.kernel.org, Anton Altaparmakov , Dave Kleikamp , Al Viro , Christoph Hellwig To: David Chinner Return-path: Received: from lazybastard.de ([212.112.238.170]:36088 "EHLO longford.lazybastard.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1422798AbXBUSvD (ORCPT ); Wed, 21 Feb 2007 13:51:03 -0500 Content-Disposition: inline In-Reply-To: <20070221182904.GL6095633@melbourne.sgi.com> Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Thu, 22 February 2007 05:29:04 +1100, David Chinner wrote: > On Wed, Feb 21, 2007 at 01:09:56PM +0000, J=C3=83=C2=B6rn Engel wrote= : > >=20 > > Process A: Process B: > > inode_wait [filesystem locking write path] > > __wait_on_bit __writeback_single_inode > > out_of_line_wait_on_bit > > ifind_fast > > [filesystem calling iget()] > > [filesystem locking write path] >=20 > This is caused by your cleaner thread racing with writeback doing ino= de > lookup, right? You need a non-blocking inode lookup to prevent the de= adlock, > I guess.... Almost correct. I need an inode lookup that is not blocking on writes. Blocking on reads is fine - it will simply delay the cleaner and cost some performance. > IIRC, the checks in xfs_ichgtime[_fast] are simply an optimisation - = if the > inode is currently I_LOCKed then we can't mark it dirty anyway so we = don't > even bother trying. We do mark the XFS inode structure (*) dirty, tho= ugh, so > the modification will make it to disk at some time in the future. >=20 > (*) XFS does double inode caching because the XFS transaction subsyst= em > requires inodes to have a different lifecycle to the linux struct ino= de > lifecycle. Ok. Then my current patches would break XFS, I believe. I'll just add a seperate fix for now and will merge that in later. > > Now, if the group in fs/fs-writeback.c had a completion event that = is > > independent of anything in fs/inode.c, the deadlock scenario descri= bed > > in section 1 goes away. As a further result, ilookup5_nowait() can= get > > removed as its only user, NTFS, only introduced it as a workaround = for > > the deadlock scenario. >=20 > Could you use ilookup5_nowait() in LogFS and avoid the cleaner deadlo= ck > that way? I could, but I personally consider ilookup5_nowait() to be a nasty hack= =2E It admittedly does what it was supposed to do, but it merely hacks around the issue of using the same lock for two completely independent purposes. Also, I _want_ to block on I_LOCK in ilookup5() and friends. If the inode has been allocated, but not read from disk yet, the wait_on_inode(inode) in ifind() is the only thing protecting me from accessing garbage data. Thinking about it, ignoring this lock can cause data loss on the filesystem. When validating blocks, they are compared to an unread inode, which just looks like an empty file. So all blocks look invalid and will get deleted. That race is _really_ hard to lose, but also _really_ nasty. J=C3=B6rn --=20 "[One] doesn't need to know [...] how to cause a headache in order to take an aspirin." -- Scott Culp, Manager of the Microsoft Security Response Center, 2001 - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html