From: David Chinner <dgc@sgi.com>
To: "Jörn Engel" <joern@lazybastard.org>
Cc: linux-fsdevel@vger.kernel.org,
Anton Altaparmakov <aia21@cam.ac.uk>, David Chinner <dgc@sgi.com>,
Dave Kleikamp <shaggy@linux.vnet.ibm.com>,
Al Viro <viro@ftp.linux.org.uk>,
Christoph Hellwig <hch@infradead.org>
Subject: Re: [RFC] The many faces of the I_LOCK
Date: Thu, 22 Feb 2007 05:29:04 +1100 [thread overview]
Message-ID: <20070221182904.GL6095633@melbourne.sgi.com> (raw)
In-Reply-To: <20070221130956.GB464@lazybastard.org>
On Wed, Feb 21, 2007 at 01:09:56PM +0000, Jörn Engel wrote:
> 1. Introduction
>
> This lengthy investigation was caused by a deadlock problem in LogFS,
> but uncovered a more general problem. It affects, at the least, all
> filesystems that need to read inodes in their write path. To my
> knowledge, that includes LogFS and NTFS, possibly also JFS and XFS.
I don't think XFS has any problems here - we're pretty careful about
reading inodes from disk before we lock other potentially dependent
objects in the filesystem....
> Deadlock happens when two processes A and B (that may be identical) have
> these two call chains:
>
> Process A: Process B:
> inode_wait [filesystem locking write path]
> __wait_on_bit __writeback_single_inode
> out_of_line_wait_on_bit
> ifind_fast
> [filesystem calling iget()]
> [filesystem locking write path]
>
> Process A will wait_on_inode() and block until process B proceeds
> through __sync_single_inode(), which is called from
> __writeback_single_inode() in Process B. Process B will block on the
> lock of the filesystem write path, held by Process A.
This is caused by your cleaner thread racing with writeback doing inode
lookup, right? You need a non-blocking inode lookup to prevent the deadlock,
I guess....
> 2. The usage of inode_lock and I_LOCK
.....
> Three rows have not exposed their meaning to me yet, so I'd
> gladly receive some insight here.
IIRC, the checks in xfs_ichgtime[_fast] are simply an optimisation - if the
inode is currently I_LOCKed then we can't mark it dirty anyway so we don't
even bother trying. We do mark the XFS inode structure (*) dirty, though, so
the modification will make it to disk at some time in the future.
(*) XFS does double inode caching because the XFS transaction subsystem
requires inodes to have a different lifecycle to the linux struct inode
lifecycle.
> 3. Seperating out sync notification
>
> One of the results from the investigations in 2 appears to be that one
> class of users in fs/fs-writeback.c is completely unrelated to another
> class of users in fs/inode.c.
>
> In particular, __sync_single_inode(), __writeback_single_inode(),
> write_inode_now(), clear_inode(), __mark_inode_dirty() and (possibly?)
> generic_osync_inode() seem to only need a completion event to
> synchronize with. There is no reason why this group should share a lock
> with iget() and any of its many permutations.
Seems reasonable, but I don't know all the little details in these
paths....
> Now, if the group in fs/fs-writeback.c had a completion event that is
> independent of anything in fs/inode.c, the deadlock scenario described
> in section 1 goes away. As a further result, ilookup5_nowait() can get
> removed as its only user, NTFS, only introduced it as a workaround for
> the deadlock scenario.
Could you use ilookup5_nowait() in LogFS and avoid the cleaner deadlock
that way?
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2007-02-21 19:01 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-02-21 13:09 [RFC] The many faces of the I_LOCK Jörn Engel
2007-02-21 18:29 ` David Chinner [this message]
2007-02-21 18:47 ` Jörn Engel
2007-02-21 18:57 ` Jörn Engel
2007-02-21 18:54 ` [PATCH 1/3] Replace I_LOCK with I_SYNC for fs/fs-writeback.c uses Jörn Engel
2007-02-21 18:55 ` [PATCH] [NTFS] Remove ilookup5_nowait and convert users to ilookup5 Jörn Engel
2007-02-21 18:56 ` [PATCH 3/3] Replace I_LOCK with I_SYNC in XFS Jörn Engel
2007-02-22 11:40 ` [RFC] The many faces of the I_LOCK Jörn Engel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070221182904.GL6095633@melbourne.sgi.com \
--to=dgc@sgi.com \
--cc=aia21@cam.ac.uk \
--cc=hch@infradead.org \
--cc=joern@lazybastard.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=shaggy@linux.vnet.ibm.com \
--cc=viro@ftp.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).