linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Chinner <dgc@sgi.com>
To: "Jörn Engel" <joern@lazybastard.org>
Cc: linux-fsdevel@vger.kernel.org,
	Anton Altaparmakov <aia21@cam.ac.uk>, David Chinner <dgc@sgi.com>,
	Dave Kleikamp <shaggy@linux.vnet.ibm.com>,
	Al Viro <viro@ftp.linux.org.uk>,
	Christoph Hellwig <hch@infradead.org>
Subject: Re: [RFC] The many faces of the I_LOCK
Date: Thu, 22 Feb 2007 05:29:04 +1100	[thread overview]
Message-ID: <20070221182904.GL6095633@melbourne.sgi.com> (raw)
In-Reply-To: <20070221130956.GB464@lazybastard.org>

On Wed, Feb 21, 2007 at 01:09:56PM +0000, Jörn Engel wrote:
> 1. Introduction
> 
> This lengthy investigation was caused by a deadlock problem in LogFS,
> but uncovered a more general problem.  It affects, at the least, all
> filesystems that need to read inodes in their write path.  To my
> knowledge, that includes LogFS and NTFS, possibly also JFS and XFS.

I don't think XFS has any problems here - we're pretty careful about
reading inodes from disk before we lock other potentially dependent
objects in the filesystem....

> Deadlock happens when two processes A and B (that may be identical) have
> these two call chains:
> 
> Process A:				Process B:
> inode_wait				[filesystem locking write path]
> __wait_on_bit				__writeback_single_inode
> out_of_line_wait_on_bit
> ifind_fast
> [filesystem calling iget()]
> [filesystem locking write path]
> 
> Process A will wait_on_inode() and block until process B proceeds
> through __sync_single_inode(), which is called from
> __writeback_single_inode() in Process B.  Process B will block on the
> lock of the filesystem write path, held by Process A.

This is caused by your cleaner thread racing with writeback doing inode
lookup, right? You need a non-blocking inode lookup to prevent the deadlock,
I guess....

> 2. The usage of inode_lock and I_LOCK
.....
> Three rows have not exposed their meaning to me yet, so I'd
> gladly receive some insight here.

IIRC, the checks in xfs_ichgtime[_fast] are simply an optimisation - if the
inode is currently I_LOCKed then we can't mark it dirty anyway so we don't
even bother trying. We do mark the XFS inode structure (*) dirty, though, so
the modification will make it to disk at some time in the future.

(*) XFS does double inode caching because the XFS transaction subsystem
requires inodes to have a different lifecycle to the linux struct inode
lifecycle.

> 3. Seperating out sync notification
> 
> One of the results from the investigations in 2 appears to be that one
> class of users in fs/fs-writeback.c is completely unrelated to another
> class of users in fs/inode.c.
> 
> In particular, __sync_single_inode(), __writeback_single_inode(),
> write_inode_now(), clear_inode(), __mark_inode_dirty() and (possibly?)
> generic_osync_inode() seem to only need a completion event to
> synchronize with.  There is no reason why this group should share a lock
> with iget() and any of its many permutations.

Seems reasonable, but I don't know all the little details in these
paths....

> Now, if the group in fs/fs-writeback.c had a completion event that is
> independent of anything in fs/inode.c, the deadlock scenario described
> in section 1 goes away.  As a further result, ilookup5_nowait() can get
> removed as its only user, NTFS, only introduced it as a workaround for
> the deadlock scenario.

Could you use ilookup5_nowait() in LogFS and avoid the cleaner deadlock
that way?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2007-02-21 19:01 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-21 13:09 [RFC] The many faces of the I_LOCK Jörn Engel
2007-02-21 18:29 ` David Chinner [this message]
2007-02-21 18:47   ` Jörn Engel
2007-02-21 18:57     ` Jörn Engel
2007-02-21 18:54 ` [PATCH 1/3] Replace I_LOCK with I_SYNC for fs/fs-writeback.c uses Jörn Engel
2007-02-21 18:55 ` [PATCH] [NTFS] Remove ilookup5_nowait and convert users to ilookup5 Jörn Engel
2007-02-21 18:56 ` [PATCH 3/3] Replace I_LOCK with I_SYNC in XFS Jörn Engel
2007-02-22 11:40 ` [RFC] The many faces of the I_LOCK Jörn Engel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070221182904.GL6095633@melbourne.sgi.com \
    --to=dgc@sgi.com \
    --cc=aia21@cam.ac.uk \
    --cc=hch@infradead.org \
    --cc=joern@lazybastard.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=shaggy@linux.vnet.ibm.com \
    --cc=viro@ftp.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).