public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org,
	linux-mtd@lists.infradead.org
Subject: Re: [RFC] Reinstate NFS exportability for JFFS2.
Date: Fri, 1 Aug 2008 12:14:17 +1000	[thread overview]
Message-ID: <18578.29049.38904.746701@notabene.brown> (raw)
In-Reply-To: message from David Woodhouse on Friday August 1

On Friday August 1, dwmw2@infradead.org wrote:
> On Fri, 2008-08-01 at 10:16 +1000, Neil Brown wrote:
> > On Thursday July 31, dwmw2@infradead.org wrote:
> > > On Fri, 2008-05-02 at 11:38 +1000, Neil Brown wrote:
> > > > Why is there a deadlock here?
> > 
> > I was really hoping you would answer this question.
> 
> It's because the nfsd readdirplus code recurses into the file system.
> >From the file system's ->readdir() function we call back into nfsd's
> encode_entry(), which then calls back into the file system's ->lookup()
> function so that it can generate a filehandle.
> 
> For any file system which has its own internal locking -- which includes
> jffs2, btrfs, xfs, jfs, gfs*, ocfs* and probably others -- that
> recursive call back into the file system will deadlock.
> 
> In the specific case of JFFS2, we need that internal locking because of
> lock ordering constraints with the garbage collection -- we need to take
> the allocator lock _before_ the per-inode lock, which means we can't use
> the generic inode->i_mutex for that purpose. That's documented in
> fs/jffs2/README.Locking. I know fewer details about the other affected
> file systems.

It sounds to me like the core problem here is that the locking regime
imposed by the VFS simply isn't suitable for JFFS2 .. and a bunch of
other filesystems.

This seems to suggest that the VFS should be changed.
Superficially, it seems that changing the locking rules so that the
callee takes i_mutex rather than the caller taking it would help and,
in the case of JFFS2, make f->sem redundant.  Does that match your
understanding?

That is obviously a big change and one that should not be made
lightly, but if it was to benefit a number of the newer filesystems, then
it would seem like the appropriate way to go.

Clearly we need a short term solution too as we don't want to wait for
VFS locking rules to be renegotiated.  The idea of a "lock is owned by
me" check is appealing to me as it is a small, localised change that
would easily be removed if/when the locking we "fixed" properly.

In the JFFS2 case I imagine this taking the following form:

 - new field in jffs2_inode_info "struct task_struct *sem_owner",
   initialised to NULL
 - in jffs2_readdir after locking ->sem, set ->sem_owner to current.
 - before unlocking ->sem, set ->sem_owner to NULL
 - in jffs2_lookup, check if "dir_f->sem_owner == current"
   If it does, set a flag reminding us not to drop the lock,
   else mutex_lock(&dir_f->sem);

This should fix the problem with very little cost either in
performance or elegance.  And if/when the locking rules were changed,
the accompanying code review would immediately notice this and remove
it.

NeilBrown

  parent reply	other threads:[~2008-08-01  2:14 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-01 19:42 [RFC] Reinstate NFS exportability for JFFS2 David Woodhouse
2008-05-01 20:48 ` Christoph Hellwig
2008-05-01 22:44   ` David Woodhouse
2008-05-02  1:38     ` Neil Brown
2008-05-02 11:37       ` David Woodhouse
2008-05-02 14:08         ` J. Bruce Fields
2008-07-31 21:54       ` David Woodhouse
2008-08-01  0:16         ` Neil Brown
2008-08-01  0:40           ` David Woodhouse
2008-08-01  0:52             ` David Woodhouse
2008-08-01  0:53             ` Chuck Lever
2008-08-01  1:00               ` David Woodhouse
2008-08-01  1:31                 ` Chuck Lever
2008-08-01  8:13                   ` David Woodhouse
2008-08-01 13:35                   ` David Woodhouse
2008-08-01 13:56                     ` David Woodhouse
2008-08-01 16:05                       ` Chuck Lever
2008-08-01 16:19                         ` David Woodhouse
2008-08-01 17:47                           ` Chuck Lever
2008-08-02 18:26                             ` J. Bruce Fields
2008-08-02 20:42                               ` David Woodhouse
2008-08-02 21:33                                 ` J. Bruce Fields
2008-08-03  8:39                                   ` David Woodhouse
2008-08-03 11:56                               ` Neil Brown
2008-08-03 17:15                                 ` Chuck Lever
2008-08-04  1:03                                   ` Neil Brown
2008-08-04  6:19                                     ` Chuck Lever
2008-08-05  8:51                                       ` Dave Chinner
2008-08-05  8:59                                         ` David Woodhouse
2008-08-05  9:47                                           ` Dave Chinner
2008-08-05 23:06                                         ` Neil Brown
2008-08-06  0:08                                           ` Dave Chinner
2008-08-06 19:56                                             ` J. Bruce Fields
2008-08-06 20:10                                               ` David Woodhouse
2008-08-09 16:47                                                 ` David Woodhouse
2008-08-09 19:55                                                   ` David Woodhouse
2008-08-09 20:01                                                     ` [PATCH 1/4] Factor out nfsd_do_readdir() into its own function David Woodhouse
2008-08-09 20:07                                                       ` Christoph Hellwig
2008-08-09 20:02                                                     ` [PATCH 2/4] Copy XFS readdir hack into nfsd code David Woodhouse
2008-08-09 20:08                                                       ` Christoph Hellwig
2008-08-09 20:03                                                     ` [PATCH 3/4] Remove XFS buffered readdir hack David Woodhouse
2008-08-09 20:09                                                       ` Christoph Hellwig
2008-08-09 20:03                                                     ` [PATCH 4/4] Reinstate NFS exportability David Woodhouse
2008-08-09 20:10                                                       ` Christoph Hellwig
2008-08-04 18:41                                     ` [RFC] Reinstate NFS exportability for JFFS2 J. Bruce Fields
2008-08-04 22:37                                       ` Neil Brown
2008-08-17 18:22                                     ` Andreas Dilger
2008-08-01  2:14             ` Neil Brown [this message]
2008-08-01  8:50               ` David Woodhouse
2008-08-01 10:03               ` Al Viro
2008-08-01 23:11                 ` Neil Brown
2008-07-31 21:54       ` [PATCH 1/4] Factor out nfsd_do_readdir() into its own function David Woodhouse
2008-07-31 21:54       ` [PATCH 2/4] Copy XFS readdir hack into nfsd code, introduce FS_NO_LOOKUP_IN_READDIR flag David Woodhouse
2008-07-31 21:55       ` [PATCH 3/4] Switch XFS to using FS_NO_LOOKUP_IN_READDIR, remove local readdir hack David Woodhouse
2008-07-31 21:55       ` [PATCH 4/4] [JFFS2] Reinstate NFS exportability David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=18578.29049.38904.746701@notabene.brown \
    --to=neilb@suse.de \
    --cc=dwmw2@infradead.org \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox