From: David Woodhouse <dwmw2@infradead.org>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: hooanon05@yahoo.co.jp, hch@infradead.org,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: Q: NFSD readdir in linux-2.6.28
Date: Fri, 17 Apr 2009 23:17:00 +0100 [thread overview]
Message-ID: <1240006620.19059.41.camel@macbook.infradead.org> (raw)
In-Reply-To: <20090417193233.GM26366@ZenIV.linux.org.uk>
On Fri, 2009-04-17 at 20:32 +0100, Al Viro wrote:
> Ow... Sorry, I've missed that one. I really don't like flags-based
> solution ;-/ It's not just filesystem method that want i_mutex there -
> we have fs/namei.c code suddenly called in different locking conditions
> now.
Well, we have that already.
When nfsd readdirplus code ends up calling back into lookup_one_len(),
it does so without i_mutex held. XFS had been doing it that way for
years, so that detail escaped our notice when we shifted the XFS 'hack'
into nfsd code.
> What were the details of xfs and jffs2 locking problems, just in case
> if something can be done in that direction short-term?
JFFS2 has its own internal mutex protecting the directory. It needs an
internal mutex instead of i_mutex because of ordering issues with
garbage collection. We are _in_ jffs2_readdir(), with the lock held,
when we call back into nfsd's filldir function... which calls right back
into jffs2_lookup() and deadlocks when it tries to take the same lock
again.
The situation in btrfs and xfs is broadly similar. They have their own
locks, and we end up deadlocking. GFS2 has some lovely "is this process
the owner of the lock" magic to avoid a similar deadlock.
It sounds like the better answer is to just make sure i_mutex is held
when nfsd_buffered_readdir() calls back into the provided filldir
function (we could do it in the various filldir functions themselves,
_if_ they call lookup_one_len(), but I think I prefer it this way --
it's simpler). Patch below for comment.
(While I'm staring at it, it looks like nfsd_buffered_readdir() should
be returning a __be32 not an int, and its 'return -ENOMEM' should be
'return nfserrno(-ENOMEM)'. The first bug I inherited from the existing
nfsd_do_readdir() when I replaced it, but the second is all my own. I'll
send a patch to fix those shortly.)
diff --git a/fs/namei.c b/fs/namei.c
index b8433eb..78f253c 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1248,6 +1248,8 @@ struct dentry *lookup_one_len(const char *name, struct dentry *base, int len)
int err;
struct qstr this;
+ WARN_ON_ONCE(!mutex_is_locked(&base->d_inode->i_mutex));
+
err = __lookup_one_len(name, &this, base, len);
if (err)
return ERR_PTR(err);
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index ab93fcf..fb2965d 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1920,13 +1920,27 @@ static int nfsd_buffered_readdir(struct file *file, filldir_t func,
break;
de = (struct buffered_dirent *)buf.dirent;
+
while (size > 0) {
+ struct inode *dir_inode = file->f_path.dentry->d_inode;
+ int finished;
+
offset = de->offset;
- if (func(cdp, de->name, de->namlen, de->offset,
- de->ino, de->d_type))
+ /*
+ * Various filldir functions may end up calling back
+ * into lookup_one_len() and the file system's
+ * ->lookup() method. These expect i_mutex to be held.
+ */
+ host_err = mutex_lock_killable(&dir_inode->i_mutex);
+ if (host_err)
goto done;
+ finished = func(cdp, de->name, de->namlen, de->offset,
+ de->ino, de->d_type);
+
+ mutex_unlock(&dir_inode->i_mutex);
+
if (cdp->err != nfs_ok)
goto done;
--
David Woodhouse Open Source Technology Centre
David.Woodhouse@intel.com Intel Corporation
next prev parent reply other threads:[~2009-04-17 22:17 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-19 14:54 Q: NFSD readdir in linux-2.6.28 hooanon05
2009-03-19 15:17 ` David Woodhouse
2009-03-19 15:34 ` hooanon05
2009-03-19 15:51 ` David Woodhouse
2009-04-17 9:32 ` David Woodhouse
2009-04-17 19:32 ` Al Viro
2009-04-17 22:17 ` David Woodhouse [this message]
2009-04-17 22:43 ` Al Viro
2009-04-17 22:51 ` David Woodhouse
2009-04-17 22:53 ` Al Viro
2009-04-17 22:55 ` David Woodhouse
2009-04-17 23:23 ` David Woodhouse
2009-04-17 23:37 ` Al Viro
2009-04-17 23:39 ` Al Viro
2009-04-18 0:15 ` [PATCH] Fix i_mutex handling in nfsd readdir David Woodhouse
2009-04-18 3:11 ` hooanon05
2009-04-18 14:25 ` Al Viro
2009-04-19 7:18 ` David Woodhouse
2009-04-19 12:27 ` [PATCH v2] " David Woodhouse
2009-04-19 20:51 ` J. Bruce Fields
2009-04-20 19:50 ` J. Bruce Fields
2009-04-21 0:29 ` Al Viro
2009-04-21 21:15 ` J. Bruce Fields
2009-04-21 21:54 ` Al Viro
2009-05-11 23:16 ` J. Bruce Fields
2009-04-22 4:41 ` hooanon05
2009-04-22 19:12 ` J. Bruce Fields
2009-04-23 6:40 ` hooanon05
2009-04-23 20:27 ` J. Bruce Fields
2009-05-05 23:35 ` J. Bruce Fields
2009-05-06 5:09 ` hooanon05
2009-05-06 20:20 ` J. Bruce Fields
2009-05-07 4:38 ` hooanon05
2009-05-08 18:47 ` J. Bruce Fields
2009-03-19 15:37 ` Q: NFSD readdir in linux-2.6.28 Matthew Wilcox
2009-03-19 15:45 ` hooanon05
2009-03-19 16:01 ` David Woodhouse
2009-04-16 21:45 ` J. Bruce Fields
2009-04-17 4:13 ` hooanon05
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1240006620.19059.41.camel@macbook.infradead.org \
--to=dwmw2@infradead.org \
--cc=hch@infradead.org \
--cc=hooanon05@yahoo.co.jp \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).