From: Sergey Vlasov <vsu@altlinux.ru>
To: Andrew Morton <akpm@osdl.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>,
joe.korty@ccur.com, linux-kernel@vger.kernel.org,
drepper@redhat.com, mingo@elte.hu, stable@kernel.org
Subject: Re: lock_kernel called under spinlock in NFS
Date: Sat, 3 Jun 2006 22:30:03 +0400 [thread overview]
Message-ID: <20060603223003.5665a426.vsu@altlinux.ru> (raw)
In-Reply-To: <20060602134346.73019624.akpm@osdl.org>
[-- Attachment #1: Type: text/plain, Size: 5261 bytes --]
On Fri, 2 Jun 2006 13:43:46 -0700 Andrew Morton wrote:
> Trond Myklebust <Trond.Myklebust@netapp.com> wrote:
> >
> > On Fri, 2006-06-02 at 16:24 -0400, Joe Korty wrote:
> > > On Thu, Jun 01, 2006 at 04:13:39PM -0400, Trond Myklebust wrote:
> > > > On Thu, 2006-06-01 at 15:55 -0400, Joe Korty wrote:
> > > >> Tree 5fdccf2354269702f71beb8e0a2942e4167fd992
> > > >>
> > > >> [PATCH] vfs: *at functions: core
> > > >>
> > > >> introduced a bug where lock_kernel() can be called from
> > > >> under a spinlock. To trigger the bug one must have
> > > >> CONFIG_PREEMPT_BKL=y and be using NFS heavily. It is
> > > >> somewhat rare and, so far, haven't traced down the userland
> > > >> sequence that causes the fatal path to be taken.
> > > >>
> > > >> The bug was caused by the insertion into do_path_lookup()
> > > >> of a call to file_permission(). do_path_lookup()
> > > >> read-locks current->fs->lock for most of its operation.
> > > >> file_permission() calls permission() which calls
> > > >> nfs_permission(), which has one path through it
> > > >> that uses lock_kernel().
> > >
> > > > Nowhere should anyone be calling file_permission() under a spinlock.
> > > >
> > > > Why would you need to read-protect current->fs in the case where you are
> > > > starting from a file? The correct thing to do there would appear to be
> > > > to read_protect only the cases where (*name=='/') and (dfd == AT_FDCWD).
> > > >
> > > > Something like the attached patch...
> > >
> > >
> > > Hi Trond,
> > > I've been running with the patch for the last few hours, on an nfs-rooted
> > > system, and it has been working fine. Any plans to submit this for 2.6.17?
> >
> > It probably ought to be, given the nature of the sin. Andrew?
> >
>
> OK.
>
> Just to confirm, this is final?
>
>
> From: Trond Myklebust <Trond.Myklebust@netapp.com>
>
> We're presently running lock_kernel() under fs_lock via nfs's ->permission
> handler. That's a ranking bug and sometimes a sleep-in-spinlock bug. This
> problem was introduced in the openat() patchset.
>
> We should not need to hold the current->fs->lock for a codepath that doesn't
> use current->fs.
>
> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> Cc: Al Viro <viro@ftp.linux.org.uk>
> Signed-off-by: Andrew Morton <akpm@osdl.org>
> ---
>
> fs/namei.c | 6 ++++--
> 1 files changed, 4 insertions(+), 2 deletions(-)
>
> diff -puN fs/namei.c~fs-nameic-call-to-file_permission-under-a-spinlock-in-do_lookup_path fs/namei.c
> --- 25/fs/namei.c~fs-nameic-call-to-file_permission-under-a-spinlock-in-do_lookup_path Fri Jun 2 13:39:52 2006
> +++ 25-akpm/fs/namei.c Fri Jun 2 13:39:52 2006
> @@ -1080,8 +1080,8 @@ static int fastcall do_path_lookup(int d
> nd->flags = flags;
> nd->depth = 0;
>
> - read_lock(¤t->fs->lock);
> if (*name=='/') {
> + read_lock(¤t->fs->lock);
> if (current->fs->altroot && !(nd->flags & LOOKUP_NOALT)) {
> nd->mnt = mntget(current->fs->altrootmnt);
> nd->dentry = dget(current->fs->altroot);
> @@ -1092,9 +1092,12 @@ static int fastcall do_path_lookup(int d
> }
> nd->mnt = mntget(current->fs->rootmnt);
> nd->dentry = dget(current->fs->root);
> + read_unlock(¤t->fs->lock);
> } else if (dfd == AT_FDCWD) {
> + read_lock(¤t->fs->lock);
> nd->mnt = mntget(current->fs->pwdmnt);
> nd->dentry = dget(current->fs->pwd);
> + read_unlock(¤t->fs->lock);
> } else {
> struct dentry *dentry;
>
> @@ -1118,7 +1121,6 @@ static int fastcall do_path_lookup(int d
>
> fput_light(file, fput_needed);
> }
> - read_unlock(¤t->fs->lock);
> current->total_link_count = 0;
> retval = link_path_walk(name, nd);
> out:
1) This bug is also present in the 2.6.16 tree.
2) The patch above is broken - it needs the fix below (or the fix should
be folded into the patch directly):
--------------------------------------------------------------------
Fix do_path_lookup() failure path after locking changes
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
---
fs/namei.c | 13 ++++++-------
1 files changed, 6 insertions(+), 7 deletions(-)
diff --git a/fs/namei.c b/fs/namei.c
index a2f79d2..d6e2ee2 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1104,17 +1104,17 @@ static int fastcall do_path_lookup(int d
file = fget_light(dfd, &fput_needed);
retval = -EBADF;
if (!file)
- goto unlock_fail;
+ goto out_fail;
dentry = file->f_dentry;
retval = -ENOTDIR;
if (!S_ISDIR(dentry->d_inode->i_mode))
- goto fput_unlock_fail;
+ goto fput_fail;
retval = file_permission(file, MAY_EXEC);
if (retval)
- goto fput_unlock_fail;
+ goto fput_fail;
nd->mnt = mntget(file->f_vfsmnt);
nd->dentry = dget(dentry);
@@ -1129,13 +1129,12 @@ out:
nd->dentry->d_inode))
audit_inode(name, nd->dentry->d_inode, flags);
}
+out_fail:
return retval;
-fput_unlock_fail:
+fput_fail:
fput_light(file, fput_needed);
-unlock_fail:
- read_unlock(¤t->fs->lock);
- return retval;
+ goto out_fail;
}
int fastcall path_lookup(const char *name, unsigned int flags,
--
1.3.3.g3d95c
[-- Attachment #2: Type: application/pgp-signature, Size: 190 bytes --]
next prev parent reply other threads:[~2006-06-03 18:30 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-06-01 19:55 lock_kernel called under spinlock in NFS Joe Korty
2006-06-01 20:13 ` Trond Myklebust
2006-06-02 20:24 ` Joe Korty
2006-06-02 20:27 ` Trond Myklebust
2006-06-02 20:43 ` Andrew Morton
2006-06-02 21:26 ` Trond Myklebust
2006-06-03 18:30 ` Sergey Vlasov [this message]
2006-06-03 22:10 ` Trond Myklebust
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060603223003.5665a426.vsu@altlinux.ru \
--to=vsu@altlinux.ru \
--cc=Trond.Myklebust@netapp.com \
--cc=akpm@osdl.org \
--cc=drepper@redhat.com \
--cc=joe.korty@ccur.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=stable@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.