public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sergey Vlasov <vsu@altlinux.ru>
To: Andrew Morton <akpm@osdl.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>,
	joe.korty@ccur.com, linux-kernel@vger.kernel.org,
	drepper@redhat.com, mingo@elte.hu, stable@kernel.org
Subject: Re: lock_kernel called under spinlock in NFS
Date: Sat, 3 Jun 2006 22:30:03 +0400	[thread overview]
Message-ID: <20060603223003.5665a426.vsu@altlinux.ru> (raw)
In-Reply-To: <20060602134346.73019624.akpm@osdl.org>

[-- Attachment #1: Type: text/plain, Size: 5261 bytes --]

On Fri, 2 Jun 2006 13:43:46 -0700 Andrew Morton wrote:

> Trond Myklebust <Trond.Myklebust@netapp.com> wrote:
> >
> > On Fri, 2006-06-02 at 16:24 -0400, Joe Korty wrote:
> > > On Thu, Jun 01, 2006 at 04:13:39PM -0400, Trond Myklebust wrote:
> > > > On Thu, 2006-06-01 at 15:55 -0400, Joe Korty wrote:
> > > >> Tree 5fdccf2354269702f71beb8e0a2942e4167fd992
> > > >> 
> > > >>         [PATCH] vfs: *at functions: core
> > > >> 
> > > >> introduced a bug where lock_kernel() can be called from
> > > >> under a spinlock.  To trigger the bug one must have
> > > >> CONFIG_PREEMPT_BKL=y and be using NFS heavily.  It is
> > > >> somewhat rare and, so far, haven't traced down the userland
> > > >> sequence that causes the fatal path to be taken.
> > > >> 
> > > >> The bug was caused by the insertion into do_path_lookup()
> > > >> of a call to file_permission().  do_path_lookup()
> > > >> read-locks current->fs->lock for most of its operation.
> > > >> file_permission() calls permission() which calls
> > > >> nfs_permission(), which has one path through it
> > > >> that uses lock_kernel().
> > > 
> > > > Nowhere should anyone be calling file_permission() under a spinlock.
> > > > 
> > > > Why would you need to read-protect current->fs in the case where you are
> > > > starting from a file? The correct thing to do there would appear to be
> > > > to read_protect only the cases where (*name=='/') and (dfd == AT_FDCWD).
> > > > 
> > > > Something like the attached patch...
> > > 
> > > 
> > > Hi Trond,
> > > I've been running with the patch for the last few hours, on an nfs-rooted
> > > system, and it has been working fine.  Any plans to submit this for 2.6.17?
> > 
> > It probably ought to be, given the nature of the sin. Andrew?
> > 
> 
> OK.
> 
> Just to confirm, this is final?
> 
> 
> From: Trond Myklebust <Trond.Myklebust@netapp.com>
> 
> We're presently running lock_kernel() under fs_lock via nfs's ->permission
> handler.  That's a ranking bug and sometimes a sleep-in-spinlock bug.  This
> problem was introduced in the openat() patchset.
> 
> We should not need to hold the current->fs->lock for a codepath that doesn't
> use current->fs.
> 
> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> Cc: Al Viro <viro@ftp.linux.org.uk>
> Signed-off-by: Andrew Morton <akpm@osdl.org>
> ---
> 
>  fs/namei.c |    6 ++++--
>  1 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff -puN fs/namei.c~fs-nameic-call-to-file_permission-under-a-spinlock-in-do_lookup_path fs/namei.c
> --- 25/fs/namei.c~fs-nameic-call-to-file_permission-under-a-spinlock-in-do_lookup_path	Fri Jun  2 13:39:52 2006
> +++ 25-akpm/fs/namei.c	Fri Jun  2 13:39:52 2006
> @@ -1080,8 +1080,8 @@ static int fastcall do_path_lookup(int d
>  	nd->flags = flags;
>  	nd->depth = 0;
>  
> -	read_lock(&current->fs->lock);
>  	if (*name=='/') {
> +		read_lock(&current->fs->lock);
>  		if (current->fs->altroot && !(nd->flags & LOOKUP_NOALT)) {
>  			nd->mnt = mntget(current->fs->altrootmnt);
>  			nd->dentry = dget(current->fs->altroot);
> @@ -1092,9 +1092,12 @@ static int fastcall do_path_lookup(int d
>  		}
>  		nd->mnt = mntget(current->fs->rootmnt);
>  		nd->dentry = dget(current->fs->root);
> +		read_unlock(&current->fs->lock);
>  	} else if (dfd == AT_FDCWD) {
> +		read_lock(&current->fs->lock);
>  		nd->mnt = mntget(current->fs->pwdmnt);
>  		nd->dentry = dget(current->fs->pwd);
> +		read_unlock(&current->fs->lock);
>  	} else {
>  		struct dentry *dentry;
>  
> @@ -1118,7 +1121,6 @@ static int fastcall do_path_lookup(int d
>  
>  		fput_light(file, fput_needed);
>  	}
> -	read_unlock(&current->fs->lock);
>  	current->total_link_count = 0;
>  	retval = link_path_walk(name, nd);
>  out:

1) This bug is also present in the 2.6.16 tree.

2) The patch above is broken - it needs the fix below (or the fix should
be folded into the patch directly):

--------------------------------------------------------------------

Fix do_path_lookup() failure path after locking changes

Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
---
 fs/namei.c |   13 ++++++-------
 1 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index a2f79d2..d6e2ee2 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1104,17 +1104,17 @@ static int fastcall do_path_lookup(int d
 		file = fget_light(dfd, &fput_needed);
 		retval = -EBADF;
 		if (!file)
-			goto unlock_fail;
+			goto out_fail;
 
 		dentry = file->f_dentry;
 
 		retval = -ENOTDIR;
 		if (!S_ISDIR(dentry->d_inode->i_mode))
-			goto fput_unlock_fail;
+			goto fput_fail;
 
 		retval = file_permission(file, MAY_EXEC);
 		if (retval)
-			goto fput_unlock_fail;
+			goto fput_fail;
 
 		nd->mnt = mntget(file->f_vfsmnt);
 		nd->dentry = dget(dentry);
@@ -1129,13 +1129,12 @@ out:
 				nd->dentry->d_inode))
 		audit_inode(name, nd->dentry->d_inode, flags);
 	}
+out_fail:
 	return retval;
 
-fput_unlock_fail:
+fput_fail:
 	fput_light(file, fput_needed);
-unlock_fail:
-	read_unlock(&current->fs->lock);
-	return retval;
+	goto out_fail;
 }
 
 int fastcall path_lookup(const char *name, unsigned int flags,
-- 
1.3.3.g3d95c



[-- Attachment #2: Type: application/pgp-signature, Size: 190 bytes --]

  parent reply	other threads:[~2006-06-03 18:30 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-01 19:55 lock_kernel called under spinlock in NFS Joe Korty
2006-06-01 20:13 ` Trond Myklebust
2006-06-02 20:24   ` Joe Korty
2006-06-02 20:27     ` Trond Myklebust
2006-06-02 20:43       ` Andrew Morton
2006-06-02 21:26         ` Trond Myklebust
2006-06-03 18:30         ` Sergey Vlasov [this message]
2006-06-03 22:10           ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060603223003.5665a426.vsu@altlinux.ru \
    --to=vsu@altlinux.ru \
    --cc=Trond.Myklebust@netapp.com \
    --cc=akpm@osdl.org \
    --cc=drepper@redhat.com \
    --cc=joe.korty@ccur.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=stable@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox