[PATCH][RFC] use after free in NLM subsystem -- how best to fix it?

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH][RFC] use after free in NLM subsystem -- how best to fix it?
@ 2007-09-24 20:12 Jeff Layton
  2007-09-24 22:13 ` Trond Myklebust
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff Layton @ 2007-09-24 20:12 UTC (permalink / raw)
  To: nfs

I've noticed a problem with the way we're doing grant callbacks. I've
seen at least one oops from it in the field. I can also reproduce it
pretty much at will on older kernels with slab poisoning enabled. On
newer kernels (2.6.23-rc-ish), I've gotten it to reproduce once on a
kernel with SLUB poisoning, but can't quite do it at will.

The problem is that I'm not sure of the best way to fix it. Hopefully
someone can give me an idea...

When a lock that a client is blocking on comes free, lockd does this in
nlmsvc_grant_blocked():

    nlm_async_call(block->b_call, NLMPROC_GRANTED_MSG, &nlmsvc_grant_ops);

the callback from this call is nlmsvc_grant_callback(). That function
does this at the end to wake up lockd:

    svc_wake_up(block->b_daemon);

...the problem is, that there is no guarantee that lockd will be up
when this happens. If someone shuts down or restarts lockd before the
async call completes, then the b_daemon pointer will point to freed
memory.

The best I've been able to come up with is to replace the svc_wake_up
calls with a call to a function that wakes up whatever lockd happens
to be up at the time (if any). My thinking is that we don't really
care too much about waking up the lockd thread that happened to spawn
the block. Rather, we're just interested in making sure that an active
lockd wakes up.

Rough, untested patch follows as an RFC. A similar patch fixes the
easily reproducible panics on older kernels.

Details on that patch and how to reproduce this are in this RHBZ:

https://bugzilla.redhat.com/show_bug.cgi?id=253754

Thoughts?

Signed-off-by: Jeff Layton <jlayton@redhat.com>

diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c
index 82e2192..a3791cc 100644
--- a/fs/lockd/svc.c
+++ b/fs/lockd/svc.c
@@ -50,6 +50,7 @@ static DEFINE_MUTEX(nlmsvc_mutex);
 static unsigned int		nlmsvc_users;
 static pid_t			nlmsvc_pid;
 static struct svc_serv		*nlmsvc_serv;
+static DEFINE_RWLOCK(nlmsvc_serv_lock);
 int				nlmsvc_grace_period;
 unsigned long			nlmsvc_timeout;
 
@@ -129,7 +130,9 @@ lockd(struct svc_rqst *rqstp)
 	 * Let our maker know we're running.
 	 */
 	nlmsvc_pid = current->pid;
+	write_lock(&nlmsvc_serv_lock);
 	nlmsvc_serv = rqstp->rq_server;
+	write_unlock(&nlmsvc_serv_lock);
 	complete(&lockd_start_done);
 
 	daemonize("lockd");
@@ -205,7 +208,9 @@ lockd(struct svc_rqst *rqstp)
 			nlmsvc_invalidate_all();
 		nlm_shutdown_hosts();
 		nlmsvc_pid = 0;
+		write_lock(&nlmsvc_serv_lock);
 		nlmsvc_serv = NULL;
+		write_unlock(&nlmsvc_serv_lock);
 	} else
 		printk(KERN_DEBUG
 			"lockd: new process, skipping host shutdown\n");
@@ -273,11 +278,14 @@ lockd_up(int proto) /* Maybe add a 'family' option when IPv6 is supported ?? */
 	/*
 	 * Check whether we're already up and running.
 	 */
-	if (nlmsvc_pid) {
+	read_lock(&nlmsvc_serv_lock);
+	if (nlmsvc_serv) {
 		if (proto)
 			error = make_socks(nlmsvc_serv, proto);
+		read_unlock(&nlmsvc_serv_lock);
 		goto out;
 	}
+	read_unlock(&nlmsvc_serv_lock);
 
 	/*
 	 * Sanity check: if there's no pid,
@@ -355,6 +363,9 @@ lockd_down(void)
 		printk(KERN_WARNING 
 			"lockd_down: lockd failed to exit, clearing pid\n");
 		nlmsvc_pid = 0;
+		write_lock(&nlmsvc_serv_lock);
+		nlmsvc_serv = NULL;
+		write_unlock(&nlmsvc_serv_lock);
 	}
 	spin_lock_irq(&current->sighand->siglock);
 	recalc_sigpending();
@@ -364,6 +375,16 @@ out:
 }
 EXPORT_SYMBOL(lockd_down);
 
+/* wake up current lockd thread */
+void
+lockd_wake_up(void)
+{
+	read_lock(&nlmsvc_serv_lock);
+	if (nlmsvc_serv)
+		svc_wake_up(nlmsvc_serv);
+	read_unlock(&nlmsvc_serv_lock);
+}
+
 /*
  * Sysctl parameters (same as module parameters, different interface).
  */
diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index a21e4bc..1dc16b8 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -644,7 +644,7 @@ static int nlmsvc_grant_deferred(struct file_lock *fl, struct file_lock *conf,
 				block->b_granted = 1;
 
 			nlmsvc_insert_block(block, 0);
-			svc_wake_up(block->b_daemon);
+			lockd_wake_up();
 			rc = 0;
 			break;
 		}
@@ -671,7 +671,7 @@ nlmsvc_notify_blocked(struct file_lock *fl)
 	list_for_each_entry(block, &nlm_blocked, b_list) {
 		if (nlm_compare_locks(&block->b_call->a_args.lock.fl, fl)) {
 			nlmsvc_insert_block(block, 0);
-			svc_wake_up(block->b_daemon);
+			lockd_wake_up();
 			return;
 		}
 	}
@@ -784,7 +784,7 @@ static void nlmsvc_grant_callback(struct rpc_task *task, void *data)
 		timeout = 60 * HZ;
 	}
 	nlmsvc_insert_block(block, timeout);
-	svc_wake_up(block->b_daemon);
+	lockd_wake_up();
 }
 
 static void nlmsvc_grant_release(void *data)
diff --git a/include/linux/lockd/bind.h b/include/linux/lockd/bind.h
index 6f1637c..29642e3 100644
--- a/include/linux/lockd/bind.h
+++ b/include/linux/lockd/bind.h
@@ -38,6 +38,7 @@ extern struct nlmsvc_binding *	nlmsvc_ops;
 extern int	nlmclnt_proc(struct inode *, int, struct file_lock *);
 extern int	lockd_up(int proto);
 extern void	lockd_down(void);
+extern void	lockd_wake_up(void);
 
 unsigned long get_nfs_grace_period(void);
 

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH][RFC] use after free in NLM subsystem -- how best to fix it?
  2007-09-24 20:12 [PATCH][RFC] use after free in NLM subsystem -- how best to fix it? Jeff Layton
@ 2007-09-24 22:13 ` Trond Myklebust
  2007-09-25 14:25   ` Jeff Layton
  0 siblings, 1 reply; 11+ messages in thread
From: Trond Myklebust @ 2007-09-24 22:13 UTC (permalink / raw)
  To: Jeff Layton; +Cc: nfs

On Mon, 2007-09-24 at 16:12 -0400, Jeff Layton wrote:

> When a lock that a client is blocking on comes free, lockd does this in
> nlmsvc_grant_blocked():
> 
>     nlm_async_call(block->b_call, NLMPROC_GRANTED_MSG, &nlmsvc_grant_ops);
> 
> the callback from this call is nlmsvc_grant_callback(). That function
> does this at the end to wake up lockd:
> 
>     svc_wake_up(block->b_daemon);
> 
> ...the problem is, that there is no guarantee that lockd will be up
> when this happens. If someone shuts down or restarts lockd before the
> async call completes, then the b_daemon pointer will point to freed
> memory.

Why shouldn't we be guaranteeing that lockd is up here?

Cheers
  Trond


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH][RFC] use after free in NLM subsystem -- how best to fix it?
  2007-09-24 22:13 ` Trond Myklebust
@ 2007-09-25 14:25   ` Jeff Layton
  2007-09-25 15:26     ` J. Bruce Fields
  2007-09-25 17:05     ` Trond Myklebust
  0 siblings, 2 replies; 11+ messages in thread
From: Jeff Layton @ 2007-09-25 14:25 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: nfs

On Mon, 24 Sep 2007 18:13:23 -0400
Trond Myklebust <trond.myklebust@fys.uio.no> wrote:

> On Mon, 2007-09-24 at 16:12 -0400, Jeff Layton wrote:
> 
> > When a lock that a client is blocking on comes free, lockd does this in
> > nlmsvc_grant_blocked():
> > 
> >     nlm_async_call(block->b_call, NLMPROC_GRANTED_MSG, &nlmsvc_grant_ops);
> > 
> > the callback from this call is nlmsvc_grant_callback(). That function
> > does this at the end to wake up lockd:
> > 
> >     svc_wake_up(block->b_daemon);
> > 
> > ...the problem is, that there is no guarantee that lockd will be up
> > when this happens. If someone shuts down or restarts lockd before the
> > async call completes, then the b_daemon pointer will point to freed
> > memory.
> 
> Why shouldn't we be guaranteeing that lockd is up here?
> 
> Cheers
>   Trond
> 

That's certainly one approach. I avoided that since it seems like it
would add extra complexity -- it would mean reference counting of some
sort. I suppose we could keep a list of async calls that are in
progress and cancel those when lockd begins shutting down. Or is there
an easier way to make sure they are cancelled before lockd goes down?

As a side note, I think we need to consider some locking around
nlmsvc_pid/nlmsvc_serv. This part of lockd seems racy:

        if (!nlmsvc_pid || current->pid == nlmsvc_pid) {
                if (nlmsvc_ops)
                        nlmsvc_invalidate_all();
                nlm_shutdown_hosts();
                nlmsvc_pid = 0;
                nlmsvc_serv = NULL;

if either nlm_invalidate_all or nlm_shutdown_hosts takes a while, and
lockd is being restarted:

1) lockd_down is called and starts shutting down lockd
2) lockd takes a while to come down, lockd_down gives up, and sets
   nlmsvc_pid=0
3) lockd_up is called and fires up a new lockd thread, it sets
   nlmsvc_pid to the new thread's pid
4) first lockd finishes and sets nlmsvc_pid=0

Perhaps we need to have only lockd_down set those vars so that it's
done under the nlmsvc_mutex?

Thoughts?
-- 
Jeff Layton <jlayton@redhat.com>

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH][RFC] use after free in NLM subsystem -- how best to fix it?
  2007-09-25 14:25   ` Jeff Layton
@ 2007-09-25 15:26     ` J. Bruce Fields
  2007-09-25 17:05     ` Trond Myklebust
  1 sibling, 0 replies; 11+ messages in thread
From: J. Bruce Fields @ 2007-09-25 15:26 UTC (permalink / raw)
  To: Jeff Layton; +Cc: nfs, Trond Myklebust

On Tue, Sep 25, 2007 at 10:25:01AM -0400, Jeff Layton wrote:
> As a side note, I think we need to consider some locking around
> nlmsvc_pid/nlmsvc_serv. This part of lockd seems racy:
> 
>         if (!nlmsvc_pid || current->pid == nlmsvc_pid) {
>                 if (nlmsvc_ops)
>                         nlmsvc_invalidate_all();
>                 nlm_shutdown_hosts();
>                 nlmsvc_pid = 0;
>                 nlmsvc_serv = NULL;
> 
> if either nlm_invalidate_all or nlm_shutdown_hosts takes a while, and
> lockd is being restarted:
> 
> 1) lockd_down is called and starts shutting down lockd
> 2) lockd takes a while to come down, lockd_down gives up, and sets
>    nlmsvc_pid=0
> 3) lockd_up is called and fires up a new lockd thread, it sets
>    nlmsvc_pid to the new thread's pid
> 4) first lockd finishes and sets nlmsvc_pid=0
> 
> Perhaps we need to have only lockd_down set those vars so that it's
> done under the nlmsvc_mutex?

Could be.  I believe nfsd startup/shutdown is similarly racy; see for
example http://lkml.org/lkml/2007/8/2/462, which I think Neil determined
was a similar race.

--b.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH][RFC] use after free in NLM subsystem -- how best to fix it?
  2007-09-25 14:25   ` Jeff Layton
  2007-09-25 15:26     ` J. Bruce Fields
@ 2007-09-25 17:05     ` Trond Myklebust
  2007-09-27 17:59       ` Jeff Layton
  1 sibling, 1 reply; 11+ messages in thread
From: Trond Myklebust @ 2007-09-25 17:05 UTC (permalink / raw)
  To: Jeff Layton; +Cc: nfs

On Tue, 2007-09-25 at 10:25 -0400, Jeff Layton wrote:

> That's certainly one approach. I avoided that since it seems like it
> would add extra complexity -- it would mean reference counting of some
> sort. I suppose we could keep a list of async calls that are in
> progress and cancel those when lockd begins shutting down. Or is there
> an easier way to make sure they are cancelled before lockd goes down?

You already have such a list: nlm_blocked.


> As a side note, I think we need to consider some locking around
> nlmsvc_pid/nlmsvc_serv. This part of lockd seems racy:
> 
>         if (!nlmsvc_pid || current->pid == nlmsvc_pid) {
>                 if (nlmsvc_ops)
>                         nlmsvc_invalidate_all();
>                 nlm_shutdown_hosts();
>                 nlmsvc_pid = 0;
>                 nlmsvc_serv = NULL;
> 
> if either nlm_invalidate_all or nlm_shutdown_hosts takes a while, and
> lockd is being restarted:
> 
> 1) lockd_down is called and starts shutting down lockd
> 2) lockd takes a while to come down, lockd_down gives up, and sets
>    nlmsvc_pid=0
> 3) lockd_up is called and fires up a new lockd thread, it sets
>    nlmsvc_pid to the new thread's pid
> 4) first lockd finishes and sets nlmsvc_pid=0
> 
> Perhaps we need to have only lockd_down set those vars so that it's
> done under the nlmsvc_mutex?
> 
> Thoughts?

For cleanliness, I suggest adding a new refcount to lockd: nlmsvc_ref.
Replace the test for (nlmsvc_users || !signalled()) in lockd() with a
test for nlmsvc_ref != 0, and then have lockd_up() take one reference to
it whenever it bumps nlmsvc_users == 0  / lockd_down() release a
reference whenever it decrements nlmsvc_users == 1.

Have nlmsvc_insert_block() take a reference whenever adding something to
an empty nlm_blocked / nlmsvc_unlink_block release the reference
whenever nlm_blocked is emptied.

Trond


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH][RFC] use after free in NLM subsystem -- how best to fix it?
  2007-09-25 17:05     ` Trond Myklebust
@ 2007-09-27 17:59       ` Jeff Layton
  2007-09-27 18:38         ` J. Bruce Fields
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff Layton @ 2007-09-27 17:59 UTC (permalink / raw)
  To: nfs

On Tue, 25 Sep 2007 13:05:48 -0400
Trond Myklebust <trond.myklebust@fys.uio.no> wrote:

> On Tue, 2007-09-25 at 10:25 -0400, Jeff Layton wrote:
> 
> > That's certainly one approach. I avoided that since it seems like it
> > would add extra complexity -- it would mean reference counting of some
> > sort. I suppose we could keep a list of async calls that are in
> > progress and cancel those when lockd begins shutting down. Or is there
> > an easier way to make sure they are cancelled before lockd goes down?
> 
> You already have such a list: nlm_blocked.
> 
> 
> > As a side note, I think we need to consider some locking around
> > nlmsvc_pid/nlmsvc_serv. This part of lockd seems racy:
> > 
> >         if (!nlmsvc_pid || current->pid == nlmsvc_pid) {
> >                 if (nlmsvc_ops)
> >                         nlmsvc_invalidate_all();
> >                 nlm_shutdown_hosts();
> >                 nlmsvc_pid = 0;
> >                 nlmsvc_serv = NULL;
> > 
> > if either nlm_invalidate_all or nlm_shutdown_hosts takes a while, and
> > lockd is being restarted:
> > 
> > 1) lockd_down is called and starts shutting down lockd
> > 2) lockd takes a while to come down, lockd_down gives up, and sets
> >    nlmsvc_pid=0
> > 3) lockd_up is called and fires up a new lockd thread, it sets
> >    nlmsvc_pid to the new thread's pid
> > 4) first lockd finishes and sets nlmsvc_pid=0
> > 
> > Perhaps we need to have only lockd_down set those vars so that it's
> > done under the nlmsvc_mutex?
> > 
> > Thoughts?
> 
> For cleanliness, I suggest adding a new refcount to lockd: nlmsvc_ref.
> Replace the test for (nlmsvc_users || !signalled()) in lockd() with a
> test for nlmsvc_ref != 0, and then have lockd_up() take one reference to
> it whenever it bumps nlmsvc_users == 0  / lockd_down() release a
> reference whenever it decrements nlmsvc_users == 1.
> 
> Have nlmsvc_insert_block() take a reference whenever adding something to
> an empty nlm_blocked / nlmsvc_unlink_block release the reference
> whenever nlm_blocked is emptied.
> 
> Trond
> 

Now that I've started really digging into this, I'm thinking that I may
be wrong about the race that exists in current mainline. There was a
change done ~June 2007:

commit 34f52e3591f241b825353ba27def956d8487c400
Author: Trond Myklebust <Trond.Myklebust@netapp.com>
Date:   Thu Jun 14 16:40:31 2007 -0400

    SUNRPC: Convert rpc_clnt->cl_users to a kref
    
    Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

...this changed nlm_destroy_host from just setting cl_dead to instead
use rpc_shutdown_client. So this code now actually kills active RPC
tasks for the RPC client and waits for them to come down instead of
just marking the client dead. This should mitigate the race that
definitely exists in earlier kernels.

I have seen a crash on recent kernels that looks very similar to the
problem that I originally described. Somehow a nlmsvc_grant_callback
outlived its lockd. I've only seen it once though, so I think I have to
conclude that the race there is something different and much more
subtle.

Adding reference counting might still be the way to go, but it's
not clear to me that basing it on adding and removing from the
nlm_blocked list will eliminate whatever this race is.

-- 
Jeff Layton <jlayton@redhat.com>

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH][RFC] use after free in NLM subsystem -- how best to fix it?
  2007-09-27 17:59       ` Jeff Layton
@ 2007-09-27 18:38         ` J. Bruce Fields
  2007-09-27 19:09           ` Jeff Layton
  0 siblings, 1 reply; 11+ messages in thread
From: J. Bruce Fields @ 2007-09-27 18:38 UTC (permalink / raw)
  To: Jeff Layton; +Cc: nfs

On Thu, Sep 27, 2007 at 01:59:38PM -0400, Jeff Layton wrote:
> Now that I've started really digging into this, I'm thinking that I may
> be wrong about the race that exists in current mainline. There was a
> change done ~June 2007:
> 
> commit 34f52e3591f241b825353ba27def956d8487c400
> Author: Trond Myklebust <Trond.Myklebust@netapp.com>
> Date:   Thu Jun 14 16:40:31 2007 -0400
> 
>     SUNRPC: Convert rpc_clnt->cl_users to a kref
>     
>     Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> 
> ...this changed nlm_destroy_host from just setting cl_dead to instead
> use rpc_shutdown_client. So this code now actually kills active RPC
> tasks for the RPC client and waits for them to come down instead of
> just marking the client dead. This should mitigate the race that
> definitely exists in earlier kernels.

Is there still a window where lockd could be killed just as someone is
starting a new rpc (but the task isn't yet visible to
rpc_shutdown_client)?

--b.

> I have seen a crash on recent kernels that looks very similar to the
> problem that I originally described. Somehow a nlmsvc_grant_callback
> outlived its lockd. I've only seen it once though, so I think I have to
> conclude that the race there is something different and much more
> subtle.
> 
> Adding reference counting might still be the way to go, but it's
> not clear to me that basing it on adding and removing from the
> nlm_blocked list will eliminate whatever this race is.
> 
> -- 
> Jeff Layton <jlayton@redhat.com>
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2005.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> NFS maillist  -  NFS@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH][RFC] use after free in NLM subsystem -- how best to fix it?
  2007-09-27 18:38         ` J. Bruce Fields
@ 2007-09-27 19:09           ` Jeff Layton
  2007-09-27 20:55             ` J. Bruce Fields
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff Layton @ 2007-09-27 19:09 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: nfs

On Thu, 27 Sep 2007 14:38:03 -0400
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Thu, Sep 27, 2007 at 01:59:38PM -0400, Jeff Layton wrote:
> > Now that I've started really digging into this, I'm thinking that I may
> > be wrong about the race that exists in current mainline. There was a
> > change done ~June 2007:
> > 
> > commit 34f52e3591f241b825353ba27def956d8487c400
> > Author: Trond Myklebust <Trond.Myklebust@netapp.com>
> > Date:   Thu Jun 14 16:40:31 2007 -0400
> > 
> >     SUNRPC: Convert rpc_clnt->cl_users to a kref
> >     
> >     Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> > 
> > ...this changed nlm_destroy_host from just setting cl_dead to instead
> > use rpc_shutdown_client. So this code now actually kills active RPC
> > tasks for the RPC client and waits for them to come down instead of
> > just marking the client dead. This should mitigate the race that
> > definitely exists in earlier kernels.
> 
> Is there still a window where lockd could be killed just as someone is
> starting a new rpc (but the task isn't yet visible to
> rpc_shutdown_client)?
> 

Perhaps, but I don't think that's the case here. Here is the oops
message:

https://bugzilla.redhat.com/show_bug.cgi?id=253754#c7

It crashed in rpciod while doing svc_wake_up from an async call. Unless
I'm missing something, the only way that could happen is from
nlmsvc_grant_callback. That's the rpc callback from
nlmsvc_grant_blocked, and that function is only ever called from lockd
itself.

So that question becomes:

  Is there still a window where lockd could be killed just as lockd is
  starting a new rpc (but the task isn't yet visible to
  rpc_shutdown_client)?

I'm thinking the answer here is no, since the call would happen near the
top of the event loop, and nlm_shutdown_hosts occurs well after that.

-- 
Jeff Layton <jlayton@redhat.com>

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH][RFC] use after free in NLM subsystem -- how best to fix it?
  2007-09-27 19:09           ` Jeff Layton
@ 2007-09-27 20:55             ` J. Bruce Fields
  2007-09-28  1:13               ` Jeff Layton
  0 siblings, 1 reply; 11+ messages in thread
From: J. Bruce Fields @ 2007-09-27 20:55 UTC (permalink / raw)
  To: Jeff Layton; +Cc: nfs

On Thu, Sep 27, 2007 at 03:09:27PM -0400, Jeff Layton wrote:
> On Thu, 27 Sep 2007 14:38:03 -0400
> "J. Bruce Fields" <bfields@fieldses.org> wrote:
> 
> > On Thu, Sep 27, 2007 at 01:59:38PM -0400, Jeff Layton wrote:
> > > Now that I've started really digging into this, I'm thinking that I may
> > > be wrong about the race that exists in current mainline. There was a
> > > change done ~June 2007:
> > > 
> > > commit 34f52e3591f241b825353ba27def956d8487c400
> > > Author: Trond Myklebust <Trond.Myklebust@netapp.com>
> > > Date:   Thu Jun 14 16:40:31 2007 -0400
> > > 
> > >     SUNRPC: Convert rpc_clnt->cl_users to a kref
> > >     
> > >     Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> > > 
> > > ...this changed nlm_destroy_host from just setting cl_dead to instead
> > > use rpc_shutdown_client. So this code now actually kills active RPC
> > > tasks for the RPC client and waits for them to come down instead of
> > > just marking the client dead. This should mitigate the race that
> > > definitely exists in earlier kernels.
> > 
> > Is there still a window where lockd could be killed just as someone is
> > starting a new rpc (but the task isn't yet visible to
> > rpc_shutdown_client)?
> > 
> 
> Perhaps, but I don't think that's the case here. Here is the oops
> message:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=253754#c7
> 
> It crashed in rpciod while doing svc_wake_up from an async call. Unless
> I'm missing something, the only way that could happen is from
> nlmsvc_grant_callback. That's the rpc callback from
> nlmsvc_grant_blocked, and that function is only ever called from lockd
> itself.
> 
> So that question becomes:
> 
>   Is there still a window where lockd could be killed just as lockd is
>   starting a new rpc (but the task isn't yet visible to
>   rpc_shutdown_client)?
> 
> I'm thinking the answer here is no, since the call would happen near the
> top of the event loop, and nlm_shutdown_hosts occurs well after that.

Without actually looking at the code (but going on the memory of a
similar-looking bug in the delegation callback code): is there a reason
the crash would have to occur right after the rpc_shutdown_client()
call?  If the problem occurs because, say, task->tk_client points to
freed memory, it may take a while for that memory to actually be
overwritten, so it may look just OK enough for the rpc code to still
limp on a little while longer before crashing.

--b.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH][RFC] use after free in NLM subsystem -- how best to fix it?
  2007-09-27 20:55             ` J. Bruce Fields
@ 2007-09-28  1:13               ` Jeff Layton
  2007-09-28 12:37                 ` Jeff Layton
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff Layton @ 2007-09-28  1:13 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: nfs

On Thu, 27 Sep 2007 16:55:33 -0400
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Thu, Sep 27, 2007 at 03:09:27PM -0400, Jeff Layton wrote:
> > On Thu, 27 Sep 2007 14:38:03 -0400
> > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > 
> > > On Thu, Sep 27, 2007 at 01:59:38PM -0400, Jeff Layton wrote:
> > > > Now that I've started really digging into this, I'm thinking that I may
> > > > be wrong about the race that exists in current mainline. There was a
> > > > change done ~June 2007:
> > > > 
> > > > commit 34f52e3591f241b825353ba27def956d8487c400
> > > > Author: Trond Myklebust <Trond.Myklebust@netapp.com>
> > > > Date:   Thu Jun 14 16:40:31 2007 -0400
> > > > 
> > > >     SUNRPC: Convert rpc_clnt->cl_users to a kref
> > > >     
> > > >     Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> > > > 
> > > > ...this changed nlm_destroy_host from just setting cl_dead to instead
> > > > use rpc_shutdown_client. So this code now actually kills active RPC
> > > > tasks for the RPC client and waits for them to come down instead of
> > > > just marking the client dead. This should mitigate the race that
> > > > definitely exists in earlier kernels.
> > > 
> > > Is there still a window where lockd could be killed just as someone is
> > > starting a new rpc (but the task isn't yet visible to
> > > rpc_shutdown_client)?
> > > 
> > 
> > Perhaps, but I don't think that's the case here. Here is the oops
> > message:
> > 
> > https://bugzilla.redhat.com/show_bug.cgi?id=253754#c7
> > 
> > It crashed in rpciod while doing svc_wake_up from an async call. Unless
> > I'm missing something, the only way that could happen is from
> > nlmsvc_grant_callback. That's the rpc callback from
> > nlmsvc_grant_blocked, and that function is only ever called from lockd
> > itself.
> > 
> > So that question becomes:
> > 
> >   Is there still a window where lockd could be killed just as lockd is
> >   starting a new rpc (but the task isn't yet visible to
> >   rpc_shutdown_client)?
> > 
> > I'm thinking the answer here is no, since the call would happen near the
> > top of the event loop, and nlm_shutdown_hosts occurs well after that.
> 
> Without actually looking at the code (but going on the memory of a
> similar-looking bug in the delegation callback code): is there a reason
> the crash would have to occur right after the rpc_shutdown_client()
> call? If the problem occurs because, say, task->tk_client points to
> freed memory, it may take a while for that memory to actually be
> overwritten, so it may look just OK enough for the rpc code to still
> limp on a little while longer before crashing.
> 

The kernel where I saw this had CONFIG_SLUB_DEBUG_ON, so the memory
should have been poisoned after free (you can see some of the poisoning
patterns in the registers in the oops).

One possibility is that lockd took a long time to come down and a
new lockd was started before it did. That might prevent
nlmsvc_invalidate_all from being called at all on the old lockd. The
new lockd then goes to process the block, async call goes out and the
callback is called, but by that time b_daemon points to freed memory.

Alas, I didn't get a coredump from that crash, so I can't go back and
look to see if the KERN_DEBUG message got printed. IIRC though, I was
thrashing lockd up and down pretty quickly, so the above race might
not be too unlikely.

I think for now, I'm going to focus on defining what the behavior
should be as far as allowing a new lockd to start while the old one
is still running. And then once we do that, to try to clean up the
locking around some of the vars that define that behavior.

That seems to be an obvious problem anyway and I have a suspicion it's
at least related to this crash...

-- 
Jeff Layton <jlayton@redhat.com>

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH][RFC] use after free in NLM subsystem -- how best to fix it?
  2007-09-28  1:13               ` Jeff Layton
@ 2007-09-28 12:37                 ` Jeff Layton
  0 siblings, 0 replies; 11+ messages in thread
From: Jeff Layton @ 2007-09-28 12:37 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: nfs

On Thu, 27 Sep 2007 21:13:07 -0400
Jeff Layton <jlayton@redhat.com> wrote:
> I think for now, I'm going to focus on defining what the behavior
> should be as far as allowing a new lockd to start while the old one
> is still running. And then once we do that, to try to clean up the
> locking around some of the vars that define that behavior.

Now that I've looked over this, there are quite a few things that
don't seem to be adequately protected against concurrent accesses by
multiple lockds. The simplest thing to do here would seem to be to
prevent multiple lockd's from ever being started.

Some questions:

1) the code seems to go out of its way to allow a new lockd to start
before the old one comes all the way down. Is there really a situation
where we need this? It seems like this could be papering over bugs
that prevent lockd from coming down properly. If it is needed then
I suppose we'll have to focus on properly preventing races when
accessing global vars in this situation.

2) lockd seems to take the BKL early and drop it just before exit. Does
anyone know what that's intended to protect? This might be a good time
to take the BKL out of lockd altogether...

-- 
Jeff Layton <jlayton@redhat.com>

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2007-09-28 12:37 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-24 20:12 [PATCH][RFC] use after free in NLM subsystem -- how best to fix it? Jeff Layton
2007-09-24 22:13 ` Trond Myklebust
2007-09-25 14:25   ` Jeff Layton
2007-09-25 15:26     ` J. Bruce Fields
2007-09-25 17:05     ` Trond Myklebust
2007-09-27 17:59       ` Jeff Layton
2007-09-27 18:38         ` J. Bruce Fields
2007-09-27 19:09           ` Jeff Layton
2007-09-27 20:55             ` J. Bruce Fields
2007-09-28  1:13               ` Jeff Layton
2007-09-28 12:37                 ` Jeff Layton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.