public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
* Another possible race in put_rpccred / rpcauth_unhash_cred
@ 2009-11-18  0:56 Neil Brown
       [not found] ` <19203.18009.945354.446817-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: Neil Brown @ 2009-11-18  0:56 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs


Hi again Trond,

 I've been looking at a customer report of an oops in
 rpcauth_lookup_credcache.
   https://bugzilla.novell.com/show_bug.cgi?id=547137
 
 It appears that one of the hash chains in the cache has become
 corrupted.

 In trying to guess how this might happen, the one possibility I have
 found is that it might be possible for hlist_del_rcu to be called on
 an rpc_cred that is not hashed.  This certainly could cause list
 corruption.

 put_rpccred checks if the cred is hashed (RPCAUTH_CRED_HASHED) and if
 not, it takes a lock and then - if the cred it not uptodate -
 calls rpcauth_unhash_cred which will call hlist_del_rcu.
 If some other thread unhashed the cred while this thread was waiting
 for rpc_credcache_lock, we could get list corruption.

 The kernel in question did not have commit 5f707eb429 which claims to
 fix a race in much the same place, however I don't think that patch
 would fix this problem.
 
 I think the simplest fix would be to use hlist_del_init_rcu in place
 of hlist_del_rcu as shown below.  That version protects again
 deleting the same element multiple times.

 Do you agree with this fix?

Thanks,
NeilBrown



diff --git a/net/sunrpc/auth.c b/net/sunrpc/auth.c
index 54a4e04..323eff6 100644
--- a/net/sunrpc/auth.c
+++ b/net/sunrpc/auth.c
@@ -118,7 +118,7 @@ static DEFINE_SPINLOCK(rpc_credcache_lock);
 static void
 rpcauth_unhash_cred_locked(struct rpc_cred *cred)
 {
-	hlist_del_rcu(&cred->cr_hash);
+	hlist_del_init_rcu(&cred->cr_hash);
 	smp_mb__before_clear_bit();
 	clear_bit(RPCAUTH_CRED_HASHED, &cred->cr_flags);
 }

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: Another possible race in put_rpccred / rpcauth_unhash_cred
       [not found] ` <19203.18009.945354.446817-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2009-11-19 15:33   ` Trond Myklebust
  2009-11-20 16:48   ` Trond Myklebust
  1 sibling, 0 replies; 3+ messages in thread
From: Trond Myklebust @ 2009-11-19 15:33 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-nfs

On Wed, 2009-11-18 at 11:56 +1100, Neil Brown wrote: 
> Hi again Trond,
> 
>  I've been looking at a customer report of an oops in
>  rpcauth_lookup_credcache.
>    https://bugzilla.novell.com/show_bug.cgi?id=547137
>  
>  It appears that one of the hash chains in the cache has become
>  corrupted.
> 
>  In trying to guess how this might happen, the one possibility I have
>  found is that it might be possible for hlist_del_rcu to be called on
>  an rpc_cred that is not hashed.  This certainly could cause list
>  corruption.
> 
>  put_rpccred checks if the cred is hashed (RPCAUTH_CRED_HASHED) and if
>  not, it takes a lock and then - if the cred it not uptodate -
>  calls rpcauth_unhash_cred which will call hlist_del_rcu.
>  If some other thread unhashed the cred while this thread was waiting
>  for rpc_credcache_lock, we could get list corruption.
> 
>  The kernel in question did not have commit 5f707eb429 which claims to
>  fix a race in much the same place, however I don't think that patch
>  would fix this problem.
>  
>  I think the simplest fix would be to use hlist_del_init_rcu in place
>  of hlist_del_rcu as shown below.  That version protects again
>  deleting the same element multiple times.
> 
>  Do you agree with this fix?
> 
> Thanks,
> NeilBrown
> 
> 
> 
> diff --git a/net/sunrpc/auth.c b/net/sunrpc/auth.c
> index 54a4e04..323eff6 100644
> --- a/net/sunrpc/auth.c
> +++ b/net/sunrpc/auth.c
> @@ -118,7 +118,7 @@ static DEFINE_SPINLOCK(rpc_credcache_lock);
>  static void
>  rpcauth_unhash_cred_locked(struct rpc_cred *cred)
>  {
> -	hlist_del_rcu(&cred->cr_hash);
> +	hlist_del_init_rcu(&cred->cr_hash);
>  	smp_mb__before_clear_bit();
>  	clear_bit(RPCAUTH_CRED_HASHED, &cred->cr_flags);
>  }

Hmm... Why not simply convert that 'clear_bit()' into a
'test_and_clear_bit()' and have it protect the hlist_del_rcu?

I can't see anything that would break if we do this. There doesn't seem
to be anything that explicitly depends on the ordering of those two
operations...

Cheers
  Trond


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Another possible race in put_rpccred / rpcauth_unhash_cred
       [not found] ` <19203.18009.945354.446817-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  2009-11-19 15:33   ` Trond Myklebust
@ 2009-11-20 16:48   ` Trond Myklebust
  1 sibling, 0 replies; 3+ messages in thread
From: Trond Myklebust @ 2009-11-20 16:48 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-nfs

On Wed, 2009-11-18 at 11:56 +1100, Neil Brown wrote: 
> Hi again Trond,
> 
>  I've been looking at a customer report of an oops in
>  rpcauth_lookup_credcache.
>    https://bugzilla.novell.com/show_bug.cgi?id=547137
>  
>  It appears that one of the hash chains in the cache has become
>  corrupted.

How about the following patch?

Cheers
  Trond
-------------------------------------------------------------------------- 
RPC: Fix two potential races in put_rpccred

From: Trond Myklebust <Trond.Myklebust@netapp.com>

It is possible for rpcauth_destroy_credcache() to cause the rpc credentials
to be unhashed while put_rpccred is waiting for the rpc_credcache_lock on
another cpu.
Should this happen, then we can end up calling hlist_del_rcu(&cred->cr_hash)
a second time, thus causing list corruption.


Should the credential actually be hashed, it is possible for
rpcauth_lookup_credcache to find and reference it before we get round to
unhashing it. We should therefore check for whether the credential is
referenced after we've unhashed it.
See https://bugzilla.novell.com/show_bug.cgi?id=547137

Reported-by: Neil Brown <neilb@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 net/sunrpc/auth.c |   39 +++++++++++++++++++++++----------------
 1 files changed, 23 insertions(+), 16 deletions(-)


diff --git a/net/sunrpc/auth.c b/net/sunrpc/auth.c
index 54a4e04..7ee6f7e 100644
--- a/net/sunrpc/auth.c
+++ b/net/sunrpc/auth.c
@@ -123,16 +123,19 @@ rpcauth_unhash_cred_locked(struct rpc_cred *cred)
 	clear_bit(RPCAUTH_CRED_HASHED, &cred->cr_flags);
 }
 
-static void
+static int
 rpcauth_unhash_cred(struct rpc_cred *cred)
 {
 	spinlock_t *cache_lock;
+	int ret;
 
 	cache_lock = &cred->cr_auth->au_credcache->lock;
 	spin_lock(cache_lock);
-	if (atomic_read(&cred->cr_count) == 0)
+	ret = atomic_read(&cred->cr_count) == 0;
+	if (ret)
 		rpcauth_unhash_cred_locked(cred);
 	spin_unlock(cache_lock);
+	return ret;
 }
 
 /*
@@ -446,31 +449,35 @@ void
 put_rpccred(struct rpc_cred *cred)
 {
 	/* Fast path for unhashed credentials */
-	if (test_bit(RPCAUTH_CRED_HASHED, &cred->cr_flags) != 0)
-		goto need_lock;
-
-	if (!atomic_dec_and_test(&cred->cr_count))
+	if (test_bit(RPCAUTH_CRED_HASHED, &cred->cr_flags) == 0) {
+		if (atomic_dec_and_test(&cred->cr_count))
+			cred->cr_ops->crdestroy(cred);
 		return;
-	goto out_destroy;
-need_lock:
+	}
+
 	if (!atomic_dec_and_lock(&cred->cr_count, &rpc_credcache_lock))
 		return;
 	if (!list_empty(&cred->cr_lru)) {
 		number_cred_unused--;
 		list_del_init(&cred->cr_lru);
 	}
-	if (test_bit(RPCAUTH_CRED_UPTODATE, &cred->cr_flags) == 0)
-		rpcauth_unhash_cred(cred);
 	if (test_bit(RPCAUTH_CRED_HASHED, &cred->cr_flags) != 0) {
-		cred->cr_expire = jiffies;
-		list_add_tail(&cred->cr_lru, &cred_unused);
-		number_cred_unused++;
-		spin_unlock(&rpc_credcache_lock);
-		return;
+		if (test_bit(RPCAUTH_CRED_UPTODATE, &cred->cr_flags) != 0) {
+			cred->cr_expire = jiffies;
+			list_add_tail(&cred->cr_lru, &cred_unused);
+			number_cred_unused++;
+			goto out_nodestroy;
+		}
+		if (!rpcauth_unhash_cred(cred)) {
+			/* We were hashed and someone looked us up... */
+			goto out_nodestroy;
+		}
 	}
 	spin_unlock(&rpc_credcache_lock);
-out_destroy:
 	cred->cr_ops->crdestroy(cred);
+	return;
+out_nodestroy:
+	spin_unlock(&rpc_credcache_lock);
 }
 EXPORT_SYMBOL_GPL(put_rpccred);
 



^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-11-20 16:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-18  0:56 Another possible race in put_rpccred / rpcauth_unhash_cred Neil Brown
     [not found] ` <19203.18009.945354.446817-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-11-19 15:33   ` Trond Myklebust
2009-11-20 16:48   ` Trond Myklebust

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox