* Another possible race in put_rpccred / rpcauth_unhash_cred
@ 2009-11-18 0:56 Neil Brown
[not found] ` <19203.18009.945354.446817-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
0 siblings, 1 reply; 3+ messages in thread
From: Neil Brown @ 2009-11-18 0:56 UTC (permalink / raw)
To: Trond Myklebust; +Cc: linux-nfs
Hi again Trond,
I've been looking at a customer report of an oops in
rpcauth_lookup_credcache.
https://bugzilla.novell.com/show_bug.cgi?id=547137
It appears that one of the hash chains in the cache has become
corrupted.
In trying to guess how this might happen, the one possibility I have
found is that it might be possible for hlist_del_rcu to be called on
an rpc_cred that is not hashed. This certainly could cause list
corruption.
put_rpccred checks if the cred is hashed (RPCAUTH_CRED_HASHED) and if
not, it takes a lock and then - if the cred it not uptodate -
calls rpcauth_unhash_cred which will call hlist_del_rcu.
If some other thread unhashed the cred while this thread was waiting
for rpc_credcache_lock, we could get list corruption.
The kernel in question did not have commit 5f707eb429 which claims to
fix a race in much the same place, however I don't think that patch
would fix this problem.
I think the simplest fix would be to use hlist_del_init_rcu in place
of hlist_del_rcu as shown below. That version protects again
deleting the same element multiple times.
Do you agree with this fix?
Thanks,
NeilBrown
diff --git a/net/sunrpc/auth.c b/net/sunrpc/auth.c
index 54a4e04..323eff6 100644
--- a/net/sunrpc/auth.c
+++ b/net/sunrpc/auth.c
@@ -118,7 +118,7 @@ static DEFINE_SPINLOCK(rpc_credcache_lock);
static void
rpcauth_unhash_cred_locked(struct rpc_cred *cred)
{
- hlist_del_rcu(&cred->cr_hash);
+ hlist_del_init_rcu(&cred->cr_hash);
smp_mb__before_clear_bit();
clear_bit(RPCAUTH_CRED_HASHED, &cred->cr_flags);
}
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: Another possible race in put_rpccred / rpcauth_unhash_cred
[not found] ` <19203.18009.945354.446817-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2009-11-19 15:33 ` Trond Myklebust
2009-11-20 16:48 ` Trond Myklebust
1 sibling, 0 replies; 3+ messages in thread
From: Trond Myklebust @ 2009-11-19 15:33 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-nfs
On Wed, 2009-11-18 at 11:56 +1100, Neil Brown wrote:
> Hi again Trond,
>
> I've been looking at a customer report of an oops in
> rpcauth_lookup_credcache.
> https://bugzilla.novell.com/show_bug.cgi?id=547137
>
> It appears that one of the hash chains in the cache has become
> corrupted.
>
> In trying to guess how this might happen, the one possibility I have
> found is that it might be possible for hlist_del_rcu to be called on
> an rpc_cred that is not hashed. This certainly could cause list
> corruption.
>
> put_rpccred checks if the cred is hashed (RPCAUTH_CRED_HASHED) and if
> not, it takes a lock and then - if the cred it not uptodate -
> calls rpcauth_unhash_cred which will call hlist_del_rcu.
> If some other thread unhashed the cred while this thread was waiting
> for rpc_credcache_lock, we could get list corruption.
>
> The kernel in question did not have commit 5f707eb429 which claims to
> fix a race in much the same place, however I don't think that patch
> would fix this problem.
>
> I think the simplest fix would be to use hlist_del_init_rcu in place
> of hlist_del_rcu as shown below. That version protects again
> deleting the same element multiple times.
>
> Do you agree with this fix?
>
> Thanks,
> NeilBrown
>
>
>
> diff --git a/net/sunrpc/auth.c b/net/sunrpc/auth.c
> index 54a4e04..323eff6 100644
> --- a/net/sunrpc/auth.c
> +++ b/net/sunrpc/auth.c
> @@ -118,7 +118,7 @@ static DEFINE_SPINLOCK(rpc_credcache_lock);
> static void
> rpcauth_unhash_cred_locked(struct rpc_cred *cred)
> {
> - hlist_del_rcu(&cred->cr_hash);
> + hlist_del_init_rcu(&cred->cr_hash);
> smp_mb__before_clear_bit();
> clear_bit(RPCAUTH_CRED_HASHED, &cred->cr_flags);
> }
Hmm... Why not simply convert that 'clear_bit()' into a
'test_and_clear_bit()' and have it protect the hlist_del_rcu?
I can't see anything that would break if we do this. There doesn't seem
to be anything that explicitly depends on the ordering of those two
operations...
Cheers
Trond
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Another possible race in put_rpccred / rpcauth_unhash_cred
[not found] ` <19203.18009.945354.446817-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-11-19 15:33 ` Trond Myklebust
@ 2009-11-20 16:48 ` Trond Myklebust
1 sibling, 0 replies; 3+ messages in thread
From: Trond Myklebust @ 2009-11-20 16:48 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-nfs
On Wed, 2009-11-18 at 11:56 +1100, Neil Brown wrote:
> Hi again Trond,
>
> I've been looking at a customer report of an oops in
> rpcauth_lookup_credcache.
> https://bugzilla.novell.com/show_bug.cgi?id=547137
>
> It appears that one of the hash chains in the cache has become
> corrupted.
How about the following patch?
Cheers
Trond
--------------------------------------------------------------------------
RPC: Fix two potential races in put_rpccred
From: Trond Myklebust <Trond.Myklebust@netapp.com>
It is possible for rpcauth_destroy_credcache() to cause the rpc credentials
to be unhashed while put_rpccred is waiting for the rpc_credcache_lock on
another cpu.
Should this happen, then we can end up calling hlist_del_rcu(&cred->cr_hash)
a second time, thus causing list corruption.
Should the credential actually be hashed, it is possible for
rpcauth_lookup_credcache to find and reference it before we get round to
unhashing it. We should therefore check for whether the credential is
referenced after we've unhashed it.
See https://bugzilla.novell.com/show_bug.cgi?id=547137
Reported-by: Neil Brown <neilb@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---
net/sunrpc/auth.c | 39 +++++++++++++++++++++++----------------
1 files changed, 23 insertions(+), 16 deletions(-)
diff --git a/net/sunrpc/auth.c b/net/sunrpc/auth.c
index 54a4e04..7ee6f7e 100644
--- a/net/sunrpc/auth.c
+++ b/net/sunrpc/auth.c
@@ -123,16 +123,19 @@ rpcauth_unhash_cred_locked(struct rpc_cred *cred)
clear_bit(RPCAUTH_CRED_HASHED, &cred->cr_flags);
}
-static void
+static int
rpcauth_unhash_cred(struct rpc_cred *cred)
{
spinlock_t *cache_lock;
+ int ret;
cache_lock = &cred->cr_auth->au_credcache->lock;
spin_lock(cache_lock);
- if (atomic_read(&cred->cr_count) == 0)
+ ret = atomic_read(&cred->cr_count) == 0;
+ if (ret)
rpcauth_unhash_cred_locked(cred);
spin_unlock(cache_lock);
+ return ret;
}
/*
@@ -446,31 +449,35 @@ void
put_rpccred(struct rpc_cred *cred)
{
/* Fast path for unhashed credentials */
- if (test_bit(RPCAUTH_CRED_HASHED, &cred->cr_flags) != 0)
- goto need_lock;
-
- if (!atomic_dec_and_test(&cred->cr_count))
+ if (test_bit(RPCAUTH_CRED_HASHED, &cred->cr_flags) == 0) {
+ if (atomic_dec_and_test(&cred->cr_count))
+ cred->cr_ops->crdestroy(cred);
return;
- goto out_destroy;
-need_lock:
+ }
+
if (!atomic_dec_and_lock(&cred->cr_count, &rpc_credcache_lock))
return;
if (!list_empty(&cred->cr_lru)) {
number_cred_unused--;
list_del_init(&cred->cr_lru);
}
- if (test_bit(RPCAUTH_CRED_UPTODATE, &cred->cr_flags) == 0)
- rpcauth_unhash_cred(cred);
if (test_bit(RPCAUTH_CRED_HASHED, &cred->cr_flags) != 0) {
- cred->cr_expire = jiffies;
- list_add_tail(&cred->cr_lru, &cred_unused);
- number_cred_unused++;
- spin_unlock(&rpc_credcache_lock);
- return;
+ if (test_bit(RPCAUTH_CRED_UPTODATE, &cred->cr_flags) != 0) {
+ cred->cr_expire = jiffies;
+ list_add_tail(&cred->cr_lru, &cred_unused);
+ number_cred_unused++;
+ goto out_nodestroy;
+ }
+ if (!rpcauth_unhash_cred(cred)) {
+ /* We were hashed and someone looked us up... */
+ goto out_nodestroy;
+ }
}
spin_unlock(&rpc_credcache_lock);
-out_destroy:
cred->cr_ops->crdestroy(cred);
+ return;
+out_nodestroy:
+ spin_unlock(&rpc_credcache_lock);
}
EXPORT_SYMBOL_GPL(put_rpccred);
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-11-20 16:48 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-18 0:56 Another possible race in put_rpccred / rpcauth_unhash_cred Neil Brown
[not found] ` <19203.18009.945354.446817-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-11-19 15:33 ` Trond Myklebust
2009-11-20 16:48 ` Trond Myklebust
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox