* [PATCH] svcrpc: modifying positive sunrpc cache entries is racy @ 2010-12-29 20:47 J. Bruce Fields 2010-12-29 20:59 ` J. Bruce Fields 2011-01-03 22:26 ` J. Bruce Fields 0 siblings, 2 replies; 20+ messages in thread From: J. Bruce Fields @ 2010-12-29 20:47 UTC (permalink / raw) To: linux-nfs; +Cc: Neil Brown From: J. Bruce Fields <bfields@redhat.com> Once a sunrpc cache entry is non-NEGATIVE, we should be replacing it (and allowing any concurrent users to destroy it on last put) instead of trying to update it in place. Otherwise someone referencing the ip_map we're modifying here could try to use the m_client just as we're putting the last reference. The bug should only be seen by users of the legacy nfsd interfaces. Signed-off-by: J. Bruce Fields <bfields@redhat.com> --- net/sunrpc/svcauth_unix.c | 18 ++++++++++++++++-- 1 files changed, 16 insertions(+), 2 deletions(-) Intended to apply for 2.6.38 if this looks right.... --b. diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c index a04ac91..70586ad 100644 --- a/net/sunrpc/svcauth_unix.c +++ b/net/sunrpc/svcauth_unix.c @@ -386,6 +386,21 @@ int auth_unix_forget_old(struct auth_domain *dom) } EXPORT_SYMBOL_GPL(auth_unix_forget_old); +static void auth_unix_invalidate_ip_map(struct cache_detail *cd, struct ip_map *ipm) +{ + struct cache_head *ch; + struct ip_map ip; + + ip.m_client = ipm->m_client; + ip.h.flags = CACHE_NEGATIVE; + ip.h.expiry_time = ipm->h.expiry_time; + ch = sunrpc_cache_update(cd, &ip.h, &ipm->h, + hash_str(ipm->m_class, IP_HASHBITS) ^ + hash_ip6(ipm->m_addr)); + if (ch) + cache_put(ch, cd); +} + struct auth_domain *auth_unix_lookup(struct net *net, struct in6_addr *addr) { struct ip_map *ipm; @@ -401,8 +416,7 @@ struct auth_domain *auth_unix_lookup(struct net *net, struct in6_addr *addr) return NULL; if ((ipm->m_client->addr_changes - ipm->m_add_change) >0) { - if (test_and_set_bit(CACHE_NEGATIVE, &ipm->h.flags) == 0) - auth_domain_put(&ipm->m_client->h); + auth_unix_invalidate_ip_map(sn->ip_map_cache, ipm); rv = NULL; } else { rv = &ipm->m_client->h; -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy 2010-12-29 20:47 [PATCH] svcrpc: modifying positive sunrpc cache entries is racy J. Bruce Fields @ 2010-12-29 20:59 ` J. Bruce Fields 2010-12-30 1:19 ` Neil Brown 2011-01-03 22:26 ` J. Bruce Fields 1 sibling, 1 reply; 20+ messages in thread From: J. Bruce Fields @ 2010-12-29 20:59 UTC (permalink / raw) To: linux-nfs; +Cc: Neil Brown On Wed, Dec 29, 2010 at 03:47:52PM -0500, bfields wrote: > From: J. Bruce Fields <bfields@redhat.com> > > Once a sunrpc cache entry is non-NEGATIVE, we should be replacing it > (and allowing any concurrent users to destroy it on last put) instead of > trying to update it in place. > > Otherwise someone referencing the ip_map we're modifying here could try > to use the m_client just as we're putting the last reference. > > The bug should only be seen by users of the legacy nfsd interfaces. > > Signed-off-by: J. Bruce Fields <bfields@redhat.com> > --- > net/sunrpc/svcauth_unix.c | 18 ++++++++++++++++-- > 1 files changed, 16 insertions(+), 2 deletions(-) > > Intended to apply for 2.6.38 if this looks right.... Also noticed while trying to track down an rhel5 oops in svcauth_unix_set_client(): - cache_check() can set an entry negative in place, which if nothing else must cause a leak in some cases. (Because when the entry is eventually destroyed, it will be assumed to not have any contents.) I suppose the fix is again to try to adding a new negative entry instead. - since cache_check() doesn't use any locking, I can't see what guarantees that when it sees the CACHE_VALID bit set and CACHE_NEGATIVE cleared, it must necessarily see the new contents. I think that'd be fixed by a wmb() before setting those bits and a rmb() after checking them. I don't know if it's actually possible to hit that bug.... --b. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy 2010-12-29 20:59 ` J. Bruce Fields @ 2010-12-30 1:19 ` Neil Brown 2010-12-30 1:57 ` J. Bruce Fields 0 siblings, 1 reply; 20+ messages in thread From: Neil Brown @ 2010-12-30 1:19 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-nfs On Wed, 29 Dec 2010 15:59:42 -0500 "J. Bruce Fields" <bfields@fieldses.org> wrote: > On Wed, Dec 29, 2010 at 03:47:52PM -0500, bfields wrote: > > From: J. Bruce Fields <bfields@redhat.com> > > > > Once a sunrpc cache entry is non-NEGATIVE, we should be replacing it > > (and allowing any concurrent users to destroy it on last put) instead of > > trying to update it in place. > > > > Otherwise someone referencing the ip_map we're modifying here could try > > to use the m_client just as we're putting the last reference. > > > > The bug should only be seen by users of the legacy nfsd interfaces. > > > > Signed-off-by: J. Bruce Fields <bfields@redhat.com> > > --- > > net/sunrpc/svcauth_unix.c | 18 ++++++++++++++++-- > > 1 files changed, 16 insertions(+), 2 deletions(-) > > > > Intended to apply for 2.6.38 if this looks right.... > > Also noticed while trying to track down an rhel5 oops in > svcauth_unix_set_client(): > > - cache_check() can set an entry negative in place, which if > nothing else must cause a leak in some cases. (Because when > the entry is eventually destroyed, it will be assumed to not > have any contents.) I suppose the fix is again to try to > adding a new negative entry instead. cache_check should only set an entry 'negative' if it is not already valid (rv == -EAGAIN) and there is no up-call pending. Maybe we should check CACHE_VALID again after the test_and_set of CACHE_PENDING, but is a very unlikely race (if it is actually a race at all) > > - since cache_check() doesn't use any locking, I can't see what > guarantees that when it sees the CACHE_VALID bit set and > CACHE_NEGATIVE cleared, it must necessarily see the new > contents. I think that'd be fixed by a wmb() before setting > those bits and a rmb() after checking them. I don't know if > it's actually possible to hit that bug.... Yes, we probably want a set_bit_lock in cache_fresh_locked() though I don't think that exists, so we could use test_and_set_bit_locked() instead. But it does feel like maybe we should add some locking to cache_check. Take the lock at the the start, and release it after the test_and_set_bit(CACHE_PENDING) or once we have decided not to do that ??? I think when I wrote this I might have thought that bit ops implied memory ordering ... or maybe I just didn't think through the issues properly at all. Thanks, NeilBrown > > --b. > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy 2010-12-30 1:19 ` Neil Brown @ 2010-12-30 1:57 ` J. Bruce Fields 2011-01-03 20:55 ` J. Bruce Fields 0 siblings, 1 reply; 20+ messages in thread From: J. Bruce Fields @ 2010-12-30 1:57 UTC (permalink / raw) To: Neil Brown; +Cc: linux-nfs On Thu, Dec 30, 2010 at 12:19:40PM +1100, Neil Brown wrote: > On Wed, 29 Dec 2010 15:59:42 -0500 "J. Bruce Fields" <bfields@fieldses.org> > wrote: > > > On Wed, Dec 29, 2010 at 03:47:52PM -0500, bfields wrote: > > > From: J. Bruce Fields <bfields@redhat.com> > > > > > > Once a sunrpc cache entry is non-NEGATIVE, we should be replacing it > > > (and allowing any concurrent users to destroy it on last put) instead of > > > trying to update it in place. > > > > > > Otherwise someone referencing the ip_map we're modifying here could try > > > to use the m_client just as we're putting the last reference. > > > > > > The bug should only be seen by users of the legacy nfsd interfaces. > > > > > > Signed-off-by: J. Bruce Fields <bfields@redhat.com> > > > --- > > > net/sunrpc/svcauth_unix.c | 18 ++++++++++++++++-- > > > 1 files changed, 16 insertions(+), 2 deletions(-) > > > > > > Intended to apply for 2.6.38 if this looks right.... > > > > Also noticed while trying to track down an rhel5 oops in > > svcauth_unix_set_client(): > > > > - cache_check() can set an entry negative in place, which if > > nothing else must cause a leak in some cases. (Because when > > the entry is eventually destroyed, it will be assumed to not > > have any contents.) I suppose the fix is again to try to > > adding a new negative entry instead. > > cache_check should only set an entry 'negative' if it is not already valid > (rv == -EAGAIN) and there is no up-call pending. I don't think anything keeps VALID from being set after the cache_is_valid check but before the code that does the set_bit(CACHE_NEGATIVE). > Maybe we should check CACHE_VALID again after the test_and_set of > CACHE_PENDING, but is a very unlikely race (if it is actually a race at all) > > > > > - since cache_check() doesn't use any locking, I can't see what > > guarantees that when it sees the CACHE_VALID bit set and > > CACHE_NEGATIVE cleared, it must necessarily see the new > > contents. I think that'd be fixed by a wmb() before setting > > those bits and a rmb() after checking them. I don't know if > > it's actually possible to hit that bug.... > > Yes, we probably want a set_bit_lock in cache_fresh_locked() though I don't > think that exists, so we could use test_and_set_bit_locked() instead. > > But it does feel like maybe we should add some locking to cache_check. > Take the lock at the the start, and release it after the > test_and_set_bit(CACHE_PENDING) or once we have decided not to do that ??? Maybe so. --b. > > I think when I wrote this I might have thought that bit ops implied memory > ordering ... or maybe I just didn't think through the issues properly at all. > > Thanks, > NeilBrown > > > > > > --b. > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy 2010-12-30 1:57 ` J. Bruce Fields @ 2011-01-03 20:55 ` J. Bruce Fields 2011-01-04 5:01 ` NeilBrown 0 siblings, 1 reply; 20+ messages in thread From: J. Bruce Fields @ 2011-01-03 20:55 UTC (permalink / raw) To: Neil Brown; +Cc: linux-nfs On Wed, Dec 29, 2010 at 08:57:19PM -0500, J. Bruce Fields wrote: > On Thu, Dec 30, 2010 at 12:19:40PM +1100, Neil Brown wrote: > > On Wed, 29 Dec 2010 15:59:42 -0500 "J. Bruce Fields" <bfields@fieldses.org> > > wrote: > > > Also noticed while trying to track down an rhel5 oops in > > > svcauth_unix_set_client(): > > > > > > - cache_check() can set an entry negative in place, which if > > > nothing else must cause a leak in some cases. (Because when > > > the entry is eventually destroyed, it will be assumed to not > > > have any contents.) I suppose the fix is again to try to > > > adding a new negative entry instead. > > > > cache_check should only set an entry 'negative' if it is not already valid > > (rv == -EAGAIN) and there is no up-call pending. > > I don't think anything keeps VALID from being set after the > cache_is_valid check but before the code that does the > set_bit(CACHE_NEGATIVE). > > > Maybe we should check CACHE_VALID again after the test_and_set of > > CACHE_PENDING, but is a very unlikely race (if it is actually a race at all) > > > > > > > > - since cache_check() doesn't use any locking, I can't see what > > > guarantees that when it sees the CACHE_VALID bit set and > > > CACHE_NEGATIVE cleared, it must necessarily see the new > > > contents. I think that'd be fixed by a wmb() before setting > > > those bits and a rmb() after checking them. I don't know if > > > it's actually possible to hit that bug.... > > > > Yes, we probably want a set_bit_lock in cache_fresh_locked() though I don't > > think that exists, so we could use test_and_set_bit_locked() instead. > > > > But it does feel like maybe we should add some locking to cache_check. > > Take the lock at the the start, and release it after the > > test_and_set_bit(CACHE_PENDING) or once we have decided not to do that ??? > > Maybe so. Here's one attempt. --b. commit 55563023f85d01698ccf72325c87e3a7039a189b Author: J. Bruce Fields <bfields@redhat.com> Date: Mon Jan 3 15:10:27 2011 -0500 svcrpc: take locks to fix cache_check races There are at least a couple races in cache_check: - We attempt to turn a cache entry negative in place. But that entry may already have been filled in by some other task since we last checked whether it was valid, so we could be modifying an already-valid entry. If nothing else there's a likely leak in such a case when the entry is eventually put() and contents are not freed because it has CACHE_NEGATIVE set. - If cache_check races with an update that is turning the entry CACHE_VALID, then it's possible that the CACHE_VALID bit could become visible on this CPU before the actual contents do, so we could tell the caller this entry is ready to use when in fact the caller could still get invalid contents. Some memory barriers might be sufficient to fix at least the latter; but for now let's keep things simple and take the hash_lock when we turn an entry negative or check the CACHE_VALID bit. Signed-off-by: J. Bruce Fields <bfields@redhat.com> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c index 0d6002f..2105b40 100644 --- a/net/sunrpc/cache.c +++ b/net/sunrpc/cache.c @@ -200,8 +200,9 @@ static int cache_make_upcall(struct cache_detail *cd, struct cache_head *h) return cd->cache_upcall(cd, h); } -static inline int cache_is_valid(struct cache_detail *detail, struct cache_head *h) +static int __cache_is_valid(struct cache_detail *detail, struct cache_head *h) { + if (!test_bit(CACHE_VALID, &h->flags)) return -EAGAIN; else { @@ -213,6 +214,33 @@ static inline int cache_is_valid(struct cache_detail *detail, struct cache_head } } +static int cache_is_valid(struct cache_detail *detail, struct cache_head *h) +{ + int rv; + + read_lock(&detail->hash_lock); + rv = __cache_is_valid(detail, h); + read_unlock(&detail->hash_lock); + return rv; +} + +static int try_to_negate_entry(struct cache_detail *detail, struct cache_head *h) +{ + int rv; + + write_lock(&detail->hash_lock); + rv = __cache_is_valid(detail, h); + if (rv != -EAGAIN) { + write_unlock(&detail->hash_lock); + return rv; + } + set_bit(CACHE_NEGATIVE, &h->flags); + cache_fresh_locked(h, seconds_since_boot()+CACHE_NEW_EXPIRY); + write_unlock(&detail->hash_lock); + cache_fresh_unlocked(h, detail); + return -ENOENT; +} + /* * This is the generic cache management routine for all * the authentication caches. @@ -251,14 +279,8 @@ int cache_check(struct cache_detail *detail, case -EINVAL: clear_bit(CACHE_PENDING, &h->flags); cache_revisit_request(h); - if (rv == -EAGAIN) { - set_bit(CACHE_NEGATIVE, &h->flags); - cache_fresh_locked(h, seconds_since_boot()+CACHE_NEW_EXPIRY); - cache_fresh_unlocked(h, detail); - rv = -ENOENT; - } + rv = try_to_negate_entry(detail, h); break; - case -EAGAIN: clear_bit(CACHE_PENDING, &h->flags); cache_revisit_request(h); ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy 2011-01-03 20:55 ` J. Bruce Fields @ 2011-01-04 5:01 ` NeilBrown 2011-01-04 15:22 ` J. Bruce Fields 0 siblings, 1 reply; 20+ messages in thread From: NeilBrown @ 2011-01-04 5:01 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-nfs On Mon, 3 Jan 2011 15:55:14 -0500 "J. Bruce Fields" <bfields@fieldses.org> wrote: > On Wed, Dec 29, 2010 at 08:57:19PM -0500, J. Bruce Fields wrote: > > On Thu, Dec 30, 2010 at 12:19:40PM +1100, Neil Brown wrote: > > > On Wed, 29 Dec 2010 15:59:42 -0500 "J. Bruce Fields" <bfields@fieldses.org> > > > wrote: > > > > Also noticed while trying to track down an rhel5 oops in > > > > svcauth_unix_set_client(): > > > > > > > > - cache_check() can set an entry negative in place, which if > > > > nothing else must cause a leak in some cases. (Because when > > > > the entry is eventually destroyed, it will be assumed to not > > > > have any contents.) I suppose the fix is again to try to > > > > adding a new negative entry instead. > > > > > > cache_check should only set an entry 'negative' if it is not already valid > > > (rv == -EAGAIN) and there is no up-call pending. > > > > I don't think anything keeps VALID from being set after the > > cache_is_valid check but before the code that does the > > set_bit(CACHE_NEGATIVE). > > > > > Maybe we should check CACHE_VALID again after the test_and_set of > > > CACHE_PENDING, but is a very unlikely race (if it is actually a race at all) > > > > > > > > > > > - since cache_check() doesn't use any locking, I can't see what > > > > guarantees that when it sees the CACHE_VALID bit set and > > > > CACHE_NEGATIVE cleared, it must necessarily see the new > > > > contents. I think that'd be fixed by a wmb() before setting > > > > those bits and a rmb() after checking them. I don't know if > > > > it's actually possible to hit that bug.... > > > > > > Yes, we probably want a set_bit_lock in cache_fresh_locked() though I don't > > > think that exists, so we could use test_and_set_bit_locked() instead. > > > > > > But it does feel like maybe we should add some locking to cache_check. > > > Take the lock at the the start, and release it after the > > > test_and_set_bit(CACHE_PENDING) or once we have decided not to do that ??? > > > > Maybe so. > > Here's one attempt. > > --b. > > commit 55563023f85d01698ccf72325c87e3a7039a189b > Author: J. Bruce Fields <bfields@redhat.com> > Date: Mon Jan 3 15:10:27 2011 -0500 > > svcrpc: take locks to fix cache_check races > > There are at least a couple races in cache_check: > > - We attempt to turn a cache entry negative in place. But that > entry may already have been filled in by some other task since > we last checked whether it was valid, so we could be modifying > an already-valid entry. If nothing else there's a likely leak > in such a case when the entry is eventually put() and contents > are not freed because it has CACHE_NEGATIVE set. > - If cache_check races with an update that is turning the entry > CACHE_VALID, then it's possible that the CACHE_VALID bit could > become visible on this CPU before the actual contents do, so > we could tell the caller this entry is ready to use when in > fact the caller could still get invalid contents. > > Some memory barriers might be sufficient to fix at least the latter; but > for now let's keep things simple and take the hash_lock when we turn an > entry negative or check the CACHE_VALID bit. > > Signed-off-by: J. Bruce Fields <bfields@redhat.com> > > diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c > index 0d6002f..2105b40 100644 > --- a/net/sunrpc/cache.c > +++ b/net/sunrpc/cache.c > @@ -200,8 +200,9 @@ static int cache_make_upcall(struct cache_detail *cd, struct cache_head *h) > return cd->cache_upcall(cd, h); > } > > -static inline int cache_is_valid(struct cache_detail *detail, struct cache_head *h) > +static int __cache_is_valid(struct cache_detail *detail, struct cache_head *h) > { > + > if (!test_bit(CACHE_VALID, &h->flags)) > return -EAGAIN; > else { > @@ -213,6 +214,33 @@ static inline int cache_is_valid(struct cache_detail *detail, struct cache_head > } > } > > +static int cache_is_valid(struct cache_detail *detail, struct cache_head *h) > +{ > + int rv; > + > + read_lock(&detail->hash_lock); > + rv = __cache_is_valid(detail, h); > + read_unlock(&detail->hash_lock); > + return rv; > +} I don't think there is anything in __cache_is_valid that needs to be protected. The compiler will almost certainly produce code which loads f->flags once and then performs 1 or 2 bit tests against the value in the register and produces one of 3 possible return values based on the result. There is absolutely no value in putting locking around that, especially as CACHE_VALID is never cleared. Maybe you imagine a re-ordering of setting CACHE_NEGATIVE and CACHE_VALID, but as they are in the same cache line (and in fact in the same byte) they cannot be re-ordered. We always set CACHE_NEGATIVE before CACHE_VALID and there is no way those two could get to memory in the wrong order. > + > +static int try_to_negate_entry(struct cache_detail *detail, struct cache_head *h) > +{ > + int rv; > + > + write_lock(&detail->hash_lock); > + rv = __cache_is_valid(detail, h); > + if (rv != -EAGAIN) { > + write_unlock(&detail->hash_lock); > + return rv; > + } > + set_bit(CACHE_NEGATIVE, &h->flags); > + cache_fresh_locked(h, seconds_since_boot()+CACHE_NEW_EXPIRY); > + write_unlock(&detail->hash_lock); > + cache_fresh_unlocked(h, detail); > + return -ENOENT; > +} > + > /* > * This is the generic cache management routine for all > * the authentication caches. > @@ -251,14 +279,8 @@ int cache_check(struct cache_detail *detail, > case -EINVAL: > clear_bit(CACHE_PENDING, &h->flags); > cache_revisit_request(h); > - if (rv == -EAGAIN) { > - set_bit(CACHE_NEGATIVE, &h->flags); > - cache_fresh_locked(h, seconds_since_boot()+CACHE_NEW_EXPIRY); > - cache_fresh_unlocked(h, detail); > - rv = -ENOENT; > - } > + rv = try_to_negate_entry(detail, h); > break; This bit looks good those. It feels much better having an 'unlock' between 'cache_fresh_locked' and 'cache_fresh_unlocked' !! Thanks, NeilBrown > - > case -EAGAIN: > clear_bit(CACHE_PENDING, &h->flags); > cache_revisit_request(h); > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy 2011-01-04 5:01 ` NeilBrown @ 2011-01-04 15:22 ` J. Bruce Fields 2011-01-04 19:23 ` J. Bruce Fields 0 siblings, 1 reply; 20+ messages in thread From: J. Bruce Fields @ 2011-01-04 15:22 UTC (permalink / raw) To: NeilBrown; +Cc: linux-nfs On Tue, Jan 04, 2011 at 04:01:52PM +1100, NeilBrown wrote: > On Mon, 3 Jan 2011 15:55:14 -0500 "J. Bruce Fields" <bfields@fieldses.org> > wrote: > > @@ -213,6 +214,33 @@ static inline int cache_is_valid(struct cache_detail *detail, struct cache_head > > } > > } > > > > +static int cache_is_valid(struct cache_detail *detail, struct cache_head *h) > > +{ > > + int rv; > > + > > + read_lock(&detail->hash_lock); > > + rv = __cache_is_valid(detail, h); > > + read_unlock(&detail->hash_lock); > > + return rv; > > +} > > I don't think there is anything in __cache_is_valid that needs to be > protected. > The compiler will almost certainly produce code which loads f->flags once and > then performs 1 or 2 bit tests against the value in the register and produces > one of 3 possible return values based on the result. > There is absolutely no value in putting locking around that, especially as > CACHE_VALID is never cleared. > > Maybe you imagine a re-ordering of setting CACHE_NEGATIVE and CACHE_VALID, > but as they are in the same cache line (and in fact in the same byte) they > cannot be re-ordered. We always set CACHE_NEGATIVE before CACHE_VALID and > there is no way those two could get to memory in the wrong order. The risk would be a reordering of CACHE_VALID with setting of the actual contents in the !NEGATIVE case, so: task doing lookup task doing update ^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^ 1. test for CACHE_VALID 3. set item->contents & !CACHE_NEGATIVE 2. dereference 4. set CACHE_VALID, clear item->contents->... CACHE_NEGATIVE. As I understand it, if we want to gaurantee that item->contents is good at step 2, then we need a write barrier between 3 and 4, together with a read barrier between 1 and 2. Taking the spinlock is overkill, but should accomplish the same thing, as it forces 1 to occur before 3 or after 4, and adds any necessary memory barriers. --b. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy 2011-01-04 15:22 ` J. Bruce Fields @ 2011-01-04 19:23 ` J. Bruce Fields 2011-01-04 19:31 ` [PATCH 1/2] svcrpc: take lock on turning entry NEGATIVE in cache_check J. Bruce Fields ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: J. Bruce Fields @ 2011-01-04 19:23 UTC (permalink / raw) To: NeilBrown; +Cc: linux-nfs On Tue, Jan 04, 2011 at 10:22:31AM -0500, J. Bruce Fields wrote: > On Tue, Jan 04, 2011 at 04:01:52PM +1100, NeilBrown wrote: > > On Mon, 3 Jan 2011 15:55:14 -0500 "J. Bruce Fields" <bfields@fieldses.org> > > wrote: > > > @@ -213,6 +214,33 @@ static inline int cache_is_valid(struct cache_detail *detail, struct cache_head > > > } > > > } > > > > > > +static int cache_is_valid(struct cache_detail *detail, struct cache_head *h) > > > +{ > > > + int rv; > > > + > > > + read_lock(&detail->hash_lock); > > > + rv = __cache_is_valid(detail, h); > > > + read_unlock(&detail->hash_lock); > > > + return rv; > > > +} > > > > I don't think there is anything in __cache_is_valid that needs to be > > protected. > > The compiler will almost certainly produce code which loads f->flags once and > > then performs 1 or 2 bit tests against the value in the register and produces > > one of 3 possible return values based on the result. > > There is absolutely no value in putting locking around that, especially as > > CACHE_VALID is never cleared. > > > > Maybe you imagine a re-ordering of setting CACHE_NEGATIVE and CACHE_VALID, > > but as they are in the same cache line (and in fact in the same byte) they > > cannot be re-ordered. We always set CACHE_NEGATIVE before CACHE_VALID and > > there is no way those two could get to memory in the wrong order. > > The risk would be a reordering of CACHE_VALID with setting of the actual > contents in the !NEGATIVE case, so: > > task doing lookup task doing update > ^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^ > > 1. test for CACHE_VALID 3. set item->contents > & !CACHE_NEGATIVE > > 2. dereference 4. set CACHE_VALID, clear > item->contents->... CACHE_NEGATIVE. > > As I understand it, if we want to gaurantee that item->contents is good > at step 2, then we need a write barrier between 3 and 4, together with a > read barrier between 1 and 2. Taking the spinlock is overkill, but > should accomplish the same thing, as it forces 1 to occur before 3 or > after 4, and adds any necessary memory barriers. So, that being the real problem, perhaps it's clearer to use explicit memory barriers instead of more locking, and add some comments. Also split this into a separate patch, as in the following. --b. ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/2] svcrpc: take lock on turning entry NEGATIVE in cache_check 2011-01-04 19:23 ` J. Bruce Fields @ 2011-01-04 19:31 ` J. Bruce Fields 2011-01-04 19:31 ` [PATCH 2/2] svcrpc: ensure cache_check caller sees updated entry J. Bruce Fields 2011-01-04 21:10 ` [PATCH] svcrpc: modifying positive sunrpc cache entries is racy NeilBrown 2 siblings, 0 replies; 20+ messages in thread From: J. Bruce Fields @ 2011-01-04 19:31 UTC (permalink / raw) To: NeilBrown; +Cc: linux-nfs, J. Bruce Fields We attempt to turn a cache entry negative in place. But that entry may already have been filled in by some other task since we last checked whether it was valid, so we could be modifying an already-valid entry. If nothing else there's a likely leak in such a case when the entry is eventually put() and contents are not freed because it has CACHE_NEGATIVE set. So, take the cache_lock just as sunrpc_cache_update() does. Signed-off-by: J. Bruce Fields <bfields@redhat.com> --- net/sunrpc/cache.c | 25 ++++++++++++++++++------- 1 files changed, 18 insertions(+), 7 deletions(-) diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c index 0d6002f..a6c5733 100644 --- a/net/sunrpc/cache.c +++ b/net/sunrpc/cache.c @@ -213,6 +213,23 @@ static inline int cache_is_valid(struct cache_detail *detail, struct cache_head } } +static int try_to_negate_entry(struct cache_detail *detail, struct cache_head *h) +{ + int rv; + + write_lock(&detail->hash_lock); + rv = cache_is_valid(detail, h); + if (rv != -EAGAIN) { + write_unlock(&detail->hash_lock); + return rv; + } + set_bit(CACHE_NEGATIVE, &h->flags); + cache_fresh_locked(h, seconds_since_boot()+CACHE_NEW_EXPIRY); + write_unlock(&detail->hash_lock); + cache_fresh_unlocked(h, detail); + return -ENOENT; +} + /* * This is the generic cache management routine for all * the authentication caches. @@ -251,14 +268,8 @@ int cache_check(struct cache_detail *detail, case -EINVAL: clear_bit(CACHE_PENDING, &h->flags); cache_revisit_request(h); - if (rv == -EAGAIN) { - set_bit(CACHE_NEGATIVE, &h->flags); - cache_fresh_locked(h, seconds_since_boot()+CACHE_NEW_EXPIRY); - cache_fresh_unlocked(h, detail); - rv = -ENOENT; - } + rv = try_to_negate_entry(detail, h); break; - case -EAGAIN: clear_bit(CACHE_PENDING, &h->flags); cache_revisit_request(h); -- 1.7.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/2] svcrpc: ensure cache_check caller sees updated entry 2011-01-04 19:23 ` J. Bruce Fields 2011-01-04 19:31 ` [PATCH 1/2] svcrpc: take lock on turning entry NEGATIVE in cache_check J. Bruce Fields @ 2011-01-04 19:31 ` J. Bruce Fields 2011-01-04 21:10 ` [PATCH] svcrpc: modifying positive sunrpc cache entries is racy NeilBrown 2 siblings, 0 replies; 20+ messages in thread From: J. Bruce Fields @ 2011-01-04 19:31 UTC (permalink / raw) To: NeilBrown; +Cc: linux-nfs, J. Bruce Fields Supposes cache_check runs simultaneously with an update on a different CPU: cache_check task doing update ^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^ 1. test for CACHE_VALID 1'. set entry->data & !CACHE_NEGATIVE 2. use entry->data 2'. set CACHE_VALID If the two memory writes performed in step 1' and 2' appear misordered with respect to the reads in step 1 and 2, then the caller could get stale data at step 2 even though it saw CACHE_VALID set on the cache entry. Add memory barriers to prevent this. Signed-off-by: J. Bruce Fields <bfields@redhat.com> --- net/sunrpc/cache.c | 11 ++++++++++- 1 files changed, 10 insertions(+), 1 deletions(-) diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c index a6c5733..72ad836 100644 --- a/net/sunrpc/cache.c +++ b/net/sunrpc/cache.c @@ -128,6 +128,7 @@ static void cache_fresh_locked(struct cache_head *head, time_t expiry) { head->expiry_time = expiry; head->last_refresh = seconds_since_boot(); + smp_wmb(); /* paired with smp_rmb() in cache_is_valid() */ set_bit(CACHE_VALID, &head->flags); } @@ -208,8 +209,16 @@ static inline int cache_is_valid(struct cache_detail *detail, struct cache_head /* entry is valid */ if (test_bit(CACHE_NEGATIVE, &h->flags)) return -ENOENT; - else + else { + /* + * In combination with write barrier in + * sunrpc_cache_update, ensures that anyone + * using the cache entry after this sees the + * updated contents: + */ + smp_rmb(); return 0; + } } } -- 1.7.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy 2011-01-04 19:23 ` J. Bruce Fields 2011-01-04 19:31 ` [PATCH 1/2] svcrpc: take lock on turning entry NEGATIVE in cache_check J. Bruce Fields 2011-01-04 19:31 ` [PATCH 2/2] svcrpc: ensure cache_check caller sees updated entry J. Bruce Fields @ 2011-01-04 21:10 ` NeilBrown [not found] ` <20110105081031.220bfbc9-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org> 2 siblings, 1 reply; 20+ messages in thread From: NeilBrown @ 2011-01-04 21:10 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-nfs On Tue, 4 Jan 2011 14:23:51 -0500 "J. Bruce Fields" <bfields@fieldses.org> wrote: > So, that being the real problem, perhaps it's clearer to use explicit > memory barriers instead of more locking, and add some comments. Also > split this into a separate patch, as in the following. Me likie. Two separate patches is good, and nice comments next to the memory barriers is good. Reviewed-by: NeilBrown <neilb@suse.de> Thanks! NeilBrown ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <20110105081031.220bfbc9-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>]
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy [not found] ` <20110105081031.220bfbc9-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org> @ 2011-01-04 21:15 ` J. Bruce Fields 0 siblings, 0 replies; 20+ messages in thread From: J. Bruce Fields @ 2011-01-04 21:15 UTC (permalink / raw) To: NeilBrown; +Cc: linux-nfs On Wed, Jan 05, 2011 at 08:10:31AM +1100, NeilBrown wrote: > On Tue, 4 Jan 2011 14:23:51 -0500 "J. Bruce Fields" <bfields@fieldses.org> > wrote: > > > So, that being the real problem, perhaps it's clearer to use explicit > > memory barriers instead of more locking, and add some comments. Also > > split this into a separate patch, as in the following. > > Me likie. > > Two separate patches is good, and nice comments next to the memory barriers > is good. > Reviewed-by: NeilBrown <neilb@suse.de> Good, thanks--queueing up for 2.6.38.--b. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy 2010-12-29 20:47 [PATCH] svcrpc: modifying positive sunrpc cache entries is racy J. Bruce Fields 2010-12-29 20:59 ` J. Bruce Fields @ 2011-01-03 22:26 ` J. Bruce Fields 2011-01-04 3:08 ` J. Bruce Fields 1 sibling, 1 reply; 20+ messages in thread From: J. Bruce Fields @ 2011-01-03 22:26 UTC (permalink / raw) To: linux-nfs; +Cc: Neil Brown On Wed, Dec 29, 2010 at 03:47:52PM -0500, bfields wrote: > From: J. Bruce Fields <bfields@redhat.com> > > Once a sunrpc cache entry is non-NEGATIVE, we should be replacing it > (and allowing any concurrent users to destroy it on last put) instead of > trying to update it in place. Or the following seems simpler. (And I was thinking it was necessary to ensure that the right thing happened to the cached xprt->xpt_auth_cache entry--though on a second look I see that sunrpc_cache_update also expires the replaced entry immediately. Still, this seems simpler if it also works.) --b. commit 20771de3185bf0031aa767d3b19f3e744a465a0c Author: J. Bruce Fields <bfields@redhat.com> Date: Fri Dec 24 14:03:38 2010 -0500 svcrpc: modifying valid sunrpc cache entries is racy Once a sunrpc cache entry is non-NEGATIVE, we shouldn't attempt to modify it in place. Otherwise someone referencing the ip_map we're modifying here could try to use the m_client just as we're putting the last reference. Instead, just set the expiry time so it expires immediately. The bug should only be seen by users of the legacy nfsd interfaces. Signed-off-by: J. Bruce Fields <bfields@redhat.com> diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c index a04ac91..5edc147 100644 --- a/net/sunrpc/svcauth_unix.c +++ b/net/sunrpc/svcauth_unix.c @@ -401,8 +401,7 @@ struct auth_domain *auth_unix_lookup(struct net *net, struct in6_addr *addr) return NULL; if ((ipm->m_client->addr_changes - ipm->m_add_change) >0) { - if (test_and_set_bit(CACHE_NEGATIVE, &ipm->h.flags) == 0) - auth_domain_put(&ipm->m_client->h); + ipm->h.expiry_time = 0; rv = NULL; } else { rv = &ipm->m_client->h; ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy 2011-01-03 22:26 ` J. Bruce Fields @ 2011-01-04 3:08 ` J. Bruce Fields 2011-01-04 4:51 ` NeilBrown 0 siblings, 1 reply; 20+ messages in thread From: J. Bruce Fields @ 2011-01-04 3:08 UTC (permalink / raw) To: linux-nfs; +Cc: Neil Brown On Mon, Jan 03, 2011 at 05:26:05PM -0500, J. Bruce Fields wrote: > On Wed, Dec 29, 2010 at 03:47:52PM -0500, bfields wrote: > > From: J. Bruce Fields <bfields@redhat.com> > > > > Once a sunrpc cache entry is non-NEGATIVE, we should be replacing it > > (and allowing any concurrent users to destroy it on last put) instead of > > trying to update it in place. > > Or the following seems simpler. > > (And I was thinking it was necessary to ensure that the right thing > happened to the cached xprt->xpt_auth_cache entry--though on a second > look I see that sunrpc_cache_update also expires the replaced entry > immediately. Still, this seems simpler if it also works.) Eh, on third thoughts: we probably do want a real negative entry created in the cache, so I think the original patch was correct! --b. > > --b. > > commit 20771de3185bf0031aa767d3b19f3e744a465a0c > Author: J. Bruce Fields <bfields@redhat.com> > Date: Fri Dec 24 14:03:38 2010 -0500 > > svcrpc: modifying valid sunrpc cache entries is racy > > Once a sunrpc cache entry is non-NEGATIVE, we shouldn't attempt to > modify it in place. Otherwise someone referencing the ip_map we're > modifying here could try to use the m_client just as we're putting the > last reference. > > Instead, just set the expiry time so it expires immediately. > > The bug should only be seen by users of the legacy nfsd interfaces. > > Signed-off-by: J. Bruce Fields <bfields@redhat.com> > > diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c > index a04ac91..5edc147 100644 > --- a/net/sunrpc/svcauth_unix.c > +++ b/net/sunrpc/svcauth_unix.c > @@ -401,8 +401,7 @@ struct auth_domain *auth_unix_lookup(struct net *net, struct in6_addr *addr) > return NULL; > > if ((ipm->m_client->addr_changes - ipm->m_add_change) >0) { > - if (test_and_set_bit(CACHE_NEGATIVE, &ipm->h.flags) == 0) > - auth_domain_put(&ipm->m_client->h); > + ipm->h.expiry_time = 0; > rv = NULL; > } else { > rv = &ipm->m_client->h; > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy 2011-01-04 3:08 ` J. Bruce Fields @ 2011-01-04 4:51 ` NeilBrown 2011-01-04 18:43 ` J. Bruce Fields 2011-01-04 21:46 ` J. Bruce Fields 0 siblings, 2 replies; 20+ messages in thread From: NeilBrown @ 2011-01-04 4:51 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-nfs On Mon, 3 Jan 2011 22:08:05 -0500 "J. Bruce Fields" <bfields@fieldses.org> wrote: > On Mon, Jan 03, 2011 at 05:26:05PM -0500, J. Bruce Fields wrote: > > On Wed, Dec 29, 2010 at 03:47:52PM -0500, bfields wrote: > > > From: J. Bruce Fields <bfields@redhat.com> > > > > > > Once a sunrpc cache entry is non-NEGATIVE, we should be replacing it > > > (and allowing any concurrent users to destroy it on last put) instead of > > > trying to update it in place. > > > > Or the following seems simpler. > > > > (And I was thinking it was necessary to ensure that the right thing > > happened to the cached xprt->xpt_auth_cache entry--though on a second > > look I see that sunrpc_cache_update also expires the replaced entry > > immediately. Still, this seems simpler if it also works.) > > Eh, on third thoughts: we probably do want a real negative entry created > in the cache, so I think the original patch was correct! I like your second thought better than your third. I don't see any reason to push this item out of the cache extra quickly. In fact I think it would still be correct to just remove those two lines and not set expiry_time to 0. auth_unix_lookup will never find that IP address, and so it doesn't matter if it remains in the cache or not. I guess there could result in the cache appearing to contain different data depending on whether you look at it the 'old' way or the 'new' way, but I don't that is a real problem, and setting expiry_time to 0 overcomes that. What was the substance of your third thought? BTW, you could use sunrpc_invalidate rather than just setting expiry_time to zero, which would hurry it out of the cache a bit faster. And this all made me realise that there is more code that can be placed under CONFIG_NFSD_DEPRECTATED. SUNRPC: Remove more code when NFSD_DEPRECATED is not configured Signed-off-by: NeilBrown <neilb@suse.de> diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h index 6950c98..c2aebe8 100644 --- a/include/linux/sunrpc/cache.h +++ b/include/linux/sunrpc/cache.h @@ -255,10 +255,13 @@ static inline time_t get_expiry(char **bpp) return rv - boot.tv_sec; } +#ifdef CONFIG_NFSD_DEPRECATED static inline void sunrpc_invalidate(struct cache_head *h, struct cache_detail *detail) { h->expiry_time = seconds_since_boot() - 1; detail->nextcheck = seconds_since_boot(); } +#endif /* CONFIG_NFSD_DEPRECATED */ + #endif /* _LINUX_SUNRPC_CACHE_H_ */ diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c index 560677d..92d47ad 100644 --- a/net/sunrpc/svcauth_unix.c +++ b/net/sunrpc/svcauth_unix.c @@ -30,7 +30,9 @@ struct unix_domain { struct auth_domain h; +#ifdef CONFIG_NFSD_DEPRECATED int addr_changes; +#endif /* CONFIG_NFSD_DEPRECATED */ /* other stuff later */ }; @@ -64,7 +66,9 @@ struct auth_domain *unix_domain_find(char *name) return NULL; } new->h.flavour = &svcauth_unix; +#ifdef CONFIG_NFSD_DEPRECATED new->addr_changes = 0; +#endif /* CONFIG_NFSD_DEPRECATED */ rv = auth_domain_lookup(name, &new->h); } } @@ -92,7 +96,9 @@ struct ip_map { char m_class[8]; /* e.g. "nfsd" */ struct in6_addr m_addr; struct unix_domain *m_client; +#ifdef CONFIG_NFSD_DEPRECATED int m_add_change; +#endif /* CONFIG_NFSD_DEPRECATED */ }; static void ip_map_put(struct kref *kref) @@ -146,7 +152,9 @@ static void update(struct cache_head *cnew, struct cache_head *citem) kref_get(&item->m_client->h.ref); new->m_client = item->m_client; +#ifdef CONFIG_NFSD_DEPRECATED new->m_add_change = item->m_add_change; +#endif /* CONFIG_NFSD_DEPRECATED */ } static struct cache_head *ip_map_alloc(void) { @@ -331,6 +339,7 @@ static int __ip_map_update(struct cache_detail *cd, struct ip_map *ipm, ip.h.flags = 0; if (!udom) set_bit(CACHE_NEGATIVE, &ip.h.flags); +#ifdef CONFIG_NFSD_DEPRECATED else { ip.m_add_change = udom->addr_changes; /* if this is from the legacy set_client system call, @@ -339,6 +348,7 @@ static int __ip_map_update(struct cache_detail *cd, struct ip_map *ipm, if (expiry == NEVER) ip.m_add_change++; } +#endif /* CONFIG_NFSD_DEPRECATED */ ip.h.expiry_time = expiry; ch = sunrpc_cache_update(cd, &ip.h, &ipm->h, hash_str(ipm->m_class, IP_HASHBITS) ^ @@ -358,6 +368,7 @@ static inline int ip_map_update(struct net *net, struct ip_map *ipm, return __ip_map_update(sn->ip_map_cache, ipm, udom, expiry); } +#ifdef CONFIG_NFSD_DEPRECATED int auth_unix_add_addr(struct net *net, struct in6_addr *addr, struct auth_domain *dom) { struct unix_domain *udom; @@ -426,6 +437,7 @@ void svcauth_unix_purge(void) } } EXPORT_SYMBOL_GPL(svcauth_unix_purge); +#endif /* CONFIG_NFSD_DEPRECATED */ static inline struct ip_map * ip_map_cached_get(struct svc_xprt *xprt) ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy 2011-01-04 4:51 ` NeilBrown @ 2011-01-04 18:43 ` J. Bruce Fields 2011-01-04 21:15 ` NeilBrown 2011-01-04 21:46 ` J. Bruce Fields 1 sibling, 1 reply; 20+ messages in thread From: J. Bruce Fields @ 2011-01-04 18:43 UTC (permalink / raw) To: NeilBrown; +Cc: linux-nfs On Tue, Jan 04, 2011 at 03:51:07PM +1100, NeilBrown wrote: > On Mon, 3 Jan 2011 22:08:05 -0500 "J. Bruce Fields" <bfields@fieldses.org> > wrote: > > > On Mon, Jan 03, 2011 at 05:26:05PM -0500, J. Bruce Fields wrote: > > > On Wed, Dec 29, 2010 at 03:47:52PM -0500, bfields wrote: > > > > From: J. Bruce Fields <bfields@redhat.com> > > > > > > > > Once a sunrpc cache entry is non-NEGATIVE, we should be replacing it > > > > (and allowing any concurrent users to destroy it on last put) instead of > > > > trying to update it in place. > > > > > > Or the following seems simpler. > > > > > > (And I was thinking it was necessary to ensure that the right thing > > > happened to the cached xprt->xpt_auth_cache entry--though on a second > > > look I see that sunrpc_cache_update also expires the replaced entry > > > immediately. Still, this seems simpler if it also works.) > > > > Eh, on third thoughts: we probably do want a real negative entry created > > in the cache, so I think the original patch was correct! > > I like your second thought better than your third. > I don't see any reason to push this item out of the cache extra quickly. > In fact I think it would still be correct to just remove those two lines and > not set expiry_time to 0. auth_unix_lookup will never find that IP address, > and so it doesn't matter if it remains in the cache or not. > I guess there could result in the cache appearing to contain different data > depending on whether you look at it the 'old' way or the 'new' way, but I > don't that is a real problem, and setting expiry_time to 0 overcomes that. > > What was the substance of your third thought? I had some idea that we could end up with a cache entry stuck in the cache forever. OK, actually I think late last night I reverted into some sort of "maybe if I type the first thing that comes to my mind, somebody else will think this through for me" state. Apologies. Um, did I win? > BTW, you could use sunrpc_invalidate rather than just setting expiry_time to > zero, which would hurry it out of the cache a bit faster. Yep, I like that, and I'd forgotten about sunrpc_invalidate, thanks! Done.... (As below). > And this all made me realise that there is more code that can be placed under > CONFIG_NFSD_DEPRECTATED. Yep, applied. (When do we get to remove all this? Taking a stab at the 2.6.40 merge window.... OK, party at my place in May!) --b. commit ab5c05c579b0b37b9bf2c79c9c8f0ef6045ee41d Author: J. Bruce Fields <bfields@redhat.com> Date: Fri Dec 24 14:03:38 2010 -0500 svcrpc: modifying valid sunrpc cache entries is racy Once a sunrpc cache entry is VALID, we should be replacing it (and allowing any concurrent users to destroy it on last put) instead of trying to update it in place. Otherwise someone referencing the ip_map we're modifying here could try to use the m_client just as we're putting the last reference. The bug should only be seen by users of the legacy nfsd interfaces. Signed-off-by: J. Bruce Fields <bfields@redhat.com> diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c index a04ac91..59a7c52 100644 --- a/net/sunrpc/svcauth_unix.c +++ b/net/sunrpc/svcauth_unix.c @@ -401,8 +401,7 @@ struct auth_domain *auth_unix_lookup(struct net *net, struct in6_addr *addr) return NULL; if ((ipm->m_client->addr_changes - ipm->m_add_change) >0) { - if (test_and_set_bit(CACHE_NEGATIVE, &ipm->h.flags) == 0) - auth_domain_put(&ipm->m_client->h); + sunrpc_invalidate(&ipm->h, sn->ip_map_cache); rv = NULL; } else { rv = &ipm->m_client->h; ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy 2011-01-04 18:43 ` J. Bruce Fields @ 2011-01-04 21:15 ` NeilBrown 2011-01-04 21:21 ` J. Bruce Fields 0 siblings, 1 reply; 20+ messages in thread From: NeilBrown @ 2011-01-04 21:15 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-nfs On Tue, 4 Jan 2011 13:43:14 -0500 "J. Bruce Fields" <bfields@fieldses.org> wrote: > On Tue, Jan 04, 2011 at 03:51:07PM +1100, NeilBrown wrote: > > On Mon, 3 Jan 2011 22:08:05 -0500 "J. Bruce Fields" <bfields@fieldses.org> > > wrote: > > > > > On Mon, Jan 03, 2011 at 05:26:05PM -0500, J. Bruce Fields wrote: > > > > On Wed, Dec 29, 2010 at 03:47:52PM -0500, bfields wrote: > > > > > From: J. Bruce Fields <bfields@redhat.com> > > > > > > > > > > Once a sunrpc cache entry is non-NEGATIVE, we should be replacing it > > > > > (and allowing any concurrent users to destroy it on last put) instead of > > > > > trying to update it in place. > > > > > > > > Or the following seems simpler. > > > > > > > > (And I was thinking it was necessary to ensure that the right thing > > > > happened to the cached xprt->xpt_auth_cache entry--though on a second > > > > look I see that sunrpc_cache_update also expires the replaced entry > > > > immediately. Still, this seems simpler if it also works.) > > > > > > Eh, on third thoughts: we probably do want a real negative entry created > > > in the cache, so I think the original patch was correct! > > > > I like your second thought better than your third. > > I don't see any reason to push this item out of the cache extra quickly. > > In fact I think it would still be correct to just remove those two lines and > > not set expiry_time to 0. auth_unix_lookup will never find that IP address, > > and so it doesn't matter if it remains in the cache or not. > > I guess there could result in the cache appearing to contain different data > > depending on whether you look at it the 'old' way or the 'new' way, but I > > don't that is a real problem, and setting expiry_time to 0 overcomes that. > > > > What was the substance of your third thought? > > I had some idea that we could end up with a cache entry stuck in the > cache forever. > > OK, actually I think late last night I reverted into some sort of "maybe > if I type the first thing that comes to my mind, somebody else will > think this through for me" state. Apologies. Um, did I win? This would be the Andrew Morton method of community involvement: say something obviously wrong as it produces more responses than saying something right - and some of them might be useful.... I fall for it all the time! > > > BTW, you could use sunrpc_invalidate rather than just setting expiry_time to > > zero, which would hurry it out of the cache a bit faster. > > Yep, I like that, and I'd forgotten about sunrpc_invalidate, thanks! > Done.... (As below). > > > And this all made me realise that there is more code that can be placed under > > CONFIG_NFSD_DEPRECTATED. > > Yep, applied. > > (When do we get to remove all this? Taking a stab at the 2.6.40 merge > window.... That is what is says in feature-removal-schedule.txt (which no-one reads). > OK, party at my place in May!) Will there still be snow? I'm not coming if there is no snow! (patch looks good) NeilBrown > > --b. > > commit ab5c05c579b0b37b9bf2c79c9c8f0ef6045ee41d > Author: J. Bruce Fields <bfields@redhat.com> > Date: Fri Dec 24 14:03:38 2010 -0500 > > svcrpc: modifying valid sunrpc cache entries is racy > > Once a sunrpc cache entry is VALID, we should be replacing it (and > allowing any concurrent users to destroy it on last put) instead of > trying to update it in place. > > Otherwise someone referencing the ip_map we're modifying here could try > to use the m_client just as we're putting the last reference. > > The bug should only be seen by users of the legacy nfsd interfaces. > > Signed-off-by: J. Bruce Fields <bfields@redhat.com> > > diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c > index a04ac91..59a7c52 100644 > --- a/net/sunrpc/svcauth_unix.c > +++ b/net/sunrpc/svcauth_unix.c > @@ -401,8 +401,7 @@ struct auth_domain *auth_unix_lookup(struct net *net, struct in6_addr *addr) > return NULL; > > if ((ipm->m_client->addr_changes - ipm->m_add_change) >0) { > - if (test_and_set_bit(CACHE_NEGATIVE, &ipm->h.flags) == 0) > - auth_domain_put(&ipm->m_client->h); > + sunrpc_invalidate(&ipm->h, sn->ip_map_cache); > rv = NULL; > } else { > rv = &ipm->m_client->h; ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy 2011-01-04 21:15 ` NeilBrown @ 2011-01-04 21:21 ` J. Bruce Fields 0 siblings, 0 replies; 20+ messages in thread From: J. Bruce Fields @ 2011-01-04 21:21 UTC (permalink / raw) To: NeilBrown; +Cc: linux-nfs On Wed, Jan 05, 2011 at 08:15:38AM +1100, NeilBrown wrote: > On Tue, 4 Jan 2011 13:43:14 -0500 "J. Bruce Fields" <bfields@fieldses.org> > wrote: > > Yep, applied. > > > > (When do we get to remove all this? Taking a stab at the 2.6.40 merge > > window.... > > That is what is says in feature-removal-schedule.txt (which no-one reads). > > > > OK, party at my place in May!) > > Will there still be snow? I'm not coming if there is no snow! If anyone attends from Australia, I'll find the snow. It may melt rather quickly, though.... > (patch looks good) Applied--thanks for the help!--b. > > NeilBrown > > > > > > > --b. > > > > commit ab5c05c579b0b37b9bf2c79c9c8f0ef6045ee41d > > Author: J. Bruce Fields <bfields@redhat.com> > > Date: Fri Dec 24 14:03:38 2010 -0500 > > > > svcrpc: modifying valid sunrpc cache entries is racy > > > > Once a sunrpc cache entry is VALID, we should be replacing it (and > > allowing any concurrent users to destroy it on last put) instead of > > trying to update it in place. > > > > Otherwise someone referencing the ip_map we're modifying here could try > > to use the m_client just as we're putting the last reference. > > > > The bug should only be seen by users of the legacy nfsd interfaces. > > > > Signed-off-by: J. Bruce Fields <bfields@redhat.com> > > > > diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c > > index a04ac91..59a7c52 100644 > > --- a/net/sunrpc/svcauth_unix.c > > +++ b/net/sunrpc/svcauth_unix.c > > @@ -401,8 +401,7 @@ struct auth_domain *auth_unix_lookup(struct net *net, struct in6_addr *addr) > > return NULL; > > > > if ((ipm->m_client->addr_changes - ipm->m_add_change) >0) { > > - if (test_and_set_bit(CACHE_NEGATIVE, &ipm->h.flags) == 0) > > - auth_domain_put(&ipm->m_client->h); > > + sunrpc_invalidate(&ipm->h, sn->ip_map_cache); > > rv = NULL; > > } else { > > rv = &ipm->m_client->h; > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy 2011-01-04 4:51 ` NeilBrown 2011-01-04 18:43 ` J. Bruce Fields @ 2011-01-04 21:46 ` J. Bruce Fields 2011-01-04 23:05 ` NeilBrown 1 sibling, 1 reply; 20+ messages in thread From: J. Bruce Fields @ 2011-01-04 21:46 UTC (permalink / raw) To: NeilBrown; +Cc: linux-nfs On Tue, Jan 04, 2011 at 03:51:07PM +1100, NeilBrown wrote: > And this all made me realise that there is more code that can be placed under > CONFIG_NFSD_DEPRECTATED. > > > SUNRPC: Remove more code when NFSD_DEPRECATED is not configured > > Signed-off-by: NeilBrown <neilb@suse.de> I did need one change to get a succesful !CONFIG_NFSD_DEPRECATED compile. --b. diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c index 46e1cbc..30916b0 100644 --- a/net/sunrpc/svcauth_unix.c +++ b/net/sunrpc/svcauth_unix.c @@ -422,6 +422,7 @@ struct auth_domain *auth_unix_lookup(struct net *net, struct in6_addr *addr) return rv; } EXPORT_SYMBOL_GPL(auth_unix_lookup); +#endif /* CONFIG_NFSD_DEPRECATED */ void svcauth_unix_purge(void) { @@ -435,7 +436,6 @@ void svcauth_unix_purge(void) } } EXPORT_SYMBOL_GPL(svcauth_unix_purge); -#endif /* CONFIG_NFSD_DEPRECATED */ static inline struct ip_map * ip_map_cached_get(struct svc_xprt *xprt) ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy 2011-01-04 21:46 ` J. Bruce Fields @ 2011-01-04 23:05 ` NeilBrown 0 siblings, 0 replies; 20+ messages in thread From: NeilBrown @ 2011-01-04 23:05 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-nfs On Tue, 4 Jan 2011 16:46:39 -0500 "J. Bruce Fields" <bfields@fieldses.org> wrote: > On Tue, Jan 04, 2011 at 03:51:07PM +1100, NeilBrown wrote: > > And this all made me realise that there is more code that can be placed under > > CONFIG_NFSD_DEPRECTATED. > > > > > > SUNRPC: Remove more code when NFSD_DEPRECATED is not configured > > > > Signed-off-by: NeilBrown <neilb@suse.de> > > I did need one change to get a succesful !CONFIG_NFSD_DEPRECATED > compile. yep.... I had done a test compile, but I managed to get the sense of CONFIG_NFSD_DEPRECATED inverted :-( NeilBrown > > --b. > > diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c > index 46e1cbc..30916b0 100644 > --- a/net/sunrpc/svcauth_unix.c > +++ b/net/sunrpc/svcauth_unix.c > @@ -422,6 +422,7 @@ struct auth_domain *auth_unix_lookup(struct net *net, struct in6_addr *addr) > return rv; > } > EXPORT_SYMBOL_GPL(auth_unix_lookup); > +#endif /* CONFIG_NFSD_DEPRECATED */ > > void svcauth_unix_purge(void) > { > @@ -435,7 +436,6 @@ void svcauth_unix_purge(void) > } > } > EXPORT_SYMBOL_GPL(svcauth_unix_purge); > -#endif /* CONFIG_NFSD_DEPRECATED */ > > static inline struct ip_map * > ip_map_cached_get(struct svc_xprt *xprt) ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2011-01-04 23:05 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-12-29 20:47 [PATCH] svcrpc: modifying positive sunrpc cache entries is racy J. Bruce Fields
2010-12-29 20:59 ` J. Bruce Fields
2010-12-30 1:19 ` Neil Brown
2010-12-30 1:57 ` J. Bruce Fields
2011-01-03 20:55 ` J. Bruce Fields
2011-01-04 5:01 ` NeilBrown
2011-01-04 15:22 ` J. Bruce Fields
2011-01-04 19:23 ` J. Bruce Fields
2011-01-04 19:31 ` [PATCH 1/2] svcrpc: take lock on turning entry NEGATIVE in cache_check J. Bruce Fields
2011-01-04 19:31 ` [PATCH 2/2] svcrpc: ensure cache_check caller sees updated entry J. Bruce Fields
2011-01-04 21:10 ` [PATCH] svcrpc: modifying positive sunrpc cache entries is racy NeilBrown
[not found] ` <20110105081031.220bfbc9-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2011-01-04 21:15 ` J. Bruce Fields
2011-01-03 22:26 ` J. Bruce Fields
2011-01-04 3:08 ` J. Bruce Fields
2011-01-04 4:51 ` NeilBrown
2011-01-04 18:43 ` J. Bruce Fields
2011-01-04 21:15 ` NeilBrown
2011-01-04 21:21 ` J. Bruce Fields
2011-01-04 21:46 ` J. Bruce Fields
2011-01-04 23:05 ` NeilBrown
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).