* [PATCH Version 4] SVCAUTH reap the rsc cache entry on RPC_SS_PROC_DESTROY
@ 2016-12-22 17:38 andros
2016-12-22 17:38 ` andros
2016-12-23 22:55 ` NeilBrown
0 siblings, 2 replies; 9+ messages in thread
From: andros @ 2016-12-22 17:38 UTC (permalink / raw)
To: bfields; +Cc: neilb, linux-nfs, Andy Adamson
From: Andy Adamson <andros@netapp.com>
Version 4 of this patch set uses the patch provided by Neil Brown
that directly unhashes and put's the to be deleted cache entry.
I took the liberty of providing the patch comments.
Neil - of course, do what you want with the patch and comments.
Thanks!
-->Andy
Neil Brown (1):
SVCAUTH reap the rsc cache entry on RPC_SS_PROC_DESTROY
include/linux/sunrpc/cache.h | 1 +
net/sunrpc/auth_gss/svcauth_gss.c | 4 ++--
net/sunrpc/cache.c | 13 +++++++++++++
3 files changed, 16 insertions(+), 2 deletions(-)
--
1.8.3.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH Version 4] SVCAUTH reap the rsc cache entry on RPC_SS_PROC_DESTROY
2016-12-22 17:38 [PATCH Version 4] SVCAUTH reap the rsc cache entry on RPC_SS_PROC_DESTROY andros
@ 2016-12-22 17:38 ` andros
2017-01-04 20:26 ` J. Bruce Fields
2016-12-23 22:55 ` NeilBrown
1 sibling, 1 reply; 9+ messages in thread
From: andros @ 2016-12-22 17:38 UTC (permalink / raw)
To: bfields; +Cc: neilb, linux-nfs, Andy Adamson
From: Neil Brown <neilb@suse.com>
The rsc cache code operates in a read_lock/write_lock environment.
Changes to a cache entry should use the provided rsc_update
routine which takes the write_lock.
The current code sets the expiry_time and the CACHE_NEGATIVE flag
without taking the write_lock as it does not call rsc_update.
Without this patch, while cache_clean sees the entries to be
removed, it does not remove the rsc_entries. This is because
rsc_update updates other fields such as flush_time and last_refresh
in the entry to trigger cache_clean to reap the entry.
Add a new sunrpc_cache_unhash() function (by Neil Brown) designed
to directly unhash and reap the to be destroyed cache entry.
Signed-off-by: Neil Brown <neilb@suse.com>
Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
---
include/linux/sunrpc/cache.h | 1 +
net/sunrpc/auth_gss/svcauth_gss.c | 4 ++--
net/sunrpc/cache.c | 13 +++++++++++++
3 files changed, 16 insertions(+), 2 deletions(-)
diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index 62a60ee..9dcf2c8 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -227,6 +227,7 @@ extern int cache_check(struct cache_detail *detail,
extern int sunrpc_cache_register_pipefs(struct dentry *parent, const char *,
umode_t, struct cache_detail *);
extern void sunrpc_cache_unregister_pipefs(struct cache_detail *);
+extern void sunrpc_cache_unhash(struct cache_detail *, struct cache_head *);
/* Must store cache_detail in seq_file->private if using next three functions */
extern void *cache_seq_start(struct seq_file *file, loff_t *pos);
diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
index 45662d7..78f8a9c 100644
--- a/net/sunrpc/auth_gss/svcauth_gss.c
+++ b/net/sunrpc/auth_gss/svcauth_gss.c
@@ -1489,8 +1489,8 @@ static void destroy_use_gss_proxy_proc_entry(struct net *net) {}
case RPC_GSS_PROC_DESTROY:
if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq))
goto auth_err;
- rsci->h.expiry_time = get_seconds();
- set_bit(CACHE_NEGATIVE, &rsci->h.flags);
+ /* Delete the entry from the cache_list and call cache_put */
+ sunrpc_cache_unhash(sn->rsc_cache, &rsci->h);
if (resv->iov_len + 4 > PAGE_SIZE)
goto drop;
svc_putnl(resv, RPC_SUCCESS);
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 8aabe12..153e254 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -1855,3 +1855,16 @@ void sunrpc_cache_unregister_pipefs(struct cache_detail *cd)
}
EXPORT_SYMBOL_GPL(sunrpc_cache_unregister_pipefs);
+void sunrpc_cache_unhash(struct cache_detail *cd, struct cache_head *h)
+{
+ write_lock(&cd->hash_lock);
+ if (!hlist_unhashed(&h->cache_list)){
+ hlist_del_init(&h->cache_list);
+ cd->entries--;
+ write_unlock(&cd->hash_lock);
+ cache_put(h, cd);
+ } else
+ write_unlock(&cd->hash_lock);
+}
+EXPORT_SYMBOL_GPL(sunrpc_cache_unhash);
+
--
1.8.3.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH Version 4] SVCAUTH reap the rsc cache entry on RPC_SS_PROC_DESTROY
2016-12-22 17:38 [PATCH Version 4] SVCAUTH reap the rsc cache entry on RPC_SS_PROC_DESTROY andros
2016-12-22 17:38 ` andros
@ 2016-12-23 22:55 ` NeilBrown
1 sibling, 0 replies; 9+ messages in thread
From: NeilBrown @ 2016-12-23 22:55 UTC (permalink / raw)
To: andros, bfields; +Cc: linux-nfs, Andy Adamson
[-- Attachment #1: Type: text/plain, Size: 449 bytes --]
On Fri, Dec 23 2016, andros@netapp.com wrote:
> From: Andy Adamson <andros@netapp.com>
>
> Version 4 of this patch set uses the patch provided by Neil Brown
> that directly unhashes and put's the to be deleted cache entry.
>
> I took the liberty of providing the patch comments.
>
> Neil - of course, do what you want with the patch and comments.
>
I'm happy with the patch and comments as they stand.
Thanks for pursuing this!
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH Version 4] SVCAUTH reap the rsc cache entry on RPC_SS_PROC_DESTROY
2016-12-22 17:38 ` andros
@ 2017-01-04 20:26 ` J. Bruce Fields
2017-01-04 21:14 ` NeilBrown
0 siblings, 1 reply; 9+ messages in thread
From: J. Bruce Fields @ 2017-01-04 20:26 UTC (permalink / raw)
To: andros; +Cc: neilb, linux-nfs
I'm not against the patch, but I'm still not convinced by the
explanation:
On Thu, Dec 22, 2016 at 12:38:06PM -0500, andros@netapp.com wrote:
> From: Neil Brown <neilb@suse.com>
>
> The rsc cache code operates in a read_lock/write_lock environment.
> Changes to a cache entry should use the provided rsc_update
> routine which takes the write_lock.
It looks pretty suspicious to be setting CACHE_NEGATIVE without the
cache_lock for write, but I'm not actually convinced there's a bug there
either. In any case not one that you'd be hitting reliably.
> The current code sets the expiry_time and the CACHE_NEGATIVE flag
> without taking the write_lock as it does not call rsc_update.
> Without this patch, while cache_clean sees the entries to be
> removed, it does not remove the rsc_entries. This is because
> rsc_update updates other fields such as flush_time and last_refresh
> in the entry to trigger cache_clean to reap the entry.
I think the root cause of the particular behavior you were seeing was
actually an oversight from Neil's c5b29f885afe "sunrpc: use seconds
since boot in expiry cache", which missed this one occurrence of
get_seconds(). So it's setting the item's entry to something decades in
the future.
And that's probably not been a huge deal since these entries aren't so
big, and they will eventually get cleaned up by cache_purge when the
cache is destroyed. Still, I can imagine it slowing down cache lookups
on a long-lived server.
The one-liner:
- rsci->h.expiry_time = get_seconds();
+ rsci->h.expiry_time = seconds_since_boot();
would probably also do the job. Am I missing something?
But, OK, I think Neil's patch will ensure entries get cleaned up more
quickly than that would, and might also fix a rare race.
--b.
>
> Add a new sunrpc_cache_unhash() function (by Neil Brown) designed
> to directly unhash and reap the to be destroyed cache entry.
>
> Signed-off-by: Neil Brown <neilb@suse.com>
> Reported-by: Andy Adamson <andros@netapp.com>
> Signed-off-by: Andy Adamson <andros@netapp.com>
> ---
> include/linux/sunrpc/cache.h | 1 +
> net/sunrpc/auth_gss/svcauth_gss.c | 4 ++--
> net/sunrpc/cache.c | 13 +++++++++++++
> 3 files changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
> index 62a60ee..9dcf2c8 100644
> --- a/include/linux/sunrpc/cache.h
> +++ b/include/linux/sunrpc/cache.h
> @@ -227,6 +227,7 @@ extern int cache_check(struct cache_detail *detail,
> extern int sunrpc_cache_register_pipefs(struct dentry *parent, const char *,
> umode_t, struct cache_detail *);
> extern void sunrpc_cache_unregister_pipefs(struct cache_detail *);
> +extern void sunrpc_cache_unhash(struct cache_detail *, struct cache_head *);
>
> /* Must store cache_detail in seq_file->private if using next three functions */
> extern void *cache_seq_start(struct seq_file *file, loff_t *pos);
> diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
> index 45662d7..78f8a9c 100644
> --- a/net/sunrpc/auth_gss/svcauth_gss.c
> +++ b/net/sunrpc/auth_gss/svcauth_gss.c
> @@ -1489,8 +1489,8 @@ static void destroy_use_gss_proxy_proc_entry(struct net *net) {}
> case RPC_GSS_PROC_DESTROY:
> if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq))
> goto auth_err;
> - rsci->h.expiry_time = get_seconds();
> - set_bit(CACHE_NEGATIVE, &rsci->h.flags);
> + /* Delete the entry from the cache_list and call cache_put */
> + sunrpc_cache_unhash(sn->rsc_cache, &rsci->h);
> if (resv->iov_len + 4 > PAGE_SIZE)
> goto drop;
> svc_putnl(resv, RPC_SUCCESS);
> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
> index 8aabe12..153e254 100644
> --- a/net/sunrpc/cache.c
> +++ b/net/sunrpc/cache.c
> @@ -1855,3 +1855,16 @@ void sunrpc_cache_unregister_pipefs(struct cache_detail *cd)
> }
> EXPORT_SYMBOL_GPL(sunrpc_cache_unregister_pipefs);
>
> +void sunrpc_cache_unhash(struct cache_detail *cd, struct cache_head *h)
> +{
> + write_lock(&cd->hash_lock);
> + if (!hlist_unhashed(&h->cache_list)){
> + hlist_del_init(&h->cache_list);
> + cd->entries--;
> + write_unlock(&cd->hash_lock);
> + cache_put(h, cd);
> + } else
> + write_unlock(&cd->hash_lock);
> +}
> +EXPORT_SYMBOL_GPL(sunrpc_cache_unhash);
> +
> --
> 1.8.3.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH Version 4] SVCAUTH reap the rsc cache entry on RPC_SS_PROC_DESTROY
2017-01-04 20:26 ` J. Bruce Fields
@ 2017-01-04 21:14 ` NeilBrown
2017-01-06 21:01 ` J. Bruce Fields
0 siblings, 1 reply; 9+ messages in thread
From: NeilBrown @ 2017-01-04 21:14 UTC (permalink / raw)
To: J. Bruce Fields, andros; +Cc: linux-nfs
[-- Attachment #1: Type: text/plain, Size: 2376 bytes --]
On Thu, Jan 05 2017, J. Bruce Fields wrote:
> I'm not against the patch, but I'm still not convinced by the
> explanation:
>
> On Thu, Dec 22, 2016 at 12:38:06PM -0500, andros@netapp.com wrote:
>> From: Neil Brown <neilb@suse.com>
>>
>> The rsc cache code operates in a read_lock/write_lock environment.
>> Changes to a cache entry should use the provided rsc_update
>> routine which takes the write_lock.
>
> It looks pretty suspicious to be setting CACHE_NEGATIVE without the
> cache_lock for write, but I'm not actually convinced there's a bug there
> either. In any case not one that you'd be hitting reliably.
>
>> The current code sets the expiry_time and the CACHE_NEGATIVE flag
>> without taking the write_lock as it does not call rsc_update.
>> Without this patch, while cache_clean sees the entries to be
>> removed, it does not remove the rsc_entries. This is because
>> rsc_update updates other fields such as flush_time and last_refresh
>> in the entry to trigger cache_clean to reap the entry.
>
> I think the root cause of the particular behavior you were seeing was
> actually an oversight from Neil's c5b29f885afe "sunrpc: use seconds
> since boot in expiry cache", which missed this one occurrence of
> get_seconds(). So it's setting the item's entry to something decades in
> the future.
>
> And that's probably not been a huge deal since these entries aren't so
> big, and they will eventually get cleaned up by cache_purge when the
> cache is destroyed. Still, I can imagine it slowing down cache lookups
> on a long-lived server.
>
> The one-liner:
>
> - rsci->h.expiry_time = get_seconds();
> + rsci->h.expiry_time = seconds_since_boot();
>
> would probably also do the job. Am I missing something?
I was missing that get_seconds() bug - thanks.
The other real bug is that setting h.expiry_time backwards should
really set cd->nextcheck backwards too. I thought I had found code
which did that, but I think I was confusing ->nextcheck with ->flush_time.
>
> But, OK, I think Neil's patch will ensure entries get cleaned up more
> quickly than that would, and might also fix a rare race.
Yes. The patch doesn't just fix the bug, whatever it is. It provides a
proper interface for functionality that wasn't previously supported, and
so had been hacked into place.
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH Version 4] SVCAUTH reap the rsc cache entry on RPC_SS_PROC_DESTROY
2017-01-04 21:14 ` NeilBrown
@ 2017-01-06 21:01 ` J. Bruce Fields
2017-01-06 23:59 ` NeilBrown
0 siblings, 1 reply; 9+ messages in thread
From: J. Bruce Fields @ 2017-01-06 21:01 UTC (permalink / raw)
To: NeilBrown; +Cc: andros, linux-nfs
On Thu, Jan 05, 2017 at 08:14:48AM +1100, NeilBrown wrote:
> On Thu, Jan 05 2017, J. Bruce Fields wrote:
>
> > I'm not against the patch, but I'm still not convinced by the
> > explanation:
> >
> > On Thu, Dec 22, 2016 at 12:38:06PM -0500, andros@netapp.com wrote:
> >> From: Neil Brown <neilb@suse.com>
> >>
> >> The rsc cache code operates in a read_lock/write_lock environment.
> >> Changes to a cache entry should use the provided rsc_update
> >> routine which takes the write_lock.
> >
> > It looks pretty suspicious to be setting CACHE_NEGATIVE without the
> > cache_lock for write, but I'm not actually convinced there's a bug there
> > either. In any case not one that you'd be hitting reliably.
> >
> >> The current code sets the expiry_time and the CACHE_NEGATIVE flag
> >> without taking the write_lock as it does not call rsc_update.
> >> Without this patch, while cache_clean sees the entries to be
> >> removed, it does not remove the rsc_entries. This is because
> >> rsc_update updates other fields such as flush_time and last_refresh
> >> in the entry to trigger cache_clean to reap the entry.
> >
> > I think the root cause of the particular behavior you were seeing was
> > actually an oversight from Neil's c5b29f885afe "sunrpc: use seconds
> > since boot in expiry cache", which missed this one occurrence of
> > get_seconds(). So it's setting the item's entry to something decades in
> > the future.
> >
> > And that's probably not been a huge deal since these entries aren't so
> > big, and they will eventually get cleaned up by cache_purge when the
> > cache is destroyed. Still, I can imagine it slowing down cache lookups
> > on a long-lived server.
> >
> > The one-liner:
> >
> > - rsci->h.expiry_time = get_seconds();
> > + rsci->h.expiry_time = seconds_since_boot();
> >
> > would probably also do the job. Am I missing something?
>
> I was missing that get_seconds() bug - thanks.
> The other real bug is that setting h.expiry_time backwards should
> really set cd->nextcheck backwards too. I thought I had found code
> which did that, but I think I was confusing ->nextcheck with ->flush_time.
>
> >
> > But, OK, I think Neil's patch will ensure entries get cleaned up more
> > quickly than that would, and might also fix a rare race.
>
> Yes. The patch doesn't just fix the bug, whatever it is. It provides a
> proper interface for functionality that wasn't previously supported, and
> so had been hacked into place.
Got it. So, with another changelog rewrite, I'm applying it like the
below.
Another question is whether it's worth a stable cc.... I think so, as
all it would take is a case where a server gets a lot of PROC_DESTROYs
over its lifetime. That's not hard to imagine. And the symptoms are
probably non-obvious and cured by a reboot, which might explain it not
having seen bug reports.
--b.
commit 27297f41527d
Author: Neil Brown <neilb@suse.com>
Date: Thu Dec 22 12:38:06 2016 -0500
svcauth_gss: free context promptly on PROC_DESTROY
Since c5b29f885afe "sunrpc: use seconds since boot in expiry cache", the
expiry_time field has been in units of seconds since boot, but that
patch overlooked the server code that handles RPC_GSS_PROC_DESTROY
requests. The result is that when we get a request from a client to
destroy a context, we set that context's expiry time to a time decades
in the future.
The context will still be cleaned up by cache_purge when the module is
unloaded or the container shut down, but a lot of contexts could pile up
before then.
The simple fix would be just to set expiry_time to seconds_since_boot(),
but modifying the entry outside the cache_lock is also looks dubious
(though I'm not completely sure what the race would be). So let's
provide a helper that does this properly, taking the lock and removing
the entry immediately.
Signed-off-by: Neil Brown <neilb@suse.com>
Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index 62a60eeacb0a..9dcf2c8fe1c5 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -227,6 +227,7 @@ extern void sunrpc_destroy_cache_detail(struct cache_detail *cd);
extern int sunrpc_cache_register_pipefs(struct dentry *parent, const char *,
umode_t, struct cache_detail *);
extern void sunrpc_cache_unregister_pipefs(struct cache_detail *);
+extern void sunrpc_cache_unhash(struct cache_detail *, struct cache_head *);
/* Must store cache_detail in seq_file->private if using next three functions */
extern void *cache_seq_start(struct seq_file *file, loff_t *pos);
diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
index 886e9d381771..a54a7a3d28f5 100644
--- a/net/sunrpc/auth_gss/svcauth_gss.c
+++ b/net/sunrpc/auth_gss/svcauth_gss.c
@@ -1489,8 +1489,8 @@ svcauth_gss_accept(struct svc_rqst *rqstp, __be32 *authp)
case RPC_GSS_PROC_DESTROY:
if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq))
goto auth_err;
- rsci->h.expiry_time = get_seconds();
- set_bit(CACHE_NEGATIVE, &rsci->h.flags);
+ /* Delete the entry from the cache_list and call cache_put */
+ sunrpc_cache_unhash(sn->rsc_cache, &rsci->h);
if (resv->iov_len + 4 > PAGE_SIZE)
goto drop;
svc_putnl(resv, RPC_SUCCESS);
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 8147e8d56eb2..502b9fe9817b 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -1855,3 +1855,15 @@ void sunrpc_cache_unregister_pipefs(struct cache_detail *cd)
}
EXPORT_SYMBOL_GPL(sunrpc_cache_unregister_pipefs);
+void sunrpc_cache_unhash(struct cache_detail *cd, struct cache_head *h)
+{
+ write_lock(&cd->hash_lock);
+ if (!hlist_unhashed(&h->cache_list)){
+ hlist_del_init(&h->cache_list);
+ cd->entries--;
+ write_unlock(&cd->hash_lock);
+ cache_put(h, cd);
+ } else
+ write_unlock(&cd->hash_lock);
+}
+EXPORT_SYMBOL_GPL(sunrpc_cache_unhash);
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH Version 4] SVCAUTH reap the rsc cache entry on RPC_SS_PROC_DESTROY
2017-01-06 21:01 ` J. Bruce Fields
@ 2017-01-06 23:59 ` NeilBrown
2017-01-12 21:08 ` J. Bruce Fields
0 siblings, 1 reply; 9+ messages in thread
From: NeilBrown @ 2017-01-06 23:59 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: andros, linux-nfs
[-- Attachment #1: Type: text/plain, Size: 7995 bytes --]
On Sat, Jan 07 2017, J. Bruce Fields wrote:
> On Thu, Jan 05, 2017 at 08:14:48AM +1100, NeilBrown wrote:
>> On Thu, Jan 05 2017, J. Bruce Fields wrote:
>>
>> > I'm not against the patch, but I'm still not convinced by the
>> > explanation:
>> >
>> > On Thu, Dec 22, 2016 at 12:38:06PM -0500, andros@netapp.com wrote:
>> >> From: Neil Brown <neilb@suse.com>
>> >>
>> >> The rsc cache code operates in a read_lock/write_lock environment.
>> >> Changes to a cache entry should use the provided rsc_update
>> >> routine which takes the write_lock.
>> >
>> > It looks pretty suspicious to be setting CACHE_NEGATIVE without the
>> > cache_lock for write, but I'm not actually convinced there's a bug there
>> > either. In any case not one that you'd be hitting reliably.
>> >
>> >> The current code sets the expiry_time and the CACHE_NEGATIVE flag
>> >> without taking the write_lock as it does not call rsc_update.
>> >> Without this patch, while cache_clean sees the entries to be
>> >> removed, it does not remove the rsc_entries. This is because
>> >> rsc_update updates other fields such as flush_time and last_refresh
>> >> in the entry to trigger cache_clean to reap the entry.
>> >
>> > I think the root cause of the particular behavior you were seeing was
>> > actually an oversight from Neil's c5b29f885afe "sunrpc: use seconds
>> > since boot in expiry cache", which missed this one occurrence of
>> > get_seconds(). So it's setting the item's entry to something decades in
>> > the future.
>> >
>> > And that's probably not been a huge deal since these entries aren't so
>> > big, and they will eventually get cleaned up by cache_purge when the
>> > cache is destroyed. Still, I can imagine it slowing down cache lookups
>> > on a long-lived server.
>> >
>> > The one-liner:
>> >
>> > - rsci->h.expiry_time = get_seconds();
>> > + rsci->h.expiry_time = seconds_since_boot();
>> >
>> > would probably also do the job. Am I missing something?
>>
>> I was missing that get_seconds() bug - thanks.
>> The other real bug is that setting h.expiry_time backwards should
>> really set cd->nextcheck backwards too. I thought I had found code
>> which did that, but I think I was confusing ->nextcheck with ->flush_time.
>>
>> >
>> > But, OK, I think Neil's patch will ensure entries get cleaned up more
>> > quickly than that would, and might also fix a rare race.
>>
>> Yes. The patch doesn't just fix the bug, whatever it is. It provides a
>> proper interface for functionality that wasn't previously supported, and
>> so had been hacked into place.
>
> Got it. So, with another changelog rewrite, I'm applying it like the
> below.
>
> Another question is whether it's worth a stable cc.... I think so, as
> all it would take is a case where a server gets a lot of PROC_DESTROYs
> over its lifetime. That's not hard to imagine. And the symptoms are
> probably non-obvious and cured by a reboot, which might explain it not
> having seen bug reports.
How about applying
>> > - rsci->h.expiry_time = get_seconds();
>> > + rsci->h.expiry_time = seconds_since_boot();
first with a Cc: to stable (and a Fixes:), then this patch on top of
that without the Cc.
With the code as it is, destroyed entries won't expire for years.
With the above change, they will expire within half an hour, as
cache_clean() looks at every cache at least every 30 minutes.
If we also added code to reduce ->nextcheck (which would require a write
lock), then they would expired within 30 seconds, as do_cache_clean()
runs cache_clean() at least ever 30 seconds (I wonder if we should make
it more demand-driven?).
With the full patch, it is removed instantly.
I think a 30 minutes guarantee is enough for -stable, especially as it
is achieved with such a tiny, obviously correct, patch.
>
> --b.
>
> commit 27297f41527d
> Author: Neil Brown <neilb@suse.com>
> Date: Thu Dec 22 12:38:06 2016 -0500
>
> svcauth_gss: free context promptly on PROC_DESTROY
>
> Since c5b29f885afe "sunrpc: use seconds since boot in expiry cache", the
> expiry_time field has been in units of seconds since boot, but that
> patch overlooked the server code that handles RPC_GSS_PROC_DESTROY
> requests. The result is that when we get a request from a client to
> destroy a context, we set that context's expiry time to a time decades
> in the future.
>
> The context will still be cleaned up by cache_purge when the module is
> unloaded or the container shut down, but a lot of contexts could pile up
> before then.
>
> The simple fix would be just to set expiry_time to seconds_since_boot(),
> but modifying the entry outside the cache_lock is also looks dubious
> (though I'm not completely sure what the race would be). So let's
> provide a helper that does this properly, taking the lock and removing
> the entry immediately.
I don't think the locking is an issue at all.
Setting h.expiry_time backwards without also setting cd->next_check
backwards results in internal inconsistencies, so is wrong for that
reason.
Setting CACHE_NEGATIVE is not "obviously correct" as, in general, value
fields are only allocateed when CACHE_NEGATIVE is not set, so they might
not be freed when it is. The doesn't actually apply to this cache, so
it would work but is not consistent with the intended (undocumented)
interface usage.
Up to you how much of this you spell out. I don't object to your
changelog above if you want to stay with it.
Thanks,
NeilBrown
>
> Signed-off-by: Neil Brown <neilb@suse.com>
> Reported-by: Andy Adamson <andros@netapp.com>
> Signed-off-by: Andy Adamson <andros@netapp.com>
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
>
> diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
> index 62a60eeacb0a..9dcf2c8fe1c5 100644
> --- a/include/linux/sunrpc/cache.h
> +++ b/include/linux/sunrpc/cache.h
> @@ -227,6 +227,7 @@ extern void sunrpc_destroy_cache_detail(struct cache_detail *cd);
> extern int sunrpc_cache_register_pipefs(struct dentry *parent, const char *,
> umode_t, struct cache_detail *);
> extern void sunrpc_cache_unregister_pipefs(struct cache_detail *);
> +extern void sunrpc_cache_unhash(struct cache_detail *, struct cache_head *);
>
> /* Must store cache_detail in seq_file->private if using next three functions */
> extern void *cache_seq_start(struct seq_file *file, loff_t *pos);
> diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
> index 886e9d381771..a54a7a3d28f5 100644
> --- a/net/sunrpc/auth_gss/svcauth_gss.c
> +++ b/net/sunrpc/auth_gss/svcauth_gss.c
> @@ -1489,8 +1489,8 @@ svcauth_gss_accept(struct svc_rqst *rqstp, __be32 *authp)
> case RPC_GSS_PROC_DESTROY:
> if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq))
> goto auth_err;
> - rsci->h.expiry_time = get_seconds();
> - set_bit(CACHE_NEGATIVE, &rsci->h.flags);
> + /* Delete the entry from the cache_list and call cache_put */
> + sunrpc_cache_unhash(sn->rsc_cache, &rsci->h);
> if (resv->iov_len + 4 > PAGE_SIZE)
> goto drop;
> svc_putnl(resv, RPC_SUCCESS);
> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
> index 8147e8d56eb2..502b9fe9817b 100644
> --- a/net/sunrpc/cache.c
> +++ b/net/sunrpc/cache.c
> @@ -1855,3 +1855,15 @@ void sunrpc_cache_unregister_pipefs(struct cache_detail *cd)
> }
> EXPORT_SYMBOL_GPL(sunrpc_cache_unregister_pipefs);
>
> +void sunrpc_cache_unhash(struct cache_detail *cd, struct cache_head *h)
> +{
> + write_lock(&cd->hash_lock);
> + if (!hlist_unhashed(&h->cache_list)){
> + hlist_del_init(&h->cache_list);
> + cd->entries--;
> + write_unlock(&cd->hash_lock);
> + cache_put(h, cd);
> + } else
> + write_unlock(&cd->hash_lock);
> +}
> +EXPORT_SYMBOL_GPL(sunrpc_cache_unhash);
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH Version 4] SVCAUTH reap the rsc cache entry on RPC_SS_PROC_DESTROY
2017-01-06 23:59 ` NeilBrown
@ 2017-01-12 21:08 ` J. Bruce Fields
2017-01-12 21:29 ` NeilBrown
0 siblings, 1 reply; 9+ messages in thread
From: J. Bruce Fields @ 2017-01-12 21:08 UTC (permalink / raw)
To: NeilBrown; +Cc: andros, linux-nfs
On Sat, Jan 07, 2017 at 10:59:26AM +1100, NeilBrown wrote:
> How about applying
> >> > - rsci->h.expiry_time = get_seconds();
> >> > + rsci->h.expiry_time = seconds_since_boot();
>
> first with a Cc: to stable (and a Fixes:), then this patch on top of
> that without the Cc.
Oh, good idea, thanks. Results below.
commit 78794d189070
Author: J. Bruce Fields <bfields@redhat.com>
Date: Mon Jan 9 17:15:18 2017 -0500
svcrpc: don't leak contexts on PROC_DESTROY
Context expiry times are in units of seconds since boot, not unix time.
The use of get_seconds() here therefore sets the expiry time decades in
the future. This prevents timely freeing of contexts destroyed by
client RPC_GSS_PROC_DESTROY requests. We'd still free them eventually
(when the module is unloaded or the container shut down), but a lot of
contexts could pile up before then.
Cc: stable@vger.kernel.org
Fixes: c5b29f885afe "sunrpc: use seconds since boot in expiry cache"
Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
index 886e9d381771..153082598522 100644
--- a/net/sunrpc/auth_gss/svcauth_gss.c
+++ b/net/sunrpc/auth_gss/svcauth_gss.c
@@ -1489,7 +1489,7 @@ svcauth_gss_accept(struct svc_rqst *rqstp, __be32 *authp)
case RPC_GSS_PROC_DESTROY:
if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq))
goto auth_err;
- rsci->h.expiry_time = get_seconds();
+ rsci->h.expiry_time = seconds_since_boot();
set_bit(CACHE_NEGATIVE, &rsci->h.flags);
if (resv->iov_len + 4 > PAGE_SIZE)
goto drop;
commit 987a7601aa80
Author: Neil Brown <neilb@suse.com>
Date: Thu Dec 22 12:38:06 2016 -0500
svcrpc: free contexts immediately on PROC_DESTROY
We currently handle a client PROC_DESTROY request by turning it
CACHE_NEGATIVE, setting the expired time to now, and then waiting for
cache_clean to clean it up later. Since we forgot to set the cache's
nextcheck value, that could take up to 30 minutes. Also, though there's
probably no real bug in this case, setting CACHE_NEGATIVE directly like
this probably isn't a great idea in general.
So let's just remove the entry from the cache directly, and move this
bit of cache manipulation to a helper function.
Signed-off-by: Neil Brown <neilb@suse.com>
Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index 62a60eeacb0a..9dcf2c8fe1c5 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -227,6 +227,7 @@ extern void sunrpc_destroy_cache_detail(struct cache_detail *cd);
extern int sunrpc_cache_register_pipefs(struct dentry *parent, const char *,
umode_t, struct cache_detail *);
extern void sunrpc_cache_unregister_pipefs(struct cache_detail *);
+extern void sunrpc_cache_unhash(struct cache_detail *, struct cache_head *);
/* Must store cache_detail in seq_file->private if using next three functions */
extern void *cache_seq_start(struct seq_file *file, loff_t *pos);
diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
index 153082598522..a54a7a3d28f5 100644
--- a/net/sunrpc/auth_gss/svcauth_gss.c
+++ b/net/sunrpc/auth_gss/svcauth_gss.c
@@ -1489,8 +1489,8 @@ svcauth_gss_accept(struct svc_rqst *rqstp, __be32 *authp)
case RPC_GSS_PROC_DESTROY:
if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq))
goto auth_err;
- rsci->h.expiry_time = seconds_since_boot();
- set_bit(CACHE_NEGATIVE, &rsci->h.flags);
+ /* Delete the entry from the cache_list and call cache_put */
+ sunrpc_cache_unhash(sn->rsc_cache, &rsci->h);
if (resv->iov_len + 4 > PAGE_SIZE)
goto drop;
svc_putnl(resv, RPC_SUCCESS);
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 8147e8d56eb2..502b9fe9817b 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -1855,3 +1855,15 @@ void sunrpc_cache_unregister_pipefs(struct cache_detail *cd)
}
EXPORT_SYMBOL_GPL(sunrpc_cache_unregister_pipefs);
+void sunrpc_cache_unhash(struct cache_detail *cd, struct cache_head *h)
+{
+ write_lock(&cd->hash_lock);
+ if (!hlist_unhashed(&h->cache_list)){
+ hlist_del_init(&h->cache_list);
+ cd->entries--;
+ write_unlock(&cd->hash_lock);
+ cache_put(h, cd);
+ } else
+ write_unlock(&cd->hash_lock);
+}
+EXPORT_SYMBOL_GPL(sunrpc_cache_unhash);
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH Version 4] SVCAUTH reap the rsc cache entry on RPC_SS_PROC_DESTROY
2017-01-12 21:08 ` J. Bruce Fields
@ 2017-01-12 21:29 ` NeilBrown
0 siblings, 0 replies; 9+ messages in thread
From: NeilBrown @ 2017-01-12 21:29 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: andros, linux-nfs
[-- Attachment #1: Type: text/plain, Size: 423 bytes --]
On Fri, Jan 13 2017, J. Bruce Fields wrote:
> On Sat, Jan 07, 2017 at 10:59:26AM +1100, NeilBrown wrote:
>> How about applying
>> >> > - rsci->h.expiry_time = get_seconds();
>> >> > + rsci->h.expiry_time = seconds_since_boot();
>>
>> first with a Cc: to stable (and a Fixes:), then this patch on top of
>> that without the Cc.
>
> Oh, good idea, thanks. Results below.
Ferpect. Thanks!
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-01-12 21:29 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-22 17:38 [PATCH Version 4] SVCAUTH reap the rsc cache entry on RPC_SS_PROC_DESTROY andros
2016-12-22 17:38 ` andros
2017-01-04 20:26 ` J. Bruce Fields
2017-01-04 21:14 ` NeilBrown
2017-01-06 21:01 ` J. Bruce Fields
2017-01-06 23:59 ` NeilBrown
2017-01-12 21:08 ` J. Bruce Fields
2017-01-12 21:29 ` NeilBrown
2016-12-23 22:55 ` NeilBrown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).