* [PATCH RFC 0/3] nfsd: convert nfsd DRC code to use list_lru infrastructure @ 2013-12-05 11:00 Jeff Layton 2013-12-05 11:00 ` [PATCH RFC 1/3] nfsd: don't try to reuse an expired DRC entry off the list Jeff Layton ` (3 more replies) 0 siblings, 4 replies; 16+ messages in thread From: Jeff Layton @ 2013-12-05 11:00 UTC (permalink / raw) To: linux-nfs; +Cc: Christoph Hellwig, J. Bruce Fields This patchset converts the LRU list in the nfsd duplicate reply cache to use the new list_lru infrastrucure. Note that this is based on top of the patch that I sent to Bruce earlier this week that fixes the svc_cacherep direct reclaim bug. I'm sending this as an RFC since I'm not 100% convinced it's an improvement. The majorly unintuitive thing about list_lru that I've found is that you can't call call list_lru_del from list_lru_walk. So, you need different routines to free an object depending on how it's being freed. Using list_lru also means extra spinlocking since we'd be moving to a per-node LRU, but that's probably not a big deal since all of this is done under the cache_lock anyway. In any case, here's what a conversion to list_lru would look like... Discuss! Jeff Layton (3): nfsd: don't try to reuse an expired DRC entry off the list list_lru: add a new LRU_SKIP_REST lru_status value and handling nfsd: convert DRC code to use list_lru fs/nfsd/nfscache.c | 130 +++++++++++++++++++++++++---------------------- include/linux/list_lru.h | 2 + mm/list_lru.c | 4 +- 3 files changed, 75 insertions(+), 61 deletions(-) -- 1.8.4.2 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH RFC 1/3] nfsd: don't try to reuse an expired DRC entry off the list 2013-12-05 11:00 [PATCH RFC 0/3] nfsd: convert nfsd DRC code to use list_lru infrastructure Jeff Layton @ 2013-12-05 11:00 ` Jeff Layton 2013-12-05 13:29 ` Christoph Hellwig 2013-12-05 11:00 ` [PATCH RFC 2/3] list_lru: add a new LRU_SKIP_REST lru_status value and handling Jeff Layton ` (2 subsequent siblings) 3 siblings, 1 reply; 16+ messages in thread From: Jeff Layton @ 2013-12-05 11:00 UTC (permalink / raw) To: linux-nfs; +Cc: Christoph Hellwig, J. Bruce Fields Currently when we are processing a request, we try to scrape an expired or over-limit entry off the list in preference to allocating a new one from the slab. This is unnecessarily complicated. Just use the slab layer. Signed-off-by: Jeff Layton <jlayton@redhat.com> --- fs/nfsd/nfscache.c | 36 ++++-------------------------------- 1 file changed, 4 insertions(+), 32 deletions(-) diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c index b6af150..f8f060f 100644 --- a/fs/nfsd/nfscache.c +++ b/fs/nfsd/nfscache.c @@ -132,13 +132,6 @@ nfsd_reply_cache_alloc(void) } static void -nfsd_reply_cache_unhash(struct svc_cacherep *rp) -{ - hlist_del_init(&rp->c_hash); - list_del_init(&rp->c_lru); -} - -static void nfsd_reply_cache_free_locked(struct svc_cacherep *rp) { if (rp->c_type == RC_REPLBUFF && rp->c_replvec.iov_base) { @@ -416,22 +409,8 @@ nfsd_cache_lookup(struct svc_rqst *rqstp) /* * Since the common case is a cache miss followed by an insert, - * preallocate an entry. First, try to reuse the first entry on the LRU - * if it works, then go ahead and prune the LRU list. + * preallocate an entry. */ - spin_lock(&cache_lock); - if (!list_empty(&lru_head)) { - rp = list_first_entry(&lru_head, struct svc_cacherep, c_lru); - if (nfsd_cache_entry_expired(rp) || - num_drc_entries >= max_drc_entries) { - nfsd_reply_cache_unhash(rp); - prune_cache_entries(); - goto search_cache; - } - } - - /* No expired ones available, allocate a new one. */ - spin_unlock(&cache_lock); rp = nfsd_reply_cache_alloc(); spin_lock(&cache_lock); if (likely(rp)) { @@ -439,7 +418,9 @@ nfsd_cache_lookup(struct svc_rqst *rqstp) drc_mem_usage += sizeof(*rp); } -search_cache: + /* go ahead and prune the cache */ + prune_cache_entries(); + found = nfsd_cache_search(rqstp, csum); if (found) { if (likely(rp)) @@ -453,15 +434,6 @@ search_cache: goto out; } - /* - * We're keeping the one we just allocated. Are we now over the - * limit? Prune one off the tip of the LRU in trade for the one we - * just allocated if so. - */ - if (num_drc_entries >= max_drc_entries) - nfsd_reply_cache_free_locked(list_first_entry(&lru_head, - struct svc_cacherep, c_lru)); - nfsdstats.rcmisses++; rqstp->rq_cacherep = rp; rp->c_state = RC_INPROG; -- 1.8.4.2 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH RFC 1/3] nfsd: don't try to reuse an expired DRC entry off the list 2013-12-05 11:00 ` [PATCH RFC 1/3] nfsd: don't try to reuse an expired DRC entry off the list Jeff Layton @ 2013-12-05 13:29 ` Christoph Hellwig 2013-12-05 13:41 ` Jeff Layton 0 siblings, 1 reply; 16+ messages in thread From: Christoph Hellwig @ 2013-12-05 13:29 UTC (permalink / raw) To: Jeff Layton; +Cc: linux-nfs, Christoph Hellwig, J. Bruce Fields On Thu, Dec 05, 2013 at 06:00:51AM -0500, Jeff Layton wrote: > Currently when we are processing a request, we try to scrape an expired > or over-limit entry off the list in preference to allocating a new one > from the slab. > > This is unnecessarily complicated. Just use the slab layer. I really don't think pruning caches from lookup functions is a good idea. To many chances for locking inversions, unbounded runtime and other issues. So I'd prefer my earlier version of the patch to remove it entirely if we want to change anything beyond your minimal fix. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH RFC 1/3] nfsd: don't try to reuse an expired DRC entry off the list 2013-12-05 13:29 ` Christoph Hellwig @ 2013-12-05 13:41 ` Jeff Layton 2013-12-05 15:50 ` J. Bruce Fields 0 siblings, 1 reply; 16+ messages in thread From: Jeff Layton @ 2013-12-05 13:41 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-nfs, J. Bruce Fields On Thu, 5 Dec 2013 05:29:19 -0800 Christoph Hellwig <hch@infradead.org> wrote: > On Thu, Dec 05, 2013 at 06:00:51AM -0500, Jeff Layton wrote: > > Currently when we are processing a request, we try to scrape an expired > > or over-limit entry off the list in preference to allocating a new one > > from the slab. > > > > This is unnecessarily complicated. Just use the slab layer. > > I really don't think pruning caches from lookup functions is a good > idea. To many chances for locking inversions, unbounded runtime and > other issues. So I'd prefer my earlier version of the patch to > remove it entirely if we want to change anything beyond your minimal > fix. > It's not likely to hit a locking inversion here since we don't have a lot of different locks, but point taken... If we take that out, then that does mean that the cache may grow larger than the max...possibly much larger since the pruner workqueue job only runs every 120s and you can put a lot more entries into the cache in that period. That said, I don't feel too strongly about it, so if you think leaving it to the workqueue job and the shrinker is the right thing to do, then so be it. -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH RFC 1/3] nfsd: don't try to reuse an expired DRC entry off the list 2013-12-05 13:41 ` Jeff Layton @ 2013-12-05 15:50 ` J. Bruce Fields 2013-12-05 16:22 ` Jeff Layton 0 siblings, 1 reply; 16+ messages in thread From: J. Bruce Fields @ 2013-12-05 15:50 UTC (permalink / raw) To: Jeff Layton; +Cc: Christoph Hellwig, linux-nfs On Thu, Dec 05, 2013 at 08:41:56AM -0500, Jeff Layton wrote: > On Thu, 5 Dec 2013 05:29:19 -0800 > Christoph Hellwig <hch@infradead.org> wrote: > > > On Thu, Dec 05, 2013 at 06:00:51AM -0500, Jeff Layton wrote: > > > Currently when we are processing a request, we try to scrape an expired > > > or over-limit entry off the list in preference to allocating a new one > > > from the slab. > > > > > > This is unnecessarily complicated. Just use the slab layer. > > > > I really don't think pruning caches from lookup functions is a good > > idea. To many chances for locking inversions, unbounded runtime and > > other issues. So I'd prefer my earlier version of the patch to > > remove it entirely if we want to change anything beyond your minimal > > fix. > > > > It's not likely to hit a locking inversion here since we don't have a > lot of different locks, but point taken... > > If we take that out, then that does mean that the cache may grow larger > than the max...possibly much larger since the pruner workqueue job > only runs every 120s and you can put a lot more entries into the cache > in that period. Yeah, this scares me. I looked through my old mail but couldn't find your previous results: how long did you find the hash buckets needed to be before the difference was noticeable on your setup? We typically do a failed-lookup-and-insert on every cached operation so I wouldn't be surprised if a workload of small random writes, for example, could produce an unreasonably large cache in that time. --b. > > That said, I don't feel too strongly about it, so if you think leaving > it to the workqueue job and the shrinker is the right thing to do, then > so be it. > > -- > Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH RFC 1/3] nfsd: don't try to reuse an expired DRC entry off the list 2013-12-05 15:50 ` J. Bruce Fields @ 2013-12-05 16:22 ` Jeff Layton 0 siblings, 0 replies; 16+ messages in thread From: Jeff Layton @ 2013-12-05 16:22 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Christoph Hellwig, linux-nfs On Thu, 5 Dec 2013 10:50:24 -0500 "J. Bruce Fields" <bfields@fieldses.org> wrote: > On Thu, Dec 05, 2013 at 08:41:56AM -0500, Jeff Layton wrote: > > On Thu, 5 Dec 2013 05:29:19 -0800 > > Christoph Hellwig <hch@infradead.org> wrote: > > > > > On Thu, Dec 05, 2013 at 06:00:51AM -0500, Jeff Layton wrote: > > > > Currently when we are processing a request, we try to scrape an expired > > > > or over-limit entry off the list in preference to allocating a new one > > > > from the slab. > > > > > > > > This is unnecessarily complicated. Just use the slab layer. > > > > > > I really don't think pruning caches from lookup functions is a good > > > idea. To many chances for locking inversions, unbounded runtime and > > > other issues. So I'd prefer my earlier version of the patch to > > > remove it entirely if we want to change anything beyond your minimal > > > fix. > > > > > > > It's not likely to hit a locking inversion here since we don't have a > > lot of different locks, but point taken... > > > > If we take that out, then that does mean that the cache may grow larger > > than the max...possibly much larger since the pruner workqueue job > > only runs every 120s and you can put a lot more entries into the cache > > in that period. > > Yeah, this scares me. > > I looked through my old mail but couldn't find your previous results: > how long did you find the hash buckets needed to be before the > difference was noticeable on your setup? > > We typically do a failed-lookup-and-insert on every cached operation so > I wouldn't be surprised if a workload of small random writes, for > example, could produce an unreasonably large cache in that time. > I don't think I ever measured that. I did some some measurement of how long it took to generate checksums, but that's a different phase of this stuff. I just sort of went with "too long a chain == bad, so keep them as short as possible". > > > > > That said, I don't feel too strongly about it, so if you think leaving > > it to the workqueue job and the shrinker is the right thing to do, then > > so be it. > > > > -- > > Jeff Layton <jlayton@redhat.com> -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH RFC 2/3] list_lru: add a new LRU_SKIP_REST lru_status value and handling 2013-12-05 11:00 [PATCH RFC 0/3] nfsd: convert nfsd DRC code to use list_lru infrastructure Jeff Layton 2013-12-05 11:00 ` [PATCH RFC 1/3] nfsd: don't try to reuse an expired DRC entry off the list Jeff Layton @ 2013-12-05 11:00 ` Jeff Layton 2013-12-05 13:30 ` Christoph Hellwig 2013-12-05 11:00 ` [PATCH RFC 3/3] nfsd: convert DRC code to use list_lru Jeff Layton 2013-12-05 13:27 ` [PATCH RFC 0/3] nfsd: convert nfsd DRC code to use list_lru infrastructure Christoph Hellwig 3 siblings, 1 reply; 16+ messages in thread From: Jeff Layton @ 2013-12-05 11:00 UTC (permalink / raw) To: linux-nfs; +Cc: Christoph Hellwig, J. Bruce Fields The current list_lru_walk_node implementation always walks the entire list. In some cases, we can be sure that when an entry isn't freeable that none of the rest of the entries on the list will be either. Create a new LRU_SKIP_REST return value that not only indicates that the current entry was skipped, but that caller should stop scanning the current node. Signed-off-by: Jeff Layton <jlayton@redhat.com> --- include/linux/list_lru.h | 2 ++ mm/list_lru.c | 4 +++- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h index 3ce5417..6140c3a 100644 --- a/include/linux/list_lru.h +++ b/include/linux/list_lru.h @@ -15,6 +15,8 @@ enum lru_status { LRU_REMOVED, /* item removed from list */ LRU_ROTATE, /* item referenced, give another pass */ LRU_SKIP, /* item cannot be locked, skip */ + LRU_SKIP_REST, /* item isn't freeable, and none of the rest + on the list will be either */ LRU_RETRY, /* item not freeable. May drop the lock internally, but has to return locked. */ }; diff --git a/mm/list_lru.c b/mm/list_lru.c index 72f9dec..abec7fb 100644 --- a/mm/list_lru.c +++ b/mm/list_lru.c @@ -98,6 +98,8 @@ restart: break; case LRU_SKIP: break; + case LRU_SKIP_REST: + goto done; case LRU_RETRY: /* * The lru lock has been dropped, our list traversal is @@ -108,7 +110,7 @@ restart: BUG(); } } - +done: spin_unlock(&nlru->lock); return isolated; } -- 1.8.4.2 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH RFC 2/3] list_lru: add a new LRU_SKIP_REST lru_status value and handling 2013-12-05 11:00 ` [PATCH RFC 2/3] list_lru: add a new LRU_SKIP_REST lru_status value and handling Jeff Layton @ 2013-12-05 13:30 ` Christoph Hellwig 2013-12-05 13:36 ` Jeff Layton 0 siblings, 1 reply; 16+ messages in thread From: Christoph Hellwig @ 2013-12-05 13:30 UTC (permalink / raw) To: Jeff Layton; +Cc: linux-nfs, Christoph Hellwig, J. Bruce Fields, Dave Chinner On Thu, Dec 05, 2013 at 06:00:52AM -0500, Jeff Layton wrote: > The current list_lru_walk_node implementation always walks the entire > list. In some cases, we can be sure that when an entry isn't freeable > that none of the rest of the entries on the list will be either. > > Create a new LRU_SKIP_REST return value that not only indicates that > the current entry was skipped, but that caller should stop scanning > the current node. > > Signed-off-by: Jeff Layton <jlayton@redhat.com> I'd call it LRU_DONE, but otherwise this looks fine to me. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH RFC 2/3] list_lru: add a new LRU_SKIP_REST lru_status value and handling 2013-12-05 13:30 ` Christoph Hellwig @ 2013-12-05 13:36 ` Jeff Layton 0 siblings, 0 replies; 16+ messages in thread From: Jeff Layton @ 2013-12-05 13:36 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-nfs, J. Bruce Fields, Dave Chinner On Thu, 5 Dec 2013 05:30:08 -0800 Christoph Hellwig <hch@infradead.org> wrote: > On Thu, Dec 05, 2013 at 06:00:52AM -0500, Jeff Layton wrote: > > The current list_lru_walk_node implementation always walks the entire > > list. In some cases, we can be sure that when an entry isn't freeable > > that none of the rest of the entries on the list will be either. > > > > Create a new LRU_SKIP_REST return value that not only indicates that > > the current entry was skipped, but that caller should stop scanning > > the current node. > > > > Signed-off-by: Jeff Layton <jlayton@redhat.com> > > I'd call it LRU_DONE, but otherwise this looks fine to me. > Ok, if we end up taking this in I'll change the name... Thanks, -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH RFC 3/3] nfsd: convert DRC code to use list_lru 2013-12-05 11:00 [PATCH RFC 0/3] nfsd: convert nfsd DRC code to use list_lru infrastructure Jeff Layton 2013-12-05 11:00 ` [PATCH RFC 1/3] nfsd: don't try to reuse an expired DRC entry off the list Jeff Layton 2013-12-05 11:00 ` [PATCH RFC 2/3] list_lru: add a new LRU_SKIP_REST lru_status value and handling Jeff Layton @ 2013-12-05 11:00 ` Jeff Layton 2013-12-05 13:41 ` Christoph Hellwig 2013-12-05 13:27 ` [PATCH RFC 0/3] nfsd: convert nfsd DRC code to use list_lru infrastructure Christoph Hellwig 3 siblings, 1 reply; 16+ messages in thread From: Jeff Layton @ 2013-12-05 11:00 UTC (permalink / raw) To: linux-nfs; +Cc: Christoph Hellwig, J. Bruce Fields Rather than keeping our own LRU list, convert the nfsd DRC code to use the new list_lru routines. Signed-off-by: Jeff Layton <jlayton@redhat.com> --- fs/nfsd/nfscache.c | 94 ++++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 66 insertions(+), 28 deletions(-) diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c index f8f060f..7f5480f 100644 --- a/fs/nfsd/nfscache.c +++ b/fs/nfsd/nfscache.c @@ -13,6 +13,7 @@ #include <linux/highmem.h> #include <linux/log2.h> #include <linux/hash.h> +#include <linux/list_lru.h> #include <net/checksum.h> #include "nfsd.h" @@ -28,7 +29,7 @@ #define TARGET_BUCKET_SIZE 64 static struct hlist_head * cache_hash; -static struct list_head lru_head; +static struct list_lru lru_head; static struct kmem_cache *drc_slab; /* max number of entries allowed in the cache */ @@ -132,7 +133,7 @@ nfsd_reply_cache_alloc(void) } static void -nfsd_reply_cache_free_locked(struct svc_cacherep *rp) +__nfsd_reply_cache_free_locked(struct svc_cacherep *rp) { if (rp->c_type == RC_REPLBUFF && rp->c_replvec.iov_base) { drc_mem_usage -= rp->c_replvec.iov_len; @@ -140,13 +141,26 @@ nfsd_reply_cache_free_locked(struct svc_cacherep *rp) } if (!hlist_unhashed(&rp->c_hash)) hlist_del(&rp->c_hash); - list_del(&rp->c_lru); --num_drc_entries; drc_mem_usage -= sizeof(*rp); kmem_cache_free(drc_slab, rp); } static void +nfsd_reply_cache_free_locked(struct svc_cacherep *rp) +{ + list_lru_del(&lru_head, &rp->c_lru); + __nfsd_reply_cache_free_locked(rp); +} + +static void +nfsd_reply_cache_free_isolate(struct svc_cacherep *rp) +{ + list_del(&rp->c_lru); + __nfsd_reply_cache_free_locked(rp); +} + +static void nfsd_reply_cache_free(struct svc_cacherep *rp) { spin_lock(&cache_lock); @@ -156,50 +170,66 @@ nfsd_reply_cache_free(struct svc_cacherep *rp) int nfsd_reply_cache_init(void) { + int ret; unsigned int hashsize; - INIT_LIST_HEAD(&lru_head); max_drc_entries = nfsd_cache_size_limit(); num_drc_entries = 0; hashsize = nfsd_hashsize(max_drc_entries); maskbits = ilog2(hashsize); register_shrinker(&nfsd_reply_cache_shrinker); + + ret = list_lru_init(&lru_head); + if (ret) + goto out_error; + + ret = -ENOMEM; drc_slab = kmem_cache_create("nfsd_drc", sizeof(struct svc_cacherep), 0, 0, NULL); if (!drc_slab) - goto out_nomem; + goto out_error; cache_hash = kcalloc(hashsize, sizeof(struct hlist_head), GFP_KERNEL); if (!cache_hash) - goto out_nomem; + goto out_error; return 0; -out_nomem: - printk(KERN_ERR "nfsd: failed to allocate reply cache\n"); +out_error: + printk(KERN_ERR "nfsd: failed to setup reply cache: %d\n", ret); nfsd_reply_cache_shutdown(); - return -ENOMEM; + return ret; } -void nfsd_reply_cache_shutdown(void) +static enum lru_status +nfsd_purge_lru_entry(struct list_head *item, spinlock_t *lock, void *cb_arg) { - struct svc_cacherep *rp; + struct svc_cacherep *rp = list_entry(item, struct svc_cacherep, c_lru); + nfsd_reply_cache_free_locked(rp); + return LRU_REMOVED; +} + +void nfsd_reply_cache_shutdown(void) +{ unregister_shrinker(&nfsd_reply_cache_shrinker); cancel_delayed_work_sync(&cache_cleaner); - while (!list_empty(&lru_head)) { - rp = list_entry(lru_head.next, struct svc_cacherep, c_lru); - nfsd_reply_cache_free_locked(rp); - } + /* In principle, nothing should be altering the list now, but... */ + spin_lock(&cache_lock); + list_lru_walk(&lru_head, nfsd_purge_lru_entry, NULL, ULONG_MAX); + spin_unlock(&cache_lock); - kfree (cache_hash); + kfree(cache_hash); cache_hash = NULL; if (drc_slab) { kmem_cache_destroy(drc_slab); drc_slab = NULL; } + + list_lru_destroy(&lru_head); + memset(&lru_head, 0, sizeof(lru_head)); } /* @@ -210,7 +240,8 @@ static void lru_put_end(struct svc_cacherep *rp) { rp->c_timestamp = jiffies; - list_move_tail(&rp->c_lru, &lru_head); + list_lru_del(&lru_head, &rp->c_lru); + list_lru_add(&lru_head, &rp->c_lru); schedule_delayed_work(&cache_cleaner, RC_EXPIRE); } @@ -231,23 +262,30 @@ nfsd_cache_entry_expired(struct svc_cacherep *rp) time_after(jiffies, rp->c_timestamp + RC_EXPIRE); } +static enum lru_status +nfsd_purge_expired_entry(struct list_head *item, spinlock_t *lock, void *cb_arg) +{ + struct svc_cacherep *rp = list_entry(item, struct svc_cacherep, c_lru); + + if (!nfsd_cache_entry_expired(rp) && + num_drc_entries <= max_drc_entries) + return LRU_SKIP_REST; + + nfsd_reply_cache_free_isolate(rp); + return LRU_REMOVED; +} + /* * Walk the LRU list and prune off entries that are older than RC_EXPIRE. * Also prune the oldest ones when the total exceeds the max number of entries. */ -static long +static unsigned long prune_cache_entries(void) { - struct svc_cacherep *rp, *tmp; - long freed = 0; + unsigned long freed; - list_for_each_entry_safe(rp, tmp, &lru_head, c_lru) { - if (!nfsd_cache_entry_expired(rp) && - num_drc_entries <= max_drc_entries) - break; - nfsd_reply_cache_free_locked(rp); - freed++; - } + freed = list_lru_walk(&lru_head, nfsd_purge_expired_entry, NULL, + ULONG_MAX); /* * Conditionally rearm the job. If we cleaned out the list, then @@ -255,7 +293,7 @@ prune_cache_entries(void) * Otherwise, we rearm the job or modify the existing one to run in * RC_EXPIRE since we just ran the pruner. */ - if (list_empty(&lru_head)) + if (num_drc_entries == 0) cancel_delayed_work(&cache_cleaner); else mod_delayed_work(system_wq, &cache_cleaner, RC_EXPIRE); -- 1.8.4.2 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH RFC 3/3] nfsd: convert DRC code to use list_lru 2013-12-05 11:00 ` [PATCH RFC 3/3] nfsd: convert DRC code to use list_lru Jeff Layton @ 2013-12-05 13:41 ` Christoph Hellwig 2013-12-05 13:48 ` Jeff Layton 0 siblings, 1 reply; 16+ messages in thread From: Christoph Hellwig @ 2013-12-05 13:41 UTC (permalink / raw) To: Jeff Layton; +Cc: linux-nfs, Christoph Hellwig, J. Bruce Fields > static void > -nfsd_reply_cache_free_locked(struct svc_cacherep *rp) > +__nfsd_reply_cache_free_locked(struct svc_cacherep *rp) > { > if (rp->c_type == RC_REPLBUFF && rp->c_replvec.iov_base) { > drc_mem_usage -= rp->c_replvec.iov_len; > @@ -140,13 +141,26 @@ nfsd_reply_cache_free_locked(struct svc_cacherep *rp) > } > if (!hlist_unhashed(&rp->c_hash)) > hlist_del(&rp->c_hash); > - list_del(&rp->c_lru); > --num_drc_entries; > drc_mem_usage -= sizeof(*rp); > kmem_cache_free(drc_slab, rp); I would be better to move the hash list deletion out of this and keep this as a plain nfsd_reply_cache_free(). E.g. in the lookup cache hit case we've never linked any item and would want this low-level function. > } > > static void > +nfsd_reply_cache_free_locked(struct svc_cacherep *rp) > +{ > + list_lru_del(&lru_head, &rp->c_lru); > + __nfsd_reply_cache_free_locked(rp); > +} > + > +static void > +nfsd_reply_cache_free_isolate(struct svc_cacherep *rp) > +{ > + list_del(&rp->c_lru); > + __nfsd_reply_cache_free_locked(rp); > +} Should be merged into the only caller. > + > +static void > nfsd_reply_cache_free(struct svc_cacherep *rp) > { > spin_lock(&cache_lock); > @@ -156,50 +170,66 @@ nfsd_reply_cache_free(struct svc_cacherep *rp) > > +static enum lru_status > +nfsd_purge_lru_entry(struct list_head *item, spinlock_t *lock, void *cb_arg) > { > - struct svc_cacherep *rp; > + struct svc_cacherep *rp = list_entry(item, struct svc_cacherep, c_lru); > > + nfsd_reply_cache_free_locked(rp); > + return LRU_REMOVED; > +} > + > +void nfsd_reply_cache_shutdown(void) > +{ > unregister_shrinker(&nfsd_reply_cache_shrinker); > cancel_delayed_work_sync(&cache_cleaner); > > - while (!list_empty(&lru_head)) { > - rp = list_entry(lru_head.next, struct svc_cacherep, c_lru); > - nfsd_reply_cache_free_locked(rp); > - } > + /* In principle, nothing should be altering the list now, but... */ > + spin_lock(&cache_lock); > + list_lru_walk(&lru_head, nfsd_purge_lru_entry, NULL, ULONG_MAX); > + spin_unlock(&cache_lock); This needs the version that does the list_del internally to, doesn't it? I'd suggest to just use sd_purge_expired_entry with a forced flag in the argument. > + memset(&lru_head, 0, sizeof(lru_head)); don't think there's much of a point. All together doesn't seem to helpful as long as the DRC keeps it's own timing based purge and similar bits unfortunatel. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH RFC 3/3] nfsd: convert DRC code to use list_lru 2013-12-05 13:41 ` Christoph Hellwig @ 2013-12-05 13:48 ` Jeff Layton 2013-12-05 15:58 ` J. Bruce Fields 0 siblings, 1 reply; 16+ messages in thread From: Jeff Layton @ 2013-12-05 13:48 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-nfs, J. Bruce Fields On Thu, 5 Dec 2013 05:41:29 -0800 Christoph Hellwig <hch@infradead.org> wrote: > > static void > > -nfsd_reply_cache_free_locked(struct svc_cacherep *rp) > > +__nfsd_reply_cache_free_locked(struct svc_cacherep *rp) > > { > > if (rp->c_type == RC_REPLBUFF && rp->c_replvec.iov_base) { > > drc_mem_usage -= rp->c_replvec.iov_len; > > @@ -140,13 +141,26 @@ nfsd_reply_cache_free_locked(struct svc_cacherep *rp) > > } > > if (!hlist_unhashed(&rp->c_hash)) > > hlist_del(&rp->c_hash); > > - list_del(&rp->c_lru); > > --num_drc_entries; > > drc_mem_usage -= sizeof(*rp); > > kmem_cache_free(drc_slab, rp); > > I would be better to move the hash list deletion out of this and > keep this as a plain nfsd_reply_cache_free(). E.g. in the lookup > cache hit case we've never linked any item and would want this low-level > function. > Ok. > > } > > > > static void > > +nfsd_reply_cache_free_locked(struct svc_cacherep *rp) > > +{ > > + list_lru_del(&lru_head, &rp->c_lru); > > + __nfsd_reply_cache_free_locked(rp); > > +} > > + > > +static void > > +nfsd_reply_cache_free_isolate(struct svc_cacherep *rp) > > +{ > > + list_del(&rp->c_lru); > > + __nfsd_reply_cache_free_locked(rp); > > +} > > Should be merged into the only caller. > Ok. > > + > > +static void > > nfsd_reply_cache_free(struct svc_cacherep *rp) > > { > > spin_lock(&cache_lock); > > @@ -156,50 +170,66 @@ nfsd_reply_cache_free(struct svc_cacherep *rp) > > > > > +static enum lru_status > > +nfsd_purge_lru_entry(struct list_head *item, spinlock_t *lock, void *cb_arg) > > { > > - struct svc_cacherep *rp; > > + struct svc_cacherep *rp = list_entry(item, struct svc_cacherep, c_lru); > > > > + nfsd_reply_cache_free_locked(rp); > > + return LRU_REMOVED; > > +} > > + > > +void nfsd_reply_cache_shutdown(void) > > +{ > > unregister_shrinker(&nfsd_reply_cache_shrinker); > > cancel_delayed_work_sync(&cache_cleaner); > > > > - while (!list_empty(&lru_head)) { > > - rp = list_entry(lru_head.next, struct svc_cacherep, c_lru); > > - nfsd_reply_cache_free_locked(rp); > > - } > > + /* In principle, nothing should be altering the list now, but... */ > > + spin_lock(&cache_lock); > > + list_lru_walk(&lru_head, nfsd_purge_lru_entry, NULL, ULONG_MAX); > > + spin_unlock(&cache_lock); > > This needs the version that does the list_del internally to, doesn't > it? I'd suggest to just use sd_purge_expired_entry with a forced > flag in the argument. > Yes it does. Good catch. > > + memset(&lru_head, 0, sizeof(lru_head)); > > don't think there's much of a point. > Yeah, I got spooked by the fact that lru->node isn't zeroed out in list_lru_destroy, but on list_lru_init it should get clobbered anyway so the memset is pointless. > > All together doesn't seem to helpful as long as the DRC keeps > it's own timing based purge and similar bits unfortunatel. Agreed. It might make sense to do this in the context of a larger redesign but otherwise I can't get too excited about this set, TBH... -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH RFC 3/3] nfsd: convert DRC code to use list_lru 2013-12-05 13:48 ` Jeff Layton @ 2013-12-05 15:58 ` J. Bruce Fields 2013-12-11 16:31 ` J. Bruce Fields 0 siblings, 1 reply; 16+ messages in thread From: J. Bruce Fields @ 2013-12-05 15:58 UTC (permalink / raw) To: Jeff Layton; +Cc: Christoph Hellwig, linux-nfs On Thu, Dec 05, 2013 at 08:48:25AM -0500, Jeff Layton wrote: > On Thu, 5 Dec 2013 05:41:29 -0800 > Christoph Hellwig <hch@infradead.org> wrote: > > All together doesn't seem to helpful as long as the DRC keeps > > it's own timing based purge and similar bits unfortunatel. > > Agreed. It might make sense to do this in the context of a larger > redesign but otherwise I can't get too excited about this set, TBH... Well, patch 1/3 still deletes some code and doesn't have any obvious downsides. --b. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH RFC 3/3] nfsd: convert DRC code to use list_lru 2013-12-05 15:58 ` J. Bruce Fields @ 2013-12-11 16:31 ` J. Bruce Fields 0 siblings, 0 replies; 16+ messages in thread From: J. Bruce Fields @ 2013-12-11 16:31 UTC (permalink / raw) To: Jeff Layton; +Cc: Christoph Hellwig, linux-nfs On Thu, Dec 05, 2013 at 10:58:23AM -0500, J. Bruce Fields wrote: > On Thu, Dec 05, 2013 at 08:48:25AM -0500, Jeff Layton wrote: > > On Thu, 5 Dec 2013 05:41:29 -0800 > > Christoph Hellwig <hch@infradead.org> wrote: > > > All together doesn't seem to helpful as long as the DRC keeps > > > it's own timing based purge and similar bits unfortunatel. > > > > Agreed. It might make sense to do this in the context of a larger > > redesign but otherwise I can't get too excited about this set, TBH... > > Well, patch 1/3 still deletes some code and doesn't have any obvious > downsides. I'm applying that one patch and dropping the rest for now. I understand Christoph's request to remove the one-off cache pruning but first want to be convinced that's safe. --b. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH RFC 0/3] nfsd: convert nfsd DRC code to use list_lru infrastructure 2013-12-05 11:00 [PATCH RFC 0/3] nfsd: convert nfsd DRC code to use list_lru infrastructure Jeff Layton ` (2 preceding siblings ...) 2013-12-05 11:00 ` [PATCH RFC 3/3] nfsd: convert DRC code to use list_lru Jeff Layton @ 2013-12-05 13:27 ` Christoph Hellwig 2013-12-05 13:37 ` Jeff Layton 3 siblings, 1 reply; 16+ messages in thread From: Christoph Hellwig @ 2013-12-05 13:27 UTC (permalink / raw) To: Jeff Layton; +Cc: linux-nfs, Christoph Hellwig, J. Bruce Fields, Dave Chinner On Thu, Dec 05, 2013 at 06:00:50AM -0500, Jeff Layton wrote: > This patchset converts the LRU list in the nfsd duplicate reply cache to > use the new list_lru infrastrucure. Note that this is based on top of > the patch that I sent to Bruce earlier this week that fixes the > svc_cacherep direct reclaim bug. I'm sending this as an RFC since I'm > not 100% convinced it's an improvement. > > The majorly unintuitive thing about list_lru that I've found is that > you can't call call list_lru_del from list_lru_walk. So, you need > different routines to free an object depending on how it's being freed. That's because list_lru_walk already does all the accounting. You can simply do an list_del_init and then return LRU_MOVED and you're done. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH RFC 0/3] nfsd: convert nfsd DRC code to use list_lru infrastructure 2013-12-05 13:27 ` [PATCH RFC 0/3] nfsd: convert nfsd DRC code to use list_lru infrastructure Christoph Hellwig @ 2013-12-05 13:37 ` Jeff Layton 0 siblings, 0 replies; 16+ messages in thread From: Jeff Layton @ 2013-12-05 13:37 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-nfs, J. Bruce Fields, Dave Chinner On Thu, 5 Dec 2013 05:27:17 -0800 Christoph Hellwig <hch@infradead.org> wrote: > On Thu, Dec 05, 2013 at 06:00:50AM -0500, Jeff Layton wrote: > > This patchset converts the LRU list in the nfsd duplicate reply cache to > > use the new list_lru infrastrucure. Note that this is based on top of > > the patch that I sent to Bruce earlier this week that fixes the > > svc_cacherep direct reclaim bug. I'm sending this as an RFC since I'm > > not 100% convinced it's an improvement. > > > > The majorly unintuitive thing about list_lru that I've found is that > > you can't call call list_lru_del from list_lru_walk. So, you need > > different routines to free an object depending on how it's being freed. > > That's because list_lru_walk already does all the accounting. You can > simply do an list_del_init and then return LRU_MOVED and you're done. > Well no...we have to take it off the hlist too, kmem_cache_free the object and fix up the accounting for the stats. list_lru handles the accounting for the lru list itself, but not for the rest of the housekeeping we have to do. -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2013-12-11 16:31 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-12-05 11:00 [PATCH RFC 0/3] nfsd: convert nfsd DRC code to use list_lru infrastructure Jeff Layton 2013-12-05 11:00 ` [PATCH RFC 1/3] nfsd: don't try to reuse an expired DRC entry off the list Jeff Layton 2013-12-05 13:29 ` Christoph Hellwig 2013-12-05 13:41 ` Jeff Layton 2013-12-05 15:50 ` J. Bruce Fields 2013-12-05 16:22 ` Jeff Layton 2013-12-05 11:00 ` [PATCH RFC 2/3] list_lru: add a new LRU_SKIP_REST lru_status value and handling Jeff Layton 2013-12-05 13:30 ` Christoph Hellwig 2013-12-05 13:36 ` Jeff Layton 2013-12-05 11:00 ` [PATCH RFC 3/3] nfsd: convert DRC code to use list_lru Jeff Layton 2013-12-05 13:41 ` Christoph Hellwig 2013-12-05 13:48 ` Jeff Layton 2013-12-05 15:58 ` J. Bruce Fields 2013-12-11 16:31 ` J. Bruce Fields 2013-12-05 13:27 ` [PATCH RFC 0/3] nfsd: convert nfsd DRC code to use list_lru infrastructure Christoph Hellwig 2013-12-05 13:37 ` Jeff Layton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).