* [PATCH v2] nfsd: serialize filecache garbage collector
@ 2022-05-31 10:34 Wang Yugui
2022-05-31 14:12 ` Chuck Lever III
0 siblings, 1 reply; 3+ messages in thread
From: Wang Yugui @ 2022-05-31 10:34 UTC (permalink / raw)
To: linux-nfs; +Cc: Wang Yugui
When many(>NFSD_FILE_LRU_THRESHOLD) files are kept as OPEN, such as
xfstests generic/531, nfsd proceses are in CPU high-load state,
and nfsd_file_gc(nfsd filecache garbage collector) waste many CPU times.
concurrency nfsd_file_gc() is almost meaningless, so serialize it.
Signed-off-by: Wang Yugui <wangyugui@e16-tech.com>
---
Changes since v1:
- add static to 'atomic_t nfsd_file_gc_running'.
thanks for kernel test robot <lkp@intel.com>
fs/nfsd/filecache.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index f172412447f5..28a8f8d6d235 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -471,10 +471,15 @@ nfsd_file_lru_walk_list(struct shrink_control *sc)
return ret;
}
+/* concurrency nfsd_file_gc() is almost meaningless, so serialize it. */
+static atomic_t nfsd_file_gc_running = ATOMIC_INIT(0);
static void
nfsd_file_gc(void)
{
- nfsd_file_lru_walk_list(NULL);
+ if(atomic_cmpxchg(&nfsd_file_gc_running, 0, 1) == 0) {
+ nfsd_file_lru_walk_list(NULL);
+ atomic_set(&nfsd_file_gc_running, 0);
+ }
}
static void
--
2.36.1
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH v2] nfsd: serialize filecache garbage collector
2022-05-31 10:34 [PATCH v2] nfsd: serialize filecache garbage collector Wang Yugui
@ 2022-05-31 14:12 ` Chuck Lever III
2022-05-31 14:44 ` Wang Yugui
0 siblings, 1 reply; 3+ messages in thread
From: Chuck Lever III @ 2022-05-31 14:12 UTC (permalink / raw)
To: Wang Yugui; +Cc: Linux NFS Mailing List
> On May 31, 2022, at 6:34 AM, Wang Yugui <wangyugui@e16-tech.com> wrote:
>
> When many(>NFSD_FILE_LRU_THRESHOLD) files are kept as OPEN, such as
> xfstests generic/531, nfsd proceses are in CPU high-load state,
> and nfsd_file_gc(nfsd filecache garbage collector) waste many CPU times.
Over the past few days, I've been able to reproduce a lot of bad
behavior with generic/531. My test client has 12 physical CPU
cores, and my lab network is 56Gb InfiniBand.
Unfortunately this patch doesn't really begin to address it. For
example, with this patch applied, CPU idle is in single digits
on the NFS server that exports the test's scratch device, and
that server can still get into a soft lock-up. IMO that is
because this change works around the underlying problem but
makes no attempt to root-cause or address that issue.
I agree that the NFS server's behavior needs attention, but I'm
not inclined to apply this particular patch as it is.
> concurrency nfsd_file_gc() is almost meaningless, so serialize it.
>
> Signed-off-by: Wang Yugui <wangyugui@e16-tech.com>
> ---
> Changes since v1:
> - add static to 'atomic_t nfsd_file_gc_running'.
> thanks for kernel test robot <lkp@intel.com>
>
> fs/nfsd/filecache.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> index f172412447f5..28a8f8d6d235 100644
> --- a/fs/nfsd/filecache.c
> +++ b/fs/nfsd/filecache.c
> @@ -471,10 +471,15 @@ nfsd_file_lru_walk_list(struct shrink_control *sc)
> return ret;
> }
>
> +/* concurrency nfsd_file_gc() is almost meaningless, so serialize it. */
> +static atomic_t nfsd_file_gc_running = ATOMIC_INIT(0);
> static void
> nfsd_file_gc(void)
> {
> - nfsd_file_lru_walk_list(NULL);
> + if(atomic_cmpxchg(&nfsd_file_gc_running, 0, 1) == 0) {
> + nfsd_file_lru_walk_list(NULL);
> + atomic_set(&nfsd_file_gc_running, 0);
> + }
> }
>
> static void
> --
> 2.36.1
>
--
Chuck Lever
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH v2] nfsd: serialize filecache garbage collector
2022-05-31 14:12 ` Chuck Lever III
@ 2022-05-31 14:44 ` Wang Yugui
0 siblings, 0 replies; 3+ messages in thread
From: Wang Yugui @ 2022-05-31 14:44 UTC (permalink / raw)
To: Chuck Lever III; +Cc: Linux NFS Mailing List
Hi,
> > On May 31, 2022, at 6:34 AM, Wang Yugui <wangyugui@e16-tech.com> wrote:
> >
> > When many(>NFSD_FILE_LRU_THRESHOLD) files are kept as OPEN, such as
> > xfstests generic/531, nfsd proceses are in CPU high-load state,
> > and nfsd_file_gc(nfsd filecache garbage collector) waste many CPU times.
>
> Over the past few days, I've been able to reproduce a lot of bad
> behavior with generic/531. My test client has 12 physical CPU
> cores, and my lab network is 56Gb InfiniBand.
>
> Unfortunately this patch doesn't really begin to address it. For
> example, with this patch applied, CPU idle is in single digits
> on the NFS server that exports the test's scratch device, and
> that server can still get into a soft lock-up. IMO that is
> because this change works around the underlying problem but
> makes no attempt to root-cause or address that issue.
>
> I agree that the NFS server's behavior needs attention, but I'm
> not inclined to apply this particular patch as it is.
Yes. this patch is just particular for xfstests generic/531.
In xfstests generic/531, when many(>500K ) files are kept as OPEN, a
file delete will cause LRU walk( CPU soft look-up) too.
big LRU data is still fast to add, but very slow to remove some random
one?
Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2022/05/31
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-05-31 14:44 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-05-31 10:34 [PATCH v2] nfsd: serialize filecache garbage collector Wang Yugui
2022-05-31 14:12 ` Chuck Lever III
2022-05-31 14:44 ` Wang Yugui
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.