From: dai.ngo@oracle.com
To: Jeff Layton <jlayton@kernel.org>,
Chuck Lever III <chuck.lever@oracle.com>
Cc: Mike Galbraith <efault@gmx.de>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work
Date: Tue, 10 Jan 2023 11:58:40 -0800 [thread overview]
Message-ID: <8e0cb925-9f73-720d-b402-a7204659ff7f@oracle.com> (raw)
In-Reply-To: <f0f56b451287d17426defe77aee1b1240d2a1b31.camel@kernel.org>
On 1/10/23 11:30 AM, Jeff Layton wrote:
> On Tue, 2023-01-10 at 11:17 -0800, dai.ngo@oracle.com wrote:
>> On 1/10/23 10:34 AM, Jeff Layton wrote:
>>> On Tue, 2023-01-10 at 18:17 +0000, Chuck Lever III wrote:
>>>>> On Jan 10, 2023, at 12:33 PM, Dai Ngo <dai.ngo@oracle.com> wrote:
>>>>>
>>>>>
>>>>> On 1/10/23 2:30 AM, Jeff Layton wrote:
>>>>>> On Mon, 2023-01-09 at 22:48 -0800, Dai Ngo wrote:
>>>>>>> Currently nfsd4_state_shrinker_worker can be schduled multiple times
>>>>>>> from nfsd4_state_shrinker_count when memory is low. This causes
>>>>>>> the WARN_ON_ONCE in __queue_delayed_work to trigger.
>>>>>>>
>>>>>>> This patch allows only one instance of nfsd4_state_shrinker_worker
>>>>>>> at a time using the nfsd_shrinker_active flag, protected by the
>>>>>>> client_lock.
>>>>>>>
>>>>>>> Replace mod_delayed_work with queue_delayed_work since we
>>>>>>> don't expect to modify the delay of any pending work.
>>>>>>>
>>>>>>> Fixes: 44df6f439a17 ("NFSD: add delegation reaper to react to low memory condition")
>>>>>>> Reported-by: Mike Galbraith <efault@gmx.de>
>>>>>>> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
>>>>>>> ---
>>>>>>> fs/nfsd/netns.h | 1 +
>>>>>>> fs/nfsd/nfs4state.c | 16 ++++++++++++++--
>>>>>>> 2 files changed, 15 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
>>>>>>> index 8c854ba3285b..801d70926442 100644
>>>>>>> --- a/fs/nfsd/netns.h
>>>>>>> +++ b/fs/nfsd/netns.h
>>>>>>> @@ -196,6 +196,7 @@ struct nfsd_net {
>>>>>>> atomic_t nfsd_courtesy_clients;
>>>>>>> struct shrinker nfsd_client_shrinker;
>>>>>>> struct delayed_work nfsd_shrinker_work;
>>>>>>> + bool nfsd_shrinker_active;
>>>>>>> };
>>>>>>> /* Simple check to find out if a given net was properly initialized */
>>>>>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>>>>>>> index ee56c9466304..e00551af6a11 100644
>>>>>>> --- a/fs/nfsd/nfs4state.c
>>>>>>> +++ b/fs/nfsd/nfs4state.c
>>>>>>> @@ -4407,11 +4407,20 @@ nfsd4_state_shrinker_count(struct shrinker *shrink, struct shrink_control *sc)
>>>>>>> struct nfsd_net *nn = container_of(shrink,
>>>>>>> struct nfsd_net, nfsd_client_shrinker);
>>>>>>> + spin_lock(&nn->client_lock);
>>>>>>> + if (nn->nfsd_shrinker_active) {
>>>>>>> + spin_unlock(&nn->client_lock);
>>>>>>> + return 0;
>>>>>>> + }
>>>>>> Is this extra machinery really necessary? The bool and spinlock don't
>>>>>> seem to be needed. Typically there is no issue with calling
>>>>>> queued_delayed_work when the work is already queued. It just returns
>>>>>> false in that case without doing anything.
>>>>> When there are multiple calls to mod_delayed_work/queue_delayed_work
>>>>> we hit the WARN_ON_ONCE's in __queue_delayed_work and __queue_work if
>>>>> the work is queued but not execute yet.
>>>> The delay argument of zero is interesting. If it's set to a value
>>>> greater than zero, do you still see a problem?
>>>>
>>> It should be safe to call it with a delay of 0. If it's always queued
>>> with a delay of 0 though (and it seems to be), you could save a little
>>> space by using a struct work_struct instead.
>> Can I defer this optimization for now? I need some time to look into it.
>>
> I'd like to see it as part of the eventual patch that's merged. There's
> no reason to use a delayed_work struct here at all. You always want the
> shrinker work to run ASAP. It should be a simple conversion.
ok, I'll make the change in v2.
>>> Also, I'm not sure if this is related, but where do you cancel the
>>> nfsd_shrinker_work before tearing down the struct nfsd_net? I'm not sure
>>> that would explains the problem Mike reported, but I think that needs to
>>> be addressed.
>> Yes, good catch. I will add the cancelling in v2 patch.
>>
>>
> Looking over the traces that Mike posted, I suspect this is the real
> bug, particularly if the server is being restarted during this test.
Yes, I noticed the WARN_ON_ONCE(timer->function != delayed_work_timer_fn)
too and this seems to indicate some kind of corruption. However, I'm not
sure if Mike's test restarts the nfs-server service. This could be a bug
in work queue module when it's under stress.
-Dai
>
>>>>> This problem was reported by Mike. I initially tried with only the
>>>>> bool but that was not enough that was why the spinlock was added.
>>>>> Mike verified that the patch fixed the problem.
>>>>>
>>>>> -Dai
>>>>>
>>>>>>> count = atomic_read(&nn->nfsd_courtesy_clients);
>>>>>>> if (!count)
>>>>>>> count = atomic_long_read(&num_delegations);
>>>>>>> - if (count)
>>>>>>> - mod_delayed_work(laundry_wq, &nn->nfsd_shrinker_work, 0);
>>>>>>> + if (count) {
>>>>>>> + nn->nfsd_shrinker_active = true;
>>>>>>> + spin_unlock(&nn->client_lock);
>>>>>>> + queue_delayed_work(laundry_wq, &nn->nfsd_shrinker_work, 0);
>>>>>>> + } else
>>>>>>> + spin_unlock(&nn->client_lock);
>>>>>>> return (unsigned long)count;
>>>>>>> }
>>>>>>> @@ -6239,6 +6248,9 @@ nfsd4_state_shrinker_worker(struct work_struct *work)
>>>>>>> courtesy_client_reaper(nn);
>>>>>>> deleg_reaper(nn);
>>>>>>> + spin_lock(&nn->client_lock);
>>>>>>> + nn->nfsd_shrinker_active = 0;
>>>>>>> + spin_unlock(&nn->client_lock);
>>>>>>> }
>>>>>>> static inline __be32 nfs4_check_fh(struct svc_fh *fhp, struct nfs4_stid *stp)
>>>> --
>>>> Chuck Lever
>>>>
>>>>
>>>>
next prev parent reply other threads:[~2023-01-10 19:59 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-10 6:48 [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work Dai Ngo
2023-01-10 10:30 ` Jeff Layton
2023-01-10 17:33 ` dai.ngo
2023-01-10 18:17 ` Chuck Lever III
2023-01-10 18:34 ` Jeff Layton
2023-01-10 19:17 ` dai.ngo
2023-01-10 19:30 ` Jeff Layton
2023-01-10 19:58 ` dai.ngo [this message]
2023-01-11 2:34 ` Mike Galbraith
2023-01-11 10:15 ` Jeff Layton
2023-01-11 10:55 ` Jeff Layton
2023-01-11 11:19 ` Mike Galbraith
2023-01-11 11:31 ` dai.ngo
2023-01-11 12:26 ` Mike Galbraith
2023-01-11 12:44 ` Jeff Layton
2023-01-11 12:00 ` Jeff Layton
2023-01-11 12:15 ` Mike Galbraith
2023-01-11 12:33 ` Jeff Layton
2023-01-11 13:48 ` Mike Galbraith
2023-01-11 14:01 ` Jeff Layton
2023-01-11 14:16 ` Jeff Layton
2023-01-10 18:46 ` dai.ngo
2023-01-10 18:53 ` Chuck Lever III
2023-01-10 19:07 ` dai.ngo
2023-01-10 19:27 ` Jeff Layton
2023-01-10 19:16 ` Jeff Layton
2023-01-10 14:26 ` Chuck Lever III
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8e0cb925-9f73-720d-b402-a7204659ff7f@oracle.com \
--to=dai.ngo@oracle.com \
--cc=chuck.lever@oracle.com \
--cc=efault@gmx.de \
--cc=jlayton@kernel.org \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox