linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chuck Lever <chuck.lever@oracle.com>
To: Olga Kornievskaia <aglo@umich.edu>
Cc: NeilBrown <neil@brown.name>,
	Olga Kornievskaia <okorniev@redhat.com>,
	jlayton@kernel.org, linux-nfs@vger.kernel.org,
	Dai.Ngo@oracle.com, tom@talpey.com
Subject: Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
Date: Thu, 21 Aug 2025 14:39:00 -0400	[thread overview]
Message-ID: <fc97f4d3-6f0d-4ead-adb5-68cea81bd38f@oracle.com> (raw)
In-Reply-To: <CAN-5tyF7wqG6_n_x4rNCKeLmkChGDA74TWduun8HahVuHfHbZw@mail.gmail.com>

On 8/21/25 2:33 PM, Olga Kornievskaia wrote:
> On Thu, Aug 21, 2025 at 2:24 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>>
>> On 8/21/25 2:20 PM, Olga Kornievskaia wrote:
>>> On Thu, Aug 21, 2025 at 11:09 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>>>>
>>>> On 8/21/25 9:56 AM, Olga Kornievskaia wrote:
>>>>> On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@brown.name> wrote:
>>>>>>
>>>>>> On Thu, 14 Aug 2025, Olga Kornievskaia wrote:
>>>>>>> On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
>>>>>>>>
>>>>>>>> On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
>>>>>>>>> When nfsd is in grace and receives an NLM LOCK request which turns
>>>>>>>>> out to have a conflicting delegation, return that the server is in
>>>>>>>>> grace.
>>>>>>>>>
>>>>>>>>> Reviewed-by: Jeff Layton <jlayton@redhat.com>
>>>>>>>>> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
>>>>>>>>> ---
>>>>>>>>>  fs/lockd/svc4proc.c | 15 +++++++++++++--
>>>>>>>>>  1 file changed, 13 insertions(+), 2 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
>>>>>>>>> index 109e5caae8c7..7ac4af5c9875 100644
>>>>>>>>> --- a/fs/lockd/svc4proc.c
>>>>>>>>> +++ b/fs/lockd/svc4proc.c
>>>>>>>>> @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
>>>>>>>>>       resp->cookie = argp->cookie;
>>>>>>>>>
>>>>>>>>>       /* Obtain client and file */
>>>>>>>>> -     if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
>>>>>>>>> -             return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
>>>>>>>>> +     resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
>>>>>>>>> +     switch (resp->status) {
>>>>>>>>> +     case 0:
>>>>>>>>> +             break;
>>>>>>>>> +     case nlm_drop_reply:
>>>>>>>>> +             if (locks_in_grace(SVC_NET(rqstp))) {
>>>>>>>>> +                     resp->status = nlm_lck_denied_grace_period;
>>>>>>>>
>>>>>>>> I think this is wrong.  If the lock request has the "reclaim" flag set,
>>>>>>>> then nlm_lck_denied_grace_period is not a meaningful error.
>>>>>>>> nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
>>>>>>>> getting a response to an upcall to mountd.  For NLM the request really
>>>>>>>> must be dropped.
>>>>>>>
>>>>>>> Thank you for pointing out this case so we are suggesting to.
>>>>>>>
>>>>>>> if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)
>>>>>>>
>>>>>>> However, I've been looking and looking but I cannot figure out how
>>>>>>> nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
>>>>>>> happen during the upcall to mountd. So that happens within nfsd_open()
>>>>>>> call and a part of fh_verify().
>>>>>>> To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
>>>>>>> from the nfsd_open().  I have searched and searched but I don't see
>>>>>>> anything that ever sets nfserr_dropit (NFSERR_DROPIT).
>>>>>>>
>>>>>>> I searched the logs and nfserr_dropit was an error that was EAGAIN
>>>>>>> translated into but then removed by the following patch.
>>>>>>
>>>>>> Oh.  I didn't know that.
>>>>>> We now use RQ_DROPME instead.
>>>>>> I guess we should remove NFSERR_DROPIT completely as it not used at all
>>>>>> any more.
>>>>>>
>>>>>> Though returning nfserr_jukebox to an v2 client isn't a good idea....
>>>>>
>>>>> I'll take your word for you.
>>>>>
>>>>>> So I guess my main complaint isn't valid, but I still don't like this
>>>>>> patch.  It seems an untidy place to put the locks_in_grace test.
>>>>>> Other callers of nlm4svc_retrieve_args() check locks_in_grace() before
>>>>>> making that call.  __nlm4svc_proc_lock probably should too.  Or is there
>>>>>> a reason that it is delayed until the middle of nlmsvc_lock()..
>>>>>
>>>>> I thought the same about why not adding the in_grace check and decided
>>>>> that it was probably because you dont want to deny a lock if there are
>>>>> no conflicts. If we add it, somebody might notice and will complain
>>>>> that they can't get their lock when there are no conflicts.
>>>>>
>>>>>> The patch is not needed and isn't clearly an improvement, so I would
>>>>>> rather it were dropped.
>>>>>
>>>>> I'm not against dropping this patch if the conclusion is that dropping
>>>>> the packet is no worse than returning in_grace error.
>>>>
>>>> I dropped both of these from nfsd-testing. If a fix is still needed,
>>>> let's start again with fresh patches.
>>>
>>> Can you clarify when you said "both"?
>>>
>>> What objections are there for the 1st patch in the series. It solves a
>>> problem and a fix is needed.
>>
>> There are two reasons to drop the first patch.
>>
>> 1. Neil's "remove nfserr_dropit" patch doesn't apply unless both patches
>> are reverted.
>>
>> 2. As I said above, NFSv2 does not have a mechanism like NFS3ERR_JUKEBOX
>> to request that the client wait a bit and resend.
> 
> ERR_JUKEBOX is not returned to another v2 or v3.
> 
> Patch1 (nfsd: nfserr_jukebox in nlm_fopen should lead to a retry)
> translates err_jukebox into the nlm_drop_reply and returns to lockd.
> As the result, no error is returned to the client but the reply is
> dropped all together.

If you want NLM to drop the response, then set RQ_DROPME. Using
nfserr_jukebox here is confusing -- it means "return a response to the
client that requests a resend". You want NLM to /not send a response/,
and we have a specific mechanism for that.


>> So, if 1/2 has been tested with NFSv2 and does not cause NFSD to leak
>> nfserr_jukebox to NFSv2 clients, then please rebase that one on the
>> current nfsd-testing branch and post it again.
>>
>>
>>> This one I agree is not needed but I
>>> thought was an improvement.
>>>
>>>>
>>>>
>>>>>> Thanks,
>>>>>> NeilBrown
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> commit 062304a815fe10068c478a4a3f28cf091c55cb82
>>>>>>> Author: J. Bruce Fields <bfields@fieldses.org>
>>>>>>> Date:   Sun Jan 2 22:05:33 2011 -0500
>>>>>>>
>>>>>>>     nfsd: stop translating EAGAIN to nfserr_dropit
>>>>>>>
>>>>>>> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
>>>>>>> index dc9c2e3fd1b8..fd608a27a8d5 100644
>>>>>>> --- a/fs/nfsd/nfsproc.c
>>>>>>> +++ b/fs/nfsd/nfsproc.c
>>>>>>> @@ -735,7 +735,8 @@ nfserrno (int errno)
>>>>>>>                 { nfserr_stale, -ESTALE },
>>>>>>>                 { nfserr_jukebox, -ETIMEDOUT },
>>>>>>>                 { nfserr_jukebox, -ERESTARTSYS },
>>>>>>> -               { nfserr_dropit, -EAGAIN },
>>>>>>> +               { nfserr_jukebox, -EAGAIN },
>>>>>>> +               { nfserr_jukebox, -EWOULDBLOCK },
>>>>>>>                 { nfserr_jukebox, -ENOMEM },
>>>>>>>                 { nfserr_badname, -ESRCH },
>>>>>>>                 { nfserr_io, -ETXTBSY },
>>>>>>>
>>>>>>> so if fh_verify is failing whatever it is returning, it is not
>>>>>>> nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
>>>>>>> translate it to nlm_failed which with my patch would not trigger
>>>>>>> nlm_lck_denied_grace_period error but resp->status would be set to
>>>>>>> nlm_failed.
>>>>>>>
>>>>>>> So I circle back to I hope that convinces you that we don't need a
>>>>>>> check for the reclaim lock.
>>>>>>>
>>>>>>> I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
>>>>>>> delegation recall. it can be a memory failure. But I'm sure when
>>>>>>> EWOULDBLOCK occurs.
>>>>>>>
>>>>>>>> At the very least we need to guard against the reclaim flag being set in
>>>>>>>> the above test.  But I would much rather a more clear distinction were
>>>>>>>> made between "drop because of a conflicting delegation" and "drop
>>>>>>>> because of a delay getting upcall response".
>>>>>>>> Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
>>>>>>>> (and ideally nlm2) handles appropriately.
>>>>>>>
>>>>>>>
>>>>>>>> NeilBrown
>>>>>>>>
>>>>>>>>
>>>>>>>>> +                     return rpc_success;
>>>>>>>>> +             }
>>>>>>>>> +             return nlm_drop_reply;
>>>>>>>>> +     default:
>>>>>>>>> +             return rpc_success;
>>>>>>>>> +     }
>>>>>>>>>
>>>>>>>>>       /* Now try to lock the file */
>>>>>>>>>       resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
>>>>>>>>> --
>>>>>>>>> 2.47.1
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>>
>>>> --
>>>> Chuck Lever
>>>
>>
>>
>> --
>> Chuck Lever


-- 
Chuck Lever

  reply	other threads:[~2025-08-21 18:39 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-12 16:03 [PATCH v2 1/2] nfsd: nfserr_jukebox in nlm_fopen should lead to a retry Olga Kornievskaia
2025-08-12 16:03 ` [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period Olga Kornievskaia
2025-08-12 23:49   ` NeilBrown
2025-08-13 15:32     ` Olga Kornievskaia
2025-08-20 23:15       ` NeilBrown
2025-08-21 13:56         ` Olga Kornievskaia
2025-08-21 14:01           ` Chuck Lever
2025-08-21 15:09           ` Chuck Lever
2025-08-21 18:20             ` Olga Kornievskaia
2025-08-21 18:24               ` Chuck Lever
2025-08-21 18:33                 ` Olga Kornievskaia
2025-08-21 18:39                   ` Chuck Lever [this message]
2025-08-21 18:51                     ` Olga Kornievskaia
2025-08-21 19:15                     ` Olga Kornievskaia
2025-08-25 15:23           ` Olga Kornievskaia
2025-08-12 17:23 ` [PATCH v2 1/2] nfsd: nfserr_jukebox in nlm_fopen should lead to a retry Chuck Lever
2025-08-21 19:18 ` Chuck Lever
2025-08-21 19:28   ` Olga Kornievskaia
2025-08-21 19:44     ` Olga Kornievskaia
2025-08-21 19:44     ` Chuck Lever
2025-08-21 19:57       ` Olga Kornievskaia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fc97f4d3-6f0d-4ead-adb5-68cea81bd38f@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=Dai.Ngo@oracle.com \
    --cc=aglo@umich.edu \
    --cc=jlayton@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neil@brown.name \
    --cc=okorniev@redhat.com \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).