* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
2025-08-21 13:56 ` Olga Kornievskaia
@ 2025-08-21 14:01 ` Chuck Lever
2025-08-21 15:09 ` Chuck Lever
2025-08-25 15:23 ` Olga Kornievskaia
2 siblings, 0 replies; 21+ messages in thread
From: Chuck Lever @ 2025-08-21 14:01 UTC (permalink / raw)
To: Olga Kornievskaia, NeilBrown
Cc: Olga Kornievskaia, jlayton, linux-nfs, Dai.Ngo, tom
On 8/21/25 9:56 AM, Olga Kornievskaia wrote:
> On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@brown.name> wrote:
>>
>> On Thu, 14 Aug 2025, Olga Kornievskaia wrote:
>>> On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
>>>>
>>>> On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
>>>>> When nfsd is in grace and receives an NLM LOCK request which turns
>>>>> out to have a conflicting delegation, return that the server is in
>>>>> grace.
>>>>>
>>>>> Reviewed-by: Jeff Layton <jlayton@redhat.com>
>>>>> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
>>>>> ---
>>>>> fs/lockd/svc4proc.c | 15 +++++++++++++--
>>>>> 1 file changed, 13 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
>>>>> index 109e5caae8c7..7ac4af5c9875 100644
>>>>> --- a/fs/lockd/svc4proc.c
>>>>> +++ b/fs/lockd/svc4proc.c
>>>>> @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
>>>>> resp->cookie = argp->cookie;
>>>>>
>>>>> /* Obtain client and file */
>>>>> - if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
>>>>> - return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
>>>>> + resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
>>>>> + switch (resp->status) {
>>>>> + case 0:
>>>>> + break;
>>>>> + case nlm_drop_reply:
>>>>> + if (locks_in_grace(SVC_NET(rqstp))) {
>>>>> + resp->status = nlm_lck_denied_grace_period;
>>>>
>>>> I think this is wrong. If the lock request has the "reclaim" flag set,
>>>> then nlm_lck_denied_grace_period is not a meaningful error.
>>>> nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
>>>> getting a response to an upcall to mountd. For NLM the request really
>>>> must be dropped.
>>>
>>> Thank you for pointing out this case so we are suggesting to.
>>>
>>> if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)
>>>
>>> However, I've been looking and looking but I cannot figure out how
>>> nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
>>> happen during the upcall to mountd. So that happens within nfsd_open()
>>> call and a part of fh_verify().
>>> To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
>>> from the nfsd_open(). I have searched and searched but I don't see
>>> anything that ever sets nfserr_dropit (NFSERR_DROPIT).
>>>
>>> I searched the logs and nfserr_dropit was an error that was EAGAIN
>>> translated into but then removed by the following patch.
>>
>> Oh. I didn't know that.
>> We now use RQ_DROPME instead.
>> I guess we should remove NFSERR_DROPIT completely as it not used at all
>> any more.
>>
>> Though returning nfserr_jukebox to an v2 client isn't a good idea....
>
> I'll take your word for you.
NFSv2 doesn't have a JUKEBOX or DELAY status code, so NFSD can't return
that kind of error to NFSv2 clients.
>> So I guess my main complaint isn't valid, but I still don't like this
>> patch. It seems an untidy place to put the locks_in_grace test.
>> Other callers of nlm4svc_retrieve_args() check locks_in_grace() before
>> making that call. __nlm4svc_proc_lock probably should too. Or is there
>> a reason that it is delayed until the middle of nlmsvc_lock()..
>
> I thought the same about why not adding the in_grace check and decided
> that it was probably because you dont want to deny a lock if there are
> no conflicts. If we add it, somebody might notice and will complain
> that they can't get their lock when there are no conflicts.
>
>> The patch is not needed and isn't clearly an improvement, so I would
>> rather it were dropped.
>
> I'm not against dropping this patch if the conclusion is that dropping
> the packet is no worse than returning in_grace error.
>
>
>>
>> Thanks,
>> NeilBrown
>>
>>
>>>
>>> commit 062304a815fe10068c478a4a3f28cf091c55cb82
>>> Author: J. Bruce Fields <bfields@fieldses.org>
>>> Date: Sun Jan 2 22:05:33 2011 -0500
>>>
>>> nfsd: stop translating EAGAIN to nfserr_dropit
>>>
>>> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
>>> index dc9c2e3fd1b8..fd608a27a8d5 100644
>>> --- a/fs/nfsd/nfsproc.c
>>> +++ b/fs/nfsd/nfsproc.c
>>> @@ -735,7 +735,8 @@ nfserrno (int errno)
>>> { nfserr_stale, -ESTALE },
>>> { nfserr_jukebox, -ETIMEDOUT },
>>> { nfserr_jukebox, -ERESTARTSYS },
>>> - { nfserr_dropit, -EAGAIN },
>>> + { nfserr_jukebox, -EAGAIN },
>>> + { nfserr_jukebox, -EWOULDBLOCK },
>>> { nfserr_jukebox, -ENOMEM },
>>> { nfserr_badname, -ESRCH },
>>> { nfserr_io, -ETXTBSY },
>>>
>>> so if fh_verify is failing whatever it is returning, it is not
>>> nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
>>> translate it to nlm_failed which with my patch would not trigger
>>> nlm_lck_denied_grace_period error but resp->status would be set to
>>> nlm_failed.
>>>
>>> So I circle back to I hope that convinces you that we don't need a
>>> check for the reclaim lock.
>>>
>>> I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
>>> delegation recall. it can be a memory failure. But I'm sure when
>>> EWOULDBLOCK occurs.
>>>
>>>> At the very least we need to guard against the reclaim flag being set in
>>>> the above test. But I would much rather a more clear distinction were
>>>> made between "drop because of a conflicting delegation" and "drop
>>>> because of a delay getting upcall response".
>>>> Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
>>>> (and ideally nlm2) handles appropriately.
>>>
>>>
>>>> NeilBrown
>>>>
>>>>
>>>>> + return rpc_success;
>>>>> + }
>>>>> + return nlm_drop_reply;
>>>>> + default:
>>>>> + return rpc_success;
>>>>> + }
>>>>>
>>>>> /* Now try to lock the file */
>>>>> resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
>>>>> --
>>>>> 2.47.1
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
--
Chuck Lever
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
2025-08-21 13:56 ` Olga Kornievskaia
2025-08-21 14:01 ` Chuck Lever
@ 2025-08-21 15:09 ` Chuck Lever
2025-08-21 18:20 ` Olga Kornievskaia
2025-08-25 15:23 ` Olga Kornievskaia
2 siblings, 1 reply; 21+ messages in thread
From: Chuck Lever @ 2025-08-21 15:09 UTC (permalink / raw)
To: Olga Kornievskaia, NeilBrown
Cc: Olga Kornievskaia, jlayton, linux-nfs, Dai.Ngo, tom
On 8/21/25 9:56 AM, Olga Kornievskaia wrote:
> On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@brown.name> wrote:
>>
>> On Thu, 14 Aug 2025, Olga Kornievskaia wrote:
>>> On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
>>>>
>>>> On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
>>>>> When nfsd is in grace and receives an NLM LOCK request which turns
>>>>> out to have a conflicting delegation, return that the server is in
>>>>> grace.
>>>>>
>>>>> Reviewed-by: Jeff Layton <jlayton@redhat.com>
>>>>> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
>>>>> ---
>>>>> fs/lockd/svc4proc.c | 15 +++++++++++++--
>>>>> 1 file changed, 13 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
>>>>> index 109e5caae8c7..7ac4af5c9875 100644
>>>>> --- a/fs/lockd/svc4proc.c
>>>>> +++ b/fs/lockd/svc4proc.c
>>>>> @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
>>>>> resp->cookie = argp->cookie;
>>>>>
>>>>> /* Obtain client and file */
>>>>> - if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
>>>>> - return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
>>>>> + resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
>>>>> + switch (resp->status) {
>>>>> + case 0:
>>>>> + break;
>>>>> + case nlm_drop_reply:
>>>>> + if (locks_in_grace(SVC_NET(rqstp))) {
>>>>> + resp->status = nlm_lck_denied_grace_period;
>>>>
>>>> I think this is wrong. If the lock request has the "reclaim" flag set,
>>>> then nlm_lck_denied_grace_period is not a meaningful error.
>>>> nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
>>>> getting a response to an upcall to mountd. For NLM the request really
>>>> must be dropped.
>>>
>>> Thank you for pointing out this case so we are suggesting to.
>>>
>>> if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)
>>>
>>> However, I've been looking and looking but I cannot figure out how
>>> nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
>>> happen during the upcall to mountd. So that happens within nfsd_open()
>>> call and a part of fh_verify().
>>> To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
>>> from the nfsd_open(). I have searched and searched but I don't see
>>> anything that ever sets nfserr_dropit (NFSERR_DROPIT).
>>>
>>> I searched the logs and nfserr_dropit was an error that was EAGAIN
>>> translated into but then removed by the following patch.
>>
>> Oh. I didn't know that.
>> We now use RQ_DROPME instead.
>> I guess we should remove NFSERR_DROPIT completely as it not used at all
>> any more.
>>
>> Though returning nfserr_jukebox to an v2 client isn't a good idea....
>
> I'll take your word for you.
>
>> So I guess my main complaint isn't valid, but I still don't like this
>> patch. It seems an untidy place to put the locks_in_grace test.
>> Other callers of nlm4svc_retrieve_args() check locks_in_grace() before
>> making that call. __nlm4svc_proc_lock probably should too. Or is there
>> a reason that it is delayed until the middle of nlmsvc_lock()..
>
> I thought the same about why not adding the in_grace check and decided
> that it was probably because you dont want to deny a lock if there are
> no conflicts. If we add it, somebody might notice and will complain
> that they can't get their lock when there are no conflicts.
>
>> The patch is not needed and isn't clearly an improvement, so I would
>> rather it were dropped.
>
> I'm not against dropping this patch if the conclusion is that dropping
> the packet is no worse than returning in_grace error.
I dropped both of these from nfsd-testing. If a fix is still needed,
let's start again with fresh patches.
>> Thanks,
>> NeilBrown
>>
>>
>>>
>>> commit 062304a815fe10068c478a4a3f28cf091c55cb82
>>> Author: J. Bruce Fields <bfields@fieldses.org>
>>> Date: Sun Jan 2 22:05:33 2011 -0500
>>>
>>> nfsd: stop translating EAGAIN to nfserr_dropit
>>>
>>> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
>>> index dc9c2e3fd1b8..fd608a27a8d5 100644
>>> --- a/fs/nfsd/nfsproc.c
>>> +++ b/fs/nfsd/nfsproc.c
>>> @@ -735,7 +735,8 @@ nfserrno (int errno)
>>> { nfserr_stale, -ESTALE },
>>> { nfserr_jukebox, -ETIMEDOUT },
>>> { nfserr_jukebox, -ERESTARTSYS },
>>> - { nfserr_dropit, -EAGAIN },
>>> + { nfserr_jukebox, -EAGAIN },
>>> + { nfserr_jukebox, -EWOULDBLOCK },
>>> { nfserr_jukebox, -ENOMEM },
>>> { nfserr_badname, -ESRCH },
>>> { nfserr_io, -ETXTBSY },
>>>
>>> so if fh_verify is failing whatever it is returning, it is not
>>> nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
>>> translate it to nlm_failed which with my patch would not trigger
>>> nlm_lck_denied_grace_period error but resp->status would be set to
>>> nlm_failed.
>>>
>>> So I circle back to I hope that convinces you that we don't need a
>>> check for the reclaim lock.
>>>
>>> I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
>>> delegation recall. it can be a memory failure. But I'm sure when
>>> EWOULDBLOCK occurs.
>>>
>>>> At the very least we need to guard against the reclaim flag being set in
>>>> the above test. But I would much rather a more clear distinction were
>>>> made between "drop because of a conflicting delegation" and "drop
>>>> because of a delay getting upcall response".
>>>> Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
>>>> (and ideally nlm2) handles appropriately.
>>>
>>>
>>>> NeilBrown
>>>>
>>>>
>>>>> + return rpc_success;
>>>>> + }
>>>>> + return nlm_drop_reply;
>>>>> + default:
>>>>> + return rpc_success;
>>>>> + }
>>>>>
>>>>> /* Now try to lock the file */
>>>>> resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
>>>>> --
>>>>> 2.47.1
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
--
Chuck Lever
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
2025-08-21 15:09 ` Chuck Lever
@ 2025-08-21 18:20 ` Olga Kornievskaia
2025-08-21 18:24 ` Chuck Lever
0 siblings, 1 reply; 21+ messages in thread
From: Olga Kornievskaia @ 2025-08-21 18:20 UTC (permalink / raw)
To: Chuck Lever
Cc: NeilBrown, Olga Kornievskaia, jlayton, linux-nfs, Dai.Ngo, tom
On Thu, Aug 21, 2025 at 11:09 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> On 8/21/25 9:56 AM, Olga Kornievskaia wrote:
> > On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@brown.name> wrote:
> >>
> >> On Thu, 14 Aug 2025, Olga Kornievskaia wrote:
> >>> On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
> >>>>
> >>>> On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
> >>>>> When nfsd is in grace and receives an NLM LOCK request which turns
> >>>>> out to have a conflicting delegation, return that the server is in
> >>>>> grace.
> >>>>>
> >>>>> Reviewed-by: Jeff Layton <jlayton@redhat.com>
> >>>>> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
> >>>>> ---
> >>>>> fs/lockd/svc4proc.c | 15 +++++++++++++--
> >>>>> 1 file changed, 13 insertions(+), 2 deletions(-)
> >>>>>
> >>>>> diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
> >>>>> index 109e5caae8c7..7ac4af5c9875 100644
> >>>>> --- a/fs/lockd/svc4proc.c
> >>>>> +++ b/fs/lockd/svc4proc.c
> >>>>> @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
> >>>>> resp->cookie = argp->cookie;
> >>>>>
> >>>>> /* Obtain client and file */
> >>>>> - if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
> >>>>> - return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
> >>>>> + resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
> >>>>> + switch (resp->status) {
> >>>>> + case 0:
> >>>>> + break;
> >>>>> + case nlm_drop_reply:
> >>>>> + if (locks_in_grace(SVC_NET(rqstp))) {
> >>>>> + resp->status = nlm_lck_denied_grace_period;
> >>>>
> >>>> I think this is wrong. If the lock request has the "reclaim" flag set,
> >>>> then nlm_lck_denied_grace_period is not a meaningful error.
> >>>> nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
> >>>> getting a response to an upcall to mountd. For NLM the request really
> >>>> must be dropped.
> >>>
> >>> Thank you for pointing out this case so we are suggesting to.
> >>>
> >>> if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)
> >>>
> >>> However, I've been looking and looking but I cannot figure out how
> >>> nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
> >>> happen during the upcall to mountd. So that happens within nfsd_open()
> >>> call and a part of fh_verify().
> >>> To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
> >>> from the nfsd_open(). I have searched and searched but I don't see
> >>> anything that ever sets nfserr_dropit (NFSERR_DROPIT).
> >>>
> >>> I searched the logs and nfserr_dropit was an error that was EAGAIN
> >>> translated into but then removed by the following patch.
> >>
> >> Oh. I didn't know that.
> >> We now use RQ_DROPME instead.
> >> I guess we should remove NFSERR_DROPIT completely as it not used at all
> >> any more.
> >>
> >> Though returning nfserr_jukebox to an v2 client isn't a good idea....
> >
> > I'll take your word for you.
> >
> >> So I guess my main complaint isn't valid, but I still don't like this
> >> patch. It seems an untidy place to put the locks_in_grace test.
> >> Other callers of nlm4svc_retrieve_args() check locks_in_grace() before
> >> making that call. __nlm4svc_proc_lock probably should too. Or is there
> >> a reason that it is delayed until the middle of nlmsvc_lock()..
> >
> > I thought the same about why not adding the in_grace check and decided
> > that it was probably because you dont want to deny a lock if there are
> > no conflicts. If we add it, somebody might notice and will complain
> > that they can't get their lock when there are no conflicts.
> >
> >> The patch is not needed and isn't clearly an improvement, so I would
> >> rather it were dropped.
> >
> > I'm not against dropping this patch if the conclusion is that dropping
> > the packet is no worse than returning in_grace error.
>
> I dropped both of these from nfsd-testing. If a fix is still needed,
> let's start again with fresh patches.
Can you clarify when you said "both"?
What objections are there for the 1st patch in the series. It solves a
problem and a fix is needed. This one I agree is not needed but I
thought was an improvement.
>
>
> >> Thanks,
> >> NeilBrown
> >>
> >>
> >>>
> >>> commit 062304a815fe10068c478a4a3f28cf091c55cb82
> >>> Author: J. Bruce Fields <bfields@fieldses.org>
> >>> Date: Sun Jan 2 22:05:33 2011 -0500
> >>>
> >>> nfsd: stop translating EAGAIN to nfserr_dropit
> >>>
> >>> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
> >>> index dc9c2e3fd1b8..fd608a27a8d5 100644
> >>> --- a/fs/nfsd/nfsproc.c
> >>> +++ b/fs/nfsd/nfsproc.c
> >>> @@ -735,7 +735,8 @@ nfserrno (int errno)
> >>> { nfserr_stale, -ESTALE },
> >>> { nfserr_jukebox, -ETIMEDOUT },
> >>> { nfserr_jukebox, -ERESTARTSYS },
> >>> - { nfserr_dropit, -EAGAIN },
> >>> + { nfserr_jukebox, -EAGAIN },
> >>> + { nfserr_jukebox, -EWOULDBLOCK },
> >>> { nfserr_jukebox, -ENOMEM },
> >>> { nfserr_badname, -ESRCH },
> >>> { nfserr_io, -ETXTBSY },
> >>>
> >>> so if fh_verify is failing whatever it is returning, it is not
> >>> nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
> >>> translate it to nlm_failed which with my patch would not trigger
> >>> nlm_lck_denied_grace_period error but resp->status would be set to
> >>> nlm_failed.
> >>>
> >>> So I circle back to I hope that convinces you that we don't need a
> >>> check for the reclaim lock.
> >>>
> >>> I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
> >>> delegation recall. it can be a memory failure. But I'm sure when
> >>> EWOULDBLOCK occurs.
> >>>
> >>>> At the very least we need to guard against the reclaim flag being set in
> >>>> the above test. But I would much rather a more clear distinction were
> >>>> made between "drop because of a conflicting delegation" and "drop
> >>>> because of a delay getting upcall response".
> >>>> Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
> >>>> (and ideally nlm2) handles appropriately.
> >>>
> >>>
> >>>> NeilBrown
> >>>>
> >>>>
> >>>>> + return rpc_success;
> >>>>> + }
> >>>>> + return nlm_drop_reply;
> >>>>> + default:
> >>>>> + return rpc_success;
> >>>>> + }
> >>>>>
> >>>>> /* Now try to lock the file */
> >>>>> resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
> >>>>> --
> >>>>> 2.47.1
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
>
>
> --
> Chuck Lever
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
2025-08-21 18:20 ` Olga Kornievskaia
@ 2025-08-21 18:24 ` Chuck Lever
2025-08-21 18:33 ` Olga Kornievskaia
0 siblings, 1 reply; 21+ messages in thread
From: Chuck Lever @ 2025-08-21 18:24 UTC (permalink / raw)
To: Olga Kornievskaia
Cc: NeilBrown, Olga Kornievskaia, jlayton, linux-nfs, Dai.Ngo, tom
On 8/21/25 2:20 PM, Olga Kornievskaia wrote:
> On Thu, Aug 21, 2025 at 11:09 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>>
>> On 8/21/25 9:56 AM, Olga Kornievskaia wrote:
>>> On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@brown.name> wrote:
>>>>
>>>> On Thu, 14 Aug 2025, Olga Kornievskaia wrote:
>>>>> On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
>>>>>>
>>>>>> On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
>>>>>>> When nfsd is in grace and receives an NLM LOCK request which turns
>>>>>>> out to have a conflicting delegation, return that the server is in
>>>>>>> grace.
>>>>>>>
>>>>>>> Reviewed-by: Jeff Layton <jlayton@redhat.com>
>>>>>>> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
>>>>>>> ---
>>>>>>> fs/lockd/svc4proc.c | 15 +++++++++++++--
>>>>>>> 1 file changed, 13 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
>>>>>>> index 109e5caae8c7..7ac4af5c9875 100644
>>>>>>> --- a/fs/lockd/svc4proc.c
>>>>>>> +++ b/fs/lockd/svc4proc.c
>>>>>>> @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
>>>>>>> resp->cookie = argp->cookie;
>>>>>>>
>>>>>>> /* Obtain client and file */
>>>>>>> - if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
>>>>>>> - return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
>>>>>>> + resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
>>>>>>> + switch (resp->status) {
>>>>>>> + case 0:
>>>>>>> + break;
>>>>>>> + case nlm_drop_reply:
>>>>>>> + if (locks_in_grace(SVC_NET(rqstp))) {
>>>>>>> + resp->status = nlm_lck_denied_grace_period;
>>>>>>
>>>>>> I think this is wrong. If the lock request has the "reclaim" flag set,
>>>>>> then nlm_lck_denied_grace_period is not a meaningful error.
>>>>>> nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
>>>>>> getting a response to an upcall to mountd. For NLM the request really
>>>>>> must be dropped.
>>>>>
>>>>> Thank you for pointing out this case so we are suggesting to.
>>>>>
>>>>> if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)
>>>>>
>>>>> However, I've been looking and looking but I cannot figure out how
>>>>> nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
>>>>> happen during the upcall to mountd. So that happens within nfsd_open()
>>>>> call and a part of fh_verify().
>>>>> To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
>>>>> from the nfsd_open(). I have searched and searched but I don't see
>>>>> anything that ever sets nfserr_dropit (NFSERR_DROPIT).
>>>>>
>>>>> I searched the logs and nfserr_dropit was an error that was EAGAIN
>>>>> translated into but then removed by the following patch.
>>>>
>>>> Oh. I didn't know that.
>>>> We now use RQ_DROPME instead.
>>>> I guess we should remove NFSERR_DROPIT completely as it not used at all
>>>> any more.
>>>>
>>>> Though returning nfserr_jukebox to an v2 client isn't a good idea....
>>>
>>> I'll take your word for you.
>>>
>>>> So I guess my main complaint isn't valid, but I still don't like this
>>>> patch. It seems an untidy place to put the locks_in_grace test.
>>>> Other callers of nlm4svc_retrieve_args() check locks_in_grace() before
>>>> making that call. __nlm4svc_proc_lock probably should too. Or is there
>>>> a reason that it is delayed until the middle of nlmsvc_lock()..
>>>
>>> I thought the same about why not adding the in_grace check and decided
>>> that it was probably because you dont want to deny a lock if there are
>>> no conflicts. If we add it, somebody might notice and will complain
>>> that they can't get their lock when there are no conflicts.
>>>
>>>> The patch is not needed and isn't clearly an improvement, so I would
>>>> rather it were dropped.
>>>
>>> I'm not against dropping this patch if the conclusion is that dropping
>>> the packet is no worse than returning in_grace error.
>>
>> I dropped both of these from nfsd-testing. If a fix is still needed,
>> let's start again with fresh patches.
>
> Can you clarify when you said "both"?
>
> What objections are there for the 1st patch in the series. It solves a
> problem and a fix is needed.
There are two reasons to drop the first patch.
1. Neil's "remove nfserr_dropit" patch doesn't apply unless both patches
are reverted.
2. As I said above, NFSv2 does not have a mechanism like NFS3ERR_JUKEBOX
to request that the client wait a bit and resend.
So, if 1/2 has been tested with NFSv2 and does not cause NFSD to leak
nfserr_jukebox to NFSv2 clients, then please rebase that one on the
current nfsd-testing branch and post it again.
> This one I agree is not needed but I
> thought was an improvement.
>
>>
>>
>>>> Thanks,
>>>> NeilBrown
>>>>
>>>>
>>>>>
>>>>> commit 062304a815fe10068c478a4a3f28cf091c55cb82
>>>>> Author: J. Bruce Fields <bfields@fieldses.org>
>>>>> Date: Sun Jan 2 22:05:33 2011 -0500
>>>>>
>>>>> nfsd: stop translating EAGAIN to nfserr_dropit
>>>>>
>>>>> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
>>>>> index dc9c2e3fd1b8..fd608a27a8d5 100644
>>>>> --- a/fs/nfsd/nfsproc.c
>>>>> +++ b/fs/nfsd/nfsproc.c
>>>>> @@ -735,7 +735,8 @@ nfserrno (int errno)
>>>>> { nfserr_stale, -ESTALE },
>>>>> { nfserr_jukebox, -ETIMEDOUT },
>>>>> { nfserr_jukebox, -ERESTARTSYS },
>>>>> - { nfserr_dropit, -EAGAIN },
>>>>> + { nfserr_jukebox, -EAGAIN },
>>>>> + { nfserr_jukebox, -EWOULDBLOCK },
>>>>> { nfserr_jukebox, -ENOMEM },
>>>>> { nfserr_badname, -ESRCH },
>>>>> { nfserr_io, -ETXTBSY },
>>>>>
>>>>> so if fh_verify is failing whatever it is returning, it is not
>>>>> nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
>>>>> translate it to nlm_failed which with my patch would not trigger
>>>>> nlm_lck_denied_grace_period error but resp->status would be set to
>>>>> nlm_failed.
>>>>>
>>>>> So I circle back to I hope that convinces you that we don't need a
>>>>> check for the reclaim lock.
>>>>>
>>>>> I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
>>>>> delegation recall. it can be a memory failure. But I'm sure when
>>>>> EWOULDBLOCK occurs.
>>>>>
>>>>>> At the very least we need to guard against the reclaim flag being set in
>>>>>> the above test. But I would much rather a more clear distinction were
>>>>>> made between "drop because of a conflicting delegation" and "drop
>>>>>> because of a delay getting upcall response".
>>>>>> Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
>>>>>> (and ideally nlm2) handles appropriately.
>>>>>
>>>>>
>>>>>> NeilBrown
>>>>>>
>>>>>>
>>>>>>> + return rpc_success;
>>>>>>> + }
>>>>>>> + return nlm_drop_reply;
>>>>>>> + default:
>>>>>>> + return rpc_success;
>>>>>>> + }
>>>>>>>
>>>>>>> /* Now try to lock the file */
>>>>>>> resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
>>>>>>> --
>>>>>>> 2.47.1
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>
>>
>> --
>> Chuck Lever
>
--
Chuck Lever
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
2025-08-21 18:24 ` Chuck Lever
@ 2025-08-21 18:33 ` Olga Kornievskaia
2025-08-21 18:39 ` Chuck Lever
0 siblings, 1 reply; 21+ messages in thread
From: Olga Kornievskaia @ 2025-08-21 18:33 UTC (permalink / raw)
To: Chuck Lever
Cc: NeilBrown, Olga Kornievskaia, jlayton, linux-nfs, Dai.Ngo, tom
On Thu, Aug 21, 2025 at 2:24 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> On 8/21/25 2:20 PM, Olga Kornievskaia wrote:
> > On Thu, Aug 21, 2025 at 11:09 AM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>
> >> On 8/21/25 9:56 AM, Olga Kornievskaia wrote:
> >>> On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@brown.name> wrote:
> >>>>
> >>>> On Thu, 14 Aug 2025, Olga Kornievskaia wrote:
> >>>>> On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
> >>>>>>
> >>>>>> On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
> >>>>>>> When nfsd is in grace and receives an NLM LOCK request which turns
> >>>>>>> out to have a conflicting delegation, return that the server is in
> >>>>>>> grace.
> >>>>>>>
> >>>>>>> Reviewed-by: Jeff Layton <jlayton@redhat.com>
> >>>>>>> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
> >>>>>>> ---
> >>>>>>> fs/lockd/svc4proc.c | 15 +++++++++++++--
> >>>>>>> 1 file changed, 13 insertions(+), 2 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
> >>>>>>> index 109e5caae8c7..7ac4af5c9875 100644
> >>>>>>> --- a/fs/lockd/svc4proc.c
> >>>>>>> +++ b/fs/lockd/svc4proc.c
> >>>>>>> @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
> >>>>>>> resp->cookie = argp->cookie;
> >>>>>>>
> >>>>>>> /* Obtain client and file */
> >>>>>>> - if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
> >>>>>>> - return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
> >>>>>>> + resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
> >>>>>>> + switch (resp->status) {
> >>>>>>> + case 0:
> >>>>>>> + break;
> >>>>>>> + case nlm_drop_reply:
> >>>>>>> + if (locks_in_grace(SVC_NET(rqstp))) {
> >>>>>>> + resp->status = nlm_lck_denied_grace_period;
> >>>>>>
> >>>>>> I think this is wrong. If the lock request has the "reclaim" flag set,
> >>>>>> then nlm_lck_denied_grace_period is not a meaningful error.
> >>>>>> nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
> >>>>>> getting a response to an upcall to mountd. For NLM the request really
> >>>>>> must be dropped.
> >>>>>
> >>>>> Thank you for pointing out this case so we are suggesting to.
> >>>>>
> >>>>> if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)
> >>>>>
> >>>>> However, I've been looking and looking but I cannot figure out how
> >>>>> nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
> >>>>> happen during the upcall to mountd. So that happens within nfsd_open()
> >>>>> call and a part of fh_verify().
> >>>>> To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
> >>>>> from the nfsd_open(). I have searched and searched but I don't see
> >>>>> anything that ever sets nfserr_dropit (NFSERR_DROPIT).
> >>>>>
> >>>>> I searched the logs and nfserr_dropit was an error that was EAGAIN
> >>>>> translated into but then removed by the following patch.
> >>>>
> >>>> Oh. I didn't know that.
> >>>> We now use RQ_DROPME instead.
> >>>> I guess we should remove NFSERR_DROPIT completely as it not used at all
> >>>> any more.
> >>>>
> >>>> Though returning nfserr_jukebox to an v2 client isn't a good idea....
> >>>
> >>> I'll take your word for you.
> >>>
> >>>> So I guess my main complaint isn't valid, but I still don't like this
> >>>> patch. It seems an untidy place to put the locks_in_grace test.
> >>>> Other callers of nlm4svc_retrieve_args() check locks_in_grace() before
> >>>> making that call. __nlm4svc_proc_lock probably should too. Or is there
> >>>> a reason that it is delayed until the middle of nlmsvc_lock()..
> >>>
> >>> I thought the same about why not adding the in_grace check and decided
> >>> that it was probably because you dont want to deny a lock if there are
> >>> no conflicts. If we add it, somebody might notice and will complain
> >>> that they can't get their lock when there are no conflicts.
> >>>
> >>>> The patch is not needed and isn't clearly an improvement, so I would
> >>>> rather it were dropped.
> >>>
> >>> I'm not against dropping this patch if the conclusion is that dropping
> >>> the packet is no worse than returning in_grace error.
> >>
> >> I dropped both of these from nfsd-testing. If a fix is still needed,
> >> let's start again with fresh patches.
> >
> > Can you clarify when you said "both"?
> >
> > What objections are there for the 1st patch in the series. It solves a
> > problem and a fix is needed.
>
> There are two reasons to drop the first patch.
>
> 1. Neil's "remove nfserr_dropit" patch doesn't apply unless both patches
> are reverted.
>
> 2. As I said above, NFSv2 does not have a mechanism like NFS3ERR_JUKEBOX
> to request that the client wait a bit and resend.
ERR_JUKEBOX is not returned to another v2 or v3.
Patch1 (nfsd: nfserr_jukebox in nlm_fopen should lead to a retry)
translates err_jukebox into the nlm_drop_reply and returns to lockd.
As the result, no error is returned to the client but the reply is
dropped all together.
> So, if 1/2 has been tested with NFSv2 and does not cause NFSD to leak
> nfserr_jukebox to NFSv2 clients, then please rebase that one on the
> current nfsd-testing branch and post it again.
>
>
> > This one I agree is not needed but I
> > thought was an improvement.
> >
> >>
> >>
> >>>> Thanks,
> >>>> NeilBrown
> >>>>
> >>>>
> >>>>>
> >>>>> commit 062304a815fe10068c478a4a3f28cf091c55cb82
> >>>>> Author: J. Bruce Fields <bfields@fieldses.org>
> >>>>> Date: Sun Jan 2 22:05:33 2011 -0500
> >>>>>
> >>>>> nfsd: stop translating EAGAIN to nfserr_dropit
> >>>>>
> >>>>> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
> >>>>> index dc9c2e3fd1b8..fd608a27a8d5 100644
> >>>>> --- a/fs/nfsd/nfsproc.c
> >>>>> +++ b/fs/nfsd/nfsproc.c
> >>>>> @@ -735,7 +735,8 @@ nfserrno (int errno)
> >>>>> { nfserr_stale, -ESTALE },
> >>>>> { nfserr_jukebox, -ETIMEDOUT },
> >>>>> { nfserr_jukebox, -ERESTARTSYS },
> >>>>> - { nfserr_dropit, -EAGAIN },
> >>>>> + { nfserr_jukebox, -EAGAIN },
> >>>>> + { nfserr_jukebox, -EWOULDBLOCK },
> >>>>> { nfserr_jukebox, -ENOMEM },
> >>>>> { nfserr_badname, -ESRCH },
> >>>>> { nfserr_io, -ETXTBSY },
> >>>>>
> >>>>> so if fh_verify is failing whatever it is returning, it is not
> >>>>> nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
> >>>>> translate it to nlm_failed which with my patch would not trigger
> >>>>> nlm_lck_denied_grace_period error but resp->status would be set to
> >>>>> nlm_failed.
> >>>>>
> >>>>> So I circle back to I hope that convinces you that we don't need a
> >>>>> check for the reclaim lock.
> >>>>>
> >>>>> I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
> >>>>> delegation recall. it can be a memory failure. But I'm sure when
> >>>>> EWOULDBLOCK occurs.
> >>>>>
> >>>>>> At the very least we need to guard against the reclaim flag being set in
> >>>>>> the above test. But I would much rather a more clear distinction were
> >>>>>> made between "drop because of a conflicting delegation" and "drop
> >>>>>> because of a delay getting upcall response".
> >>>>>> Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
> >>>>>> (and ideally nlm2) handles appropriately.
> >>>>>
> >>>>>
> >>>>>> NeilBrown
> >>>>>>
> >>>>>>
> >>>>>>> + return rpc_success;
> >>>>>>> + }
> >>>>>>> + return nlm_drop_reply;
> >>>>>>> + default:
> >>>>>>> + return rpc_success;
> >>>>>>> + }
> >>>>>>>
> >>>>>>> /* Now try to lock the file */
> >>>>>>> resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
> >>>>>>> --
> >>>>>>> 2.47.1
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>
> >>
> >> --
> >> Chuck Lever
> >
>
>
> --
> Chuck Lever
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
2025-08-21 18:33 ` Olga Kornievskaia
@ 2025-08-21 18:39 ` Chuck Lever
2025-08-21 18:51 ` Olga Kornievskaia
2025-08-21 19:15 ` Olga Kornievskaia
0 siblings, 2 replies; 21+ messages in thread
From: Chuck Lever @ 2025-08-21 18:39 UTC (permalink / raw)
To: Olga Kornievskaia
Cc: NeilBrown, Olga Kornievskaia, jlayton, linux-nfs, Dai.Ngo, tom
On 8/21/25 2:33 PM, Olga Kornievskaia wrote:
> On Thu, Aug 21, 2025 at 2:24 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>>
>> On 8/21/25 2:20 PM, Olga Kornievskaia wrote:
>>> On Thu, Aug 21, 2025 at 11:09 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>>>>
>>>> On 8/21/25 9:56 AM, Olga Kornievskaia wrote:
>>>>> On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@brown.name> wrote:
>>>>>>
>>>>>> On Thu, 14 Aug 2025, Olga Kornievskaia wrote:
>>>>>>> On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
>>>>>>>>
>>>>>>>> On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
>>>>>>>>> When nfsd is in grace and receives an NLM LOCK request which turns
>>>>>>>>> out to have a conflicting delegation, return that the server is in
>>>>>>>>> grace.
>>>>>>>>>
>>>>>>>>> Reviewed-by: Jeff Layton <jlayton@redhat.com>
>>>>>>>>> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
>>>>>>>>> ---
>>>>>>>>> fs/lockd/svc4proc.c | 15 +++++++++++++--
>>>>>>>>> 1 file changed, 13 insertions(+), 2 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
>>>>>>>>> index 109e5caae8c7..7ac4af5c9875 100644
>>>>>>>>> --- a/fs/lockd/svc4proc.c
>>>>>>>>> +++ b/fs/lockd/svc4proc.c
>>>>>>>>> @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
>>>>>>>>> resp->cookie = argp->cookie;
>>>>>>>>>
>>>>>>>>> /* Obtain client and file */
>>>>>>>>> - if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
>>>>>>>>> - return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
>>>>>>>>> + resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
>>>>>>>>> + switch (resp->status) {
>>>>>>>>> + case 0:
>>>>>>>>> + break;
>>>>>>>>> + case nlm_drop_reply:
>>>>>>>>> + if (locks_in_grace(SVC_NET(rqstp))) {
>>>>>>>>> + resp->status = nlm_lck_denied_grace_period;
>>>>>>>>
>>>>>>>> I think this is wrong. If the lock request has the "reclaim" flag set,
>>>>>>>> then nlm_lck_denied_grace_period is not a meaningful error.
>>>>>>>> nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
>>>>>>>> getting a response to an upcall to mountd. For NLM the request really
>>>>>>>> must be dropped.
>>>>>>>
>>>>>>> Thank you for pointing out this case so we are suggesting to.
>>>>>>>
>>>>>>> if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)
>>>>>>>
>>>>>>> However, I've been looking and looking but I cannot figure out how
>>>>>>> nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
>>>>>>> happen during the upcall to mountd. So that happens within nfsd_open()
>>>>>>> call and a part of fh_verify().
>>>>>>> To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
>>>>>>> from the nfsd_open(). I have searched and searched but I don't see
>>>>>>> anything that ever sets nfserr_dropit (NFSERR_DROPIT).
>>>>>>>
>>>>>>> I searched the logs and nfserr_dropit was an error that was EAGAIN
>>>>>>> translated into but then removed by the following patch.
>>>>>>
>>>>>> Oh. I didn't know that.
>>>>>> We now use RQ_DROPME instead.
>>>>>> I guess we should remove NFSERR_DROPIT completely as it not used at all
>>>>>> any more.
>>>>>>
>>>>>> Though returning nfserr_jukebox to an v2 client isn't a good idea....
>>>>>
>>>>> I'll take your word for you.
>>>>>
>>>>>> So I guess my main complaint isn't valid, but I still don't like this
>>>>>> patch. It seems an untidy place to put the locks_in_grace test.
>>>>>> Other callers of nlm4svc_retrieve_args() check locks_in_grace() before
>>>>>> making that call. __nlm4svc_proc_lock probably should too. Or is there
>>>>>> a reason that it is delayed until the middle of nlmsvc_lock()..
>>>>>
>>>>> I thought the same about why not adding the in_grace check and decided
>>>>> that it was probably because you dont want to deny a lock if there are
>>>>> no conflicts. If we add it, somebody might notice and will complain
>>>>> that they can't get their lock when there are no conflicts.
>>>>>
>>>>>> The patch is not needed and isn't clearly an improvement, so I would
>>>>>> rather it were dropped.
>>>>>
>>>>> I'm not against dropping this patch if the conclusion is that dropping
>>>>> the packet is no worse than returning in_grace error.
>>>>
>>>> I dropped both of these from nfsd-testing. If a fix is still needed,
>>>> let's start again with fresh patches.
>>>
>>> Can you clarify when you said "both"?
>>>
>>> What objections are there for the 1st patch in the series. It solves a
>>> problem and a fix is needed.
>>
>> There are two reasons to drop the first patch.
>>
>> 1. Neil's "remove nfserr_dropit" patch doesn't apply unless both patches
>> are reverted.
>>
>> 2. As I said above, NFSv2 does not have a mechanism like NFS3ERR_JUKEBOX
>> to request that the client wait a bit and resend.
>
> ERR_JUKEBOX is not returned to another v2 or v3.
>
> Patch1 (nfsd: nfserr_jukebox in nlm_fopen should lead to a retry)
> translates err_jukebox into the nlm_drop_reply and returns to lockd.
> As the result, no error is returned to the client but the reply is
> dropped all together.
If you want NLM to drop the response, then set RQ_DROPME. Using
nfserr_jukebox here is confusing -- it means "return a response to the
client that requests a resend". You want NLM to /not send a response/,
and we have a specific mechanism for that.
>> So, if 1/2 has been tested with NFSv2 and does not cause NFSD to leak
>> nfserr_jukebox to NFSv2 clients, then please rebase that one on the
>> current nfsd-testing branch and post it again.
>>
>>
>>> This one I agree is not needed but I
>>> thought was an improvement.
>>>
>>>>
>>>>
>>>>>> Thanks,
>>>>>> NeilBrown
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> commit 062304a815fe10068c478a4a3f28cf091c55cb82
>>>>>>> Author: J. Bruce Fields <bfields@fieldses.org>
>>>>>>> Date: Sun Jan 2 22:05:33 2011 -0500
>>>>>>>
>>>>>>> nfsd: stop translating EAGAIN to nfserr_dropit
>>>>>>>
>>>>>>> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
>>>>>>> index dc9c2e3fd1b8..fd608a27a8d5 100644
>>>>>>> --- a/fs/nfsd/nfsproc.c
>>>>>>> +++ b/fs/nfsd/nfsproc.c
>>>>>>> @@ -735,7 +735,8 @@ nfserrno (int errno)
>>>>>>> { nfserr_stale, -ESTALE },
>>>>>>> { nfserr_jukebox, -ETIMEDOUT },
>>>>>>> { nfserr_jukebox, -ERESTARTSYS },
>>>>>>> - { nfserr_dropit, -EAGAIN },
>>>>>>> + { nfserr_jukebox, -EAGAIN },
>>>>>>> + { nfserr_jukebox, -EWOULDBLOCK },
>>>>>>> { nfserr_jukebox, -ENOMEM },
>>>>>>> { nfserr_badname, -ESRCH },
>>>>>>> { nfserr_io, -ETXTBSY },
>>>>>>>
>>>>>>> so if fh_verify is failing whatever it is returning, it is not
>>>>>>> nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
>>>>>>> translate it to nlm_failed which with my patch would not trigger
>>>>>>> nlm_lck_denied_grace_period error but resp->status would be set to
>>>>>>> nlm_failed.
>>>>>>>
>>>>>>> So I circle back to I hope that convinces you that we don't need a
>>>>>>> check for the reclaim lock.
>>>>>>>
>>>>>>> I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
>>>>>>> delegation recall. it can be a memory failure. But I'm sure when
>>>>>>> EWOULDBLOCK occurs.
>>>>>>>
>>>>>>>> At the very least we need to guard against the reclaim flag being set in
>>>>>>>> the above test. But I would much rather a more clear distinction were
>>>>>>>> made between "drop because of a conflicting delegation" and "drop
>>>>>>>> because of a delay getting upcall response".
>>>>>>>> Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
>>>>>>>> (and ideally nlm2) handles appropriately.
>>>>>>>
>>>>>>>
>>>>>>>> NeilBrown
>>>>>>>>
>>>>>>>>
>>>>>>>>> + return rpc_success;
>>>>>>>>> + }
>>>>>>>>> + return nlm_drop_reply;
>>>>>>>>> + default:
>>>>>>>>> + return rpc_success;
>>>>>>>>> + }
>>>>>>>>>
>>>>>>>>> /* Now try to lock the file */
>>>>>>>>> resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
>>>>>>>>> --
>>>>>>>>> 2.47.1
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>>
>>>> --
>>>> Chuck Lever
>>>
>>
>>
>> --
>> Chuck Lever
--
Chuck Lever
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
2025-08-21 18:39 ` Chuck Lever
@ 2025-08-21 18:51 ` Olga Kornievskaia
2025-08-21 19:15 ` Olga Kornievskaia
1 sibling, 0 replies; 21+ messages in thread
From: Olga Kornievskaia @ 2025-08-21 18:51 UTC (permalink / raw)
To: Chuck Lever
Cc: NeilBrown, Olga Kornievskaia, jlayton, linux-nfs, Dai.Ngo, tom
On Thu, Aug 21, 2025 at 2:39 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> On 8/21/25 2:33 PM, Olga Kornievskaia wrote:
> > On Thu, Aug 21, 2025 at 2:24 PM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>
> >> On 8/21/25 2:20 PM, Olga Kornievskaia wrote:
> >>> On Thu, Aug 21, 2025 at 11:09 AM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>>>
> >>>> On 8/21/25 9:56 AM, Olga Kornievskaia wrote:
> >>>>> On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@brown.name> wrote:
> >>>>>>
> >>>>>> On Thu, 14 Aug 2025, Olga Kornievskaia wrote:
> >>>>>>> On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
> >>>>>>>>
> >>>>>>>> On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
> >>>>>>>>> When nfsd is in grace and receives an NLM LOCK request which turns
> >>>>>>>>> out to have a conflicting delegation, return that the server is in
> >>>>>>>>> grace.
> >>>>>>>>>
> >>>>>>>>> Reviewed-by: Jeff Layton <jlayton@redhat.com>
> >>>>>>>>> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
> >>>>>>>>> ---
> >>>>>>>>> fs/lockd/svc4proc.c | 15 +++++++++++++--
> >>>>>>>>> 1 file changed, 13 insertions(+), 2 deletions(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
> >>>>>>>>> index 109e5caae8c7..7ac4af5c9875 100644
> >>>>>>>>> --- a/fs/lockd/svc4proc.c
> >>>>>>>>> +++ b/fs/lockd/svc4proc.c
> >>>>>>>>> @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
> >>>>>>>>> resp->cookie = argp->cookie;
> >>>>>>>>>
> >>>>>>>>> /* Obtain client and file */
> >>>>>>>>> - if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
> >>>>>>>>> - return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
> >>>>>>>>> + resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
> >>>>>>>>> + switch (resp->status) {
> >>>>>>>>> + case 0:
> >>>>>>>>> + break;
> >>>>>>>>> + case nlm_drop_reply:
> >>>>>>>>> + if (locks_in_grace(SVC_NET(rqstp))) {
> >>>>>>>>> + resp->status = nlm_lck_denied_grace_period;
> >>>>>>>>
> >>>>>>>> I think this is wrong. If the lock request has the "reclaim" flag set,
> >>>>>>>> then nlm_lck_denied_grace_period is not a meaningful error.
> >>>>>>>> nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
> >>>>>>>> getting a response to an upcall to mountd. For NLM the request really
> >>>>>>>> must be dropped.
> >>>>>>>
> >>>>>>> Thank you for pointing out this case so we are suggesting to.
> >>>>>>>
> >>>>>>> if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)
> >>>>>>>
> >>>>>>> However, I've been looking and looking but I cannot figure out how
> >>>>>>> nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
> >>>>>>> happen during the upcall to mountd. So that happens within nfsd_open()
> >>>>>>> call and a part of fh_verify().
> >>>>>>> To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
> >>>>>>> from the nfsd_open(). I have searched and searched but I don't see
> >>>>>>> anything that ever sets nfserr_dropit (NFSERR_DROPIT).
> >>>>>>>
> >>>>>>> I searched the logs and nfserr_dropit was an error that was EAGAIN
> >>>>>>> translated into but then removed by the following patch.
> >>>>>>
> >>>>>> Oh. I didn't know that.
> >>>>>> We now use RQ_DROPME instead.
> >>>>>> I guess we should remove NFSERR_DROPIT completely as it not used at all
> >>>>>> any more.
> >>>>>>
> >>>>>> Though returning nfserr_jukebox to an v2 client isn't a good idea....
> >>>>>
> >>>>> I'll take your word for you.
> >>>>>
> >>>>>> So I guess my main complaint isn't valid, but I still don't like this
> >>>>>> patch. It seems an untidy place to put the locks_in_grace test.
> >>>>>> Other callers of nlm4svc_retrieve_args() check locks_in_grace() before
> >>>>>> making that call. __nlm4svc_proc_lock probably should too. Or is there
> >>>>>> a reason that it is delayed until the middle of nlmsvc_lock()..
> >>>>>
> >>>>> I thought the same about why not adding the in_grace check and decided
> >>>>> that it was probably because you dont want to deny a lock if there are
> >>>>> no conflicts. If we add it, somebody might notice and will complain
> >>>>> that they can't get their lock when there are no conflicts.
> >>>>>
> >>>>>> The patch is not needed and isn't clearly an improvement, so I would
> >>>>>> rather it were dropped.
> >>>>>
> >>>>> I'm not against dropping this patch if the conclusion is that dropping
> >>>>> the packet is no worse than returning in_grace error.
> >>>>
> >>>> I dropped both of these from nfsd-testing. If a fix is still needed,
> >>>> let's start again with fresh patches.
> >>>
> >>> Can you clarify when you said "both"?
> >>>
> >>> What objections are there for the 1st patch in the series. It solves a
> >>> problem and a fix is needed.
> >>
> >> There are two reasons to drop the first patch.
> >>
> >> 1. Neil's "remove nfserr_dropit" patch doesn't apply unless both patches
> >> are reverted.
> >>
> >> 2. As I said above, NFSv2 does not have a mechanism like NFS3ERR_JUKEBOX
> >> to request that the client wait a bit and resend.
> >
> > ERR_JUKEBOX is not returned to another v2 or v3.
> >
> > Patch1 (nfsd: nfserr_jukebox in nlm_fopen should lead to a retry)
> > translates err_jukebox into the nlm_drop_reply and returns to lockd.
> > As the result, no error is returned to the client but the reply is
> > dropped all together.
>
> If you want NLM to drop the response, then set RQ_DROPME. Using
> nfserr_jukebox here is confusing -- it means "return a response to the
> client that requests a resend". You want NLM to /not send a response/,
> and we have a specific mechanism for that.
To drop a response lockd expect that it gets nlm_drop_response.
> >> So, if 1/2 has been tested with NFSv2 and does not cause NFSD to leak
> >> nfserr_jukebox to NFSv2 clients, then please rebase that one on the
> >> current nfsd-testing branch and post it again.
> >>
> >>
> >>> This one I agree is not needed but I
> >>> thought was an improvement.
> >>>
> >>>>
> >>>>
> >>>>>> Thanks,
> >>>>>> NeilBrown
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> commit 062304a815fe10068c478a4a3f28cf091c55cb82
> >>>>>>> Author: J. Bruce Fields <bfields@fieldses.org>
> >>>>>>> Date: Sun Jan 2 22:05:33 2011 -0500
> >>>>>>>
> >>>>>>> nfsd: stop translating EAGAIN to nfserr_dropit
> >>>>>>>
> >>>>>>> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
> >>>>>>> index dc9c2e3fd1b8..fd608a27a8d5 100644
> >>>>>>> --- a/fs/nfsd/nfsproc.c
> >>>>>>> +++ b/fs/nfsd/nfsproc.c
> >>>>>>> @@ -735,7 +735,8 @@ nfserrno (int errno)
> >>>>>>> { nfserr_stale, -ESTALE },
> >>>>>>> { nfserr_jukebox, -ETIMEDOUT },
> >>>>>>> { nfserr_jukebox, -ERESTARTSYS },
> >>>>>>> - { nfserr_dropit, -EAGAIN },
> >>>>>>> + { nfserr_jukebox, -EAGAIN },
> >>>>>>> + { nfserr_jukebox, -EWOULDBLOCK },
> >>>>>>> { nfserr_jukebox, -ENOMEM },
> >>>>>>> { nfserr_badname, -ESRCH },
> >>>>>>> { nfserr_io, -ETXTBSY },
> >>>>>>>
> >>>>>>> so if fh_verify is failing whatever it is returning, it is not
> >>>>>>> nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
> >>>>>>> translate it to nlm_failed which with my patch would not trigger
> >>>>>>> nlm_lck_denied_grace_period error but resp->status would be set to
> >>>>>>> nlm_failed.
> >>>>>>>
> >>>>>>> So I circle back to I hope that convinces you that we don't need a
> >>>>>>> check for the reclaim lock.
> >>>>>>>
> >>>>>>> I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
> >>>>>>> delegation recall. it can be a memory failure. But I'm sure when
> >>>>>>> EWOULDBLOCK occurs.
> >>>>>>>
> >>>>>>>> At the very least we need to guard against the reclaim flag being set in
> >>>>>>>> the above test. But I would much rather a more clear distinction were
> >>>>>>>> made between "drop because of a conflicting delegation" and "drop
> >>>>>>>> because of a delay getting upcall response".
> >>>>>>>> Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
> >>>>>>>> (and ideally nlm2) handles appropriately.
> >>>>>>>
> >>>>>>>
> >>>>>>>> NeilBrown
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> + return rpc_success;
> >>>>>>>>> + }
> >>>>>>>>> + return nlm_drop_reply;
> >>>>>>>>> + default:
> >>>>>>>>> + return rpc_success;
> >>>>>>>>> + }
> >>>>>>>>>
> >>>>>>>>> /* Now try to lock the file */
> >>>>>>>>> resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
> >>>>>>>>> --
> >>>>>>>>> 2.47.1
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Chuck Lever
> >>>
> >>
> >>
> >> --
> >> Chuck Lever
>
>
> --
> Chuck Lever
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
2025-08-21 18:39 ` Chuck Lever
2025-08-21 18:51 ` Olga Kornievskaia
@ 2025-08-21 19:15 ` Olga Kornievskaia
1 sibling, 0 replies; 21+ messages in thread
From: Olga Kornievskaia @ 2025-08-21 19:15 UTC (permalink / raw)
To: Chuck Lever
Cc: NeilBrown, Olga Kornievskaia, jlayton, linux-nfs, Dai.Ngo, tom
On Thu, Aug 21, 2025 at 2:39 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> On 8/21/25 2:33 PM, Olga Kornievskaia wrote:
> > On Thu, Aug 21, 2025 at 2:24 PM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>
> >> On 8/21/25 2:20 PM, Olga Kornievskaia wrote:
> >>> On Thu, Aug 21, 2025 at 11:09 AM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>>>
> >>>> On 8/21/25 9:56 AM, Olga Kornievskaia wrote:
> >>>>> On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@brown.name> wrote:
> >>>>>>
> >>>>>> On Thu, 14 Aug 2025, Olga Kornievskaia wrote:
> >>>>>>> On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
> >>>>>>>>
> >>>>>>>> On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
> >>>>>>>>> When nfsd is in grace and receives an NLM LOCK request which turns
> >>>>>>>>> out to have a conflicting delegation, return that the server is in
> >>>>>>>>> grace.
> >>>>>>>>>
> >>>>>>>>> Reviewed-by: Jeff Layton <jlayton@redhat.com>
> >>>>>>>>> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
> >>>>>>>>> ---
> >>>>>>>>> fs/lockd/svc4proc.c | 15 +++++++++++++--
> >>>>>>>>> 1 file changed, 13 insertions(+), 2 deletions(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
> >>>>>>>>> index 109e5caae8c7..7ac4af5c9875 100644
> >>>>>>>>> --- a/fs/lockd/svc4proc.c
> >>>>>>>>> +++ b/fs/lockd/svc4proc.c
> >>>>>>>>> @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
> >>>>>>>>> resp->cookie = argp->cookie;
> >>>>>>>>>
> >>>>>>>>> /* Obtain client and file */
> >>>>>>>>> - if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
> >>>>>>>>> - return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
> >>>>>>>>> + resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
> >>>>>>>>> + switch (resp->status) {
> >>>>>>>>> + case 0:
> >>>>>>>>> + break;
> >>>>>>>>> + case nlm_drop_reply:
> >>>>>>>>> + if (locks_in_grace(SVC_NET(rqstp))) {
> >>>>>>>>> + resp->status = nlm_lck_denied_grace_period;
> >>>>>>>>
> >>>>>>>> I think this is wrong. If the lock request has the "reclaim" flag set,
> >>>>>>>> then nlm_lck_denied_grace_period is not a meaningful error.
> >>>>>>>> nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
> >>>>>>>> getting a response to an upcall to mountd. For NLM the request really
> >>>>>>>> must be dropped.
> >>>>>>>
> >>>>>>> Thank you for pointing out this case so we are suggesting to.
> >>>>>>>
> >>>>>>> if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)
> >>>>>>>
> >>>>>>> However, I've been looking and looking but I cannot figure out how
> >>>>>>> nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
> >>>>>>> happen during the upcall to mountd. So that happens within nfsd_open()
> >>>>>>> call and a part of fh_verify().
> >>>>>>> To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
> >>>>>>> from the nfsd_open(). I have searched and searched but I don't see
> >>>>>>> anything that ever sets nfserr_dropit (NFSERR_DROPIT).
> >>>>>>>
> >>>>>>> I searched the logs and nfserr_dropit was an error that was EAGAIN
> >>>>>>> translated into but then removed by the following patch.
> >>>>>>
> >>>>>> Oh. I didn't know that.
> >>>>>> We now use RQ_DROPME instead.
> >>>>>> I guess we should remove NFSERR_DROPIT completely as it not used at all
> >>>>>> any more.
> >>>>>>
> >>>>>> Though returning nfserr_jukebox to an v2 client isn't a good idea....
> >>>>>
> >>>>> I'll take your word for you.
> >>>>>
> >>>>>> So I guess my main complaint isn't valid, but I still don't like this
> >>>>>> patch. It seems an untidy place to put the locks_in_grace test.
> >>>>>> Other callers of nlm4svc_retrieve_args() check locks_in_grace() before
> >>>>>> making that call. __nlm4svc_proc_lock probably should too. Or is there
> >>>>>> a reason that it is delayed until the middle of nlmsvc_lock()..
> >>>>>
> >>>>> I thought the same about why not adding the in_grace check and decided
> >>>>> that it was probably because you dont want to deny a lock if there are
> >>>>> no conflicts. If we add it, somebody might notice and will complain
> >>>>> that they can't get their lock when there are no conflicts.
> >>>>>
> >>>>>> The patch is not needed and isn't clearly an improvement, so I would
> >>>>>> rather it were dropped.
> >>>>>
> >>>>> I'm not against dropping this patch if the conclusion is that dropping
> >>>>> the packet is no worse than returning in_grace error.
> >>>>
> >>>> I dropped both of these from nfsd-testing. If a fix is still needed,
> >>>> let's start again with fresh patches.
> >>>
> >>> Can you clarify when you said "both"?
> >>>
> >>> What objections are there for the 1st patch in the series. It solves a
> >>> problem and a fix is needed.
> >>
> >> There are two reasons to drop the first patch.
> >>
> >> 1. Neil's "remove nfserr_dropit" patch doesn't apply unless both patches
> >> are reverted.
> >>
> >> 2. As I said above, NFSv2 does not have a mechanism like NFS3ERR_JUKEBOX
> >> to request that the client wait a bit and resend.
> >
> > ERR_JUKEBOX is not returned to another v2 or v3.
> >
> > Patch1 (nfsd: nfserr_jukebox in nlm_fopen should lead to a retry)
> > translates err_jukebox into the nlm_drop_reply and returns to lockd.
> > As the result, no error is returned to the client but the reply is
> > dropped all together.
>
> If you want NLM to drop the response, then set RQ_DROPME. Using
> nfserr_jukebox here is confusing -- it means "return a response to the
> client that requests a resend". You want NLM to /not send a response/,
> and we have a specific mechanism for that.
I don't understand why the suggestion holds value.
nlm_open() is going to call nfsd_open() which will return
nfserr_jukebox (for among other conditions) when there is a
conflicting delegation. Currently, nlm_fopen() would return nlm_failed
(it does not recognize nfserr_jukebox error code). nlm_fopen() has to
identify that nfserr_jukebox error as special to mean that the request
needs to be dropped. There is already a mechanism for it and that is
to return nlm_drop_reply.
What purpose would be to set RQ_DROPME for the nfserr_jukebox in
nlm_fopen() (note that nfserr_jukebox needs to be identified) . What
error should one be returning instead of "nlm_drop_reply"? Why is
returning some other error code + setting RQ_DROPME be more "clear"?
>
>
> >> So, if 1/2 has been tested with NFSv2 and does not cause NFSD to leak
> >> nfserr_jukebox to NFSv2 clients, then please rebase that one on the
> >> current nfsd-testing branch and post it again.
> >>
> >>
> >>> This one I agree is not needed but I
> >>> thought was an improvement.
> >>>
> >>>>
> >>>>
> >>>>>> Thanks,
> >>>>>> NeilBrown
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> commit 062304a815fe10068c478a4a3f28cf091c55cb82
> >>>>>>> Author: J. Bruce Fields <bfields@fieldses.org>
> >>>>>>> Date: Sun Jan 2 22:05:33 2011 -0500
> >>>>>>>
> >>>>>>> nfsd: stop translating EAGAIN to nfserr_dropit
> >>>>>>>
> >>>>>>> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
> >>>>>>> index dc9c2e3fd1b8..fd608a27a8d5 100644
> >>>>>>> --- a/fs/nfsd/nfsproc.c
> >>>>>>> +++ b/fs/nfsd/nfsproc.c
> >>>>>>> @@ -735,7 +735,8 @@ nfserrno (int errno)
> >>>>>>> { nfserr_stale, -ESTALE },
> >>>>>>> { nfserr_jukebox, -ETIMEDOUT },
> >>>>>>> { nfserr_jukebox, -ERESTARTSYS },
> >>>>>>> - { nfserr_dropit, -EAGAIN },
> >>>>>>> + { nfserr_jukebox, -EAGAIN },
> >>>>>>> + { nfserr_jukebox, -EWOULDBLOCK },
> >>>>>>> { nfserr_jukebox, -ENOMEM },
> >>>>>>> { nfserr_badname, -ESRCH },
> >>>>>>> { nfserr_io, -ETXTBSY },
> >>>>>>>
> >>>>>>> so if fh_verify is failing whatever it is returning, it is not
> >>>>>>> nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
> >>>>>>> translate it to nlm_failed which with my patch would not trigger
> >>>>>>> nlm_lck_denied_grace_period error but resp->status would be set to
> >>>>>>> nlm_failed.
> >>>>>>>
> >>>>>>> So I circle back to I hope that convinces you that we don't need a
> >>>>>>> check for the reclaim lock.
> >>>>>>>
> >>>>>>> I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
> >>>>>>> delegation recall. it can be a memory failure. But I'm sure when
> >>>>>>> EWOULDBLOCK occurs.
> >>>>>>>
> >>>>>>>> At the very least we need to guard against the reclaim flag being set in
> >>>>>>>> the above test. But I would much rather a more clear distinction were
> >>>>>>>> made between "drop because of a conflicting delegation" and "drop
> >>>>>>>> because of a delay getting upcall response".
> >>>>>>>> Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
> >>>>>>>> (and ideally nlm2) handles appropriately.
> >>>>>>>
> >>>>>>>
> >>>>>>>> NeilBrown
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> + return rpc_success;
> >>>>>>>>> + }
> >>>>>>>>> + return nlm_drop_reply;
> >>>>>>>>> + default:
> >>>>>>>>> + return rpc_success;
> >>>>>>>>> + }
> >>>>>>>>>
> >>>>>>>>> /* Now try to lock the file */
> >>>>>>>>> resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
> >>>>>>>>> --
> >>>>>>>>> 2.47.1
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Chuck Lever
> >>>
> >>
> >>
> >> --
> >> Chuck Lever
>
>
> --
> Chuck Lever
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
2025-08-21 13:56 ` Olga Kornievskaia
2025-08-21 14:01 ` Chuck Lever
2025-08-21 15:09 ` Chuck Lever
@ 2025-08-25 15:23 ` Olga Kornievskaia
2 siblings, 0 replies; 21+ messages in thread
From: Olga Kornievskaia @ 2025-08-25 15:23 UTC (permalink / raw)
To: NeilBrown
Cc: Olga Kornievskaia, chuck.lever, jlayton, linux-nfs, Dai.Ngo, tom
On Thu, Aug 21, 2025 at 9:56 AM Olga Kornievskaia <aglo@umich.edu> wrote:
>
> On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@brown.name> wrote:
> >
> > On Thu, 14 Aug 2025, Olga Kornievskaia wrote:
> > > On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
> > > >
> > > > On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
> > > > > When nfsd is in grace and receives an NLM LOCK request which turns
> > > > > out to have a conflicting delegation, return that the server is in
> > > > > grace.
My last attempt for restoring (prior to write delegations) behaviour
vs what happens now (even with patch#1). I bring it up again because
there has been misunderstanding of what the patch series is doing and
I believe (some) technical concerns (regarding nfsv2 client) have been
discussed and resolved.
In the old code (prior to write delegations), when a v3 client sent a
lock request (during grace period) and when a v4 client reclaimed the
same lock after the server rebooted, the v3 client got the in_grace
error.
With patch #1 (not in question here), the code at least isn't failing
the request with nlm_failed and instead drops the request. No lock
failure but the overall behaviour of the nfs client is different
(details below).
> > > > > Reviewed-by: Jeff Layton <jlayton@redhat.com>
> > > > > Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
> > > > > ---
> > > > > fs/lockd/svc4proc.c | 15 +++++++++++++--
> > > > > 1 file changed, 13 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
> > > > > index 109e5caae8c7..7ac4af5c9875 100644
> > > > > --- a/fs/lockd/svc4proc.c
> > > > > +++ b/fs/lockd/svc4proc.c
> > > > > @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
> > > > > resp->cookie = argp->cookie;
> > > > >
> > > > > /* Obtain client and file */
> > > > > - if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
> > > > > - return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
> > > > > + resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
> > > > > + switch (resp->status) {
> > > > > + case 0:
> > > > > + break;
> > > > > + case nlm_drop_reply:
> > > > > + if (locks_in_grace(SVC_NET(rqstp))) {
> > > > > + resp->status = nlm_lck_denied_grace_period;
> > > >
> > > > I think this is wrong. If the lock request has the "reclaim" flag set,
> > > > then nlm_lck_denied_grace_period is not a meaningful error.
> > > > nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
> > > > getting a response to an upcall to mountd. For NLM the request really
> > > > must be dropped.
> > >
> > > Thank you for pointing out this case so we are suggesting to.
> > >
> > > if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)
> > >
> > > However, I've been looking and looking but I cannot figure out how
> > > nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
> > > happen during the upcall to mountd. So that happens within nfsd_open()
> > > call and a part of fh_verify().
> > > To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
> > > from the nfsd_open(). I have searched and searched but I don't see
> > > anything that ever sets nfserr_dropit (NFSERR_DROPIT).
> > >
> > > I searched the logs and nfserr_dropit was an error that was EAGAIN
> > > translated into but then removed by the following patch.
> >
> > Oh. I didn't know that.
> > We now use RQ_DROPME instead.
> > I guess we should remove NFSERR_DROPIT completely as it not used at all
> > any more.
> >
> > Though returning nfserr_jukebox to an v2 client isn't a good idea....
>
> I'll take your word for you.
I didn't write the correct words here. I have said the correct words
in the other patch. v2 (or v3) client does not get an nfserr_jukebox.
> > So I guess my main complaint isn't valid, but I still don't like this
> > patch. It seems an untidy place to put the locks_in_grace test.
> > Other callers of nlm4svc_retrieve_args() check locks_in_grace() before
> > making that call. __nlm4svc_proc_lock probably should too. Or is there
> > a reason that it is delayed until the middle of nlmsvc_lock()..
>
> I thought the same about why not adding the in_grace check and decided
> that it was probably because you dont want to deny a lock if there are
> no conflicts. If we add it, somebody might notice and will complain
> that they can't get their lock when there are no conflicts.
The disadvantage of unconditionally adding in_grace check() in the
lock call is that again it would produce a difference in behaviour
where some lock would not be granted as fast as before (ie., some
environments might notice the change?).
> > The patch is not needed and isn't clearly an improvement, so I would
> > rather it were dropped.
>
> I'm not against dropping this patch if the conclusion is that dropping
> the packet is no worse than returning in_grace error.
My argument for inclusion of this patch is that it restores previous
behaviour that the nfs client experienced before (ie., receiving
in_grace error). When we rely on drop-retry behavior, can there be
problems that the client runs out of re-tries and fails the request
(nlm has a hard coded retry limit of 5 and then its own timeout
value)?
I admit my reasons are hypothetical, we can wait until somebody complains.
> > Thanks,
> > NeilBrown
> >
> >
> > >
> > > commit 062304a815fe10068c478a4a3f28cf091c55cb82
> > > Author: J. Bruce Fields <bfields@fieldses.org>
> > > Date: Sun Jan 2 22:05:33 2011 -0500
> > >
> > > nfsd: stop translating EAGAIN to nfserr_dropit
> > >
> > > diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
> > > index dc9c2e3fd1b8..fd608a27a8d5 100644
> > > --- a/fs/nfsd/nfsproc.c
> > > +++ b/fs/nfsd/nfsproc.c
> > > @@ -735,7 +735,8 @@ nfserrno (int errno)
> > > { nfserr_stale, -ESTALE },
> > > { nfserr_jukebox, -ETIMEDOUT },
> > > { nfserr_jukebox, -ERESTARTSYS },
> > > - { nfserr_dropit, -EAGAIN },
> > > + { nfserr_jukebox, -EAGAIN },
> > > + { nfserr_jukebox, -EWOULDBLOCK },
> > > { nfserr_jukebox, -ENOMEM },
> > > { nfserr_badname, -ESRCH },
> > > { nfserr_io, -ETXTBSY },
> > >
> > > so if fh_verify is failing whatever it is returning, it is not
> > > nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
> > > translate it to nlm_failed which with my patch would not trigger
> > > nlm_lck_denied_grace_period error but resp->status would be set to
> > > nlm_failed.
> > >
> > > So I circle back to I hope that convinces you that we don't need a
> > > check for the reclaim lock.
> > >
> > > I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
> > > delegation recall. it can be a memory failure. But I'm sure when
> > > EWOULDBLOCK occurs.
> > >
> > > > At the very least we need to guard against the reclaim flag being set in
> > > > the above test. But I would much rather a more clear distinction were
> > > > made between "drop because of a conflicting delegation" and "drop
> > > > because of a delay getting upcall response".
> > > > Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
> > > > (and ideally nlm2) handles appropriately.
> > >
> > >
> > > > NeilBrown
> > > >
> > > >
> > > > > + return rpc_success;
> > > > > + }
> > > > > + return nlm_drop_reply;
> > > > > + default:
> > > > > + return rpc_success;
> > > > > + }
> > > > >
> > > > > /* Now try to lock the file */
> > > > > resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
> > > > > --
> > > > > 2.47.1
> > > > >
> > > > >
> > > >
> > > >
> > >
> >
^ permalink raw reply [flat|nested] 21+ messages in thread