linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/2] nfsd: nfserr_jukebox in nlm_fopen should lead to a retry
@ 2025-08-12 16:03 Olga Kornievskaia
  2025-08-12 16:03 ` [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period Olga Kornievskaia
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Olga Kornievskaia @ 2025-08-12 16:03 UTC (permalink / raw)
  To: chuck.lever, jlayton; +Cc: linux-nfs, neil, Dai.Ngo, tom

When v3 NLM request finds a conflicting delegation, it triggers
a delegation recall and nfsd_open fails with EAGAIN. nfsd_open
then translates EAGAIN into nfserr_jukebox. In nlm_fopen, instead
of returning nlm_failed for when there is a conflicting delegation,
drop this NLM request so that the client retries. Once delegation
is recalled and if a local lock is claimed, a retry would lead to
nfsd returning a nlm_lck_blocked error or a successful nlm lock.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
---
 fs/nfsd/lockd.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/fs/nfsd/lockd.c b/fs/nfsd/lockd.c
index edc9f75dc75c..8fdc769d392e 100644
--- a/fs/nfsd/lockd.c
+++ b/fs/nfsd/lockd.c
@@ -57,6 +57,20 @@ nlm_fopen(struct svc_rqst *rqstp, struct nfs_fh *f, struct file **filp,
 	switch (nfserr) {
 	case nfs_ok:
 		return 0;
+	case nfserr_jukebox:
+		/* this error can indicate a presence of a conflicting
+		 * delegation to an NLM lock request. Options are:
+		 * (1) For now, drop this request and make the client
+		 * retry. When delegation is returned, client's retry will
+		 * complete.
+		 * (2) NLM4_DENIED as per "spec" signals to the client
+		 * that the lock is unavaiable now but client can retry.
+		 * Linux client implementation does not. It treats
+		 * NLM4_DENIED same as NLM4_FAILED and errors the request.
+		 * (3) For the future, treat this as blocked lock and try
+		 * to callback when the delegation is returned but might
+		 * not have a proper lock request to block on.
+		 */
 	case nfserr_dropit:
 		return nlm_drop_reply;
 	case nfserr_stale:
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
  2025-08-12 16:03 [PATCH v2 1/2] nfsd: nfserr_jukebox in nlm_fopen should lead to a retry Olga Kornievskaia
@ 2025-08-12 16:03 ` Olga Kornievskaia
  2025-08-12 23:49   ` NeilBrown
  2025-08-12 17:23 ` [PATCH v2 1/2] nfsd: nfserr_jukebox in nlm_fopen should lead to a retry Chuck Lever
  2025-08-21 19:18 ` Chuck Lever
  2 siblings, 1 reply; 21+ messages in thread
From: Olga Kornievskaia @ 2025-08-12 16:03 UTC (permalink / raw)
  To: chuck.lever, jlayton; +Cc: linux-nfs, neil, Dai.Ngo, tom

When nfsd is in grace and receives an NLM LOCK request which turns
out to have a conflicting delegation, return that the server is in
grace.

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
---
 fs/lockd/svc4proc.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
index 109e5caae8c7..7ac4af5c9875 100644
--- a/fs/lockd/svc4proc.c
+++ b/fs/lockd/svc4proc.c
@@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
 	resp->cookie = argp->cookie;
 
 	/* Obtain client and file */
-	if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
-		return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
+	resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
+	switch (resp->status) {
+	case 0:
+		break;
+	case nlm_drop_reply:
+		if (locks_in_grace(SVC_NET(rqstp))) {
+			resp->status = nlm_lck_denied_grace_period;
+			return rpc_success;
+		}
+		return nlm_drop_reply;
+	default:
+		return rpc_success;
+	}
 
 	/* Now try to lock the file */
 	resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/2] nfsd: nfserr_jukebox in nlm_fopen should lead to a retry
  2025-08-12 16:03 [PATCH v2 1/2] nfsd: nfserr_jukebox in nlm_fopen should lead to a retry Olga Kornievskaia
  2025-08-12 16:03 ` [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period Olga Kornievskaia
@ 2025-08-12 17:23 ` Chuck Lever
  2025-08-21 19:18 ` Chuck Lever
  2 siblings, 0 replies; 21+ messages in thread
From: Chuck Lever @ 2025-08-12 17:23 UTC (permalink / raw)
  To: jlayton, Olga Kornievskaia; +Cc: Chuck Lever, linux-nfs, neil, Dai.Ngo, tom

From: Chuck Lever <chuck.lever@oracle.com>

On Tue, 12 Aug 2025 12:03:16 -0400, Olga Kornievskaia wrote:
> When v3 NLM request finds a conflicting delegation, it triggers
> a delegation recall and nfsd_open fails with EAGAIN. nfsd_open
> then translates EAGAIN into nfserr_jukebox. In nlm_fopen, instead
> of returning nlm_failed for when there is a conflicting delegation,
> drop this NLM request so that the client retries. Once delegation
> is recalled and if a local lock is claimed, a retry would lead to
> nfsd returning a nlm_lck_blocked error or a successful nlm lock.
> 
> [...]

Applied to nfsd-testing, thanks!

[1/2] nfsd: nfserr_jukebox in nlm_fopen should lead to a retry
      commit: ee0261077b9ba643eec07a4996b291209e3c5c11
[2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
      commit: c351af6c2e509c9cefeabbf521c6893c1f4a9b86

--
Chuck Lever


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
  2025-08-12 16:03 ` [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period Olga Kornievskaia
@ 2025-08-12 23:49   ` NeilBrown
  2025-08-13 15:32     ` Olga Kornievskaia
  0 siblings, 1 reply; 21+ messages in thread
From: NeilBrown @ 2025-08-12 23:49 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: chuck.lever, jlayton, linux-nfs, Dai.Ngo, tom

On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
> When nfsd is in grace and receives an NLM LOCK request which turns
> out to have a conflicting delegation, return that the server is in
> grace.
> 
> Reviewed-by: Jeff Layton <jlayton@redhat.com>
> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
> ---
>  fs/lockd/svc4proc.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
> index 109e5caae8c7..7ac4af5c9875 100644
> --- a/fs/lockd/svc4proc.c
> +++ b/fs/lockd/svc4proc.c
> @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
>  	resp->cookie = argp->cookie;
>  
>  	/* Obtain client and file */
> -	if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
> -		return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
> +	resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
> +	switch (resp->status) {
> +	case 0:
> +		break;
> +	case nlm_drop_reply:
> +		if (locks_in_grace(SVC_NET(rqstp))) {
> +			resp->status = nlm_lck_denied_grace_period;

I think this is wrong.  If the lock request has the "reclaim" flag set,
then nlm_lck_denied_grace_period is not a meaningful error.
nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
getting a response to an upcall to mountd.  For NLM the request really
must be dropped.

At the very least we need to guard against the reclaim flag being set in
the above test.  But I would much rather a more clear distinction were
made between "drop because of a conflicting delegation" and "drop
because of a delay getting upcall response".
Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
(and ideally nlm2) handles appropriately.

NeilBrown


> +			return rpc_success;
> +		}
> +		return nlm_drop_reply;
> +	default:
> +		return rpc_success;
> +	}
>  
>  	/* Now try to lock the file */
>  	resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
> -- 
> 2.47.1
> 
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
  2025-08-12 23:49   ` NeilBrown
@ 2025-08-13 15:32     ` Olga Kornievskaia
  2025-08-20 23:15       ` NeilBrown
  0 siblings, 1 reply; 21+ messages in thread
From: Olga Kornievskaia @ 2025-08-13 15:32 UTC (permalink / raw)
  To: NeilBrown
  Cc: Olga Kornievskaia, chuck.lever, jlayton, linux-nfs, Dai.Ngo, tom

On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
>
> On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
> > When nfsd is in grace and receives an NLM LOCK request which turns
> > out to have a conflicting delegation, return that the server is in
> > grace.
> >
> > Reviewed-by: Jeff Layton <jlayton@redhat.com>
> > Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
> > ---
> >  fs/lockd/svc4proc.c | 15 +++++++++++++--
> >  1 file changed, 13 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
> > index 109e5caae8c7..7ac4af5c9875 100644
> > --- a/fs/lockd/svc4proc.c
> > +++ b/fs/lockd/svc4proc.c
> > @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
> >       resp->cookie = argp->cookie;
> >
> >       /* Obtain client and file */
> > -     if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
> > -             return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
> > +     resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
> > +     switch (resp->status) {
> > +     case 0:
> > +             break;
> > +     case nlm_drop_reply:
> > +             if (locks_in_grace(SVC_NET(rqstp))) {
> > +                     resp->status = nlm_lck_denied_grace_period;
>
> I think this is wrong.  If the lock request has the "reclaim" flag set,
> then nlm_lck_denied_grace_period is not a meaningful error.
> nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
> getting a response to an upcall to mountd.  For NLM the request really
> must be dropped.

Thank you for pointing out this case so we are suggesting to.

if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)

However, I've been looking and looking but I cannot figure out how
nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
happen during the upcall to mountd. So that happens within nfsd_open()
call and a part of fh_verify().
To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
from the nfsd_open().  I have searched and searched but I don't see
anything that ever sets nfserr_dropit (NFSERR_DROPIT).

I searched the logs and nfserr_dropit was an error that was EAGAIN
translated into but then removed by the following patch.

commit 062304a815fe10068c478a4a3f28cf091c55cb82
Author: J. Bruce Fields <bfields@fieldses.org>
Date:   Sun Jan 2 22:05:33 2011 -0500

    nfsd: stop translating EAGAIN to nfserr_dropit

diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
index dc9c2e3fd1b8..fd608a27a8d5 100644
--- a/fs/nfsd/nfsproc.c
+++ b/fs/nfsd/nfsproc.c
@@ -735,7 +735,8 @@ nfserrno (int errno)
                { nfserr_stale, -ESTALE },
                { nfserr_jukebox, -ETIMEDOUT },
                { nfserr_jukebox, -ERESTARTSYS },
-               { nfserr_dropit, -EAGAIN },
+               { nfserr_jukebox, -EAGAIN },
+               { nfserr_jukebox, -EWOULDBLOCK },
                { nfserr_jukebox, -ENOMEM },
                { nfserr_badname, -ESRCH },
                { nfserr_io, -ETXTBSY },

so if fh_verify is failing whatever it is returning, it is not
nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
translate it to nlm_failed which with my patch would not trigger
nlm_lck_denied_grace_period error but resp->status would be set to
nlm_failed.

So I circle back to I hope that convinces you that we don't need a
check for the reclaim lock.

I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
delegation recall. it can be a memory failure. But I'm sure when
EWOULDBLOCK occurs.

> At the very least we need to guard against the reclaim flag being set in
> the above test.  But I would much rather a more clear distinction were
> made between "drop because of a conflicting delegation" and "drop
> because of a delay getting upcall response".
> Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
> (and ideally nlm2) handles appropriately.


> NeilBrown
>
>
> > +                     return rpc_success;
> > +             }
> > +             return nlm_drop_reply;
> > +     default:
> > +             return rpc_success;
> > +     }
> >
> >       /* Now try to lock the file */
> >       resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
> > --
> > 2.47.1
> >
> >
>
>

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
  2025-08-13 15:32     ` Olga Kornievskaia
@ 2025-08-20 23:15       ` NeilBrown
  2025-08-21 13:56         ` Olga Kornievskaia
  0 siblings, 1 reply; 21+ messages in thread
From: NeilBrown @ 2025-08-20 23:15 UTC (permalink / raw)
  To: Olga Kornievskaia
  Cc: Olga Kornievskaia, chuck.lever, jlayton, linux-nfs, Dai.Ngo, tom

On Thu, 14 Aug 2025, Olga Kornievskaia wrote:
> On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
> >
> > On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
> > > When nfsd is in grace and receives an NLM LOCK request which turns
> > > out to have a conflicting delegation, return that the server is in
> > > grace.
> > >
> > > Reviewed-by: Jeff Layton <jlayton@redhat.com>
> > > Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
> > > ---
> > >  fs/lockd/svc4proc.c | 15 +++++++++++++--
> > >  1 file changed, 13 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
> > > index 109e5caae8c7..7ac4af5c9875 100644
> > > --- a/fs/lockd/svc4proc.c
> > > +++ b/fs/lockd/svc4proc.c
> > > @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
> > >       resp->cookie = argp->cookie;
> > >
> > >       /* Obtain client and file */
> > > -     if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
> > > -             return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
> > > +     resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
> > > +     switch (resp->status) {
> > > +     case 0:
> > > +             break;
> > > +     case nlm_drop_reply:
> > > +             if (locks_in_grace(SVC_NET(rqstp))) {
> > > +                     resp->status = nlm_lck_denied_grace_period;
> >
> > I think this is wrong.  If the lock request has the "reclaim" flag set,
> > then nlm_lck_denied_grace_period is not a meaningful error.
> > nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
> > getting a response to an upcall to mountd.  For NLM the request really
> > must be dropped.
> 
> Thank you for pointing out this case so we are suggesting to.
> 
> if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)
> 
> However, I've been looking and looking but I cannot figure out how
> nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
> happen during the upcall to mountd. So that happens within nfsd_open()
> call and a part of fh_verify().
> To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
> from the nfsd_open().  I have searched and searched but I don't see
> anything that ever sets nfserr_dropit (NFSERR_DROPIT).
> 
> I searched the logs and nfserr_dropit was an error that was EAGAIN
> translated into but then removed by the following patch.

Oh.  I didn't know that.
We now use RQ_DROPME instead.
I guess we should remove NFSERR_DROPIT completely as it not used at all
any more.

Though returning nfserr_jukebox to an v2 client isn't a good idea....

So I guess my main complaint isn't valid, but I still don't like this
patch.  It seems an untidy place to put the locks_in_grace test.
Other callers of nlm4svc_retrieve_args() check locks_in_grace() before
making that call.  __nlm4svc_proc_lock probably should too.  Or is there
a reason that it is delayed until the middle of nlmsvc_lock()..

The patch is not needed and isn't clearly an improvement, so I would
rather it were dropped.

Thanks,
NeilBrown


> 
> commit 062304a815fe10068c478a4a3f28cf091c55cb82
> Author: J. Bruce Fields <bfields@fieldses.org>
> Date:   Sun Jan 2 22:05:33 2011 -0500
> 
>     nfsd: stop translating EAGAIN to nfserr_dropit
> 
> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
> index dc9c2e3fd1b8..fd608a27a8d5 100644
> --- a/fs/nfsd/nfsproc.c
> +++ b/fs/nfsd/nfsproc.c
> @@ -735,7 +735,8 @@ nfserrno (int errno)
>                 { nfserr_stale, -ESTALE },
>                 { nfserr_jukebox, -ETIMEDOUT },
>                 { nfserr_jukebox, -ERESTARTSYS },
> -               { nfserr_dropit, -EAGAIN },
> +               { nfserr_jukebox, -EAGAIN },
> +               { nfserr_jukebox, -EWOULDBLOCK },
>                 { nfserr_jukebox, -ENOMEM },
>                 { nfserr_badname, -ESRCH },
>                 { nfserr_io, -ETXTBSY },
> 
> so if fh_verify is failing whatever it is returning, it is not
> nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
> translate it to nlm_failed which with my patch would not trigger
> nlm_lck_denied_grace_period error but resp->status would be set to
> nlm_failed.
> 
> So I circle back to I hope that convinces you that we don't need a
> check for the reclaim lock.
> 
> I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
> delegation recall. it can be a memory failure. But I'm sure when
> EWOULDBLOCK occurs.
> 
> > At the very least we need to guard against the reclaim flag being set in
> > the above test.  But I would much rather a more clear distinction were
> > made between "drop because of a conflicting delegation" and "drop
> > because of a delay getting upcall response".
> > Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
> > (and ideally nlm2) handles appropriately.
> 
> 
> > NeilBrown
> >
> >
> > > +                     return rpc_success;
> > > +             }
> > > +             return nlm_drop_reply;
> > > +     default:
> > > +             return rpc_success;
> > > +     }
> > >
> > >       /* Now try to lock the file */
> > >       resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
> > > --
> > > 2.47.1
> > >
> > >
> >
> >
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
  2025-08-20 23:15       ` NeilBrown
@ 2025-08-21 13:56         ` Olga Kornievskaia
  2025-08-21 14:01           ` Chuck Lever
                             ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Olga Kornievskaia @ 2025-08-21 13:56 UTC (permalink / raw)
  To: NeilBrown
  Cc: Olga Kornievskaia, chuck.lever, jlayton, linux-nfs, Dai.Ngo, tom

On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@brown.name> wrote:
>
> On Thu, 14 Aug 2025, Olga Kornievskaia wrote:
> > On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
> > >
> > > On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
> > > > When nfsd is in grace and receives an NLM LOCK request which turns
> > > > out to have a conflicting delegation, return that the server is in
> > > > grace.
> > > >
> > > > Reviewed-by: Jeff Layton <jlayton@redhat.com>
> > > > Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
> > > > ---
> > > >  fs/lockd/svc4proc.c | 15 +++++++++++++--
> > > >  1 file changed, 13 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
> > > > index 109e5caae8c7..7ac4af5c9875 100644
> > > > --- a/fs/lockd/svc4proc.c
> > > > +++ b/fs/lockd/svc4proc.c
> > > > @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
> > > >       resp->cookie = argp->cookie;
> > > >
> > > >       /* Obtain client and file */
> > > > -     if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
> > > > -             return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
> > > > +     resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
> > > > +     switch (resp->status) {
> > > > +     case 0:
> > > > +             break;
> > > > +     case nlm_drop_reply:
> > > > +             if (locks_in_grace(SVC_NET(rqstp))) {
> > > > +                     resp->status = nlm_lck_denied_grace_period;
> > >
> > > I think this is wrong.  If the lock request has the "reclaim" flag set,
> > > then nlm_lck_denied_grace_period is not a meaningful error.
> > > nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
> > > getting a response to an upcall to mountd.  For NLM the request really
> > > must be dropped.
> >
> > Thank you for pointing out this case so we are suggesting to.
> >
> > if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)
> >
> > However, I've been looking and looking but I cannot figure out how
> > nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
> > happen during the upcall to mountd. So that happens within nfsd_open()
> > call and a part of fh_verify().
> > To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
> > from the nfsd_open().  I have searched and searched but I don't see
> > anything that ever sets nfserr_dropit (NFSERR_DROPIT).
> >
> > I searched the logs and nfserr_dropit was an error that was EAGAIN
> > translated into but then removed by the following patch.
>
> Oh.  I didn't know that.
> We now use RQ_DROPME instead.
> I guess we should remove NFSERR_DROPIT completely as it not used at all
> any more.
>
> Though returning nfserr_jukebox to an v2 client isn't a good idea....

I'll take your word for you.

> So I guess my main complaint isn't valid, but I still don't like this
> patch.  It seems an untidy place to put the locks_in_grace test.
> Other callers of nlm4svc_retrieve_args() check locks_in_grace() before
> making that call.  __nlm4svc_proc_lock probably should too.  Or is there
> a reason that it is delayed until the middle of nlmsvc_lock()..

I thought the same about why not adding the in_grace check and decided
that it was probably because you dont want to deny a lock if there are
no conflicts. If we add it, somebody might notice and will complain
that they can't get their lock when there are no conflicts.

> The patch is not needed and isn't clearly an improvement, so I would
> rather it were dropped.

I'm not against dropping this patch if the conclusion is that dropping
the packet is no worse than returning in_grace error.


>
> Thanks,
> NeilBrown
>
>
> >
> > commit 062304a815fe10068c478a4a3f28cf091c55cb82
> > Author: J. Bruce Fields <bfields@fieldses.org>
> > Date:   Sun Jan 2 22:05:33 2011 -0500
> >
> >     nfsd: stop translating EAGAIN to nfserr_dropit
> >
> > diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
> > index dc9c2e3fd1b8..fd608a27a8d5 100644
> > --- a/fs/nfsd/nfsproc.c
> > +++ b/fs/nfsd/nfsproc.c
> > @@ -735,7 +735,8 @@ nfserrno (int errno)
> >                 { nfserr_stale, -ESTALE },
> >                 { nfserr_jukebox, -ETIMEDOUT },
> >                 { nfserr_jukebox, -ERESTARTSYS },
> > -               { nfserr_dropit, -EAGAIN },
> > +               { nfserr_jukebox, -EAGAIN },
> > +               { nfserr_jukebox, -EWOULDBLOCK },
> >                 { nfserr_jukebox, -ENOMEM },
> >                 { nfserr_badname, -ESRCH },
> >                 { nfserr_io, -ETXTBSY },
> >
> > so if fh_verify is failing whatever it is returning, it is not
> > nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
> > translate it to nlm_failed which with my patch would not trigger
> > nlm_lck_denied_grace_period error but resp->status would be set to
> > nlm_failed.
> >
> > So I circle back to I hope that convinces you that we don't need a
> > check for the reclaim lock.
> >
> > I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
> > delegation recall. it can be a memory failure. But I'm sure when
> > EWOULDBLOCK occurs.
> >
> > > At the very least we need to guard against the reclaim flag being set in
> > > the above test.  But I would much rather a more clear distinction were
> > > made between "drop because of a conflicting delegation" and "drop
> > > because of a delay getting upcall response".
> > > Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
> > > (and ideally nlm2) handles appropriately.
> >
> >
> > > NeilBrown
> > >
> > >
> > > > +                     return rpc_success;
> > > > +             }
> > > > +             return nlm_drop_reply;
> > > > +     default:
> > > > +             return rpc_success;
> > > > +     }
> > > >
> > > >       /* Now try to lock the file */
> > > >       resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
> > > > --
> > > > 2.47.1
> > > >
> > > >
> > >
> > >
> >
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
  2025-08-21 13:56         ` Olga Kornievskaia
@ 2025-08-21 14:01           ` Chuck Lever
  2025-08-21 15:09           ` Chuck Lever
  2025-08-25 15:23           ` Olga Kornievskaia
  2 siblings, 0 replies; 21+ messages in thread
From: Chuck Lever @ 2025-08-21 14:01 UTC (permalink / raw)
  To: Olga Kornievskaia, NeilBrown
  Cc: Olga Kornievskaia, jlayton, linux-nfs, Dai.Ngo, tom

On 8/21/25 9:56 AM, Olga Kornievskaia wrote:
> On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@brown.name> wrote:
>>
>> On Thu, 14 Aug 2025, Olga Kornievskaia wrote:
>>> On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
>>>>
>>>> On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
>>>>> When nfsd is in grace and receives an NLM LOCK request which turns
>>>>> out to have a conflicting delegation, return that the server is in
>>>>> grace.
>>>>>
>>>>> Reviewed-by: Jeff Layton <jlayton@redhat.com>
>>>>> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
>>>>> ---
>>>>>  fs/lockd/svc4proc.c | 15 +++++++++++++--
>>>>>  1 file changed, 13 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
>>>>> index 109e5caae8c7..7ac4af5c9875 100644
>>>>> --- a/fs/lockd/svc4proc.c
>>>>> +++ b/fs/lockd/svc4proc.c
>>>>> @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
>>>>>       resp->cookie = argp->cookie;
>>>>>
>>>>>       /* Obtain client and file */
>>>>> -     if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
>>>>> -             return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
>>>>> +     resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
>>>>> +     switch (resp->status) {
>>>>> +     case 0:
>>>>> +             break;
>>>>> +     case nlm_drop_reply:
>>>>> +             if (locks_in_grace(SVC_NET(rqstp))) {
>>>>> +                     resp->status = nlm_lck_denied_grace_period;
>>>>
>>>> I think this is wrong.  If the lock request has the "reclaim" flag set,
>>>> then nlm_lck_denied_grace_period is not a meaningful error.
>>>> nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
>>>> getting a response to an upcall to mountd.  For NLM the request really
>>>> must be dropped.
>>>
>>> Thank you for pointing out this case so we are suggesting to.
>>>
>>> if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)
>>>
>>> However, I've been looking and looking but I cannot figure out how
>>> nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
>>> happen during the upcall to mountd. So that happens within nfsd_open()
>>> call and a part of fh_verify().
>>> To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
>>> from the nfsd_open().  I have searched and searched but I don't see
>>> anything that ever sets nfserr_dropit (NFSERR_DROPIT).
>>>
>>> I searched the logs and nfserr_dropit was an error that was EAGAIN
>>> translated into but then removed by the following patch.
>>
>> Oh.  I didn't know that.
>> We now use RQ_DROPME instead.
>> I guess we should remove NFSERR_DROPIT completely as it not used at all
>> any more.
>>
>> Though returning nfserr_jukebox to an v2 client isn't a good idea....
> 
> I'll take your word for you.

NFSv2 doesn't have a JUKEBOX or DELAY status code, so NFSD can't return
that kind of error to NFSv2 clients.


>> So I guess my main complaint isn't valid, but I still don't like this
>> patch.  It seems an untidy place to put the locks_in_grace test.
>> Other callers of nlm4svc_retrieve_args() check locks_in_grace() before
>> making that call.  __nlm4svc_proc_lock probably should too.  Or is there
>> a reason that it is delayed until the middle of nlmsvc_lock()..
> 
> I thought the same about why not adding the in_grace check and decided
> that it was probably because you dont want to deny a lock if there are
> no conflicts. If we add it, somebody might notice and will complain
> that they can't get their lock when there are no conflicts.
> 
>> The patch is not needed and isn't clearly an improvement, so I would
>> rather it were dropped.
> 
> I'm not against dropping this patch if the conclusion is that dropping
> the packet is no worse than returning in_grace error.
> 
> 
>>
>> Thanks,
>> NeilBrown
>>
>>
>>>
>>> commit 062304a815fe10068c478a4a3f28cf091c55cb82
>>> Author: J. Bruce Fields <bfields@fieldses.org>
>>> Date:   Sun Jan 2 22:05:33 2011 -0500
>>>
>>>     nfsd: stop translating EAGAIN to nfserr_dropit
>>>
>>> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
>>> index dc9c2e3fd1b8..fd608a27a8d5 100644
>>> --- a/fs/nfsd/nfsproc.c
>>> +++ b/fs/nfsd/nfsproc.c
>>> @@ -735,7 +735,8 @@ nfserrno (int errno)
>>>                 { nfserr_stale, -ESTALE },
>>>                 { nfserr_jukebox, -ETIMEDOUT },
>>>                 { nfserr_jukebox, -ERESTARTSYS },
>>> -               { nfserr_dropit, -EAGAIN },
>>> +               { nfserr_jukebox, -EAGAIN },
>>> +               { nfserr_jukebox, -EWOULDBLOCK },
>>>                 { nfserr_jukebox, -ENOMEM },
>>>                 { nfserr_badname, -ESRCH },
>>>                 { nfserr_io, -ETXTBSY },
>>>
>>> so if fh_verify is failing whatever it is returning, it is not
>>> nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
>>> translate it to nlm_failed which with my patch would not trigger
>>> nlm_lck_denied_grace_period error but resp->status would be set to
>>> nlm_failed.
>>>
>>> So I circle back to I hope that convinces you that we don't need a
>>> check for the reclaim lock.
>>>
>>> I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
>>> delegation recall. it can be a memory failure. But I'm sure when
>>> EWOULDBLOCK occurs.
>>>
>>>> At the very least we need to guard against the reclaim flag being set in
>>>> the above test.  But I would much rather a more clear distinction were
>>>> made between "drop because of a conflicting delegation" and "drop
>>>> because of a delay getting upcall response".
>>>> Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
>>>> (and ideally nlm2) handles appropriately.
>>>
>>>
>>>> NeilBrown
>>>>
>>>>
>>>>> +                     return rpc_success;
>>>>> +             }
>>>>> +             return nlm_drop_reply;
>>>>> +     default:
>>>>> +             return rpc_success;
>>>>> +     }
>>>>>
>>>>>       /* Now try to lock the file */
>>>>>       resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
>>>>> --
>>>>> 2.47.1
>>>>>
>>>>>
>>>>
>>>>
>>>
>>


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
  2025-08-21 13:56         ` Olga Kornievskaia
  2025-08-21 14:01           ` Chuck Lever
@ 2025-08-21 15:09           ` Chuck Lever
  2025-08-21 18:20             ` Olga Kornievskaia
  2025-08-25 15:23           ` Olga Kornievskaia
  2 siblings, 1 reply; 21+ messages in thread
From: Chuck Lever @ 2025-08-21 15:09 UTC (permalink / raw)
  To: Olga Kornievskaia, NeilBrown
  Cc: Olga Kornievskaia, jlayton, linux-nfs, Dai.Ngo, tom

On 8/21/25 9:56 AM, Olga Kornievskaia wrote:
> On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@brown.name> wrote:
>>
>> On Thu, 14 Aug 2025, Olga Kornievskaia wrote:
>>> On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
>>>>
>>>> On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
>>>>> When nfsd is in grace and receives an NLM LOCK request which turns
>>>>> out to have a conflicting delegation, return that the server is in
>>>>> grace.
>>>>>
>>>>> Reviewed-by: Jeff Layton <jlayton@redhat.com>
>>>>> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
>>>>> ---
>>>>>  fs/lockd/svc4proc.c | 15 +++++++++++++--
>>>>>  1 file changed, 13 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
>>>>> index 109e5caae8c7..7ac4af5c9875 100644
>>>>> --- a/fs/lockd/svc4proc.c
>>>>> +++ b/fs/lockd/svc4proc.c
>>>>> @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
>>>>>       resp->cookie = argp->cookie;
>>>>>
>>>>>       /* Obtain client and file */
>>>>> -     if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
>>>>> -             return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
>>>>> +     resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
>>>>> +     switch (resp->status) {
>>>>> +     case 0:
>>>>> +             break;
>>>>> +     case nlm_drop_reply:
>>>>> +             if (locks_in_grace(SVC_NET(rqstp))) {
>>>>> +                     resp->status = nlm_lck_denied_grace_period;
>>>>
>>>> I think this is wrong.  If the lock request has the "reclaim" flag set,
>>>> then nlm_lck_denied_grace_period is not a meaningful error.
>>>> nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
>>>> getting a response to an upcall to mountd.  For NLM the request really
>>>> must be dropped.
>>>
>>> Thank you for pointing out this case so we are suggesting to.
>>>
>>> if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)
>>>
>>> However, I've been looking and looking but I cannot figure out how
>>> nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
>>> happen during the upcall to mountd. So that happens within nfsd_open()
>>> call and a part of fh_verify().
>>> To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
>>> from the nfsd_open().  I have searched and searched but I don't see
>>> anything that ever sets nfserr_dropit (NFSERR_DROPIT).
>>>
>>> I searched the logs and nfserr_dropit was an error that was EAGAIN
>>> translated into but then removed by the following patch.
>>
>> Oh.  I didn't know that.
>> We now use RQ_DROPME instead.
>> I guess we should remove NFSERR_DROPIT completely as it not used at all
>> any more.
>>
>> Though returning nfserr_jukebox to an v2 client isn't a good idea....
> 
> I'll take your word for you.
> 
>> So I guess my main complaint isn't valid, but I still don't like this
>> patch.  It seems an untidy place to put the locks_in_grace test.
>> Other callers of nlm4svc_retrieve_args() check locks_in_grace() before
>> making that call.  __nlm4svc_proc_lock probably should too.  Or is there
>> a reason that it is delayed until the middle of nlmsvc_lock()..
> 
> I thought the same about why not adding the in_grace check and decided
> that it was probably because you dont want to deny a lock if there are
> no conflicts. If we add it, somebody might notice and will complain
> that they can't get their lock when there are no conflicts.
> 
>> The patch is not needed and isn't clearly an improvement, so I would
>> rather it were dropped.
> 
> I'm not against dropping this patch if the conclusion is that dropping
> the packet is no worse than returning in_grace error.

I dropped both of these from nfsd-testing. If a fix is still needed,
let's start again with fresh patches.


>> Thanks,
>> NeilBrown
>>
>>
>>>
>>> commit 062304a815fe10068c478a4a3f28cf091c55cb82
>>> Author: J. Bruce Fields <bfields@fieldses.org>
>>> Date:   Sun Jan 2 22:05:33 2011 -0500
>>>
>>>     nfsd: stop translating EAGAIN to nfserr_dropit
>>>
>>> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
>>> index dc9c2e3fd1b8..fd608a27a8d5 100644
>>> --- a/fs/nfsd/nfsproc.c
>>> +++ b/fs/nfsd/nfsproc.c
>>> @@ -735,7 +735,8 @@ nfserrno (int errno)
>>>                 { nfserr_stale, -ESTALE },
>>>                 { nfserr_jukebox, -ETIMEDOUT },
>>>                 { nfserr_jukebox, -ERESTARTSYS },
>>> -               { nfserr_dropit, -EAGAIN },
>>> +               { nfserr_jukebox, -EAGAIN },
>>> +               { nfserr_jukebox, -EWOULDBLOCK },
>>>                 { nfserr_jukebox, -ENOMEM },
>>>                 { nfserr_badname, -ESRCH },
>>>                 { nfserr_io, -ETXTBSY },
>>>
>>> so if fh_verify is failing whatever it is returning, it is not
>>> nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
>>> translate it to nlm_failed which with my patch would not trigger
>>> nlm_lck_denied_grace_period error but resp->status would be set to
>>> nlm_failed.
>>>
>>> So I circle back to I hope that convinces you that we don't need a
>>> check for the reclaim lock.
>>>
>>> I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
>>> delegation recall. it can be a memory failure. But I'm sure when
>>> EWOULDBLOCK occurs.
>>>
>>>> At the very least we need to guard against the reclaim flag being set in
>>>> the above test.  But I would much rather a more clear distinction were
>>>> made between "drop because of a conflicting delegation" and "drop
>>>> because of a delay getting upcall response".
>>>> Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
>>>> (and ideally nlm2) handles appropriately.
>>>
>>>
>>>> NeilBrown
>>>>
>>>>
>>>>> +                     return rpc_success;
>>>>> +             }
>>>>> +             return nlm_drop_reply;
>>>>> +     default:
>>>>> +             return rpc_success;
>>>>> +     }
>>>>>
>>>>>       /* Now try to lock the file */
>>>>>       resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
>>>>> --
>>>>> 2.47.1
>>>>>
>>>>>
>>>>
>>>>
>>>
>>


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
  2025-08-21 15:09           ` Chuck Lever
@ 2025-08-21 18:20             ` Olga Kornievskaia
  2025-08-21 18:24               ` Chuck Lever
  0 siblings, 1 reply; 21+ messages in thread
From: Olga Kornievskaia @ 2025-08-21 18:20 UTC (permalink / raw)
  To: Chuck Lever
  Cc: NeilBrown, Olga Kornievskaia, jlayton, linux-nfs, Dai.Ngo, tom

On Thu, Aug 21, 2025 at 11:09 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> On 8/21/25 9:56 AM, Olga Kornievskaia wrote:
> > On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@brown.name> wrote:
> >>
> >> On Thu, 14 Aug 2025, Olga Kornievskaia wrote:
> >>> On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
> >>>>
> >>>> On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
> >>>>> When nfsd is in grace and receives an NLM LOCK request which turns
> >>>>> out to have a conflicting delegation, return that the server is in
> >>>>> grace.
> >>>>>
> >>>>> Reviewed-by: Jeff Layton <jlayton@redhat.com>
> >>>>> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
> >>>>> ---
> >>>>>  fs/lockd/svc4proc.c | 15 +++++++++++++--
> >>>>>  1 file changed, 13 insertions(+), 2 deletions(-)
> >>>>>
> >>>>> diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
> >>>>> index 109e5caae8c7..7ac4af5c9875 100644
> >>>>> --- a/fs/lockd/svc4proc.c
> >>>>> +++ b/fs/lockd/svc4proc.c
> >>>>> @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
> >>>>>       resp->cookie = argp->cookie;
> >>>>>
> >>>>>       /* Obtain client and file */
> >>>>> -     if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
> >>>>> -             return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
> >>>>> +     resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
> >>>>> +     switch (resp->status) {
> >>>>> +     case 0:
> >>>>> +             break;
> >>>>> +     case nlm_drop_reply:
> >>>>> +             if (locks_in_grace(SVC_NET(rqstp))) {
> >>>>> +                     resp->status = nlm_lck_denied_grace_period;
> >>>>
> >>>> I think this is wrong.  If the lock request has the "reclaim" flag set,
> >>>> then nlm_lck_denied_grace_period is not a meaningful error.
> >>>> nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
> >>>> getting a response to an upcall to mountd.  For NLM the request really
> >>>> must be dropped.
> >>>
> >>> Thank you for pointing out this case so we are suggesting to.
> >>>
> >>> if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)
> >>>
> >>> However, I've been looking and looking but I cannot figure out how
> >>> nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
> >>> happen during the upcall to mountd. So that happens within nfsd_open()
> >>> call and a part of fh_verify().
> >>> To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
> >>> from the nfsd_open().  I have searched and searched but I don't see
> >>> anything that ever sets nfserr_dropit (NFSERR_DROPIT).
> >>>
> >>> I searched the logs and nfserr_dropit was an error that was EAGAIN
> >>> translated into but then removed by the following patch.
> >>
> >> Oh.  I didn't know that.
> >> We now use RQ_DROPME instead.
> >> I guess we should remove NFSERR_DROPIT completely as it not used at all
> >> any more.
> >>
> >> Though returning nfserr_jukebox to an v2 client isn't a good idea....
> >
> > I'll take your word for you.
> >
> >> So I guess my main complaint isn't valid, but I still don't like this
> >> patch.  It seems an untidy place to put the locks_in_grace test.
> >> Other callers of nlm4svc_retrieve_args() check locks_in_grace() before
> >> making that call.  __nlm4svc_proc_lock probably should too.  Or is there
> >> a reason that it is delayed until the middle of nlmsvc_lock()..
> >
> > I thought the same about why not adding the in_grace check and decided
> > that it was probably because you dont want to deny a lock if there are
> > no conflicts. If we add it, somebody might notice and will complain
> > that they can't get their lock when there are no conflicts.
> >
> >> The patch is not needed and isn't clearly an improvement, so I would
> >> rather it were dropped.
> >
> > I'm not against dropping this patch if the conclusion is that dropping
> > the packet is no worse than returning in_grace error.
>
> I dropped both of these from nfsd-testing. If a fix is still needed,
> let's start again with fresh patches.

Can you clarify when you said "both"?

What objections are there for the 1st patch in the series. It solves a
problem and a fix is needed. This one I agree is not needed but I
thought was an improvement.

>
>
> >> Thanks,
> >> NeilBrown
> >>
> >>
> >>>
> >>> commit 062304a815fe10068c478a4a3f28cf091c55cb82
> >>> Author: J. Bruce Fields <bfields@fieldses.org>
> >>> Date:   Sun Jan 2 22:05:33 2011 -0500
> >>>
> >>>     nfsd: stop translating EAGAIN to nfserr_dropit
> >>>
> >>> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
> >>> index dc9c2e3fd1b8..fd608a27a8d5 100644
> >>> --- a/fs/nfsd/nfsproc.c
> >>> +++ b/fs/nfsd/nfsproc.c
> >>> @@ -735,7 +735,8 @@ nfserrno (int errno)
> >>>                 { nfserr_stale, -ESTALE },
> >>>                 { nfserr_jukebox, -ETIMEDOUT },
> >>>                 { nfserr_jukebox, -ERESTARTSYS },
> >>> -               { nfserr_dropit, -EAGAIN },
> >>> +               { nfserr_jukebox, -EAGAIN },
> >>> +               { nfserr_jukebox, -EWOULDBLOCK },
> >>>                 { nfserr_jukebox, -ENOMEM },
> >>>                 { nfserr_badname, -ESRCH },
> >>>                 { nfserr_io, -ETXTBSY },
> >>>
> >>> so if fh_verify is failing whatever it is returning, it is not
> >>> nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
> >>> translate it to nlm_failed which with my patch would not trigger
> >>> nlm_lck_denied_grace_period error but resp->status would be set to
> >>> nlm_failed.
> >>>
> >>> So I circle back to I hope that convinces you that we don't need a
> >>> check for the reclaim lock.
> >>>
> >>> I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
> >>> delegation recall. it can be a memory failure. But I'm sure when
> >>> EWOULDBLOCK occurs.
> >>>
> >>>> At the very least we need to guard against the reclaim flag being set in
> >>>> the above test.  But I would much rather a more clear distinction were
> >>>> made between "drop because of a conflicting delegation" and "drop
> >>>> because of a delay getting upcall response".
> >>>> Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
> >>>> (and ideally nlm2) handles appropriately.
> >>>
> >>>
> >>>> NeilBrown
> >>>>
> >>>>
> >>>>> +                     return rpc_success;
> >>>>> +             }
> >>>>> +             return nlm_drop_reply;
> >>>>> +     default:
> >>>>> +             return rpc_success;
> >>>>> +     }
> >>>>>
> >>>>>       /* Now try to lock the file */
> >>>>>       resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
> >>>>> --
> >>>>> 2.47.1
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
>
>
> --
> Chuck Lever

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
  2025-08-21 18:20             ` Olga Kornievskaia
@ 2025-08-21 18:24               ` Chuck Lever
  2025-08-21 18:33                 ` Olga Kornievskaia
  0 siblings, 1 reply; 21+ messages in thread
From: Chuck Lever @ 2025-08-21 18:24 UTC (permalink / raw)
  To: Olga Kornievskaia
  Cc: NeilBrown, Olga Kornievskaia, jlayton, linux-nfs, Dai.Ngo, tom

On 8/21/25 2:20 PM, Olga Kornievskaia wrote:
> On Thu, Aug 21, 2025 at 11:09 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>>
>> On 8/21/25 9:56 AM, Olga Kornievskaia wrote:
>>> On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@brown.name> wrote:
>>>>
>>>> On Thu, 14 Aug 2025, Olga Kornievskaia wrote:
>>>>> On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
>>>>>>
>>>>>> On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
>>>>>>> When nfsd is in grace and receives an NLM LOCK request which turns
>>>>>>> out to have a conflicting delegation, return that the server is in
>>>>>>> grace.
>>>>>>>
>>>>>>> Reviewed-by: Jeff Layton <jlayton@redhat.com>
>>>>>>> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
>>>>>>> ---
>>>>>>>  fs/lockd/svc4proc.c | 15 +++++++++++++--
>>>>>>>  1 file changed, 13 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
>>>>>>> index 109e5caae8c7..7ac4af5c9875 100644
>>>>>>> --- a/fs/lockd/svc4proc.c
>>>>>>> +++ b/fs/lockd/svc4proc.c
>>>>>>> @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
>>>>>>>       resp->cookie = argp->cookie;
>>>>>>>
>>>>>>>       /* Obtain client and file */
>>>>>>> -     if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
>>>>>>> -             return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
>>>>>>> +     resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
>>>>>>> +     switch (resp->status) {
>>>>>>> +     case 0:
>>>>>>> +             break;
>>>>>>> +     case nlm_drop_reply:
>>>>>>> +             if (locks_in_grace(SVC_NET(rqstp))) {
>>>>>>> +                     resp->status = nlm_lck_denied_grace_period;
>>>>>>
>>>>>> I think this is wrong.  If the lock request has the "reclaim" flag set,
>>>>>> then nlm_lck_denied_grace_period is not a meaningful error.
>>>>>> nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
>>>>>> getting a response to an upcall to mountd.  For NLM the request really
>>>>>> must be dropped.
>>>>>
>>>>> Thank you for pointing out this case so we are suggesting to.
>>>>>
>>>>> if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)
>>>>>
>>>>> However, I've been looking and looking but I cannot figure out how
>>>>> nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
>>>>> happen during the upcall to mountd. So that happens within nfsd_open()
>>>>> call and a part of fh_verify().
>>>>> To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
>>>>> from the nfsd_open().  I have searched and searched but I don't see
>>>>> anything that ever sets nfserr_dropit (NFSERR_DROPIT).
>>>>>
>>>>> I searched the logs and nfserr_dropit was an error that was EAGAIN
>>>>> translated into but then removed by the following patch.
>>>>
>>>> Oh.  I didn't know that.
>>>> We now use RQ_DROPME instead.
>>>> I guess we should remove NFSERR_DROPIT completely as it not used at all
>>>> any more.
>>>>
>>>> Though returning nfserr_jukebox to an v2 client isn't a good idea....
>>>
>>> I'll take your word for you.
>>>
>>>> So I guess my main complaint isn't valid, but I still don't like this
>>>> patch.  It seems an untidy place to put the locks_in_grace test.
>>>> Other callers of nlm4svc_retrieve_args() check locks_in_grace() before
>>>> making that call.  __nlm4svc_proc_lock probably should too.  Or is there
>>>> a reason that it is delayed until the middle of nlmsvc_lock()..
>>>
>>> I thought the same about why not adding the in_grace check and decided
>>> that it was probably because you dont want to deny a lock if there are
>>> no conflicts. If we add it, somebody might notice and will complain
>>> that they can't get their lock when there are no conflicts.
>>>
>>>> The patch is not needed and isn't clearly an improvement, so I would
>>>> rather it were dropped.
>>>
>>> I'm not against dropping this patch if the conclusion is that dropping
>>> the packet is no worse than returning in_grace error.
>>
>> I dropped both of these from nfsd-testing. If a fix is still needed,
>> let's start again with fresh patches.
> 
> Can you clarify when you said "both"?
> 
> What objections are there for the 1st patch in the series. It solves a
> problem and a fix is needed.

There are two reasons to drop the first patch.

1. Neil's "remove nfserr_dropit" patch doesn't apply unless both patches
are reverted.

2. As I said above, NFSv2 does not have a mechanism like NFS3ERR_JUKEBOX
to request that the client wait a bit and resend.

So, if 1/2 has been tested with NFSv2 and does not cause NFSD to leak
nfserr_jukebox to NFSv2 clients, then please rebase that one on the
current nfsd-testing branch and post it again.


> This one I agree is not needed but I
> thought was an improvement.
> 
>>
>>
>>>> Thanks,
>>>> NeilBrown
>>>>
>>>>
>>>>>
>>>>> commit 062304a815fe10068c478a4a3f28cf091c55cb82
>>>>> Author: J. Bruce Fields <bfields@fieldses.org>
>>>>> Date:   Sun Jan 2 22:05:33 2011 -0500
>>>>>
>>>>>     nfsd: stop translating EAGAIN to nfserr_dropit
>>>>>
>>>>> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
>>>>> index dc9c2e3fd1b8..fd608a27a8d5 100644
>>>>> --- a/fs/nfsd/nfsproc.c
>>>>> +++ b/fs/nfsd/nfsproc.c
>>>>> @@ -735,7 +735,8 @@ nfserrno (int errno)
>>>>>                 { nfserr_stale, -ESTALE },
>>>>>                 { nfserr_jukebox, -ETIMEDOUT },
>>>>>                 { nfserr_jukebox, -ERESTARTSYS },
>>>>> -               { nfserr_dropit, -EAGAIN },
>>>>> +               { nfserr_jukebox, -EAGAIN },
>>>>> +               { nfserr_jukebox, -EWOULDBLOCK },
>>>>>                 { nfserr_jukebox, -ENOMEM },
>>>>>                 { nfserr_badname, -ESRCH },
>>>>>                 { nfserr_io, -ETXTBSY },
>>>>>
>>>>> so if fh_verify is failing whatever it is returning, it is not
>>>>> nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
>>>>> translate it to nlm_failed which with my patch would not trigger
>>>>> nlm_lck_denied_grace_period error but resp->status would be set to
>>>>> nlm_failed.
>>>>>
>>>>> So I circle back to I hope that convinces you that we don't need a
>>>>> check for the reclaim lock.
>>>>>
>>>>> I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
>>>>> delegation recall. it can be a memory failure. But I'm sure when
>>>>> EWOULDBLOCK occurs.
>>>>>
>>>>>> At the very least we need to guard against the reclaim flag being set in
>>>>>> the above test.  But I would much rather a more clear distinction were
>>>>>> made between "drop because of a conflicting delegation" and "drop
>>>>>> because of a delay getting upcall response".
>>>>>> Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
>>>>>> (and ideally nlm2) handles appropriately.
>>>>>
>>>>>
>>>>>> NeilBrown
>>>>>>
>>>>>>
>>>>>>> +                     return rpc_success;
>>>>>>> +             }
>>>>>>> +             return nlm_drop_reply;
>>>>>>> +     default:
>>>>>>> +             return rpc_success;
>>>>>>> +     }
>>>>>>>
>>>>>>>       /* Now try to lock the file */
>>>>>>>       resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
>>>>>>> --
>>>>>>> 2.47.1
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>
>>
>> --
>> Chuck Lever
> 


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
  2025-08-21 18:24               ` Chuck Lever
@ 2025-08-21 18:33                 ` Olga Kornievskaia
  2025-08-21 18:39                   ` Chuck Lever
  0 siblings, 1 reply; 21+ messages in thread
From: Olga Kornievskaia @ 2025-08-21 18:33 UTC (permalink / raw)
  To: Chuck Lever
  Cc: NeilBrown, Olga Kornievskaia, jlayton, linux-nfs, Dai.Ngo, tom

On Thu, Aug 21, 2025 at 2:24 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> On 8/21/25 2:20 PM, Olga Kornievskaia wrote:
> > On Thu, Aug 21, 2025 at 11:09 AM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>
> >> On 8/21/25 9:56 AM, Olga Kornievskaia wrote:
> >>> On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@brown.name> wrote:
> >>>>
> >>>> On Thu, 14 Aug 2025, Olga Kornievskaia wrote:
> >>>>> On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
> >>>>>>
> >>>>>> On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
> >>>>>>> When nfsd is in grace and receives an NLM LOCK request which turns
> >>>>>>> out to have a conflicting delegation, return that the server is in
> >>>>>>> grace.
> >>>>>>>
> >>>>>>> Reviewed-by: Jeff Layton <jlayton@redhat.com>
> >>>>>>> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
> >>>>>>> ---
> >>>>>>>  fs/lockd/svc4proc.c | 15 +++++++++++++--
> >>>>>>>  1 file changed, 13 insertions(+), 2 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
> >>>>>>> index 109e5caae8c7..7ac4af5c9875 100644
> >>>>>>> --- a/fs/lockd/svc4proc.c
> >>>>>>> +++ b/fs/lockd/svc4proc.c
> >>>>>>> @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
> >>>>>>>       resp->cookie = argp->cookie;
> >>>>>>>
> >>>>>>>       /* Obtain client and file */
> >>>>>>> -     if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
> >>>>>>> -             return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
> >>>>>>> +     resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
> >>>>>>> +     switch (resp->status) {
> >>>>>>> +     case 0:
> >>>>>>> +             break;
> >>>>>>> +     case nlm_drop_reply:
> >>>>>>> +             if (locks_in_grace(SVC_NET(rqstp))) {
> >>>>>>> +                     resp->status = nlm_lck_denied_grace_period;
> >>>>>>
> >>>>>> I think this is wrong.  If the lock request has the "reclaim" flag set,
> >>>>>> then nlm_lck_denied_grace_period is not a meaningful error.
> >>>>>> nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
> >>>>>> getting a response to an upcall to mountd.  For NLM the request really
> >>>>>> must be dropped.
> >>>>>
> >>>>> Thank you for pointing out this case so we are suggesting to.
> >>>>>
> >>>>> if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)
> >>>>>
> >>>>> However, I've been looking and looking but I cannot figure out how
> >>>>> nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
> >>>>> happen during the upcall to mountd. So that happens within nfsd_open()
> >>>>> call and a part of fh_verify().
> >>>>> To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
> >>>>> from the nfsd_open().  I have searched and searched but I don't see
> >>>>> anything that ever sets nfserr_dropit (NFSERR_DROPIT).
> >>>>>
> >>>>> I searched the logs and nfserr_dropit was an error that was EAGAIN
> >>>>> translated into but then removed by the following patch.
> >>>>
> >>>> Oh.  I didn't know that.
> >>>> We now use RQ_DROPME instead.
> >>>> I guess we should remove NFSERR_DROPIT completely as it not used at all
> >>>> any more.
> >>>>
> >>>> Though returning nfserr_jukebox to an v2 client isn't a good idea....
> >>>
> >>> I'll take your word for you.
> >>>
> >>>> So I guess my main complaint isn't valid, but I still don't like this
> >>>> patch.  It seems an untidy place to put the locks_in_grace test.
> >>>> Other callers of nlm4svc_retrieve_args() check locks_in_grace() before
> >>>> making that call.  __nlm4svc_proc_lock probably should too.  Or is there
> >>>> a reason that it is delayed until the middle of nlmsvc_lock()..
> >>>
> >>> I thought the same about why not adding the in_grace check and decided
> >>> that it was probably because you dont want to deny a lock if there are
> >>> no conflicts. If we add it, somebody might notice and will complain
> >>> that they can't get their lock when there are no conflicts.
> >>>
> >>>> The patch is not needed and isn't clearly an improvement, so I would
> >>>> rather it were dropped.
> >>>
> >>> I'm not against dropping this patch if the conclusion is that dropping
> >>> the packet is no worse than returning in_grace error.
> >>
> >> I dropped both of these from nfsd-testing. If a fix is still needed,
> >> let's start again with fresh patches.
> >
> > Can you clarify when you said "both"?
> >
> > What objections are there for the 1st patch in the series. It solves a
> > problem and a fix is needed.
>
> There are two reasons to drop the first patch.
>
> 1. Neil's "remove nfserr_dropit" patch doesn't apply unless both patches
> are reverted.
>
> 2. As I said above, NFSv2 does not have a mechanism like NFS3ERR_JUKEBOX
> to request that the client wait a bit and resend.

ERR_JUKEBOX is not returned to another v2 or v3.

Patch1 (nfsd: nfserr_jukebox in nlm_fopen should lead to a retry)
translates err_jukebox into the nlm_drop_reply and returns to lockd.
As the result, no error is returned to the client but the reply is
dropped all together.


> So, if 1/2 has been tested with NFSv2 and does not cause NFSD to leak
> nfserr_jukebox to NFSv2 clients, then please rebase that one on the
> current nfsd-testing branch and post it again.
>
>
> > This one I agree is not needed but I
> > thought was an improvement.
> >
> >>
> >>
> >>>> Thanks,
> >>>> NeilBrown
> >>>>
> >>>>
> >>>>>
> >>>>> commit 062304a815fe10068c478a4a3f28cf091c55cb82
> >>>>> Author: J. Bruce Fields <bfields@fieldses.org>
> >>>>> Date:   Sun Jan 2 22:05:33 2011 -0500
> >>>>>
> >>>>>     nfsd: stop translating EAGAIN to nfserr_dropit
> >>>>>
> >>>>> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
> >>>>> index dc9c2e3fd1b8..fd608a27a8d5 100644
> >>>>> --- a/fs/nfsd/nfsproc.c
> >>>>> +++ b/fs/nfsd/nfsproc.c
> >>>>> @@ -735,7 +735,8 @@ nfserrno (int errno)
> >>>>>                 { nfserr_stale, -ESTALE },
> >>>>>                 { nfserr_jukebox, -ETIMEDOUT },
> >>>>>                 { nfserr_jukebox, -ERESTARTSYS },
> >>>>> -               { nfserr_dropit, -EAGAIN },
> >>>>> +               { nfserr_jukebox, -EAGAIN },
> >>>>> +               { nfserr_jukebox, -EWOULDBLOCK },
> >>>>>                 { nfserr_jukebox, -ENOMEM },
> >>>>>                 { nfserr_badname, -ESRCH },
> >>>>>                 { nfserr_io, -ETXTBSY },
> >>>>>
> >>>>> so if fh_verify is failing whatever it is returning, it is not
> >>>>> nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
> >>>>> translate it to nlm_failed which with my patch would not trigger
> >>>>> nlm_lck_denied_grace_period error but resp->status would be set to
> >>>>> nlm_failed.
> >>>>>
> >>>>> So I circle back to I hope that convinces you that we don't need a
> >>>>> check for the reclaim lock.
> >>>>>
> >>>>> I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
> >>>>> delegation recall. it can be a memory failure. But I'm sure when
> >>>>> EWOULDBLOCK occurs.
> >>>>>
> >>>>>> At the very least we need to guard against the reclaim flag being set in
> >>>>>> the above test.  But I would much rather a more clear distinction were
> >>>>>> made between "drop because of a conflicting delegation" and "drop
> >>>>>> because of a delay getting upcall response".
> >>>>>> Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
> >>>>>> (and ideally nlm2) handles appropriately.
> >>>>>
> >>>>>
> >>>>>> NeilBrown
> >>>>>>
> >>>>>>
> >>>>>>> +                     return rpc_success;
> >>>>>>> +             }
> >>>>>>> +             return nlm_drop_reply;
> >>>>>>> +     default:
> >>>>>>> +             return rpc_success;
> >>>>>>> +     }
> >>>>>>>
> >>>>>>>       /* Now try to lock the file */
> >>>>>>>       resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
> >>>>>>> --
> >>>>>>> 2.47.1
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>
> >>
> >> --
> >> Chuck Lever
> >
>
>
> --
> Chuck Lever

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
  2025-08-21 18:33                 ` Olga Kornievskaia
@ 2025-08-21 18:39                   ` Chuck Lever
  2025-08-21 18:51                     ` Olga Kornievskaia
  2025-08-21 19:15                     ` Olga Kornievskaia
  0 siblings, 2 replies; 21+ messages in thread
From: Chuck Lever @ 2025-08-21 18:39 UTC (permalink / raw)
  To: Olga Kornievskaia
  Cc: NeilBrown, Olga Kornievskaia, jlayton, linux-nfs, Dai.Ngo, tom

On 8/21/25 2:33 PM, Olga Kornievskaia wrote:
> On Thu, Aug 21, 2025 at 2:24 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>>
>> On 8/21/25 2:20 PM, Olga Kornievskaia wrote:
>>> On Thu, Aug 21, 2025 at 11:09 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>>>>
>>>> On 8/21/25 9:56 AM, Olga Kornievskaia wrote:
>>>>> On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@brown.name> wrote:
>>>>>>
>>>>>> On Thu, 14 Aug 2025, Olga Kornievskaia wrote:
>>>>>>> On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
>>>>>>>>
>>>>>>>> On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
>>>>>>>>> When nfsd is in grace and receives an NLM LOCK request which turns
>>>>>>>>> out to have a conflicting delegation, return that the server is in
>>>>>>>>> grace.
>>>>>>>>>
>>>>>>>>> Reviewed-by: Jeff Layton <jlayton@redhat.com>
>>>>>>>>> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
>>>>>>>>> ---
>>>>>>>>>  fs/lockd/svc4proc.c | 15 +++++++++++++--
>>>>>>>>>  1 file changed, 13 insertions(+), 2 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
>>>>>>>>> index 109e5caae8c7..7ac4af5c9875 100644
>>>>>>>>> --- a/fs/lockd/svc4proc.c
>>>>>>>>> +++ b/fs/lockd/svc4proc.c
>>>>>>>>> @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
>>>>>>>>>       resp->cookie = argp->cookie;
>>>>>>>>>
>>>>>>>>>       /* Obtain client and file */
>>>>>>>>> -     if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
>>>>>>>>> -             return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
>>>>>>>>> +     resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
>>>>>>>>> +     switch (resp->status) {
>>>>>>>>> +     case 0:
>>>>>>>>> +             break;
>>>>>>>>> +     case nlm_drop_reply:
>>>>>>>>> +             if (locks_in_grace(SVC_NET(rqstp))) {
>>>>>>>>> +                     resp->status = nlm_lck_denied_grace_period;
>>>>>>>>
>>>>>>>> I think this is wrong.  If the lock request has the "reclaim" flag set,
>>>>>>>> then nlm_lck_denied_grace_period is not a meaningful error.
>>>>>>>> nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
>>>>>>>> getting a response to an upcall to mountd.  For NLM the request really
>>>>>>>> must be dropped.
>>>>>>>
>>>>>>> Thank you for pointing out this case so we are suggesting to.
>>>>>>>
>>>>>>> if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)
>>>>>>>
>>>>>>> However, I've been looking and looking but I cannot figure out how
>>>>>>> nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
>>>>>>> happen during the upcall to mountd. So that happens within nfsd_open()
>>>>>>> call and a part of fh_verify().
>>>>>>> To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
>>>>>>> from the nfsd_open().  I have searched and searched but I don't see
>>>>>>> anything that ever sets nfserr_dropit (NFSERR_DROPIT).
>>>>>>>
>>>>>>> I searched the logs and nfserr_dropit was an error that was EAGAIN
>>>>>>> translated into but then removed by the following patch.
>>>>>>
>>>>>> Oh.  I didn't know that.
>>>>>> We now use RQ_DROPME instead.
>>>>>> I guess we should remove NFSERR_DROPIT completely as it not used at all
>>>>>> any more.
>>>>>>
>>>>>> Though returning nfserr_jukebox to an v2 client isn't a good idea....
>>>>>
>>>>> I'll take your word for you.
>>>>>
>>>>>> So I guess my main complaint isn't valid, but I still don't like this
>>>>>> patch.  It seems an untidy place to put the locks_in_grace test.
>>>>>> Other callers of nlm4svc_retrieve_args() check locks_in_grace() before
>>>>>> making that call.  __nlm4svc_proc_lock probably should too.  Or is there
>>>>>> a reason that it is delayed until the middle of nlmsvc_lock()..
>>>>>
>>>>> I thought the same about why not adding the in_grace check and decided
>>>>> that it was probably because you dont want to deny a lock if there are
>>>>> no conflicts. If we add it, somebody might notice and will complain
>>>>> that they can't get their lock when there are no conflicts.
>>>>>
>>>>>> The patch is not needed and isn't clearly an improvement, so I would
>>>>>> rather it were dropped.
>>>>>
>>>>> I'm not against dropping this patch if the conclusion is that dropping
>>>>> the packet is no worse than returning in_grace error.
>>>>
>>>> I dropped both of these from nfsd-testing. If a fix is still needed,
>>>> let's start again with fresh patches.
>>>
>>> Can you clarify when you said "both"?
>>>
>>> What objections are there for the 1st patch in the series. It solves a
>>> problem and a fix is needed.
>>
>> There are two reasons to drop the first patch.
>>
>> 1. Neil's "remove nfserr_dropit" patch doesn't apply unless both patches
>> are reverted.
>>
>> 2. As I said above, NFSv2 does not have a mechanism like NFS3ERR_JUKEBOX
>> to request that the client wait a bit and resend.
> 
> ERR_JUKEBOX is not returned to another v2 or v3.
> 
> Patch1 (nfsd: nfserr_jukebox in nlm_fopen should lead to a retry)
> translates err_jukebox into the nlm_drop_reply and returns to lockd.
> As the result, no error is returned to the client but the reply is
> dropped all together.

If you want NLM to drop the response, then set RQ_DROPME. Using
nfserr_jukebox here is confusing -- it means "return a response to the
client that requests a resend". You want NLM to /not send a response/,
and we have a specific mechanism for that.


>> So, if 1/2 has been tested with NFSv2 and does not cause NFSD to leak
>> nfserr_jukebox to NFSv2 clients, then please rebase that one on the
>> current nfsd-testing branch and post it again.
>>
>>
>>> This one I agree is not needed but I
>>> thought was an improvement.
>>>
>>>>
>>>>
>>>>>> Thanks,
>>>>>> NeilBrown
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> commit 062304a815fe10068c478a4a3f28cf091c55cb82
>>>>>>> Author: J. Bruce Fields <bfields@fieldses.org>
>>>>>>> Date:   Sun Jan 2 22:05:33 2011 -0500
>>>>>>>
>>>>>>>     nfsd: stop translating EAGAIN to nfserr_dropit
>>>>>>>
>>>>>>> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
>>>>>>> index dc9c2e3fd1b8..fd608a27a8d5 100644
>>>>>>> --- a/fs/nfsd/nfsproc.c
>>>>>>> +++ b/fs/nfsd/nfsproc.c
>>>>>>> @@ -735,7 +735,8 @@ nfserrno (int errno)
>>>>>>>                 { nfserr_stale, -ESTALE },
>>>>>>>                 { nfserr_jukebox, -ETIMEDOUT },
>>>>>>>                 { nfserr_jukebox, -ERESTARTSYS },
>>>>>>> -               { nfserr_dropit, -EAGAIN },
>>>>>>> +               { nfserr_jukebox, -EAGAIN },
>>>>>>> +               { nfserr_jukebox, -EWOULDBLOCK },
>>>>>>>                 { nfserr_jukebox, -ENOMEM },
>>>>>>>                 { nfserr_badname, -ESRCH },
>>>>>>>                 { nfserr_io, -ETXTBSY },
>>>>>>>
>>>>>>> so if fh_verify is failing whatever it is returning, it is not
>>>>>>> nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
>>>>>>> translate it to nlm_failed which with my patch would not trigger
>>>>>>> nlm_lck_denied_grace_period error but resp->status would be set to
>>>>>>> nlm_failed.
>>>>>>>
>>>>>>> So I circle back to I hope that convinces you that we don't need a
>>>>>>> check for the reclaim lock.
>>>>>>>
>>>>>>> I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
>>>>>>> delegation recall. it can be a memory failure. But I'm sure when
>>>>>>> EWOULDBLOCK occurs.
>>>>>>>
>>>>>>>> At the very least we need to guard against the reclaim flag being set in
>>>>>>>> the above test.  But I would much rather a more clear distinction were
>>>>>>>> made between "drop because of a conflicting delegation" and "drop
>>>>>>>> because of a delay getting upcall response".
>>>>>>>> Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
>>>>>>>> (and ideally nlm2) handles appropriately.
>>>>>>>
>>>>>>>
>>>>>>>> NeilBrown
>>>>>>>>
>>>>>>>>
>>>>>>>>> +                     return rpc_success;
>>>>>>>>> +             }
>>>>>>>>> +             return nlm_drop_reply;
>>>>>>>>> +     default:
>>>>>>>>> +             return rpc_success;
>>>>>>>>> +     }
>>>>>>>>>
>>>>>>>>>       /* Now try to lock the file */
>>>>>>>>>       resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
>>>>>>>>> --
>>>>>>>>> 2.47.1
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>>
>>>> --
>>>> Chuck Lever
>>>
>>
>>
>> --
>> Chuck Lever


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
  2025-08-21 18:39                   ` Chuck Lever
@ 2025-08-21 18:51                     ` Olga Kornievskaia
  2025-08-21 19:15                     ` Olga Kornievskaia
  1 sibling, 0 replies; 21+ messages in thread
From: Olga Kornievskaia @ 2025-08-21 18:51 UTC (permalink / raw)
  To: Chuck Lever
  Cc: NeilBrown, Olga Kornievskaia, jlayton, linux-nfs, Dai.Ngo, tom

On Thu, Aug 21, 2025 at 2:39 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> On 8/21/25 2:33 PM, Olga Kornievskaia wrote:
> > On Thu, Aug 21, 2025 at 2:24 PM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>
> >> On 8/21/25 2:20 PM, Olga Kornievskaia wrote:
> >>> On Thu, Aug 21, 2025 at 11:09 AM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>>>
> >>>> On 8/21/25 9:56 AM, Olga Kornievskaia wrote:
> >>>>> On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@brown.name> wrote:
> >>>>>>
> >>>>>> On Thu, 14 Aug 2025, Olga Kornievskaia wrote:
> >>>>>>> On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
> >>>>>>>>
> >>>>>>>> On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
> >>>>>>>>> When nfsd is in grace and receives an NLM LOCK request which turns
> >>>>>>>>> out to have a conflicting delegation, return that the server is in
> >>>>>>>>> grace.
> >>>>>>>>>
> >>>>>>>>> Reviewed-by: Jeff Layton <jlayton@redhat.com>
> >>>>>>>>> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
> >>>>>>>>> ---
> >>>>>>>>>  fs/lockd/svc4proc.c | 15 +++++++++++++--
> >>>>>>>>>  1 file changed, 13 insertions(+), 2 deletions(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
> >>>>>>>>> index 109e5caae8c7..7ac4af5c9875 100644
> >>>>>>>>> --- a/fs/lockd/svc4proc.c
> >>>>>>>>> +++ b/fs/lockd/svc4proc.c
> >>>>>>>>> @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
> >>>>>>>>>       resp->cookie = argp->cookie;
> >>>>>>>>>
> >>>>>>>>>       /* Obtain client and file */
> >>>>>>>>> -     if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
> >>>>>>>>> -             return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
> >>>>>>>>> +     resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
> >>>>>>>>> +     switch (resp->status) {
> >>>>>>>>> +     case 0:
> >>>>>>>>> +             break;
> >>>>>>>>> +     case nlm_drop_reply:
> >>>>>>>>> +             if (locks_in_grace(SVC_NET(rqstp))) {
> >>>>>>>>> +                     resp->status = nlm_lck_denied_grace_period;
> >>>>>>>>
> >>>>>>>> I think this is wrong.  If the lock request has the "reclaim" flag set,
> >>>>>>>> then nlm_lck_denied_grace_period is not a meaningful error.
> >>>>>>>> nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
> >>>>>>>> getting a response to an upcall to mountd.  For NLM the request really
> >>>>>>>> must be dropped.
> >>>>>>>
> >>>>>>> Thank you for pointing out this case so we are suggesting to.
> >>>>>>>
> >>>>>>> if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)
> >>>>>>>
> >>>>>>> However, I've been looking and looking but I cannot figure out how
> >>>>>>> nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
> >>>>>>> happen during the upcall to mountd. So that happens within nfsd_open()
> >>>>>>> call and a part of fh_verify().
> >>>>>>> To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
> >>>>>>> from the nfsd_open().  I have searched and searched but I don't see
> >>>>>>> anything that ever sets nfserr_dropit (NFSERR_DROPIT).
> >>>>>>>
> >>>>>>> I searched the logs and nfserr_dropit was an error that was EAGAIN
> >>>>>>> translated into but then removed by the following patch.
> >>>>>>
> >>>>>> Oh.  I didn't know that.
> >>>>>> We now use RQ_DROPME instead.
> >>>>>> I guess we should remove NFSERR_DROPIT completely as it not used at all
> >>>>>> any more.
> >>>>>>
> >>>>>> Though returning nfserr_jukebox to an v2 client isn't a good idea....
> >>>>>
> >>>>> I'll take your word for you.
> >>>>>
> >>>>>> So I guess my main complaint isn't valid, but I still don't like this
> >>>>>> patch.  It seems an untidy place to put the locks_in_grace test.
> >>>>>> Other callers of nlm4svc_retrieve_args() check locks_in_grace() before
> >>>>>> making that call.  __nlm4svc_proc_lock probably should too.  Or is there
> >>>>>> a reason that it is delayed until the middle of nlmsvc_lock()..
> >>>>>
> >>>>> I thought the same about why not adding the in_grace check and decided
> >>>>> that it was probably because you dont want to deny a lock if there are
> >>>>> no conflicts. If we add it, somebody might notice and will complain
> >>>>> that they can't get their lock when there are no conflicts.
> >>>>>
> >>>>>> The patch is not needed and isn't clearly an improvement, so I would
> >>>>>> rather it were dropped.
> >>>>>
> >>>>> I'm not against dropping this patch if the conclusion is that dropping
> >>>>> the packet is no worse than returning in_grace error.
> >>>>
> >>>> I dropped both of these from nfsd-testing. If a fix is still needed,
> >>>> let's start again with fresh patches.
> >>>
> >>> Can you clarify when you said "both"?
> >>>
> >>> What objections are there for the 1st patch in the series. It solves a
> >>> problem and a fix is needed.
> >>
> >> There are two reasons to drop the first patch.
> >>
> >> 1. Neil's "remove nfserr_dropit" patch doesn't apply unless both patches
> >> are reverted.
> >>
> >> 2. As I said above, NFSv2 does not have a mechanism like NFS3ERR_JUKEBOX
> >> to request that the client wait a bit and resend.
> >
> > ERR_JUKEBOX is not returned to another v2 or v3.
> >
> > Patch1 (nfsd: nfserr_jukebox in nlm_fopen should lead to a retry)
> > translates err_jukebox into the nlm_drop_reply and returns to lockd.
> > As the result, no error is returned to the client but the reply is
> > dropped all together.
>
> If you want NLM to drop the response, then set RQ_DROPME. Using
> nfserr_jukebox here is confusing -- it means "return a response to the
> client that requests a resend". You want NLM to /not send a response/,
> and we have a specific mechanism for that.

To drop a response lockd expect that it gets nlm_drop_response.

> >> So, if 1/2 has been tested with NFSv2 and does not cause NFSD to leak
> >> nfserr_jukebox to NFSv2 clients, then please rebase that one on the
> >> current nfsd-testing branch and post it again.
> >>
> >>
> >>> This one I agree is not needed but I
> >>> thought was an improvement.
> >>>
> >>>>
> >>>>
> >>>>>> Thanks,
> >>>>>> NeilBrown
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> commit 062304a815fe10068c478a4a3f28cf091c55cb82
> >>>>>>> Author: J. Bruce Fields <bfields@fieldses.org>
> >>>>>>> Date:   Sun Jan 2 22:05:33 2011 -0500
> >>>>>>>
> >>>>>>>     nfsd: stop translating EAGAIN to nfserr_dropit
> >>>>>>>
> >>>>>>> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
> >>>>>>> index dc9c2e3fd1b8..fd608a27a8d5 100644
> >>>>>>> --- a/fs/nfsd/nfsproc.c
> >>>>>>> +++ b/fs/nfsd/nfsproc.c
> >>>>>>> @@ -735,7 +735,8 @@ nfserrno (int errno)
> >>>>>>>                 { nfserr_stale, -ESTALE },
> >>>>>>>                 { nfserr_jukebox, -ETIMEDOUT },
> >>>>>>>                 { nfserr_jukebox, -ERESTARTSYS },
> >>>>>>> -               { nfserr_dropit, -EAGAIN },
> >>>>>>> +               { nfserr_jukebox, -EAGAIN },
> >>>>>>> +               { nfserr_jukebox, -EWOULDBLOCK },
> >>>>>>>                 { nfserr_jukebox, -ENOMEM },
> >>>>>>>                 { nfserr_badname, -ESRCH },
> >>>>>>>                 { nfserr_io, -ETXTBSY },
> >>>>>>>
> >>>>>>> so if fh_verify is failing whatever it is returning, it is not
> >>>>>>> nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
> >>>>>>> translate it to nlm_failed which with my patch would not trigger
> >>>>>>> nlm_lck_denied_grace_period error but resp->status would be set to
> >>>>>>> nlm_failed.
> >>>>>>>
> >>>>>>> So I circle back to I hope that convinces you that we don't need a
> >>>>>>> check for the reclaim lock.
> >>>>>>>
> >>>>>>> I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
> >>>>>>> delegation recall. it can be a memory failure. But I'm sure when
> >>>>>>> EWOULDBLOCK occurs.
> >>>>>>>
> >>>>>>>> At the very least we need to guard against the reclaim flag being set in
> >>>>>>>> the above test.  But I would much rather a more clear distinction were
> >>>>>>>> made between "drop because of a conflicting delegation" and "drop
> >>>>>>>> because of a delay getting upcall response".
> >>>>>>>> Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
> >>>>>>>> (and ideally nlm2) handles appropriately.
> >>>>>>>
> >>>>>>>
> >>>>>>>> NeilBrown
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> +                     return rpc_success;
> >>>>>>>>> +             }
> >>>>>>>>> +             return nlm_drop_reply;
> >>>>>>>>> +     default:
> >>>>>>>>> +             return rpc_success;
> >>>>>>>>> +     }
> >>>>>>>>>
> >>>>>>>>>       /* Now try to lock the file */
> >>>>>>>>>       resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
> >>>>>>>>> --
> >>>>>>>>> 2.47.1
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Chuck Lever
> >>>
> >>
> >>
> >> --
> >> Chuck Lever
>
>
> --
> Chuck Lever

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
  2025-08-21 18:39                   ` Chuck Lever
  2025-08-21 18:51                     ` Olga Kornievskaia
@ 2025-08-21 19:15                     ` Olga Kornievskaia
  1 sibling, 0 replies; 21+ messages in thread
From: Olga Kornievskaia @ 2025-08-21 19:15 UTC (permalink / raw)
  To: Chuck Lever
  Cc: NeilBrown, Olga Kornievskaia, jlayton, linux-nfs, Dai.Ngo, tom

On Thu, Aug 21, 2025 at 2:39 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> On 8/21/25 2:33 PM, Olga Kornievskaia wrote:
> > On Thu, Aug 21, 2025 at 2:24 PM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>
> >> On 8/21/25 2:20 PM, Olga Kornievskaia wrote:
> >>> On Thu, Aug 21, 2025 at 11:09 AM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>>>
> >>>> On 8/21/25 9:56 AM, Olga Kornievskaia wrote:
> >>>>> On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@brown.name> wrote:
> >>>>>>
> >>>>>> On Thu, 14 Aug 2025, Olga Kornievskaia wrote:
> >>>>>>> On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
> >>>>>>>>
> >>>>>>>> On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
> >>>>>>>>> When nfsd is in grace and receives an NLM LOCK request which turns
> >>>>>>>>> out to have a conflicting delegation, return that the server is in
> >>>>>>>>> grace.
> >>>>>>>>>
> >>>>>>>>> Reviewed-by: Jeff Layton <jlayton@redhat.com>
> >>>>>>>>> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
> >>>>>>>>> ---
> >>>>>>>>>  fs/lockd/svc4proc.c | 15 +++++++++++++--
> >>>>>>>>>  1 file changed, 13 insertions(+), 2 deletions(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
> >>>>>>>>> index 109e5caae8c7..7ac4af5c9875 100644
> >>>>>>>>> --- a/fs/lockd/svc4proc.c
> >>>>>>>>> +++ b/fs/lockd/svc4proc.c
> >>>>>>>>> @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
> >>>>>>>>>       resp->cookie = argp->cookie;
> >>>>>>>>>
> >>>>>>>>>       /* Obtain client and file */
> >>>>>>>>> -     if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
> >>>>>>>>> -             return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
> >>>>>>>>> +     resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
> >>>>>>>>> +     switch (resp->status) {
> >>>>>>>>> +     case 0:
> >>>>>>>>> +             break;
> >>>>>>>>> +     case nlm_drop_reply:
> >>>>>>>>> +             if (locks_in_grace(SVC_NET(rqstp))) {
> >>>>>>>>> +                     resp->status = nlm_lck_denied_grace_period;
> >>>>>>>>
> >>>>>>>> I think this is wrong.  If the lock request has the "reclaim" flag set,
> >>>>>>>> then nlm_lck_denied_grace_period is not a meaningful error.
> >>>>>>>> nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
> >>>>>>>> getting a response to an upcall to mountd.  For NLM the request really
> >>>>>>>> must be dropped.
> >>>>>>>
> >>>>>>> Thank you for pointing out this case so we are suggesting to.
> >>>>>>>
> >>>>>>> if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)
> >>>>>>>
> >>>>>>> However, I've been looking and looking but I cannot figure out how
> >>>>>>> nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
> >>>>>>> happen during the upcall to mountd. So that happens within nfsd_open()
> >>>>>>> call and a part of fh_verify().
> >>>>>>> To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
> >>>>>>> from the nfsd_open().  I have searched and searched but I don't see
> >>>>>>> anything that ever sets nfserr_dropit (NFSERR_DROPIT).
> >>>>>>>
> >>>>>>> I searched the logs and nfserr_dropit was an error that was EAGAIN
> >>>>>>> translated into but then removed by the following patch.
> >>>>>>
> >>>>>> Oh.  I didn't know that.
> >>>>>> We now use RQ_DROPME instead.
> >>>>>> I guess we should remove NFSERR_DROPIT completely as it not used at all
> >>>>>> any more.
> >>>>>>
> >>>>>> Though returning nfserr_jukebox to an v2 client isn't a good idea....
> >>>>>
> >>>>> I'll take your word for you.
> >>>>>
> >>>>>> So I guess my main complaint isn't valid, but I still don't like this
> >>>>>> patch.  It seems an untidy place to put the locks_in_grace test.
> >>>>>> Other callers of nlm4svc_retrieve_args() check locks_in_grace() before
> >>>>>> making that call.  __nlm4svc_proc_lock probably should too.  Or is there
> >>>>>> a reason that it is delayed until the middle of nlmsvc_lock()..
> >>>>>
> >>>>> I thought the same about why not adding the in_grace check and decided
> >>>>> that it was probably because you dont want to deny a lock if there are
> >>>>> no conflicts. If we add it, somebody might notice and will complain
> >>>>> that they can't get their lock when there are no conflicts.
> >>>>>
> >>>>>> The patch is not needed and isn't clearly an improvement, so I would
> >>>>>> rather it were dropped.
> >>>>>
> >>>>> I'm not against dropping this patch if the conclusion is that dropping
> >>>>> the packet is no worse than returning in_grace error.
> >>>>
> >>>> I dropped both of these from nfsd-testing. If a fix is still needed,
> >>>> let's start again with fresh patches.
> >>>
> >>> Can you clarify when you said "both"?
> >>>
> >>> What objections are there for the 1st patch in the series. It solves a
> >>> problem and a fix is needed.
> >>
> >> There are two reasons to drop the first patch.
> >>
> >> 1. Neil's "remove nfserr_dropit" patch doesn't apply unless both patches
> >> are reverted.
> >>
> >> 2. As I said above, NFSv2 does not have a mechanism like NFS3ERR_JUKEBOX
> >> to request that the client wait a bit and resend.
> >
> > ERR_JUKEBOX is not returned to another v2 or v3.
> >
> > Patch1 (nfsd: nfserr_jukebox in nlm_fopen should lead to a retry)
> > translates err_jukebox into the nlm_drop_reply and returns to lockd.
> > As the result, no error is returned to the client but the reply is
> > dropped all together.
>
> If you want NLM to drop the response, then set RQ_DROPME. Using
> nfserr_jukebox here is confusing -- it means "return a response to the
> client that requests a resend". You want NLM to /not send a response/,
> and we have a specific mechanism for that.

I don't understand why the suggestion holds value.

nlm_open() is going to call nfsd_open() which will return
nfserr_jukebox (for among other conditions) when there is a
conflicting delegation. Currently, nlm_fopen() would return nlm_failed
(it does not recognize nfserr_jukebox error code). nlm_fopen() has to
identify that nfserr_jukebox error as special to mean that the request
needs to be dropped. There is already a mechanism for it and that is
to return nlm_drop_reply.

What purpose would be to set RQ_DROPME for the nfserr_jukebox in
nlm_fopen() (note that nfserr_jukebox needs to be identified) . What
error should one be returning instead of "nlm_drop_reply"?  Why is
returning some other error code + setting RQ_DROPME be more "clear"?

>
>
> >> So, if 1/2 has been tested with NFSv2 and does not cause NFSD to leak
> >> nfserr_jukebox to NFSv2 clients, then please rebase that one on the
> >> current nfsd-testing branch and post it again.
> >>
> >>
> >>> This one I agree is not needed but I
> >>> thought was an improvement.
> >>>
> >>>>
> >>>>
> >>>>>> Thanks,
> >>>>>> NeilBrown
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> commit 062304a815fe10068c478a4a3f28cf091c55cb82
> >>>>>>> Author: J. Bruce Fields <bfields@fieldses.org>
> >>>>>>> Date:   Sun Jan 2 22:05:33 2011 -0500
> >>>>>>>
> >>>>>>>     nfsd: stop translating EAGAIN to nfserr_dropit
> >>>>>>>
> >>>>>>> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
> >>>>>>> index dc9c2e3fd1b8..fd608a27a8d5 100644
> >>>>>>> --- a/fs/nfsd/nfsproc.c
> >>>>>>> +++ b/fs/nfsd/nfsproc.c
> >>>>>>> @@ -735,7 +735,8 @@ nfserrno (int errno)
> >>>>>>>                 { nfserr_stale, -ESTALE },
> >>>>>>>                 { nfserr_jukebox, -ETIMEDOUT },
> >>>>>>>                 { nfserr_jukebox, -ERESTARTSYS },
> >>>>>>> -               { nfserr_dropit, -EAGAIN },
> >>>>>>> +               { nfserr_jukebox, -EAGAIN },
> >>>>>>> +               { nfserr_jukebox, -EWOULDBLOCK },
> >>>>>>>                 { nfserr_jukebox, -ENOMEM },
> >>>>>>>                 { nfserr_badname, -ESRCH },
> >>>>>>>                 { nfserr_io, -ETXTBSY },
> >>>>>>>
> >>>>>>> so if fh_verify is failing whatever it is returning, it is not
> >>>>>>> nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
> >>>>>>> translate it to nlm_failed which with my patch would not trigger
> >>>>>>> nlm_lck_denied_grace_period error but resp->status would be set to
> >>>>>>> nlm_failed.
> >>>>>>>
> >>>>>>> So I circle back to I hope that convinces you that we don't need a
> >>>>>>> check for the reclaim lock.
> >>>>>>>
> >>>>>>> I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
> >>>>>>> delegation recall. it can be a memory failure. But I'm sure when
> >>>>>>> EWOULDBLOCK occurs.
> >>>>>>>
> >>>>>>>> At the very least we need to guard against the reclaim flag being set in
> >>>>>>>> the above test.  But I would much rather a more clear distinction were
> >>>>>>>> made between "drop because of a conflicting delegation" and "drop
> >>>>>>>> because of a delay getting upcall response".
> >>>>>>>> Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
> >>>>>>>> (and ideally nlm2) handles appropriately.
> >>>>>>>
> >>>>>>>
> >>>>>>>> NeilBrown
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> +                     return rpc_success;
> >>>>>>>>> +             }
> >>>>>>>>> +             return nlm_drop_reply;
> >>>>>>>>> +     default:
> >>>>>>>>> +             return rpc_success;
> >>>>>>>>> +     }
> >>>>>>>>>
> >>>>>>>>>       /* Now try to lock the file */
> >>>>>>>>>       resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
> >>>>>>>>> --
> >>>>>>>>> 2.47.1
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Chuck Lever
> >>>
> >>
> >>
> >> --
> >> Chuck Lever
>
>
> --
> Chuck Lever

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/2] nfsd: nfserr_jukebox in nlm_fopen should lead to a retry
  2025-08-12 16:03 [PATCH v2 1/2] nfsd: nfserr_jukebox in nlm_fopen should lead to a retry Olga Kornievskaia
  2025-08-12 16:03 ` [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period Olga Kornievskaia
  2025-08-12 17:23 ` [PATCH v2 1/2] nfsd: nfserr_jukebox in nlm_fopen should lead to a retry Chuck Lever
@ 2025-08-21 19:18 ` Chuck Lever
  2025-08-21 19:28   ` Olga Kornievskaia
  2 siblings, 1 reply; 21+ messages in thread
From: Chuck Lever @ 2025-08-21 19:18 UTC (permalink / raw)
  To: Olga Kornievskaia, neil; +Cc: linux-nfs, Dai.Ngo, tom, jlayton

OK, let's reset this discussion.


On 8/12/25 12:03 PM, Olga Kornievskaia wrote:
> When v3 NLM request finds a conflicting delegation, it triggers
> a delegation recall and nfsd_open fails with EAGAIN. nfsd_open
> then translates EAGAIN into nfserr_jukebox. In nlm_fopen, instead
> of returning nlm_failed for when there is a conflicting delegation,
> drop this NLM request so that the client retries. Once delegation
> is recalled and if a local lock is claimed, a retry would lead to
> nfsd returning a nlm_lck_blocked error or a successful nlm lock.

If this patch "... solves a problem and a fix is needed" then we need a
Fixes: tag so the patch is prioritized and considered for LTS.


> Reviewed-by: Jeff Layton <jlayton@kernel.org>
> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
> ---
>  fs/nfsd/lockd.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/fs/nfsd/lockd.c b/fs/nfsd/lockd.c
> index edc9f75dc75c..8fdc769d392e 100644
> --- a/fs/nfsd/lockd.c
> +++ b/fs/nfsd/lockd.c
> @@ -57,6 +57,20 @@ nlm_fopen(struct svc_rqst *rqstp, struct nfs_fh *f, struct file **filp,
>  	switch (nfserr) {
>  	case nfs_ok:
>  		return 0;
> +	case nfserr_jukebox:
> +		/* this error can indicate a presence of a conflicting
> +		 * delegation to an NLM lock request. Options are:
> +		 * (1) For now, drop this request and make the client
> +		 * retry. When delegation is returned, client's retry will
> +		 * complete.
> +		 * (2) NLM4_DENIED as per "spec" signals to the client
> +		 * that the lock is unavaiable now but client can retry.
> +		 * Linux client implementation does not. It treats
> +		 * NLM4_DENIED same as NLM4_FAILED and errors the request.
> +		 * (3) For the future, treat this as blocked lock and try
> +		 * to callback when the delegation is returned but might
> +		 * not have a proper lock request to block on.
> +		 */

s/unavaiable/unavailable

Since 2020, kernel coding style uses the "fallthrough;" keyword for
switch cases with no "break;".

Although, instead of "fallthrough;", you could simply remove the
nfserr_dropit case in this patch. That would remove the conflict with
Neil's patch (if it were to be postponed until after this one).


>  	case nfserr_dropit:
>  		return nlm_drop_reply;
>  	case nfserr_stale:



-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/2] nfsd: nfserr_jukebox in nlm_fopen should lead to a retry
  2025-08-21 19:18 ` Chuck Lever
@ 2025-08-21 19:28   ` Olga Kornievskaia
  2025-08-21 19:44     ` Olga Kornievskaia
  2025-08-21 19:44     ` Chuck Lever
  0 siblings, 2 replies; 21+ messages in thread
From: Olga Kornievskaia @ 2025-08-21 19:28 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Olga Kornievskaia, neil, linux-nfs, Dai.Ngo, tom, jlayton

On Thu, Aug 21, 2025 at 3:18 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> OK, let's reset this discussion.
>
>
> On 8/12/25 12:03 PM, Olga Kornievskaia wrote:
> > When v3 NLM request finds a conflicting delegation, it triggers
> > a delegation recall and nfsd_open fails with EAGAIN. nfsd_open
> > then translates EAGAIN into nfserr_jukebox. In nlm_fopen, instead
> > of returning nlm_failed for when there is a conflicting delegation,
> > drop this NLM request so that the client retries. Once delegation
> > is recalled and if a local lock is claimed, a retry would lead to
> > nfsd returning a nlm_lck_blocked error or a successful nlm lock.
>
> If this patch "... solves a problem and a fix is needed" then we need a
> Fixes: tag so the patch is prioritized and considered for LTS.

What fixes tag is appropriate here? This is day 0 behaviour but it's
only a problem since additions of write delegations I believe.

> > Reviewed-by: Jeff Layton <jlayton@kernel.org>
> > Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
> > ---
> >  fs/nfsd/lockd.c | 14 ++++++++++++++
> >  1 file changed, 14 insertions(+)
> >
> > diff --git a/fs/nfsd/lockd.c b/fs/nfsd/lockd.c
> > index edc9f75dc75c..8fdc769d392e 100644
> > --- a/fs/nfsd/lockd.c
> > +++ b/fs/nfsd/lockd.c
> > @@ -57,6 +57,20 @@ nlm_fopen(struct svc_rqst *rqstp, struct nfs_fh *f, struct file **filp,
> >       switch (nfserr) {
> >       case nfs_ok:
> >               return 0;
> > +     case nfserr_jukebox:
> > +             /* this error can indicate a presence of a conflicting
> > +              * delegation to an NLM lock request. Options are:
> > +              * (1) For now, drop this request and make the client
> > +              * retry. When delegation is returned, client's retry will
> > +              * complete.
> > +              * (2) NLM4_DENIED as per "spec" signals to the client
> > +              * that the lock is unavaiable now but client can retry.
> > +              * Linux client implementation does not. It treats
> > +              * NLM4_DENIED same as NLM4_FAILED and errors the request.
> > +              * (3) For the future, treat this as blocked lock and try
> > +              * to callback when the delegation is returned but might
> > +              * not have a proper lock request to block on.
> > +              */
>
> s/unavaiable/unavailable
>
> Since 2020, kernel coding style uses the "fallthrough;" keyword for
> switch cases with no "break;".
>
> Although, instead of "fallthrough;",

I'll add that.

> you could simply remove the
> nfserr_dropit case in this patch. That would remove the conflict with
> Neil's patch (if it were to be postponed until after this one).

I can re-send the patch with the fallthrough added on top of Neil's patch.

>
>
> >       case nfserr_dropit:
> >               return nlm_drop_reply;
> >       case nfserr_stale:
>
>
>
> --
> Chuck Lever
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/2] nfsd: nfserr_jukebox in nlm_fopen should lead to a retry
  2025-08-21 19:28   ` Olga Kornievskaia
@ 2025-08-21 19:44     ` Olga Kornievskaia
  2025-08-21 19:44     ` Chuck Lever
  1 sibling, 0 replies; 21+ messages in thread
From: Olga Kornievskaia @ 2025-08-21 19:44 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Olga Kornievskaia, neil, linux-nfs, Dai.Ngo, tom, jlayton

On Thu, Aug 21, 2025 at 3:28 PM Olga Kornievskaia <aglo@umich.edu> wrote:
>
> On Thu, Aug 21, 2025 at 3:18 PM Chuck Lever <chuck.lever@oracle.com> wrote:
> >
> > OK, let's reset this discussion.
> >
> >
> > On 8/12/25 12:03 PM, Olga Kornievskaia wrote:
> > > When v3 NLM request finds a conflicting delegation, it triggers
> > > a delegation recall and nfsd_open fails with EAGAIN. nfsd_open
> > > then translates EAGAIN into nfserr_jukebox. In nlm_fopen, instead
> > > of returning nlm_failed for when there is a conflicting delegation,
> > > drop this NLM request so that the client retries. Once delegation
> > > is recalled and if a local lock is claimed, a retry would lead to
> > > nfsd returning a nlm_lck_blocked error or a successful nlm lock.
> >
> > If this patch "... solves a problem and a fix is needed" then we need a
> > Fixes: tag so the patch is prioritized and considered for LTS.
>
> What fixes tag is appropriate here? This is day 0 behaviour but it's
> only a problem since additions of write delegations I believe.
>
> > > Reviewed-by: Jeff Layton <jlayton@kernel.org>
> > > Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
> > > ---
> > >  fs/nfsd/lockd.c | 14 ++++++++++++++
> > >  1 file changed, 14 insertions(+)
> > >
> > > diff --git a/fs/nfsd/lockd.c b/fs/nfsd/lockd.c
> > > index edc9f75dc75c..8fdc769d392e 100644
> > > --- a/fs/nfsd/lockd.c
> > > +++ b/fs/nfsd/lockd.c
> > > @@ -57,6 +57,20 @@ nlm_fopen(struct svc_rqst *rqstp, struct nfs_fh *f, struct file **filp,
> > >       switch (nfserr) {
> > >       case nfs_ok:
> > >               return 0;
> > > +     case nfserr_jukebox:
> > > +             /* this error can indicate a presence of a conflicting
> > > +              * delegation to an NLM lock request. Options are:
> > > +              * (1) For now, drop this request and make the client
> > > +              * retry. When delegation is returned, client's retry will
> > > +              * complete.
> > > +              * (2) NLM4_DENIED as per "spec" signals to the client
> > > +              * that the lock is unavaiable now but client can retry.
> > > +              * Linux client implementation does not. It treats
> > > +              * NLM4_DENIED same as NLM4_FAILED and errors the request.
> > > +              * (3) For the future, treat this as blocked lock and try
> > > +              * to callback when the delegation is returned but might
> > > +              * not have a proper lock request to block on.
> > > +              */
> >
> > s/unavaiable/unavailable
> >
> > Since 2020, kernel coding style uses the "fallthrough;" keyword for
> > switch cases with no "break;".
> >
> > Although, instead of "fallthrough;",
>
> I'll add that.
>
> > you could simply remove the
> > nfserr_dropit case in this patch. That would remove the conflict with
> > Neil's patch (if it were to be postponed until after this one).
>
> I can re-send the patch with the fallthrough added on top of Neil's patch.

Scratch the fallthrough. On top of Neil's patch no fallthrough it
needed.  Content is the original patch, just rebased.

>
> >
> >
> > >       case nfserr_dropit:
> > >               return nlm_drop_reply;
> > >       case nfserr_stale:
> >
> >
> >
> > --
> > Chuck Lever
> >

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/2] nfsd: nfserr_jukebox in nlm_fopen should lead to a retry
  2025-08-21 19:28   ` Olga Kornievskaia
  2025-08-21 19:44     ` Olga Kornievskaia
@ 2025-08-21 19:44     ` Chuck Lever
  2025-08-21 19:57       ` Olga Kornievskaia
  1 sibling, 1 reply; 21+ messages in thread
From: Chuck Lever @ 2025-08-21 19:44 UTC (permalink / raw)
  To: Olga Kornievskaia
  Cc: Olga Kornievskaia, neil, linux-nfs, Dai.Ngo, tom, jlayton

On 8/21/25 3:28 PM, Olga Kornievskaia wrote:
> On Thu, Aug 21, 2025 at 3:18 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>>
>> OK, let's reset this discussion.
>>
>>
>> On 8/12/25 12:03 PM, Olga Kornievskaia wrote:
>>> When v3 NLM request finds a conflicting delegation, it triggers
>>> a delegation recall and nfsd_open fails with EAGAIN. nfsd_open
>>> then translates EAGAIN into nfserr_jukebox. In nlm_fopen, instead
>>> of returning nlm_failed for when there is a conflicting delegation,
>>> drop this NLM request so that the client retries. Once delegation
>>> is recalled and if a local lock is claimed, a retry would lead to
>>> nfsd returning a nlm_lck_blocked error or a successful nlm lock.
>>
>> If this patch "... solves a problem and a fix is needed" then we need a
>> Fixes: tag so the patch is prioritized and considered for LTS.
> 
> What fixes tag is appropriate here? This is day 0 behaviour but it's
> only a problem since additions of write delegations I believe.

How about:

Fixes: d343fce148a4 ("[PATCH] knfsd: Allow lockd to drop replies as
appropriate")
Suggested-Cc: stable@vger.kernel.org # v6.6

(I'll remove the "Suggested-" when applying the patch)


>>> Reviewed-by: Jeff Layton <jlayton@kernel.org>
>>> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
>>> ---
>>>  fs/nfsd/lockd.c | 14 ++++++++++++++
>>>  1 file changed, 14 insertions(+)
>>>
>>> diff --git a/fs/nfsd/lockd.c b/fs/nfsd/lockd.c
>>> index edc9f75dc75c..8fdc769d392e 100644
>>> --- a/fs/nfsd/lockd.c
>>> +++ b/fs/nfsd/lockd.c
>>> @@ -57,6 +57,20 @@ nlm_fopen(struct svc_rqst *rqstp, struct nfs_fh *f, struct file **filp,
>>>       switch (nfserr) {
>>>       case nfs_ok:
>>>               return 0;
>>> +     case nfserr_jukebox:
>>> +             /* this error can indicate a presence of a conflicting
>>> +              * delegation to an NLM lock request. Options are:
>>> +              * (1) For now, drop this request and make the client
>>> +              * retry. When delegation is returned, client's retry will
>>> +              * complete.
>>> +              * (2) NLM4_DENIED as per "spec" signals to the client
>>> +              * that the lock is unavaiable now but client can retry.
>>> +              * Linux client implementation does not. It treats
>>> +              * NLM4_DENIED same as NLM4_FAILED and errors the request.
>>> +              * (3) For the future, treat this as blocked lock and try
>>> +              * to callback when the delegation is returned but might
>>> +              * not have a proper lock request to block on.
>>> +              */
>>
>> s/unavaiable/unavailable
>>
>> Since 2020, kernel coding style uses the "fallthrough;" keyword for
>> switch cases with no "break;".
>>
>> Although, instead of "fallthrough;",
> 
> I'll add that.
> 
>> you could simply remove the
>> nfserr_dropit case in this patch. That would remove the conflict with
>> Neil's patch (if it were to be postponed until after this one).
> 
> I can re-send the patch with the fallthrough added on top of Neil's patch.

If you agree that this fix is appropriate for LTS, then Neil's patch
needs to be rebased on top of this one to facilitate automated LTS
backporting.

I can drop Neil's patch from nfsd-testing and have him submit a
rebased one, once this one is re-applied to nfsd-testing.


>>>       case nfserr_dropit:
>>>               return nlm_drop_reply;
>>>       case nfserr_stale:
>>
>>
>>
>> --
>> Chuck Lever
>>


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/2] nfsd: nfserr_jukebox in nlm_fopen should lead to a retry
  2025-08-21 19:44     ` Chuck Lever
@ 2025-08-21 19:57       ` Olga Kornievskaia
  0 siblings, 0 replies; 21+ messages in thread
From: Olga Kornievskaia @ 2025-08-21 19:57 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Olga Kornievskaia, neil, linux-nfs, Dai.Ngo, tom, jlayton

On Thu, Aug 21, 2025 at 3:45 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> On 8/21/25 3:28 PM, Olga Kornievskaia wrote:
> > On Thu, Aug 21, 2025 at 3:18 PM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>
> >> OK, let's reset this discussion.
> >>
> >>
> >> On 8/12/25 12:03 PM, Olga Kornievskaia wrote:
> >>> When v3 NLM request finds a conflicting delegation, it triggers
> >>> a delegation recall and nfsd_open fails with EAGAIN. nfsd_open
> >>> then translates EAGAIN into nfserr_jukebox. In nlm_fopen, instead
> >>> of returning nlm_failed for when there is a conflicting delegation,
> >>> drop this NLM request so that the client retries. Once delegation
> >>> is recalled and if a local lock is claimed, a retry would lead to
> >>> nfsd returning a nlm_lck_blocked error or a successful nlm lock.
> >>
> >> If this patch "... solves a problem and a fix is needed" then we need a
> >> Fixes: tag so the patch is prioritized and considered for LTS.
> >
> > What fixes tag is appropriate here? This is day 0 behaviour but it's
> > only a problem since additions of write delegations I believe.
>
> How about:
>
> Fixes: d343fce148a4 ("[PATCH] knfsd: Allow lockd to drop replies as
> appropriate")

That sounds fine to me.

> Suggested-Cc: stable@vger.kernel.org # v6.6
>
> (I'll remove the "Suggested-" when applying the patch)

I will re-send original patch with the Fixes and Suggested-Cc added to
the commit message.

> >>> Reviewed-by: Jeff Layton <jlayton@kernel.org>
> >>> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
> >>> ---
> >>>  fs/nfsd/lockd.c | 14 ++++++++++++++
> >>>  1 file changed, 14 insertions(+)
> >>>
> >>> diff --git a/fs/nfsd/lockd.c b/fs/nfsd/lockd.c
> >>> index edc9f75dc75c..8fdc769d392e 100644
> >>> --- a/fs/nfsd/lockd.c
> >>> +++ b/fs/nfsd/lockd.c
> >>> @@ -57,6 +57,20 @@ nlm_fopen(struct svc_rqst *rqstp, struct nfs_fh *f, struct file **filp,
> >>>       switch (nfserr) {
> >>>       case nfs_ok:
> >>>               return 0;
> >>> +     case nfserr_jukebox:
> >>> +             /* this error can indicate a presence of a conflicting
> >>> +              * delegation to an NLM lock request. Options are:
> >>> +              * (1) For now, drop this request and make the client
> >>> +              * retry. When delegation is returned, client's retry will
> >>> +              * complete.
> >>> +              * (2) NLM4_DENIED as per "spec" signals to the client
> >>> +              * that the lock is unavaiable now but client can retry.
> >>> +              * Linux client implementation does not. It treats
> >>> +              * NLM4_DENIED same as NLM4_FAILED and errors the request.
> >>> +              * (3) For the future, treat this as blocked lock and try
> >>> +              * to callback when the delegation is returned but might
> >>> +              * not have a proper lock request to block on.
> >>> +              */
> >>
> >> s/unavaiable/unavailable
> >>
> >> Since 2020, kernel coding style uses the "fallthrough;" keyword for
> >> switch cases with no "break;".
> >>
> >> Although, instead of "fallthrough;",
> >
> > I'll add that.
> >
> >> you could simply remove the
> >> nfserr_dropit case in this patch. That would remove the conflict with
> >> Neil's patch (if it were to be postponed until after this one).
> >
> > I can re-send the patch with the fallthrough added on top of Neil's patch.
>
> If you agree that this fix is appropriate for LTS, then Neil's patch
> needs to be rebased on top of this one to facilitate automated LTS
> backporting.

I believe this fix is appropriate for LTS.

> I can drop Neil's patch from nfsd-testing and have him submit a
> rebased one, once this one is re-applied to nfsd-testing.
>
> >>>       case nfserr_dropit:
> >>>               return nlm_drop_reply;
> >>>       case nfserr_stale:
> >>
> >>
> >>
> >> --
> >> Chuck Lever
> >>
>
>
> --
> Chuck Lever

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period
  2025-08-21 13:56         ` Olga Kornievskaia
  2025-08-21 14:01           ` Chuck Lever
  2025-08-21 15:09           ` Chuck Lever
@ 2025-08-25 15:23           ` Olga Kornievskaia
  2 siblings, 0 replies; 21+ messages in thread
From: Olga Kornievskaia @ 2025-08-25 15:23 UTC (permalink / raw)
  To: NeilBrown
  Cc: Olga Kornievskaia, chuck.lever, jlayton, linux-nfs, Dai.Ngo, tom

On Thu, Aug 21, 2025 at 9:56 AM Olga Kornievskaia <aglo@umich.edu> wrote:
>
> On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@brown.name> wrote:
> >
> > On Thu, 14 Aug 2025, Olga Kornievskaia wrote:
> > > On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@brown.name> wrote:
> > > >
> > > > On Wed, 13 Aug 2025, Olga Kornievskaia wrote:
> > > > > When nfsd is in grace and receives an NLM LOCK request which turns
> > > > > out to have a conflicting delegation, return that the server is in
> > > > > grace.

My last attempt for restoring (prior to write delegations) behaviour
vs what happens now (even with patch#1). I bring it up again because
there has been misunderstanding of what the patch series is doing and
I believe (some) technical concerns (regarding nfsv2 client) have been
discussed and resolved.

In the old code (prior to write delegations), when a v3 client sent a
lock request (during grace period) and when a v4 client reclaimed the
same lock after the server rebooted, the v3 client got the in_grace
error.

With patch #1 (not in question here), the code at least isn't failing
the request with nlm_failed and instead drops the request. No lock
failure but the overall behaviour of the nfs client is different
(details below).

> > > > > Reviewed-by: Jeff Layton <jlayton@redhat.com>
> > > > > Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
> > > > > ---
> > > > >  fs/lockd/svc4proc.c | 15 +++++++++++++--
> > > > >  1 file changed, 13 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
> > > > > index 109e5caae8c7..7ac4af5c9875 100644
> > > > > --- a/fs/lockd/svc4proc.c
> > > > > +++ b/fs/lockd/svc4proc.c
> > > > > @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp)
> > > > >       resp->cookie = argp->cookie;
> > > > >
> > > > >       /* Obtain client and file */
> > > > > -     if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file)))
> > > > > -             return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;
> > > > > +     resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file);
> > > > > +     switch (resp->status) {
> > > > > +     case 0:
> > > > > +             break;
> > > > > +     case nlm_drop_reply:
> > > > > +             if (locks_in_grace(SVC_NET(rqstp))) {
> > > > > +                     resp->status = nlm_lck_denied_grace_period;
> > > >
> > > > I think this is wrong.  If the lock request has the "reclaim" flag set,
> > > > then nlm_lck_denied_grace_period is not a meaningful error.
> > > > nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay
> > > > getting a response to an upcall to mountd.  For NLM the request really
> > > > must be dropped.
> > >
> > > Thank you for pointing out this case so we are suggesting to.
> > >
> > > if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim)
> > >
> > > However, I've been looking and looking but I cannot figure out how
> > > nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can
> > > happen during the upcall to mountd. So that happens within nfsd_open()
> > > call and a part of fh_verify().
> > > To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit
> > > from the nfsd_open().  I have searched and searched but I don't see
> > > anything that ever sets nfserr_dropit (NFSERR_DROPIT).
> > >
> > > I searched the logs and nfserr_dropit was an error that was EAGAIN
> > > translated into but then removed by the following patch.
> >
> > Oh.  I didn't know that.
> > We now use RQ_DROPME instead.
> > I guess we should remove NFSERR_DROPIT completely as it not used at all
> > any more.
> >
> > Though returning nfserr_jukebox to an v2 client isn't a good idea....
>
> I'll take your word for you.

I didn't write the correct words here. I have said the correct words
in the other patch. v2 (or v3) client does not get an nfserr_jukebox.

> > So I guess my main complaint isn't valid, but I still don't like this
> > patch.  It seems an untidy place to put the locks_in_grace test.
> > Other callers of nlm4svc_retrieve_args() check locks_in_grace() before
> > making that call.  __nlm4svc_proc_lock probably should too.  Or is there
> > a reason that it is delayed until the middle of nlmsvc_lock()..
>
> I thought the same about why not adding the in_grace check and decided
> that it was probably because you dont want to deny a lock if there are
> no conflicts. If we add it, somebody might notice and will complain
> that they can't get their lock when there are no conflicts.

The disadvantage of unconditionally adding in_grace check() in the
lock call is that again it would produce a difference in behaviour
where some lock would not be granted as fast as before (ie., some
environments might notice the change?).

> > The patch is not needed and isn't clearly an improvement, so I would
> > rather it were dropped.
>
> I'm not against dropping this patch if the conclusion is that dropping
> the packet is no worse than returning in_grace error.

My argument for inclusion of this patch is that it restores previous
behaviour that the nfs client experienced before (ie., receiving
in_grace error). When we rely on drop-retry behavior, can there be
problems that the client runs out of re-tries and fails the request
(nlm has a hard coded retry limit of 5 and then its own timeout
value)?

I admit my reasons are hypothetical, we can wait until somebody complains.


> > Thanks,
> > NeilBrown
> >
> >
> > >
> > > commit 062304a815fe10068c478a4a3f28cf091c55cb82
> > > Author: J. Bruce Fields <bfields@fieldses.org>
> > > Date:   Sun Jan 2 22:05:33 2011 -0500
> > >
> > >     nfsd: stop translating EAGAIN to nfserr_dropit
> > >
> > > diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
> > > index dc9c2e3fd1b8..fd608a27a8d5 100644
> > > --- a/fs/nfsd/nfsproc.c
> > > +++ b/fs/nfsd/nfsproc.c
> > > @@ -735,7 +735,8 @@ nfserrno (int errno)
> > >                 { nfserr_stale, -ESTALE },
> > >                 { nfserr_jukebox, -ETIMEDOUT },
> > >                 { nfserr_jukebox, -ERESTARTSYS },
> > > -               { nfserr_dropit, -EAGAIN },
> > > +               { nfserr_jukebox, -EAGAIN },
> > > +               { nfserr_jukebox, -EWOULDBLOCK },
> > >                 { nfserr_jukebox, -ENOMEM },
> > >                 { nfserr_badname, -ESRCH },
> > >                 { nfserr_io, -ETXTBSY },
> > >
> > > so if fh_verify is failing whatever it is returning, it is not
> > > nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would
> > > translate it to nlm_failed which with my patch would not trigger
> > > nlm_lck_denied_grace_period error but resp->status would be set to
> > > nlm_failed.
> > >
> > > So I circle back to I hope that convinces you that we don't need a
> > > check for the reclaim lock.
> > >
> > > I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is
> > > delegation recall. it can be a memory failure. But I'm sure when
> > > EWOULDBLOCK occurs.
> > >
> > > > At the very least we need to guard against the reclaim flag being set in
> > > > the above test.  But I would much rather a more clear distinction were
> > > > made between "drop because of a conflicting delegation" and "drop
> > > > because of a delay getting upcall response".
> > > > Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4
> > > > (and ideally nlm2) handles appropriately.
> > >
> > >
> > > > NeilBrown
> > > >
> > > >
> > > > > +                     return rpc_success;
> > > > > +             }
> > > > > +             return nlm_drop_reply;
> > > > > +     default:
> > > > > +             return rpc_success;
> > > > > +     }
> > > > >
> > > > >       /* Now try to lock the file */
> > > > >       resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock,
> > > > > --
> > > > > 2.47.1
> > > > >
> > > > >
> > > >
> > > >
> > >
> >

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2025-08-25 15:24 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-12 16:03 [PATCH v2 1/2] nfsd: nfserr_jukebox in nlm_fopen should lead to a retry Olga Kornievskaia
2025-08-12 16:03 ` [PATCH v2 2/2] lockd: while grace prefer to fail with nlm_lck_denied_grace_period Olga Kornievskaia
2025-08-12 23:49   ` NeilBrown
2025-08-13 15:32     ` Olga Kornievskaia
2025-08-20 23:15       ` NeilBrown
2025-08-21 13:56         ` Olga Kornievskaia
2025-08-21 14:01           ` Chuck Lever
2025-08-21 15:09           ` Chuck Lever
2025-08-21 18:20             ` Olga Kornievskaia
2025-08-21 18:24               ` Chuck Lever
2025-08-21 18:33                 ` Olga Kornievskaia
2025-08-21 18:39                   ` Chuck Lever
2025-08-21 18:51                     ` Olga Kornievskaia
2025-08-21 19:15                     ` Olga Kornievskaia
2025-08-25 15:23           ` Olga Kornievskaia
2025-08-12 17:23 ` [PATCH v2 1/2] nfsd: nfserr_jukebox in nlm_fopen should lead to a retry Chuck Lever
2025-08-21 19:18 ` Chuck Lever
2025-08-21 19:28   ` Olga Kornievskaia
2025-08-21 19:44     ` Olga Kornievskaia
2025-08-21 19:44     ` Chuck Lever
2025-08-21 19:57       ` Olga Kornievskaia

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).