[PATCH v1] NFSD: Fix crash in nfsd4_read

public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v1] NFSD: Fix crash in nfsd4_read_release()
@ 2025-09-30 14:05 Chuck Lever
  2025-09-30 14:45 ` Jeff Layton
  2025-10-01  0:53 ` NeilBrown
  0 siblings, 2 replies; 9+ messages in thread
From: Chuck Lever @ 2025-09-30 14:05 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

When tracing is enabled, the trace_nfsd_read_done trace point
crashes during the pynfs read.testNoFh test.

Fixes: 87c5942e8fae ("nfsd: Add I/O trace points in the NFSv4 read proc")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/nfs4proc.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index e466cf52d7d7..f9aeefc0da73 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -988,10 +988,11 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 static void
 nfsd4_read_release(union nfsd4_op_u *u)
 {
-	if (u->read.rd_nf)
+	if (u->read.rd_nf) {
+		trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp,
+				     u->read.rd_offset, u->read.rd_length);
 		nfsd_file_put(u->read.rd_nf);
-	trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp,
-			     u->read.rd_offset, u->read.rd_length);
+	}
 }
 
 static __be32
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] NFSD: Fix crash in nfsd4_read_release()
  2025-09-30 14:05 [PATCH v1] NFSD: Fix crash in nfsd4_read_release() Chuck Lever
@ 2025-09-30 14:45 ` Jeff Layton
  2025-10-01  0:53 ` NeilBrown
  1 sibling, 0 replies; 9+ messages in thread
From: Jeff Layton @ 2025-09-30 14:45 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever

On Tue, 2025-09-30 at 10:05 -0400, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> When tracing is enabled, the trace_nfsd_read_done trace point
> crashes during the pynfs read.testNoFh test.
> 
> Fixes: 87c5942e8fae ("nfsd: Add I/O trace points in the NFSv4 read proc")
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfsd/nfs4proc.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index e466cf52d7d7..f9aeefc0da73 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -988,10 +988,11 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  static void
>  nfsd4_read_release(union nfsd4_op_u *u)
>  {
> -	if (u->read.rd_nf)
> +	if (u->read.rd_nf) {
> +		trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp,
> +				     u->read.rd_offset, u->read.rd_length);
>  		nfsd_file_put(u->read.rd_nf);
> -	trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp,
> -			     u->read.rd_offset, u->read.rd_length);
> +	}
>  }
>  
>  static __be32

Not sending a filehandle is pretty rare so I guess we won't miss the
lack of a tracepoint in that case.

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] NFSD: Fix crash in nfsd4_read_release()
  2025-09-30 14:05 [PATCH v1] NFSD: Fix crash in nfsd4_read_release() Chuck Lever
  2025-09-30 14:45 ` Jeff Layton
@ 2025-10-01  0:53 ` NeilBrown
  2025-10-01 13:16   ` Chuck Lever
  1 sibling, 1 reply; 9+ messages in thread
From: NeilBrown @ 2025-10-01  0:53 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs,
	Chuck Lever

On Wed, 01 Oct 2025, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> When tracing is enabled, the trace_nfsd_read_done trace point
> crashes during the pynfs read.testNoFh test.
> 
> Fixes: 87c5942e8fae ("nfsd: Add I/O trace points in the NFSv4 read proc")
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfsd/nfs4proc.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index e466cf52d7d7..f9aeefc0da73 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -988,10 +988,11 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  static void
>  nfsd4_read_release(union nfsd4_op_u *u)
>  {
> -	if (u->read.rd_nf)
> +	if (u->read.rd_nf) {
> +		trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp,
> +				     u->read.rd_offset, u->read.rd_length);
>  		nfsd_file_put(u->read.rd_nf);
> -	trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp,
> -			     u->read.rd_offset, u->read.rd_length);
> +	}
>  }

I must say this looks a bit weird.  rd_nf isn't used in the trace but if
it isn't set, you say the trace crashes...

That is because rd_fhp being NULL (because there is no current_fh) is
one thing that results in rd_nf being NULL.  Seems a bit indirect.
Did you consider
   if (u->read.rd_fhp)
       trace .....
   if (u->read.rd_nf)
       nfsd_file_put....

??
Or maybe
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -444,7 +444,7 @@ DECLARE_EVENT_CLASS(nfsd_io_class,
 	),
 	TP_fast_assign(
 		__entry->xid = be32_to_cpu(rqstp->rq_xid);
-		__entry->fh_hash = knfsd_fh_hash(&fhp->fh_handle);
+		__entry->fh_hash = fhp ? 0 : knfsd_fh_hash(&fhp->fh_handle);
 		__entry->offset = offset;
 		__entry->len = len;
 	),


That would make all IO event trace points robust.

I'm almost tempted to suggest knfsd_fh_hash() be changed to receive a
svc_fh and test for NULL, but there are 2 places where there isn't a
svc_fh handy.

NeilBrown


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] NFSD: Fix crash in nfsd4_read_release()
  2025-10-01  0:53 ` NeilBrown
@ 2025-10-01 13:16   ` Chuck Lever
  2025-10-03  7:08     ` NeilBrown
  0 siblings, 1 reply; 9+ messages in thread
From: Chuck Lever @ 2025-10-01 13:16 UTC (permalink / raw)
  To: NeilBrown
  Cc: Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs,
	Chuck Lever

On 9/30/25 8:53 PM, NeilBrown wrote:
> On Wed, 01 Oct 2025, Chuck Lever wrote:
>> From: Chuck Lever <chuck.lever@oracle.com>
>>
>> When tracing is enabled, the trace_nfsd_read_done trace point
>> crashes during the pynfs read.testNoFh test.
>>
>> Fixes: 87c5942e8fae ("nfsd: Add I/O trace points in the NFSv4 read proc")
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>>  fs/nfsd/nfs4proc.c | 7 ++++---
>>  1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>> index e466cf52d7d7..f9aeefc0da73 100644
>> --- a/fs/nfsd/nfs4proc.c
>> +++ b/fs/nfsd/nfs4proc.c
>> @@ -988,10 +988,11 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>>  static void
>>  nfsd4_read_release(union nfsd4_op_u *u)
>>  {
>> -	if (u->read.rd_nf)
>> +	if (u->read.rd_nf) {
>> +		trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp,
>> +				     u->read.rd_offset, u->read.rd_length);
>>  		nfsd_file_put(u->read.rd_nf);
>> -	trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp,
>> -			     u->read.rd_offset, u->read.rd_length);
>> +	}
>>  }
> 
> I must say this looks a bit weird.  rd_nf isn't used in the trace but if
> it isn't set, you say the trace crashes...
> 
> That is because rd_fhp being NULL (because there is no current_fh) is
> one thing that results in rd_nf being NULL.  Seems a bit indirect.

When rd_nf is NULL, no file or file handle is present. That's really all
there is to it. You can read it as "when rd_nf is NULL, no processing is
needed".

The other read trace points are already skipped in this case as well, so
there is no need to protect them.


> Did you consider
>    if (u->read.rd_fhp)
>        trace .....
>    if (u->read.rd_nf)
>        nfsd_file_put....
> 
> ??

I can't think of a legitimate case where fhp is not NULL but nf is
NULL. Or vice versa.


> Or maybe
> --- a/fs/nfsd/trace.h
> +++ b/fs/nfsd/trace.h
> @@ -444,7 +444,7 @@ DECLARE_EVENT_CLASS(nfsd_io_class,
>  	),
>  	TP_fast_assign(
>  		__entry->xid = be32_to_cpu(rqstp->rq_xid);
> -		__entry->fh_hash = knfsd_fh_hash(&fhp->fh_handle);
> +		__entry->fh_hash = fhp ? 0 : knfsd_fh_hash(&fhp->fh_handle);
>  		__entry->offset = offset;
>  		__entry->len = len;
>  	),
> 
> 
> That would make all IO event trace points robust.
> I'm almost tempted to suggest knfsd_fh_hash() be changed to receive a
> svc_fh and test for NULL, but there are 2 places where there isn't a
> svc_fh handy.
> 
> NeilBrown
> 


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] NFSD: Fix crash in nfsd4_read_release()
  2025-10-01 13:16   ` Chuck Lever
@ 2025-10-03  7:08     ` NeilBrown
  2025-10-03 13:57       ` Chuck Lever
  0 siblings, 1 reply; 9+ messages in thread
From: NeilBrown @ 2025-10-03  7:08 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs,
	Chuck Lever

On Wed, 01 Oct 2025, Chuck Lever wrote:
> On 9/30/25 8:53 PM, NeilBrown wrote:
> > On Wed, 01 Oct 2025, Chuck Lever wrote:
> >> From: Chuck Lever <chuck.lever@oracle.com>
> >>
> >> When tracing is enabled, the trace_nfsd_read_done trace point
> >> crashes during the pynfs read.testNoFh test.
> >>
> >> Fixes: 87c5942e8fae ("nfsd: Add I/O trace points in the NFSv4 read proc")
> >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> >> ---
> >>  fs/nfsd/nfs4proc.c | 7 ++++---
> >>  1 file changed, 4 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> >> index e466cf52d7d7..f9aeefc0da73 100644
> >> --- a/fs/nfsd/nfs4proc.c
> >> +++ b/fs/nfsd/nfs4proc.c
> >> @@ -988,10 +988,11 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> >>  static void
> >>  nfsd4_read_release(union nfsd4_op_u *u)
> >>  {
> >> -	if (u->read.rd_nf)
> >> +	if (u->read.rd_nf) {
> >> +		trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp,
> >> +				     u->read.rd_offset, u->read.rd_length);
> >>  		nfsd_file_put(u->read.rd_nf);
> >> -	trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp,
> >> -			     u->read.rd_offset, u->read.rd_length);
> >> +	}
> >>  }
> > 
> > I must say this looks a bit weird.  rd_nf isn't used in the trace but if
> > it isn't set, you say the trace crashes...
> > 
> > That is because rd_fhp being NULL (because there is no current_fh) is
> > one thing that results in rd_nf being NULL.  Seems a bit indirect.
> 
> When rd_nf is NULL, no file or file handle is present. That's really all
> there is to it. You can read it as "when rd_nf is NULL, no processing is
> needed".
> 
> The other read trace points are already skipped in this case as well, so
> there is no need to protect them.

They are skipped because ALLOWED_WITHOUT_FH is not set so ->op_func
isn't called.  But ->op_release *is* called.  That seems inconsistent.
I wonder if we could move the ->op_release call out of
nfsd4_encode_operation() and only call it in cases were ->op_func
was called.

Something like the following?  It isn't as elegant as I would have
liked, but I think it is better than hiding the ->op_release cxall in
nfsd4_encode_replay().

Thanks,
NeilBrown

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 71b428efcbb5..4ed823d1e6be 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -2837,6 +2837,8 @@ nfsd4_proc_compound(struct svc_rqst *rqstp)
 
 	trace_nfsd_compound(rqstp, args->tag, args->taglen, args->opcnt);
 	while (!status && resp->opcnt < args->opcnt) {
+		bool called_func = false;
+
 		op = &args->ops[resp->opcnt++];
 
 		/*
@@ -2886,6 +2888,7 @@ nfsd4_proc_compound(struct svc_rqst *rqstp)
 		if (op->opdesc->op_get_currentstateid)
 			op->opdesc->op_get_currentstateid(cstate, &op->u);
 		op->status = op->opdesc->op_func(rqstp, cstate, &op->u);
+		called_func = true;
 		trace_nfsd_compound_op_err(rqstp, op->opnum, op->status);
 
 		/* Only from SEQUENCE */
@@ -2914,6 +2917,9 @@ nfsd4_proc_compound(struct svc_rqst *rqstp)
 			nfsd4_encode_operation(resp, op);
 			status = op->status;
 		}
+		if (called_func && op->opdesc->op_release)
+			op->opdesc->op_release(&op->u);
+
 
 		trace_nfsd_compound_status(args->opcnt, resp->opcnt,
 					   status, nfsd4_op_name(op->opnum));
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index ea91bad4eee2..e7b4363aadb0 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -5885,10 +5885,10 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
 	nfsd4_enc encoder;
 
 	if (xdr_stream_encode_u32(xdr, op->opnum) != XDR_UNIT)
-		goto release;
+		goto out;
 	op_status_offset = xdr->buf->len;
 	if (!xdr_reserve_space(xdr, XDR_UNIT))
-		goto release;
+		goto out;
 
 	if (op->opnum == OP_ILLEGAL)
 		goto status;
@@ -5944,10 +5944,7 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
 				      resp->cstate.minorversion);
 	write_bytes_to_xdr_buf(xdr->buf, op_status_offset,
 			       &op->status, XDR_UNIT);
-release:
-	if (opdesc && opdesc->op_release)
-		opdesc->op_release(&op->u);
-
+out:
 	/*
 	 * Account for pages consumed while encoding this operation.
 	 * The xdr_stream primitives don't manage rq_next_page.



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] NFSD: Fix crash in nfsd4_read_release()
  2025-10-03  7:08     ` NeilBrown
@ 2025-10-03 13:57       ` Chuck Lever
  2025-10-04  6:34         ` NeilBrown
  0 siblings, 1 reply; 9+ messages in thread
From: Chuck Lever @ 2025-10-03 13:57 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton
  Cc: Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs, Chuck Lever

On 10/3/25 3:08 AM, NeilBrown wrote:
> On Wed, 01 Oct 2025, Chuck Lever wrote:
>> On 9/30/25 8:53 PM, NeilBrown wrote:
>>> On Wed, 01 Oct 2025, Chuck Lever wrote:
>>>> From: Chuck Lever <chuck.lever@oracle.com>
>>>>
>>>> When tracing is enabled, the trace_nfsd_read_done trace point
>>>> crashes during the pynfs read.testNoFh test.
>>>>
>>>> Fixes: 87c5942e8fae ("nfsd: Add I/O trace points in the NFSv4 read proc")
>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>>> ---
>>>>  fs/nfsd/nfs4proc.c | 7 ++++---
>>>>  1 file changed, 4 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>>>> index e466cf52d7d7..f9aeefc0da73 100644
>>>> --- a/fs/nfsd/nfs4proc.c
>>>> +++ b/fs/nfsd/nfs4proc.c
>>>> @@ -988,10 +988,11 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>>>>  static void
>>>>  nfsd4_read_release(union nfsd4_op_u *u)
>>>>  {
>>>> -	if (u->read.rd_nf)
>>>> +	if (u->read.rd_nf) {
>>>> +		trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp,
>>>> +				     u->read.rd_offset, u->read.rd_length);
>>>>  		nfsd_file_put(u->read.rd_nf);
>>>> -	trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp,
>>>> -			     u->read.rd_offset, u->read.rd_length);
>>>> +	}
>>>>  }
>>>
>>> I must say this looks a bit weird.  rd_nf isn't used in the trace but if
>>> it isn't set, you say the trace crashes...
>>>
>>> That is because rd_fhp being NULL (because there is no current_fh) is
>>> one thing that results in rd_nf being NULL.  Seems a bit indirect.
>>
>> When rd_nf is NULL, no file or file handle is present. That's really all
>> there is to it. You can read it as "when rd_nf is NULL, no processing is
>> needed".
>>
>> The other read trace points are already skipped in this case as well, so
>> there is no need to protect them.
> 
> They are skipped because ALLOWED_WITHOUT_FH is not set so ->op_func
> isn't called.  But ->op_release *is* called.  That seems inconsistent.
> I wonder if we could move the ->op_release call out of
> nfsd4_encode_operation() and only call it in cases were ->op_func
> was called.
> 
> Something like the following?  It isn't as elegant as I would have
> liked, but I think it is better than hiding the ->op_release cxall in
> nfsd4_encode_replay().
> 
> Thanks,
> NeilBrown
> 
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index 71b428efcbb5..4ed823d1e6be 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -2837,6 +2837,8 @@ nfsd4_proc_compound(struct svc_rqst *rqstp)
>  
>  	trace_nfsd_compound(rqstp, args->tag, args->taglen, args->opcnt);
>  	while (!status && resp->opcnt < args->opcnt) {
> +		bool called_func = false;
> +

Or call it "release_needed" ?

This change is not something I'd care to have backported into stable
kernels... So maybe keep the modification of nfsd4_read_release for
backport, and then subsequently do this change too?

I was reminded of a release-related change that Jeff did recently, maybe
commit 15a8b55dbb1b ("nfsd: call op_release, even when op_func returns
an error")... might be relevant, might not.


>  		op = &args->ops[resp->opcnt++];
>  
>  		/*
> @@ -2886,6 +2888,7 @@ nfsd4_proc_compound(struct svc_rqst *rqstp)
>  		if (op->opdesc->op_get_currentstateid)
>  			op->opdesc->op_get_currentstateid(cstate, &op->u);
>  		op->status = op->opdesc->op_func(rqstp, cstate, &op->u);
> +		called_func = true;
>  		trace_nfsd_compound_op_err(rqstp, op->opnum, op->status);
>  
>  		/* Only from SEQUENCE */
> @@ -2914,6 +2917,9 @@ nfsd4_proc_compound(struct svc_rqst *rqstp)
>  			nfsd4_encode_operation(resp, op);
>  			status = op->status;
>  		}
> +		if (called_func && op->opdesc->op_release)
> +			op->opdesc->op_release(&op->u);
> +
>  
>  		trace_nfsd_compound_status(args->opcnt, resp->opcnt,
>  					   status, nfsd4_op_name(op->opnum));
> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> index ea91bad4eee2..e7b4363aadb0 100644
> --- a/fs/nfsd/nfs4xdr.c
> +++ b/fs/nfsd/nfs4xdr.c
> @@ -5885,10 +5885,10 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
>  	nfsd4_enc encoder;
>  
>  	if (xdr_stream_encode_u32(xdr, op->opnum) != XDR_UNIT)
> -		goto release;
> +		goto out;
>  	op_status_offset = xdr->buf->len;
>  	if (!xdr_reserve_space(xdr, XDR_UNIT))
> -		goto release;
> +		goto out;
>  
>  	if (op->opnum == OP_ILLEGAL)
>  		goto status;
> @@ -5944,10 +5944,7 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
>  				      resp->cstate.minorversion);
>  	write_bytes_to_xdr_buf(xdr->buf, op_status_offset,
>  			       &op->status, XDR_UNIT);
> -release:
> -	if (opdesc && opdesc->op_release)
> -		opdesc->op_release(&op->u);
> -
> +out:
>  	/*
>  	 * Account for pages consumed while encoding this operation.
>  	 * The xdr_stream primitives don't manage rq_next_page.
> 
> 


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] NFSD: Fix crash in nfsd4_read_release()
  2025-10-03 13:57       ` Chuck Lever
@ 2025-10-04  6:34         ` NeilBrown
  2025-10-04 15:51           ` Chuck Lever
  2025-10-06 16:23           ` Jeff Layton
  0 siblings, 2 replies; 9+ messages in thread
From: NeilBrown @ 2025-10-04  6:34 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs,
	Chuck Lever

On Fri, 03 Oct 2025, Chuck Lever wrote:
> On 10/3/25 3:08 AM, NeilBrown wrote:
> > On Wed, 01 Oct 2025, Chuck Lever wrote:
> >> On 9/30/25 8:53 PM, NeilBrown wrote:
> >>> On Wed, 01 Oct 2025, Chuck Lever wrote:
> >>>> From: Chuck Lever <chuck.lever@oracle.com>
> >>>>
> >>>> When tracing is enabled, the trace_nfsd_read_done trace point
> >>>> crashes during the pynfs read.testNoFh test.
> >>>>
> >>>> Fixes: 87c5942e8fae ("nfsd: Add I/O trace points in the NFSv4 read proc")
> >>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> >>>> ---
> >>>>  fs/nfsd/nfs4proc.c | 7 ++++---
> >>>>  1 file changed, 4 insertions(+), 3 deletions(-)
> >>>>
> >>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> >>>> index e466cf52d7d7..f9aeefc0da73 100644
> >>>> --- a/fs/nfsd/nfs4proc.c
> >>>> +++ b/fs/nfsd/nfs4proc.c
> >>>> @@ -988,10 +988,11 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> >>>>  static void
> >>>>  nfsd4_read_release(union nfsd4_op_u *u)
> >>>>  {
> >>>> -	if (u->read.rd_nf)
> >>>> +	if (u->read.rd_nf) {
> >>>> +		trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp,
> >>>> +				     u->read.rd_offset, u->read.rd_length);
> >>>>  		nfsd_file_put(u->read.rd_nf);
> >>>> -	trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp,
> >>>> -			     u->read.rd_offset, u->read.rd_length);
> >>>> +	}
> >>>>  }
> >>>
> >>> I must say this looks a bit weird.  rd_nf isn't used in the trace but if
> >>> it isn't set, you say the trace crashes...
> >>>
> >>> That is because rd_fhp being NULL (because there is no current_fh) is
> >>> one thing that results in rd_nf being NULL.  Seems a bit indirect.
> >>
> >> When rd_nf is NULL, no file or file handle is present. That's really all
> >> there is to it. You can read it as "when rd_nf is NULL, no processing is
> >> needed".
> >>
> >> The other read trace points are already skipped in this case as well, so
> >> there is no need to protect them.
> > 
> > They are skipped because ALLOWED_WITHOUT_FH is not set so ->op_func
> > isn't called.  But ->op_release *is* called.  That seems inconsistent.
> > I wonder if we could move the ->op_release call out of
> > nfsd4_encode_operation() and only call it in cases were ->op_func
> > was called.
> > 
> > Something like the following?  It isn't as elegant as I would have
> > liked, but I think it is better than hiding the ->op_release cxall in
> > nfsd4_encode_replay().
> > 
> > Thanks,
> > NeilBrown
> > 
> > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > index 71b428efcbb5..4ed823d1e6be 100644
> > --- a/fs/nfsd/nfs4proc.c
> > +++ b/fs/nfsd/nfs4proc.c
> > @@ -2837,6 +2837,8 @@ nfsd4_proc_compound(struct svc_rqst *rqstp)
> >  
> >  	trace_nfsd_compound(rqstp, args->tag, args->taglen, args->opcnt);
> >  	while (!status && resp->opcnt < args->opcnt) {
> > +		bool called_func = false;
> > +
> 
> Or call it "release_needed" ?

That might be a bit better - yes.

> 
> This change is not something I'd care to have backported into stable
> kernels... So maybe keep the modification of nfsd4_read_release for
> backport, and then subsequently do this change too?

That seems like a sensible safe approach.

> 
> I was reminded of a release-related change that Jeff did recently, maybe
> commit 15a8b55dbb1b ("nfsd: call op_release, even when op_func returns
> an error")... might be relevant, might not.

Definitely relevant.  I think this is the commit to mention in the
Fixes: tag.  Prior to this commit nfsd4_read_release wouldn't have been
called if there was no filehandle, so the trace point was perfectly
safe.

I wonder if this fix by Jeff was the best fix for the problem.
An alternative would be to require ->op_func to release any resources
before returning an error..... except that wouldn't work for 
OP_NONTRIVIAL_ERROR_ENCODE functions like OP_LOCK as the ->op_release
frees the "deny" info which is needed to encode the reply....

But wait ...nfs4_encode_operation - which currently calls ->op_release -
is also called from nfsd4_enc_sequence_replay().  The status is set to
an error so ->op_release previously would not have been called, but now
it is.  Could that mean nfsd4_lock_release() can get called there even
tough nfsd4_lock wasn't called.  Could that be a problem?  It isn't
clear to me that is *isn't* a problem, should maybe there is a subtlety
that saves it.

I'm beginning to think that we don't want an op_release at all.
Anything allocated in ->op_func should either be freed in ->op_func when
an error is detected, or in the corresponding nfsd4_enc_ops function
after the allocated value is used for encoding.

Looking further...
 ->op_get_currentstateid is only called immediately before the one time
 that ->op_func is called.  So ->op_func could do that
 get_currentstateid work.
 and ->op_set_currentstateid looks like it could be folded in to
 ->op_func as well.

Does anyone have any thoughts about that?

Thanks,
NeilBrown


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] NFSD: Fix crash in nfsd4_read_release()
  2025-10-04  6:34         ` NeilBrown
@ 2025-10-04 15:51           ` Chuck Lever
  2025-10-06 16:23           ` Jeff Layton
  1 sibling, 0 replies; 9+ messages in thread
From: Chuck Lever @ 2025-10-04 15:51 UTC (permalink / raw)
  To: NeilBrown
  Cc: Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs,
	Chuck Lever

On 10/4/25 2:34 AM, NeilBrown wrote:
> On Fri, 03 Oct 2025, Chuck Lever wrote:
>> I was reminded of a release-related change that Jeff did recently, maybe
>> commit 15a8b55dbb1b ("nfsd: call op_release, even when op_func returns
>> an error")... might be relevant, might not.
> 
> Definitely relevant.  I think this is the commit to mention in the
> Fixes: tag.  Prior to this commit nfsd4_read_release wouldn't have been
> called if there was no filehandle, so the trace point was perfectly
> safe.

I'll replace the Fixes: tag with that one.


> I wonder if this fix by Jeff was the best fix for the problem.
> An alternative would be to require ->op_func to release any resources
> before returning an error..... except that wouldn't work for 
> OP_NONTRIVIAL_ERROR_ENCODE functions like OP_LOCK as the ->op_release
> frees the "deny" info which is needed to encode the reply....
> 
> But wait ...nfs4_encode_operation - which currently calls ->op_release -
> is also called from nfsd4_enc_sequence_replay().  The status is set to
> an error so ->op_release previously would not have been called, but now
> it is.  Could that mean nfsd4_lock_release() can get called there even
> tough nfsd4_lock wasn't called.  Could that be a problem?  It isn't
> clear to me that is *isn't* a problem, should maybe there is a subtlety
> that saves it.

One approach to getting experimental data would be to hack together a
pynfs test for this case. (If it proves useful, please contribute it).


> I'm beginning to think that we don't want an op_release at all.
> Anything allocated in ->op_func should either be freed in ->op_func when
> an error is detected, or in the corresponding nfsd4_enc_ops function
> after the allocated value is used for encoding.
> 
> Looking further...
>  ->op_get_currentstateid is only called immediately before the one time
>  that ->op_func is called.  So ->op_func could do that
>  get_currentstateid work.
>  and ->op_set_currentstateid looks like it could be folded in to
>  ->op_func as well.
> 
> Does anyone have any thoughts about that?

Careful review needed, I think. We are getting ready for a week of
bake-a-thon next week... that might delay any deep thinking on my
part for a few days.


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] NFSD: Fix crash in nfsd4_read_release()
  2025-10-04  6:34         ` NeilBrown
  2025-10-04 15:51           ` Chuck Lever
@ 2025-10-06 16:23           ` Jeff Layton
  1 sibling, 0 replies; 9+ messages in thread
From: Jeff Layton @ 2025-10-06 16:23 UTC (permalink / raw)
  To: NeilBrown, Chuck Lever
  Cc: Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs, Chuck Lever

On Sat, 2025-10-04 at 16:34 +1000, NeilBrown wrote:
> On Fri, 03 Oct 2025, Chuck Lever wrote:
> > On 10/3/25 3:08 AM, NeilBrown wrote:
> > > On Wed, 01 Oct 2025, Chuck Lever wrote:
> > > > On 9/30/25 8:53 PM, NeilBrown wrote:
> > > > > On Wed, 01 Oct 2025, Chuck Lever wrote:
> > > > > > From: Chuck Lever <chuck.lever@oracle.com>
> > > > > > 
> > > > > > When tracing is enabled, the trace_nfsd_read_done trace point
> > > > > > crashes during the pynfs read.testNoFh test.
> > > > > > 
> > > > > > Fixes: 87c5942e8fae ("nfsd: Add I/O trace points in the NFSv4 read proc")
> > > > > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > > > > > ---
> > > > > >  fs/nfsd/nfs4proc.c | 7 ++++---
> > > > > >  1 file changed, 4 insertions(+), 3 deletions(-)
> > > > > > 
> > > > > > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > > > > > index e466cf52d7d7..f9aeefc0da73 100644
> > > > > > --- a/fs/nfsd/nfs4proc.c
> > > > > > +++ b/fs/nfsd/nfs4proc.c
> > > > > > @@ -988,10 +988,11 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> > > > > >  static void
> > > > > >  nfsd4_read_release(union nfsd4_op_u *u)
> > > > > >  {
> > > > > > -	if (u->read.rd_nf)
> > > > > > +	if (u->read.rd_nf) {
> > > > > > +		trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp,
> > > > > > +				     u->read.rd_offset, u->read.rd_length);
> > > > > >  		nfsd_file_put(u->read.rd_nf);
> > > > > > -	trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp,
> > > > > > -			     u->read.rd_offset, u->read.rd_length);
> > > > > > +	}
> > > > > >  }
> > > > > 
> > > > > I must say this looks a bit weird.  rd_nf isn't used in the trace but if
> > > > > it isn't set, you say the trace crashes...
> > > > > 
> > > > > That is because rd_fhp being NULL (because there is no current_fh) is
> > > > > one thing that results in rd_nf being NULL.  Seems a bit indirect.
> > > > 
> > > > When rd_nf is NULL, no file or file handle is present. That's really all
> > > > there is to it. You can read it as "when rd_nf is NULL, no processing is
> > > > needed".
> > > > 
> > > > The other read trace points are already skipped in this case as well, so
> > > > there is no need to protect them.
> > > 
> > > They are skipped because ALLOWED_WITHOUT_FH is not set so ->op_func
> > > isn't called.  But ->op_release *is* called.  That seems inconsistent.
> > > I wonder if we could move the ->op_release call out of
> > > nfsd4_encode_operation() and only call it in cases were ->op_func
> > > was called.
> > > 
> > > Something like the following?  It isn't as elegant as I would have
> > > liked, but I think it is better than hiding the ->op_release cxall in
> > > nfsd4_encode_replay().
> > > 
> > > Thanks,
> > > NeilBrown
> > > 
> > > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > > index 71b428efcbb5..4ed823d1e6be 100644
> > > --- a/fs/nfsd/nfs4proc.c
> > > +++ b/fs/nfsd/nfs4proc.c
> > > @@ -2837,6 +2837,8 @@ nfsd4_proc_compound(struct svc_rqst *rqstp)
> > >  
> > >  	trace_nfsd_compound(rqstp, args->tag, args->taglen, args->opcnt);
> > >  	while (!status && resp->opcnt < args->opcnt) {
> > > +		bool called_func = false;
> > > +
> > 
> > Or call it "release_needed" ?
> 
> That might be a bit better - yes.
> 
> > 
> > This change is not something I'd care to have backported into stable
> > kernels... So maybe keep the modification of nfsd4_read_release for
> > backport, and then subsequently do this change too?
> 
> That seems like a sensible safe approach.
> 
> > 
> > I was reminded of a release-related change that Jeff did recently, maybe
> > commit 15a8b55dbb1b ("nfsd: call op_release, even when op_func returns
> > an error")... might be relevant, might not.
> 
> Definitely relevant.  I think this is the commit to mention in the
> Fixes: tag.  Prior to this commit nfsd4_read_release wouldn't have been
> called if there was no filehandle, so the trace point was perfectly
> safe.
> 
> I wonder if this fix by Jeff was the best fix for the problem.
> An alternative would be to require ->op_func to release any resources
> before returning an error..... except that wouldn't work for 
> OP_NONTRIVIAL_ERROR_ENCODE functions like OP_LOCK as the ->op_release
> frees the "deny" info which is needed to encode the reply....
> 
> But wait ...nfs4_encode_operation - which currently calls ->op_release -
> is also called from nfsd4_enc_sequence_replay().  The status is set to
> an error so ->op_release previously would not have been called, but now
> it is.  Could that mean nfsd4_lock_release() can get called there even
> tough nfsd4_lock wasn't called.  Could that be a problem?  It isn't
> clear to me that is *isn't* a problem, should maybe there is a subtlety
> that saves it.
> 
> I'm beginning to think that we don't want an op_release at all.
> Anything allocated in ->op_func should either be freed in ->op_func when
> an error is detected, or in the corresponding nfsd4_enc_ops function
> after the allocated value is used for encoding.
> 
> Looking further...
>  ->op_get_currentstateid is only called immediately before the one time
>  that ->op_func is called.  So ->op_func could do that
>  get_currentstateid work.
>  and ->op_set_currentstateid looks like it could be folded in to
>  ->op_func as well.
> 
> Does anyone have any thoughts about that?

I think that sounds reasonable. I don't see anything that would
preclude that anyway, at least with a cursory look. All of the
references that get released in the op_release calls are acquired
during op_func, AFAICT.
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-10-06 16:23 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-30 14:05 [PATCH v1] NFSD: Fix crash in nfsd4_read_release() Chuck Lever
2025-09-30 14:45 ` Jeff Layton
2025-10-01  0:53 ` NeilBrown
2025-10-01 13:16   ` Chuck Lever
2025-10-03  7:08     ` NeilBrown
2025-10-03 13:57       ` Chuck Lever
2025-10-04  6:34         ` NeilBrown
2025-10-04 15:51           ` Chuck Lever
2025-10-06 16:23           ` Jeff Layton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox