* [PATCH v1] NFSD: Fix crash in nfsd4_read_release()
@ 2025-09-30 14:05 Chuck Lever
2025-09-30 14:45 ` Jeff Layton
2025-10-01 0:53 ` NeilBrown
0 siblings, 2 replies; 9+ messages in thread
From: Chuck Lever @ 2025-09-30 14:05 UTC (permalink / raw)
To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
When tracing is enabled, the trace_nfsd_read_done trace point
crashes during the pynfs read.testNoFh test.
Fixes: 87c5942e8fae ("nfsd: Add I/O trace points in the NFSv4 read proc")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfsd/nfs4proc.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index e466cf52d7d7..f9aeefc0da73 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -988,10 +988,11 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
static void
nfsd4_read_release(union nfsd4_op_u *u)
{
- if (u->read.rd_nf)
+ if (u->read.rd_nf) {
+ trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp,
+ u->read.rd_offset, u->read.rd_length);
nfsd_file_put(u->read.rd_nf);
- trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp,
- u->read.rd_offset, u->read.rd_length);
+ }
}
static __be32
--
2.51.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH v1] NFSD: Fix crash in nfsd4_read_release() 2025-09-30 14:05 [PATCH v1] NFSD: Fix crash in nfsd4_read_release() Chuck Lever @ 2025-09-30 14:45 ` Jeff Layton 2025-10-01 0:53 ` NeilBrown 1 sibling, 0 replies; 9+ messages in thread From: Jeff Layton @ 2025-09-30 14:45 UTC (permalink / raw) To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey Cc: linux-nfs, Chuck Lever On Tue, 2025-09-30 at 10:05 -0400, Chuck Lever wrote: > From: Chuck Lever <chuck.lever@oracle.com> > > When tracing is enabled, the trace_nfsd_read_done trace point > crashes during the pynfs read.testNoFh test. > > Fixes: 87c5942e8fae ("nfsd: Add I/O trace points in the NFSv4 read proc") > Signed-off-by: Chuck Lever <chuck.lever@oracle.com> > --- > fs/nfsd/nfs4proc.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c > index e466cf52d7d7..f9aeefc0da73 100644 > --- a/fs/nfsd/nfs4proc.c > +++ b/fs/nfsd/nfs4proc.c > @@ -988,10 +988,11 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, > static void > nfsd4_read_release(union nfsd4_op_u *u) > { > - if (u->read.rd_nf) > + if (u->read.rd_nf) { > + trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp, > + u->read.rd_offset, u->read.rd_length); > nfsd_file_put(u->read.rd_nf); > - trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp, > - u->read.rd_offset, u->read.rd_length); > + } > } > > static __be32 Not sending a filehandle is pretty rare so I guess we won't miss the lack of a tracepoint in that case. Reviewed-by: Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v1] NFSD: Fix crash in nfsd4_read_release() 2025-09-30 14:05 [PATCH v1] NFSD: Fix crash in nfsd4_read_release() Chuck Lever 2025-09-30 14:45 ` Jeff Layton @ 2025-10-01 0:53 ` NeilBrown 2025-10-01 13:16 ` Chuck Lever 1 sibling, 1 reply; 9+ messages in thread From: NeilBrown @ 2025-10-01 0:53 UTC (permalink / raw) To: Chuck Lever Cc: Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs, Chuck Lever On Wed, 01 Oct 2025, Chuck Lever wrote: > From: Chuck Lever <chuck.lever@oracle.com> > > When tracing is enabled, the trace_nfsd_read_done trace point > crashes during the pynfs read.testNoFh test. > > Fixes: 87c5942e8fae ("nfsd: Add I/O trace points in the NFSv4 read proc") > Signed-off-by: Chuck Lever <chuck.lever@oracle.com> > --- > fs/nfsd/nfs4proc.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c > index e466cf52d7d7..f9aeefc0da73 100644 > --- a/fs/nfsd/nfs4proc.c > +++ b/fs/nfsd/nfs4proc.c > @@ -988,10 +988,11 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, > static void > nfsd4_read_release(union nfsd4_op_u *u) > { > - if (u->read.rd_nf) > + if (u->read.rd_nf) { > + trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp, > + u->read.rd_offset, u->read.rd_length); > nfsd_file_put(u->read.rd_nf); > - trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp, > - u->read.rd_offset, u->read.rd_length); > + } > } I must say this looks a bit weird. rd_nf isn't used in the trace but if it isn't set, you say the trace crashes... That is because rd_fhp being NULL (because there is no current_fh) is one thing that results in rd_nf being NULL. Seems a bit indirect. Did you consider if (u->read.rd_fhp) trace ..... if (u->read.rd_nf) nfsd_file_put.... ?? Or maybe --- a/fs/nfsd/trace.h +++ b/fs/nfsd/trace.h @@ -444,7 +444,7 @@ DECLARE_EVENT_CLASS(nfsd_io_class, ), TP_fast_assign( __entry->xid = be32_to_cpu(rqstp->rq_xid); - __entry->fh_hash = knfsd_fh_hash(&fhp->fh_handle); + __entry->fh_hash = fhp ? 0 : knfsd_fh_hash(&fhp->fh_handle); __entry->offset = offset; __entry->len = len; ), That would make all IO event trace points robust. I'm almost tempted to suggest knfsd_fh_hash() be changed to receive a svc_fh and test for NULL, but there are 2 places where there isn't a svc_fh handy. NeilBrown ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v1] NFSD: Fix crash in nfsd4_read_release() 2025-10-01 0:53 ` NeilBrown @ 2025-10-01 13:16 ` Chuck Lever 2025-10-03 7:08 ` NeilBrown 0 siblings, 1 reply; 9+ messages in thread From: Chuck Lever @ 2025-10-01 13:16 UTC (permalink / raw) To: NeilBrown Cc: Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs, Chuck Lever On 9/30/25 8:53 PM, NeilBrown wrote: > On Wed, 01 Oct 2025, Chuck Lever wrote: >> From: Chuck Lever <chuck.lever@oracle.com> >> >> When tracing is enabled, the trace_nfsd_read_done trace point >> crashes during the pynfs read.testNoFh test. >> >> Fixes: 87c5942e8fae ("nfsd: Add I/O trace points in the NFSv4 read proc") >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> >> --- >> fs/nfsd/nfs4proc.c | 7 ++++--- >> 1 file changed, 4 insertions(+), 3 deletions(-) >> >> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c >> index e466cf52d7d7..f9aeefc0da73 100644 >> --- a/fs/nfsd/nfs4proc.c >> +++ b/fs/nfsd/nfs4proc.c >> @@ -988,10 +988,11 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, >> static void >> nfsd4_read_release(union nfsd4_op_u *u) >> { >> - if (u->read.rd_nf) >> + if (u->read.rd_nf) { >> + trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp, >> + u->read.rd_offset, u->read.rd_length); >> nfsd_file_put(u->read.rd_nf); >> - trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp, >> - u->read.rd_offset, u->read.rd_length); >> + } >> } > > I must say this looks a bit weird. rd_nf isn't used in the trace but if > it isn't set, you say the trace crashes... > > That is because rd_fhp being NULL (because there is no current_fh) is > one thing that results in rd_nf being NULL. Seems a bit indirect. When rd_nf is NULL, no file or file handle is present. That's really all there is to it. You can read it as "when rd_nf is NULL, no processing is needed". The other read trace points are already skipped in this case as well, so there is no need to protect them. > Did you consider > if (u->read.rd_fhp) > trace ..... > if (u->read.rd_nf) > nfsd_file_put.... > > ?? I can't think of a legitimate case where fhp is not NULL but nf is NULL. Or vice versa. > Or maybe > --- a/fs/nfsd/trace.h > +++ b/fs/nfsd/trace.h > @@ -444,7 +444,7 @@ DECLARE_EVENT_CLASS(nfsd_io_class, > ), > TP_fast_assign( > __entry->xid = be32_to_cpu(rqstp->rq_xid); > - __entry->fh_hash = knfsd_fh_hash(&fhp->fh_handle); > + __entry->fh_hash = fhp ? 0 : knfsd_fh_hash(&fhp->fh_handle); > __entry->offset = offset; > __entry->len = len; > ), > > > That would make all IO event trace points robust. > I'm almost tempted to suggest knfsd_fh_hash() be changed to receive a > svc_fh and test for NULL, but there are 2 places where there isn't a > svc_fh handy. > > NeilBrown > -- Chuck Lever ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v1] NFSD: Fix crash in nfsd4_read_release() 2025-10-01 13:16 ` Chuck Lever @ 2025-10-03 7:08 ` NeilBrown 2025-10-03 13:57 ` Chuck Lever 0 siblings, 1 reply; 9+ messages in thread From: NeilBrown @ 2025-10-03 7:08 UTC (permalink / raw) To: Chuck Lever Cc: Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs, Chuck Lever On Wed, 01 Oct 2025, Chuck Lever wrote: > On 9/30/25 8:53 PM, NeilBrown wrote: > > On Wed, 01 Oct 2025, Chuck Lever wrote: > >> From: Chuck Lever <chuck.lever@oracle.com> > >> > >> When tracing is enabled, the trace_nfsd_read_done trace point > >> crashes during the pynfs read.testNoFh test. > >> > >> Fixes: 87c5942e8fae ("nfsd: Add I/O trace points in the NFSv4 read proc") > >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> > >> --- > >> fs/nfsd/nfs4proc.c | 7 ++++--- > >> 1 file changed, 4 insertions(+), 3 deletions(-) > >> > >> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c > >> index e466cf52d7d7..f9aeefc0da73 100644 > >> --- a/fs/nfsd/nfs4proc.c > >> +++ b/fs/nfsd/nfs4proc.c > >> @@ -988,10 +988,11 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, > >> static void > >> nfsd4_read_release(union nfsd4_op_u *u) > >> { > >> - if (u->read.rd_nf) > >> + if (u->read.rd_nf) { > >> + trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp, > >> + u->read.rd_offset, u->read.rd_length); > >> nfsd_file_put(u->read.rd_nf); > >> - trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp, > >> - u->read.rd_offset, u->read.rd_length); > >> + } > >> } > > > > I must say this looks a bit weird. rd_nf isn't used in the trace but if > > it isn't set, you say the trace crashes... > > > > That is because rd_fhp being NULL (because there is no current_fh) is > > one thing that results in rd_nf being NULL. Seems a bit indirect. > > When rd_nf is NULL, no file or file handle is present. That's really all > there is to it. You can read it as "when rd_nf is NULL, no processing is > needed". > > The other read trace points are already skipped in this case as well, so > there is no need to protect them. They are skipped because ALLOWED_WITHOUT_FH is not set so ->op_func isn't called. But ->op_release *is* called. That seems inconsistent. I wonder if we could move the ->op_release call out of nfsd4_encode_operation() and only call it in cases were ->op_func was called. Something like the following? It isn't as elegant as I would have liked, but I think it is better than hiding the ->op_release cxall in nfsd4_encode_replay(). Thanks, NeilBrown diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c index 71b428efcbb5..4ed823d1e6be 100644 --- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -2837,6 +2837,8 @@ nfsd4_proc_compound(struct svc_rqst *rqstp) trace_nfsd_compound(rqstp, args->tag, args->taglen, args->opcnt); while (!status && resp->opcnt < args->opcnt) { + bool called_func = false; + op = &args->ops[resp->opcnt++]; /* @@ -2886,6 +2888,7 @@ nfsd4_proc_compound(struct svc_rqst *rqstp) if (op->opdesc->op_get_currentstateid) op->opdesc->op_get_currentstateid(cstate, &op->u); op->status = op->opdesc->op_func(rqstp, cstate, &op->u); + called_func = true; trace_nfsd_compound_op_err(rqstp, op->opnum, op->status); /* Only from SEQUENCE */ @@ -2914,6 +2917,9 @@ nfsd4_proc_compound(struct svc_rqst *rqstp) nfsd4_encode_operation(resp, op); status = op->status; } + if (called_func && op->opdesc->op_release) + op->opdesc->op_release(&op->u); + trace_nfsd_compound_status(args->opcnt, resp->opcnt, status, nfsd4_op_name(op->opnum)); diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index ea91bad4eee2..e7b4363aadb0 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -5885,10 +5885,10 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op) nfsd4_enc encoder; if (xdr_stream_encode_u32(xdr, op->opnum) != XDR_UNIT) - goto release; + goto out; op_status_offset = xdr->buf->len; if (!xdr_reserve_space(xdr, XDR_UNIT)) - goto release; + goto out; if (op->opnum == OP_ILLEGAL) goto status; @@ -5944,10 +5944,7 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op) resp->cstate.minorversion); write_bytes_to_xdr_buf(xdr->buf, op_status_offset, &op->status, XDR_UNIT); -release: - if (opdesc && opdesc->op_release) - opdesc->op_release(&op->u); - +out: /* * Account for pages consumed while encoding this operation. * The xdr_stream primitives don't manage rq_next_page. ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v1] NFSD: Fix crash in nfsd4_read_release() 2025-10-03 7:08 ` NeilBrown @ 2025-10-03 13:57 ` Chuck Lever 2025-10-04 6:34 ` NeilBrown 0 siblings, 1 reply; 9+ messages in thread From: Chuck Lever @ 2025-10-03 13:57 UTC (permalink / raw) To: NeilBrown, Jeff Layton Cc: Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs, Chuck Lever On 10/3/25 3:08 AM, NeilBrown wrote: > On Wed, 01 Oct 2025, Chuck Lever wrote: >> On 9/30/25 8:53 PM, NeilBrown wrote: >>> On Wed, 01 Oct 2025, Chuck Lever wrote: >>>> From: Chuck Lever <chuck.lever@oracle.com> >>>> >>>> When tracing is enabled, the trace_nfsd_read_done trace point >>>> crashes during the pynfs read.testNoFh test. >>>> >>>> Fixes: 87c5942e8fae ("nfsd: Add I/O trace points in the NFSv4 read proc") >>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> >>>> --- >>>> fs/nfsd/nfs4proc.c | 7 ++++--- >>>> 1 file changed, 4 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c >>>> index e466cf52d7d7..f9aeefc0da73 100644 >>>> --- a/fs/nfsd/nfs4proc.c >>>> +++ b/fs/nfsd/nfs4proc.c >>>> @@ -988,10 +988,11 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, >>>> static void >>>> nfsd4_read_release(union nfsd4_op_u *u) >>>> { >>>> - if (u->read.rd_nf) >>>> + if (u->read.rd_nf) { >>>> + trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp, >>>> + u->read.rd_offset, u->read.rd_length); >>>> nfsd_file_put(u->read.rd_nf); >>>> - trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp, >>>> - u->read.rd_offset, u->read.rd_length); >>>> + } >>>> } >>> >>> I must say this looks a bit weird. rd_nf isn't used in the trace but if >>> it isn't set, you say the trace crashes... >>> >>> That is because rd_fhp being NULL (because there is no current_fh) is >>> one thing that results in rd_nf being NULL. Seems a bit indirect. >> >> When rd_nf is NULL, no file or file handle is present. That's really all >> there is to it. You can read it as "when rd_nf is NULL, no processing is >> needed". >> >> The other read trace points are already skipped in this case as well, so >> there is no need to protect them. > > They are skipped because ALLOWED_WITHOUT_FH is not set so ->op_func > isn't called. But ->op_release *is* called. That seems inconsistent. > I wonder if we could move the ->op_release call out of > nfsd4_encode_operation() and only call it in cases were ->op_func > was called. > > Something like the following? It isn't as elegant as I would have > liked, but I think it is better than hiding the ->op_release cxall in > nfsd4_encode_replay(). > > Thanks, > NeilBrown > > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c > index 71b428efcbb5..4ed823d1e6be 100644 > --- a/fs/nfsd/nfs4proc.c > +++ b/fs/nfsd/nfs4proc.c > @@ -2837,6 +2837,8 @@ nfsd4_proc_compound(struct svc_rqst *rqstp) > > trace_nfsd_compound(rqstp, args->tag, args->taglen, args->opcnt); > while (!status && resp->opcnt < args->opcnt) { > + bool called_func = false; > + Or call it "release_needed" ? This change is not something I'd care to have backported into stable kernels... So maybe keep the modification of nfsd4_read_release for backport, and then subsequently do this change too? I was reminded of a release-related change that Jeff did recently, maybe commit 15a8b55dbb1b ("nfsd: call op_release, even when op_func returns an error")... might be relevant, might not. > op = &args->ops[resp->opcnt++]; > > /* > @@ -2886,6 +2888,7 @@ nfsd4_proc_compound(struct svc_rqst *rqstp) > if (op->opdesc->op_get_currentstateid) > op->opdesc->op_get_currentstateid(cstate, &op->u); > op->status = op->opdesc->op_func(rqstp, cstate, &op->u); > + called_func = true; > trace_nfsd_compound_op_err(rqstp, op->opnum, op->status); > > /* Only from SEQUENCE */ > @@ -2914,6 +2917,9 @@ nfsd4_proc_compound(struct svc_rqst *rqstp) > nfsd4_encode_operation(resp, op); > status = op->status; > } > + if (called_func && op->opdesc->op_release) > + op->opdesc->op_release(&op->u); > + > > trace_nfsd_compound_status(args->opcnt, resp->opcnt, > status, nfsd4_op_name(op->opnum)); > diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c > index ea91bad4eee2..e7b4363aadb0 100644 > --- a/fs/nfsd/nfs4xdr.c > +++ b/fs/nfsd/nfs4xdr.c > @@ -5885,10 +5885,10 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op) > nfsd4_enc encoder; > > if (xdr_stream_encode_u32(xdr, op->opnum) != XDR_UNIT) > - goto release; > + goto out; > op_status_offset = xdr->buf->len; > if (!xdr_reserve_space(xdr, XDR_UNIT)) > - goto release; > + goto out; > > if (op->opnum == OP_ILLEGAL) > goto status; > @@ -5944,10 +5944,7 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op) > resp->cstate.minorversion); > write_bytes_to_xdr_buf(xdr->buf, op_status_offset, > &op->status, XDR_UNIT); > -release: > - if (opdesc && opdesc->op_release) > - opdesc->op_release(&op->u); > - > +out: > /* > * Account for pages consumed while encoding this operation. > * The xdr_stream primitives don't manage rq_next_page. > > -- Chuck Lever ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v1] NFSD: Fix crash in nfsd4_read_release() 2025-10-03 13:57 ` Chuck Lever @ 2025-10-04 6:34 ` NeilBrown 2025-10-04 15:51 ` Chuck Lever 2025-10-06 16:23 ` Jeff Layton 0 siblings, 2 replies; 9+ messages in thread From: NeilBrown @ 2025-10-04 6:34 UTC (permalink / raw) To: Chuck Lever Cc: Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs, Chuck Lever On Fri, 03 Oct 2025, Chuck Lever wrote: > On 10/3/25 3:08 AM, NeilBrown wrote: > > On Wed, 01 Oct 2025, Chuck Lever wrote: > >> On 9/30/25 8:53 PM, NeilBrown wrote: > >>> On Wed, 01 Oct 2025, Chuck Lever wrote: > >>>> From: Chuck Lever <chuck.lever@oracle.com> > >>>> > >>>> When tracing is enabled, the trace_nfsd_read_done trace point > >>>> crashes during the pynfs read.testNoFh test. > >>>> > >>>> Fixes: 87c5942e8fae ("nfsd: Add I/O trace points in the NFSv4 read proc") > >>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> > >>>> --- > >>>> fs/nfsd/nfs4proc.c | 7 ++++--- > >>>> 1 file changed, 4 insertions(+), 3 deletions(-) > >>>> > >>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c > >>>> index e466cf52d7d7..f9aeefc0da73 100644 > >>>> --- a/fs/nfsd/nfs4proc.c > >>>> +++ b/fs/nfsd/nfs4proc.c > >>>> @@ -988,10 +988,11 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, > >>>> static void > >>>> nfsd4_read_release(union nfsd4_op_u *u) > >>>> { > >>>> - if (u->read.rd_nf) > >>>> + if (u->read.rd_nf) { > >>>> + trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp, > >>>> + u->read.rd_offset, u->read.rd_length); > >>>> nfsd_file_put(u->read.rd_nf); > >>>> - trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp, > >>>> - u->read.rd_offset, u->read.rd_length); > >>>> + } > >>>> } > >>> > >>> I must say this looks a bit weird. rd_nf isn't used in the trace but if > >>> it isn't set, you say the trace crashes... > >>> > >>> That is because rd_fhp being NULL (because there is no current_fh) is > >>> one thing that results in rd_nf being NULL. Seems a bit indirect. > >> > >> When rd_nf is NULL, no file or file handle is present. That's really all > >> there is to it. You can read it as "when rd_nf is NULL, no processing is > >> needed". > >> > >> The other read trace points are already skipped in this case as well, so > >> there is no need to protect them. > > > > They are skipped because ALLOWED_WITHOUT_FH is not set so ->op_func > > isn't called. But ->op_release *is* called. That seems inconsistent. > > I wonder if we could move the ->op_release call out of > > nfsd4_encode_operation() and only call it in cases were ->op_func > > was called. > > > > Something like the following? It isn't as elegant as I would have > > liked, but I think it is better than hiding the ->op_release cxall in > > nfsd4_encode_replay(). > > > > Thanks, > > NeilBrown > > > > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c > > index 71b428efcbb5..4ed823d1e6be 100644 > > --- a/fs/nfsd/nfs4proc.c > > +++ b/fs/nfsd/nfs4proc.c > > @@ -2837,6 +2837,8 @@ nfsd4_proc_compound(struct svc_rqst *rqstp) > > > > trace_nfsd_compound(rqstp, args->tag, args->taglen, args->opcnt); > > while (!status && resp->opcnt < args->opcnt) { > > + bool called_func = false; > > + > > Or call it "release_needed" ? That might be a bit better - yes. > > This change is not something I'd care to have backported into stable > kernels... So maybe keep the modification of nfsd4_read_release for > backport, and then subsequently do this change too? That seems like a sensible safe approach. > > I was reminded of a release-related change that Jeff did recently, maybe > commit 15a8b55dbb1b ("nfsd: call op_release, even when op_func returns > an error")... might be relevant, might not. Definitely relevant. I think this is the commit to mention in the Fixes: tag. Prior to this commit nfsd4_read_release wouldn't have been called if there was no filehandle, so the trace point was perfectly safe. I wonder if this fix by Jeff was the best fix for the problem. An alternative would be to require ->op_func to release any resources before returning an error..... except that wouldn't work for OP_NONTRIVIAL_ERROR_ENCODE functions like OP_LOCK as the ->op_release frees the "deny" info which is needed to encode the reply.... But wait ...nfs4_encode_operation - which currently calls ->op_release - is also called from nfsd4_enc_sequence_replay(). The status is set to an error so ->op_release previously would not have been called, but now it is. Could that mean nfsd4_lock_release() can get called there even tough nfsd4_lock wasn't called. Could that be a problem? It isn't clear to me that is *isn't* a problem, should maybe there is a subtlety that saves it. I'm beginning to think that we don't want an op_release at all. Anything allocated in ->op_func should either be freed in ->op_func when an error is detected, or in the corresponding nfsd4_enc_ops function after the allocated value is used for encoding. Looking further... ->op_get_currentstateid is only called immediately before the one time that ->op_func is called. So ->op_func could do that get_currentstateid work. and ->op_set_currentstateid looks like it could be folded in to ->op_func as well. Does anyone have any thoughts about that? Thanks, NeilBrown ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v1] NFSD: Fix crash in nfsd4_read_release() 2025-10-04 6:34 ` NeilBrown @ 2025-10-04 15:51 ` Chuck Lever 2025-10-06 16:23 ` Jeff Layton 1 sibling, 0 replies; 9+ messages in thread From: Chuck Lever @ 2025-10-04 15:51 UTC (permalink / raw) To: NeilBrown Cc: Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs, Chuck Lever On 10/4/25 2:34 AM, NeilBrown wrote: > On Fri, 03 Oct 2025, Chuck Lever wrote: >> I was reminded of a release-related change that Jeff did recently, maybe >> commit 15a8b55dbb1b ("nfsd: call op_release, even when op_func returns >> an error")... might be relevant, might not. > > Definitely relevant. I think this is the commit to mention in the > Fixes: tag. Prior to this commit nfsd4_read_release wouldn't have been > called if there was no filehandle, so the trace point was perfectly > safe. I'll replace the Fixes: tag with that one. > I wonder if this fix by Jeff was the best fix for the problem. > An alternative would be to require ->op_func to release any resources > before returning an error..... except that wouldn't work for > OP_NONTRIVIAL_ERROR_ENCODE functions like OP_LOCK as the ->op_release > frees the "deny" info which is needed to encode the reply.... > > But wait ...nfs4_encode_operation - which currently calls ->op_release - > is also called from nfsd4_enc_sequence_replay(). The status is set to > an error so ->op_release previously would not have been called, but now > it is. Could that mean nfsd4_lock_release() can get called there even > tough nfsd4_lock wasn't called. Could that be a problem? It isn't > clear to me that is *isn't* a problem, should maybe there is a subtlety > that saves it. One approach to getting experimental data would be to hack together a pynfs test for this case. (If it proves useful, please contribute it). > I'm beginning to think that we don't want an op_release at all. > Anything allocated in ->op_func should either be freed in ->op_func when > an error is detected, or in the corresponding nfsd4_enc_ops function > after the allocated value is used for encoding. > > Looking further... > ->op_get_currentstateid is only called immediately before the one time > that ->op_func is called. So ->op_func could do that > get_currentstateid work. > and ->op_set_currentstateid looks like it could be folded in to > ->op_func as well. > > Does anyone have any thoughts about that? Careful review needed, I think. We are getting ready for a week of bake-a-thon next week... that might delay any deep thinking on my part for a few days. -- Chuck Lever ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v1] NFSD: Fix crash in nfsd4_read_release() 2025-10-04 6:34 ` NeilBrown 2025-10-04 15:51 ` Chuck Lever @ 2025-10-06 16:23 ` Jeff Layton 1 sibling, 0 replies; 9+ messages in thread From: Jeff Layton @ 2025-10-06 16:23 UTC (permalink / raw) To: NeilBrown, Chuck Lever Cc: Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs, Chuck Lever On Sat, 2025-10-04 at 16:34 +1000, NeilBrown wrote: > On Fri, 03 Oct 2025, Chuck Lever wrote: > > On 10/3/25 3:08 AM, NeilBrown wrote: > > > On Wed, 01 Oct 2025, Chuck Lever wrote: > > > > On 9/30/25 8:53 PM, NeilBrown wrote: > > > > > On Wed, 01 Oct 2025, Chuck Lever wrote: > > > > > > From: Chuck Lever <chuck.lever@oracle.com> > > > > > > > > > > > > When tracing is enabled, the trace_nfsd_read_done trace point > > > > > > crashes during the pynfs read.testNoFh test. > > > > > > > > > > > > Fixes: 87c5942e8fae ("nfsd: Add I/O trace points in the NFSv4 read proc") > > > > > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com> > > > > > > --- > > > > > > fs/nfsd/nfs4proc.c | 7 ++++--- > > > > > > 1 file changed, 4 insertions(+), 3 deletions(-) > > > > > > > > > > > > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c > > > > > > index e466cf52d7d7..f9aeefc0da73 100644 > > > > > > --- a/fs/nfsd/nfs4proc.c > > > > > > +++ b/fs/nfsd/nfs4proc.c > > > > > > @@ -988,10 +988,11 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, > > > > > > static void > > > > > > nfsd4_read_release(union nfsd4_op_u *u) > > > > > > { > > > > > > - if (u->read.rd_nf) > > > > > > + if (u->read.rd_nf) { > > > > > > + trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp, > > > > > > + u->read.rd_offset, u->read.rd_length); > > > > > > nfsd_file_put(u->read.rd_nf); > > > > > > - trace_nfsd_read_done(u->read.rd_rqstp, u->read.rd_fhp, > > > > > > - u->read.rd_offset, u->read.rd_length); > > > > > > + } > > > > > > } > > > > > > > > > > I must say this looks a bit weird. rd_nf isn't used in the trace but if > > > > > it isn't set, you say the trace crashes... > > > > > > > > > > That is because rd_fhp being NULL (because there is no current_fh) is > > > > > one thing that results in rd_nf being NULL. Seems a bit indirect. > > > > > > > > When rd_nf is NULL, no file or file handle is present. That's really all > > > > there is to it. You can read it as "when rd_nf is NULL, no processing is > > > > needed". > > > > > > > > The other read trace points are already skipped in this case as well, so > > > > there is no need to protect them. > > > > > > They are skipped because ALLOWED_WITHOUT_FH is not set so ->op_func > > > isn't called. But ->op_release *is* called. That seems inconsistent. > > > I wonder if we could move the ->op_release call out of > > > nfsd4_encode_operation() and only call it in cases were ->op_func > > > was called. > > > > > > Something like the following? It isn't as elegant as I would have > > > liked, but I think it is better than hiding the ->op_release cxall in > > > nfsd4_encode_replay(). > > > > > > Thanks, > > > NeilBrown > > > > > > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c > > > index 71b428efcbb5..4ed823d1e6be 100644 > > > --- a/fs/nfsd/nfs4proc.c > > > +++ b/fs/nfsd/nfs4proc.c > > > @@ -2837,6 +2837,8 @@ nfsd4_proc_compound(struct svc_rqst *rqstp) > > > > > > trace_nfsd_compound(rqstp, args->tag, args->taglen, args->opcnt); > > > while (!status && resp->opcnt < args->opcnt) { > > > + bool called_func = false; > > > + > > > > Or call it "release_needed" ? > > That might be a bit better - yes. > > > > > This change is not something I'd care to have backported into stable > > kernels... So maybe keep the modification of nfsd4_read_release for > > backport, and then subsequently do this change too? > > That seems like a sensible safe approach. > > > > > I was reminded of a release-related change that Jeff did recently, maybe > > commit 15a8b55dbb1b ("nfsd: call op_release, even when op_func returns > > an error")... might be relevant, might not. > > Definitely relevant. I think this is the commit to mention in the > Fixes: tag. Prior to this commit nfsd4_read_release wouldn't have been > called if there was no filehandle, so the trace point was perfectly > safe. > > I wonder if this fix by Jeff was the best fix for the problem. > An alternative would be to require ->op_func to release any resources > before returning an error..... except that wouldn't work for > OP_NONTRIVIAL_ERROR_ENCODE functions like OP_LOCK as the ->op_release > frees the "deny" info which is needed to encode the reply.... > > But wait ...nfs4_encode_operation - which currently calls ->op_release - > is also called from nfsd4_enc_sequence_replay(). The status is set to > an error so ->op_release previously would not have been called, but now > it is. Could that mean nfsd4_lock_release() can get called there even > tough nfsd4_lock wasn't called. Could that be a problem? It isn't > clear to me that is *isn't* a problem, should maybe there is a subtlety > that saves it. > > I'm beginning to think that we don't want an op_release at all. > Anything allocated in ->op_func should either be freed in ->op_func when > an error is detected, or in the corresponding nfsd4_enc_ops function > after the allocated value is used for encoding. > > Looking further... > ->op_get_currentstateid is only called immediately before the one time > that ->op_func is called. So ->op_func could do that > get_currentstateid work. > and ->op_set_currentstateid looks like it could be folded in to > ->op_func as well. > > Does anyone have any thoughts about that? I think that sounds reasonable. I don't see anything that would preclude that anyway, at least with a cursory look. All of the references that get released in the op_release calls are acquired during op_func, AFAICT. -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-10-06 16:23 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-09-30 14:05 [PATCH v1] NFSD: Fix crash in nfsd4_read_release() Chuck Lever 2025-09-30 14:45 ` Jeff Layton 2025-10-01 0:53 ` NeilBrown 2025-10-01 13:16 ` Chuck Lever 2025-10-03 7:08 ` NeilBrown 2025-10-03 13:57 ` Chuck Lever 2025-10-04 6:34 ` NeilBrown 2025-10-04 15:51 ` Chuck Lever 2025-10-06 16:23 ` Jeff Layton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox