Linux NFS development
 help / color / mirror / Atom feed
* [RFC PATCH] NFSD: Remove WARN_ON_ONCE in nfsd_iter_read()
@ 2025-09-11 20:18 Chuck Lever
  2025-09-12 13:25 ` Mike Snitzer
  2025-09-13  5:37 ` NeilBrown
  0 siblings, 2 replies; 5+ messages in thread
From: Chuck Lever @ 2025-09-11 20:18 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

The *count parameter does not appear to be explicitly restricted
to being smaller than rsize, so it might be possible to overrun
the rq_bvec array.

Rather than overrunning the array (damage done!) and then WARNING
once, let's harden the loop so that it terminates before the end of
rq_bvec. This should result in a short read, which is OK (clients
recover by sending additional READ requests for the remaining unread
bytes).

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/vfs.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

There might be a similar issue with rq_next_page in this loop?

Suppose that nfsd4_encode_readv() encounters a second READ operation
in a COMPOUND, and the two READ operations together comprise more
than "rsize" total bytes of payload. Each rq_bvec is under the page
limit, but the total number of pages consumed from rq_pages might
exceed rq_maxpages.

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 714777c221ed..e2f0fe3f82c0 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1120,13 +1120,13 @@ __be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		bvec_set_page(&rqstp->rq_bvec[v], *(rqstp->rq_next_page++),
 			      len, base);
 		total -= len;
-		++v;
 		base = 0;
+		if (++v >= rqstp->rq_maxpages)
+			break;
 	}
-	WARN_ON_ONCE(v > rqstp->rq_maxpages);
 
-	trace_nfsd_read_vector(rqstp, fhp, offset, *count);
-	iov_iter_bvec(&iter, ITER_DEST, rqstp->rq_bvec, v, *count);
+	trace_nfsd_read_vector(rqstp, fhp, offset, *count - total);
+	iov_iter_bvec(&iter, ITER_DEST, rqstp->rq_bvec, v, *count - total);
 	host_err = vfs_iocb_iter_read(file, &kiocb, &iter);
 	return nfsd_finish_read(rqstp, fhp, file, offset, count, eof, host_err);
 }
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] NFSD: Remove WARN_ON_ONCE in nfsd_iter_read()
  2025-09-11 20:18 [RFC PATCH] NFSD: Remove WARN_ON_ONCE in nfsd_iter_read() Chuck Lever
@ 2025-09-12 13:25 ` Mike Snitzer
  2025-09-13  5:37 ` NeilBrown
  1 sibling, 0 replies; 5+ messages in thread
From: Mike Snitzer @ 2025-09-12 13:25 UTC (permalink / raw)
  To: Chuck Lever
  Cc: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	linux-nfs, Chuck Lever

On Thu, Sep 11, 2025 at 04:18:58PM -0400, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> The *count parameter does not appear to be explicitly restricted
> to being smaller than rsize, so it might be possible to overrun
> the rq_bvec array.
> 
> Rather than overrunning the array (damage done!) and then WARNING
> once, let's harden the loop so that it terminates before the end of
> rq_bvec. This should result in a short read, which is OK (clients
> recover by sending additional READ requests for the remaining unread
> bytes).
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfsd/vfs.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> There might be a similar issue with rq_next_page in this loop?
> 
> Suppose that nfsd4_encode_readv() encounters a second READ operation
> in a COMPOUND, and the two READ operations together comprise more
> than "rsize" total bytes of payload. Each rq_bvec is under the page
> limit, but the total number of pages consumed from rq_pages might
> exceed rq_maxpages.

This concern would appear well-founded; but probably best to deal with
it independently.
 
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 714777c221ed..e2f0fe3f82c0 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -1120,13 +1120,13 @@ __be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
>  		bvec_set_page(&rqstp->rq_bvec[v], *(rqstp->rq_next_page++),
>  			      len, base);
>  		total -= len;
> -		++v;
>  		base = 0;
> +		if (++v >= rqstp->rq_maxpages)
> +			break;

Shouldn't this be == instead of >= ?
Not seeing how it could ever become greater without first being equal.

Other than that, this patch is a welcome obvious improvement:

Reviewed-by: Mike Snitzer <snitzer@kernel.org>

>  	}
> -	WARN_ON_ONCE(v > rqstp->rq_maxpages);
>  
> -	trace_nfsd_read_vector(rqstp, fhp, offset, *count);
> -	iov_iter_bvec(&iter, ITER_DEST, rqstp->rq_bvec, v, *count);
> +	trace_nfsd_read_vector(rqstp, fhp, offset, *count - total);
> +	iov_iter_bvec(&iter, ITER_DEST, rqstp->rq_bvec, v, *count - total);
>  	host_err = vfs_iocb_iter_read(file, &kiocb, &iter);
>  	return nfsd_finish_read(rqstp, fhp, file, offset, count, eof, host_err);
>  }
> -- 
> 2.50.0
> 
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] NFSD: Remove WARN_ON_ONCE in nfsd_iter_read()
  2025-09-11 20:18 [RFC PATCH] NFSD: Remove WARN_ON_ONCE in nfsd_iter_read() Chuck Lever
  2025-09-12 13:25 ` Mike Snitzer
@ 2025-09-13  5:37 ` NeilBrown
  2025-09-13 16:01   ` Chuck Lever
  1 sibling, 1 reply; 5+ messages in thread
From: NeilBrown @ 2025-09-13  5:37 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs,
	Chuck Lever

On Fri, 12 Sep 2025, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> The *count parameter does not appear to be explicitly restricted
> to being smaller than rsize, so it might be possible to overrun
> the rq_bvec array.
> 
> Rather than overrunning the array (damage done!) and then WARNING
> once, let's harden the loop so that it terminates before the end of
> rq_bvec. This should result in a short read, which is OK (clients
> recover by sending additional READ requests for the remaining unread
> bytes).
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfsd/vfs.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> There might be a similar issue with rq_next_page in this loop?
> 
> Suppose that nfsd4_encode_readv() encounters a second READ operation
> in a COMPOUND, and the two READ operations together comprise more
> than "rsize" total bytes of payload. Each rq_bvec is under the page
> limit, but the total number of pages consumed from rq_pages might
> exceed rq_maxpages.
> 
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 714777c221ed..e2f0fe3f82c0 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -1120,13 +1120,13 @@ __be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
>  		bvec_set_page(&rqstp->rq_bvec[v], *(rqstp->rq_next_page++),
>  			      len, base);
>  		total -= len;
> -		++v;
>  		base = 0;
> +		if (++v >= rqstp->rq_maxpages)
> +			break;

I would have changed the head of the while loop to be

  while (total > 0 && rqstp->rq_next_page < ....)

and then realised that I don't know what to put there.
I don't think that counting up to rq_maxpages is safe when there could
be two READ ops in an NFSv4 Compound.

And if we are cleaning up this function I woul put the increments of v
and rq_next_page next to each other..
</bikeshed>

The patch is an improvement.  I wonder if it is enough.

Thanks,
NeilBrown



>  	}
> -	WARN_ON_ONCE(v > rqstp->rq_maxpages);
>  
> -	trace_nfsd_read_vector(rqstp, fhp, offset, *count);
> -	iov_iter_bvec(&iter, ITER_DEST, rqstp->rq_bvec, v, *count);
> +	trace_nfsd_read_vector(rqstp, fhp, offset, *count - total);
> +	iov_iter_bvec(&iter, ITER_DEST, rqstp->rq_bvec, v, *count - total);
>  	host_err = vfs_iocb_iter_read(file, &kiocb, &iter);
>  	return nfsd_finish_read(rqstp, fhp, file, offset, count, eof, host_err);
>  }
> -- 
> 2.50.0
> 
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] NFSD: Remove WARN_ON_ONCE in nfsd_iter_read()
  2025-09-13  5:37 ` NeilBrown
@ 2025-09-13 16:01   ` Chuck Lever
  2025-09-14  0:34     ` NeilBrown
  0 siblings, 1 reply; 5+ messages in thread
From: Chuck Lever @ 2025-09-13 16:01 UTC (permalink / raw)
  To: NeilBrown
  Cc: Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs,
	Chuck Lever

On 9/13/25 1:37 AM, NeilBrown wrote:
>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
>> index 714777c221ed..e2f0fe3f82c0 100644
>> --- a/fs/nfsd/vfs.c
>> +++ b/fs/nfsd/vfs.c
>> @@ -1120,13 +1120,13 @@ __be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>  		bvec_set_page(&rqstp->rq_bvec[v], *(rqstp->rq_next_page++),
>>  			      len, base);
>>  		total -= len;
>> -		++v;
>>  		base = 0;
>> +		if (++v >= rqstp->rq_maxpages)
>> +			break;
> I would have changed the head of the while loop to be
> 
>   while (total > 0 && rqstp->rq_next_page < ....)
> 
> and then realised that I don't know what to put there.
> I don't think that counting up to rq_maxpages is safe when there could
> be two READ ops in an NFSv4 Compound.
> 
> And if we are cleaning up this function I woul put the increments of v
> and rq_next_page next to each other..
> </bikeshed>
> 
> The patch is an improvement.  I wonder if it is enough.

I'm planning a second patch that adds protection against overrunning
the rq_pages array. Is there anything else you feel is needed?


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] NFSD: Remove WARN_ON_ONCE in nfsd_iter_read()
  2025-09-13 16:01   ` Chuck Lever
@ 2025-09-14  0:34     ` NeilBrown
  0 siblings, 0 replies; 5+ messages in thread
From: NeilBrown @ 2025-09-14  0:34 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs,
	Chuck Lever

On Sun, 14 Sep 2025, Chuck Lever wrote:
> On 9/13/25 1:37 AM, NeilBrown wrote:
> >> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> >> index 714777c221ed..e2f0fe3f82c0 100644
> >> --- a/fs/nfsd/vfs.c
> >> +++ b/fs/nfsd/vfs.c
> >> @@ -1120,13 +1120,13 @@ __be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
> >>  		bvec_set_page(&rqstp->rq_bvec[v], *(rqstp->rq_next_page++),
> >>  			      len, base);
> >>  		total -= len;
> >> -		++v;
> >>  		base = 0;
> >> +		if (++v >= rqstp->rq_maxpages)
> >> +			break;
> > I would have changed the head of the while loop to be
> > 
> >   while (total > 0 && rqstp->rq_next_page < ....)
> > 
> > and then realised that I don't know what to put there.
> > I don't think that counting up to rq_maxpages is safe when there could
> > be two READ ops in an NFSv4 Compound.
> > 
> > And if we are cleaning up this function I woul put the increments of v
> > and rq_next_page next to each other..
> > </bikeshed>
> > 
> > The patch is an improvement.  I wonder if it is enough.
> 
> I'm planning a second patch that adds protection against overrunning
> the rq_pages array. Is there anything else you feel is needed?

It might be easier to review the two patches together, but if we decide
that we aren't worrying about multiple users of the page list I would
rather the code looked like:

	while (total > 0 && v < rqstp->rq_maxpages) {
		len = min_t(size_t, total, PAGE_SIZE - base);
		bvec_set_page(&rqstp->rq_bvec[v], *(rqstp->rq_next_page),
			      len, base);
		total -= len;
		v += 1;
		rqstp->rq_nextpage += 1
		base = 0;
	}

So all the tests are together, all the increments are together.
I don't mind if you keep "++" but as I personally don't much like it
I changed it here.
A possible alternative would be

		bvec_set_page(&rqstp->rq_bvec[v++], *(rqstp->rq_next_page++),
			      len, base);

then again all the increments would be together - I don't mind ++
in this context, I just don't much like it stand-alone.

But if you disagree, leave it as it is.  Probably the part I like the
least is the "break" and the end of the while().  Having multiple exits
from a loop is sometimes needed, but should be avoided when not needed.

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-09-14  0:34 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-11 20:18 [RFC PATCH] NFSD: Remove WARN_ON_ONCE in nfsd_iter_read() Chuck Lever
2025-09-12 13:25 ` Mike Snitzer
2025-09-13  5:37 ` NeilBrown
2025-09-13 16:01   ` Chuck Lever
2025-09-14  0:34     ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox