public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.com>
To: Murphy Zhou <jencce.kernel@gmail.com>, linux-nfs@vger.kernel.org
Cc: bfields@fieldses.org, jlayton@kernel.org,
	Murphy Zhou <jencce.kernel@gmail.com>
Subject: Re: [PATCH] nfsd/nfsd3_proc_readdir: check before encoding on temporary page
Date: Mon, 01 Apr 2019 12:20:28 +1100	[thread overview]
Message-ID: <875zry34ir.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <20190329142332.12010-1-jencce.kernel@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5435 bytes --]

On Fri, Mar 29 2019, Murphy Zhou wrote:

> After this commit
>   f875a79 nfsd: allow nfsv3 readdir request to be larger.
> nfsv3 readdir request size can be larger than PAGE_SIZE. So if the
> request and the directory are large enough, we can run out of pages
> in rq_respages. Then the temporary page we are encoding on is a
> random page, Oops happen.
>
> Listing a directory with 30000 files in it can trigger the panic.
>
> Fixing this by ensuring the temporary page resides in rq_respages.
> And when counting how many bytes left currently from buffer to the
> end of the page which buffer is pointing to, aka len1, do not
> assume that curr_page_addr is at the beginning of the same page.
> Also, update resp->count by going through all pages. Because the
> pages may not be sequential, the old way is not safe.
>
> Fixes: f875a79 "nfsd: allow nfsv3 readdir request to be larger"
> Signed-off-by: Murphy Zhou <jencce.kernel@gmail.com>

Hi,
 thanks for finding these problems and submitting a patch.  Clearly I
 should have tested better :-(

> ---
>  fs/nfsd/nfs3proc.c | 17 +++++++++++++++--
>  fs/nfsd/nfs3xdr.c  |  8 +++++---
>  2 files changed, 20 insertions(+), 5 deletions(-)
>
> diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
> index 8f933e84cec1..9bc32af4e2da 100644
> --- a/fs/nfsd/nfs3proc.c
> +++ b/fs/nfsd/nfs3proc.c
> @@ -442,7 +442,9 @@ nfsd3_proc_readdir(struct svc_rqst *rqstp)
>  	struct nfsd3_readdirargs *argp = rqstp->rq_argp;
>  	struct nfsd3_readdirres  *resp = rqstp->rq_resp;
>  	__be32		nfserr;
> -	int		count;
> +	int		count = 0;
> +	struct page	**p;
> +	caddr_t		page_addr = NULL;
>  
>  	dprintk("nfsd: READDIR(3)  %s %d bytes at %d\n",
>  				SVCFH_fmt(&argp->fh),
> @@ -462,7 +464,18 @@ nfsd3_proc_readdir(struct svc_rqst *rqstp)
>  	nfserr = nfsd_readdir(rqstp, &resp->fh, (loff_t*) &argp->cookie, 
>  					&resp->common, nfs3svc_encode_entry);
>  	memcpy(resp->verf, argp->verf, 8);
> -	resp->count = resp->buffer - argp->buffer;
> +	count = 0;
> +	for (p = rqstp->rq_respages + 1; p < rqstp->rq_next_page; p++) {
> +		page_addr = page_address(*p);
> +
> +		if (((caddr_t)resp->buffer >= page_addr) &&
> +		    ((caddr_t)resp->buffer < page_addr + PAGE_SIZE)) {
> +			count += (caddr_t)resp->buffer - page_addr;
> +			break;
> +		}
> +		count += PAGE_SIZE;
> +	}
> +	resp->count = count >> 2;
>  	if (resp->offset) {
>  		loff_t offset = argp->cookie;

This section makes perfect sense.  You have copied code from
nfsd3_proc_readdirplus() int nfsd3_proc_readdir().  This is needed
because readdir doesn't limit replies to PAGE_SIZE any more.

>  
> diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
> index 93fea246f676..1fabf1952bdb 100644
> --- a/fs/nfsd/nfs3xdr.c
> +++ b/fs/nfsd/nfs3xdr.c
> @@ -953,6 +953,7 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen,
>  			break;
>  	}
>  
> +

Adding blank lines is best avoided.

>  	if ((caddr_t)(cd->buffer + elen) < (curr_page_addr + PAGE_SIZE)) {
>  		/* encode entry in current page */
>  
> @@ -961,11 +962,12 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen,
>  		if (plus)
>  			p = encode_entryplus_baggage(cd, p, name, namlen, ino);
>  		num_entry_words = p - cd->buffer;
> -	} else if (*(page+1) != NULL) {
> +	} else if (*(page+1) != NULL && (page+1) < cd->rqstp->rq_next_page) {

Adding the test against rq_next_page looks correct.
nfs3svc_decode_readdirplusargs increments ->rq_next_page over enough
pages to cover the request count.
However nfs4svc_decode_readdirargs doesn't!  So it will not advance
rq_next_page properly.  We need to fix that to make it like
nfs3svc_decode_readdirplusargs().

But I don't think this test should be needed.  svc_alloc_arg() NULL-
terminates the ->rq_pages array which ->rq_next_page points in to.

So I think this test is correct as it stands, but
nfs4svc_decode_readdirargs() needs to be fixed.

>  		/* temporarily encode entry into next page, then move back to
>  		 * current and next page in rq_respages[] */
>  		__be32 *p1, *tmp;
>  		int len1, len2;
> +		caddr_t	tmp_page_addr = NULL;
>  
>  		/* grab next page for temporary storage of entry */
>  		p1 = tmp = page_address(*(page+1));
> @@ -977,7 +979,8 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen,
>  
>  		/* determine entry word length and lengths to go in pages */
>  		num_entry_words = p1 - tmp;
> -		len1 = curr_page_addr + PAGE_SIZE - (caddr_t)cd->buffer;
> +		tmp_page_addr = (caddr_t)((unsigned long)cd->buffer & PAGE_MASK);
> +		len1 = tmp_page_addr + PAGE_SIZE - (caddr_t)cd->buffer;

I think tmp_page_addr will always be the same as curr_page_addr.
At least, it will when nfs4svc_decode_readdirargs() is fixed.

Could you please revert the changed you've made to nfs3xdr.c (keeping
the change to nfsd3_proc_readdir()), and fix nfs4svc_decode_readdirargs
to be more like nfs4svc_decode_readdirplusargs, and see if that fixed
the problem?

Thanks a lot,
NeilBrown


>  		if ((num_entry_words << 2) < len1) {
>  			/* the actual number of words in the entry is less
>  			 * than elen and can still fit in the current page
> @@ -1026,7 +1029,6 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen,
>  	cd->buffer = p;
>  	cd->common.err = nfs_ok;
>  	return 0;
> -
>  }
>  
>  int
> -- 
> 2.21.0

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

  reply	other threads:[~2019-04-01  1:20 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-29 14:23 [PATCH] nfsd/nfsd3_proc_readdir: check before encoding on temporary page Murphy Zhou
2019-04-01  1:20 ` NeilBrown [this message]
2019-04-04  2:55   ` Murphy Zhou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=875zry34ir.fsf@notabene.neil.brown.name \
    --to=neilb@suse.com \
    --cc=bfields@fieldses.org \
    --cc=jencce.kernel@gmail.com \
    --cc=jlayton@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox