linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org
Subject: Re: [PATCH v1 4/4] NFSD: Handle full-length symlinks
Date: Wed, 1 Aug 2018 10:14:06 -0400	[thread overview]
Message-ID: <20180801141406.GD16651@fieldses.org> (raw)
In-Reply-To: <20180727151910.21878.49021.stgit@klimt.1015granger.net>

On Fri, Jul 27, 2018 at 11:19:10AM -0400, Chuck Lever wrote:
> I've given up on the idea of zero-copy handling of SYMLINK on the
> server side. This is because the Linux VFS symlink API requires the
> symlink pathname to be in a NUL-terminated kmalloc'd buffer. The
> NUL-termination is going to be problematic (watching out for
> landing on a page boundary and dealing with a 4096-byte pathname).
> 
> I don't believe that SYMLINK creation is on a performance path or is
> requested frequently enough that it will cause noticeable CPU cache
> pollution due to data copies.

Sounds fine.

The limits here are weird.  In the v2 case the maximum length is
NFS_MAXPATHLEN == 1024.  OK, I see, I guess that limit was imposed by
the NFSv2 protocol.  In the v3 case it's NFS3_MAXPATHLEN == PATH_MAX ==
4096.  But we were imposing an artificial max of 4095 just so we could
fit the result in one page with null termination.

OK, makes sense to me (though I can't recall anyone actually complaining
about that limit).

--b.


> 
> There will be two places where a transport callout will be necessary
> to fill in the rqstp: one will be in the svc_fill_symlink_pathname()
> helper that is used by NFSv2 and NFSv3, and the other will be in
> nfsd4_decode_create().
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfsd/nfs3proc.c         |    2 +
>  fs/nfsd/nfsproc.c          |    2 +
>  include/linux/sunrpc/svc.h |    3 +-
>  net/sunrpc/svc.c           |   67 ++++++++++++++++----------------------------
>  4 files changed, 31 insertions(+), 43 deletions(-)
> 
> diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
> index 8d1c2d1a..9eb8086 100644
> --- a/fs/nfsd/nfs3proc.c
> +++ b/fs/nfsd/nfs3proc.c
> @@ -290,6 +290,7 @@
>  		RETURN_STATUS(nfserr_nametoolong);
>  
>  	argp->tname = svc_fill_symlink_pathname(rqstp, &argp->first,
> +						page_address(rqstp->rq_arg.pages[0]),
>  						argp->tlen);
>  	if (IS_ERR(argp->tname))
>  		RETURN_STATUS(nfserrno(PTR_ERR(argp->tname)));
> @@ -303,6 +304,7 @@
>  	fh_init(&resp->fh, NFS3_FHSIZE);
>  	nfserr = nfsd_symlink(rqstp, &resp->dirfh, argp->fname, argp->flen,
>  						   argp->tname, &resp->fh);
> +	kfree(argp->tname);
>  	RETURN_STATUS(nfserr);
>  }
>  
> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
> index a6faee5..0d20fd1 100644
> --- a/fs/nfsd/nfsproc.c
> +++ b/fs/nfsd/nfsproc.c
> @@ -454,6 +454,7 @@
>  		return nfserr_nametoolong;
>  
>  	argp->tname = svc_fill_symlink_pathname(rqstp, &argp->first,
> +						page_address(rqstp->rq_arg.pages[0]),
>  						argp->tlen);
>  	if (IS_ERR(argp->tname))
>  		return nfserrno(PTR_ERR(argp->tname));
> @@ -466,6 +467,7 @@
>  	nfserr = nfsd_symlink(rqstp, &argp->ffh, argp->fname, argp->flen,
>  						 argp->tname, &newfh);
>  
> +	kfree(argp->tname);
>  	fh_put(&argp->ffh);
>  	fh_put(&newfh);
>  	return nfserr;
> diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
> index 43f88bd..73e130a 100644
> --- a/include/linux/sunrpc/svc.h
> +++ b/include/linux/sunrpc/svc.h
> @@ -499,7 +499,8 @@ unsigned int	   svc_fill_write_vector(struct svc_rqst *rqstp,
>  					 struct page **pages,
>  					 struct kvec *first, size_t total);
>  char		  *svc_fill_symlink_pathname(struct svc_rqst *rqstp,
> -					     struct kvec *first, size_t total);
> +					     struct kvec *first, void *p,
> +					     size_t total);
>  
>  #define	RPC_MAX_ADDRBUFLEN	(63U)
>  
> diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
> index 2194ed5..d13e05f 100644
> --- a/net/sunrpc/svc.c
> +++ b/net/sunrpc/svc.c
> @@ -1577,65 +1577,48 @@ unsigned int svc_fill_write_vector(struct svc_rqst *rqstp, struct page **pages,
>   * svc_fill_symlink_pathname - Construct pathname argument for VFS symlink call
>   * @rqstp: svc_rqst to operate on
>   * @first: buffer containing first section of pathname
> + * @p: buffer containing remaining section of pathname
>   * @total: total length of the pathname argument
>   *
> - * Returns pointer to a NUL-terminated string, or an ERR_PTR. The buffer is
> - * released automatically when @rqstp is recycled.
> + * The VFS symlink API demands a NUL-terminated pathname in mapped memory.
> + * Returns pointer to a NUL-terminated string, or an ERR_PTR. Caller must free
> + * the returned string.
>   */
>  char *svc_fill_symlink_pathname(struct svc_rqst *rqstp, struct kvec *first,
> -				size_t total)
> +				void *p, size_t total)
>  {
> -	struct xdr_buf *arg = &rqstp->rq_arg;
> -	struct page **pages;
> -	char *result;
> -
> -	/* VFS API demands a NUL-terminated pathname. This function
> -	 * uses a page from @rqstp as the pathname buffer, to enable
> -	 * direct placement. Thus the total buffer size is PAGE_SIZE.
> -	 * Space in this buffer for NUL-termination requires that we
> -	 * cap the size of the returned symlink pathname just a
> -	 * little early.
> -	 */
> -	if (total > PAGE_SIZE - 1)
> -		return ERR_PTR(-ENAMETOOLONG);
> +	size_t len, remaining;
> +	char *result, *dst;
>  
> -	/* Some types of transport can present the pathname entirely
> -	 * in rq_arg.pages. If not, then copy the pathname into one
> -	 * page.
> -	 */
> -	pages = arg->pages;
> -	WARN_ON_ONCE(arg->page_base != 0);
> -	if (first->iov_base == 0) {
> -		result = page_address(*pages);
> -		result[total] = '\0';
> -	} else {
> -		size_t len, remaining;
> -		char *dst;
> +	result = kmalloc(total + 1, GFP_KERNEL);
> +	if (!result)
> +		return ERR_PTR(-ESERVERFAULT);
>  
> -		result = page_address(*(rqstp->rq_next_page++));
> -		dst = result;
> -		remaining = total;
> +	dst = result;
> +	remaining = total;
>  
> -		len = min_t(size_t, total, first->iov_len);
> +	len = min_t(size_t, total, first->iov_len);
> +	if (len) {
>  		memcpy(dst, first->iov_base, len);
>  		dst += len;
>  		remaining -= len;
> +	}
>  
> -		/* No more than one page left */
> -		if (remaining) {
> -			len = min_t(size_t, remaining, PAGE_SIZE);
> -			memcpy(dst, page_address(*pages), len);
> -			dst += len;
> -		}
> -
> -		*dst = '\0';
> +	if (remaining) {
> +		len = min_t(size_t, remaining, PAGE_SIZE);
> +		memcpy(dst, p, len);
> +		dst += len;
>  	}
>  
> -	/* Sanity check: we don't allow the pathname argument to
> +	*dst = '\0';
> +
> +	/* Sanity check: Linux doesn't allow the pathname argument to
>  	 * contain a NUL byte.
>  	 */
> -	if (strlen(result) != total)
> +	if (strlen(result) != total) {
> +		kfree(result);
>  		return ERR_PTR(-EINVAL);
> +	}
>  	return result;
>  }
>  EXPORT_SYMBOL_GPL(svc_fill_symlink_pathname);

  reply	other threads:[~2018-08-01 16:00 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-27 15:18 [PATCH v1 0/4] NFS/RDMA server patches for v4.19 Chuck Lever
2018-07-27 15:18 ` [PATCH v1 1/4] svcrdma: Avoid releasing a page in svc_xprt_release() Chuck Lever
2018-07-27 15:18 ` [PATCH v1 2/4] svcrdma: Clean up Read chunk path Chuck Lever
2018-07-27 15:19 ` [PATCH v1 3/4] NFSD: Refactor the generic write vector fill helper Chuck Lever
2018-11-15 16:32   ` J. Bruce Fields
2018-11-16  0:19     ` Chuck Lever
2018-11-19 19:54       ` Bruce Fields
2018-11-19 19:58         ` Chuck Lever
2018-11-19 19:59           ` Bruce Fields
2018-07-27 15:19 ` [PATCH v1 4/4] NFSD: Handle full-length symlinks Chuck Lever
2018-08-01 14:14   ` J. Bruce Fields [this message]
2018-08-01 14:16     ` Chuck Lever
2018-08-01 14:36 ` [PATCH v1 0/4] NFS/RDMA server patches for v4.19 J. Bruce Fields
2018-08-01 15:14   ` Chuck Lever
2018-08-01 15:20     ` Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180801141406.GD16651@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).