All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@infradead.org>
To: Chuck Lever <cel@kernel.org>
Cc: NeilBrown <neil@brown.name>, Jeff Layton <jlayton@kernel.org>,
	Olga Kornievskaia <okorniev@redhat.com>,
	Dai Ngo <dai.ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
	linux-nfs@vger.kernel.org, Chuck Lever <chuck.lever@oracle.com>,
	Mike Snitzer <snitzer@kernel.org>
Subject: Re: [PATCH v4 4/4] NFSD: Implement NFSD_IO_DIRECT for NFS READ
Date: Mon, 29 Sep 2025 00:29:01 -0700	[thread overview]
Message-ID: <aNo1PdbeHsd_rpgl@infradead.org> (raw)
In-Reply-To: <20250926145151.59941-5-cel@kernel.org>

On Fri, Sep 26, 2025 at 10:51:51AM -0400, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> Add an experimental option that forces NFS READ operations to use
> direct I/O instead of reading through the NFS server's page cache.
> 
> There are already other layers of caching:
>  - The page cache on NFS clients
>  - The block device underlying the exported file system

What layer of caching is in the "block device" ?

> +nfsd_direct_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
> +		 struct nfsd_file *nf, loff_t offset, unsigned long *count,
> +		 u32 *eof)
> +{
> +	loff_t dio_start, dio_end;
> +	unsigned long v, total;
> +	struct iov_iter iter;
> +	struct kiocb kiocb;
> +	ssize_t host_err;
> +	size_t len;
> +
> +	init_sync_kiocb(&kiocb, nf->nf_file);
> +	kiocb.ki_flags |= IOCB_DIRECT;
> +
> +	/* Read a properly-aligned region of bytes into rq_bvec */
> +	dio_start = round_down(offset, nf->nf_dio_read_offset_align);
> +	dio_end = round_up(offset + *count, nf->nf_dio_read_offset_align);
> +
> +	kiocb.ki_pos = dio_start;
> +
> +	v = 0;
> +	total = dio_end - dio_start;
> +	while (total && v < rqstp->rq_maxpages &&
> +	       rqstp->rq_next_page < rqstp->rq_page_end) {
> +		len = min_t(size_t, total, PAGE_SIZE);
> +		bvec_set_page(&rqstp->rq_bvec[v], *rqstp->rq_next_page,
> +			      len, 0);
> +
> +		total -= len;
> +		++rqstp->rq_next_page;
> +		++v;
> +	}
> +
> +	trace_nfsd_read_direct(rqstp, fhp, offset, *count - total);
> +	iov_iter_bvec(&iter, ITER_DEST, rqstp->rq_bvec, v,
> +		      dio_end - dio_start - total);
> +
> +	host_err = vfs_iocb_iter_read(nf->nf_file, &kiocb, &iter);
> +	if (host_err >= 0) {
> +		unsigned int pad = offset - dio_start;
> +
> +		/* The returned payload starts after the pad */
> +		rqstp->rq_res.page_base = pad;
> +
> +		/* Compute the count of bytes to be returned */
> +		if (host_err > pad + *count) {
> +			host_err = *count;
> +		} else if (host_err > pad) {
> +			host_err -= pad;
> +		} else {
> +			host_err = 0;
> +		}

No need for the braces here.

> +	} else if (unlikely(host_err == -EINVAL)) {
> +		pr_info_ratelimited("nfsd: Unexpected direct I/O alignment failure\n");
> +		host_err = -ESERVERFAULT;
> +	}

You'll probably want to print s_id to identify the file syste for which
this happened.  What is -ESERVERFAULT supposed to mean, btw?

> +	case NFSD_IO_DIRECT:
> +		if (nf->nf_dio_read_offset_align &&

I guess this is the "is direct I/O actually supported" check?  I guess
it'll work, but will underreport as very few file system actually
report the requirement at the moment.  Can you add a comment explaining
the check?


  parent reply	other threads:[~2025-09-29  7:29 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-26 14:51 [PATCH v4 0/4] NFSD direct I/O read Chuck Lever
2025-09-26 14:51 ` [PATCH v4 1/4] NFSD: filecache: add STATX_DIOALIGN and STATX_DIO_READ_ALIGN support Chuck Lever
2025-09-26 14:51 ` [PATCH v4 2/4] NFSD: pass nfsd_file to nfsd_iter_read() Chuck Lever
2025-09-26 14:51 ` [PATCH v4 3/4] NFSD: Relocate the xdr_reserve_space_vec() call site Chuck Lever
2025-09-29  3:55   ` NeilBrown
2025-09-29 11:49   ` Jeff Layton
2025-09-26 14:51 ` [PATCH v4 4/4] NFSD: Implement NFSD_IO_DIRECT for NFS READ Chuck Lever
2025-09-29  4:02   ` NeilBrown
2025-09-29  7:29   ` Christoph Hellwig [this message]
2025-09-29 13:03     ` Chuck Lever
2025-09-29 13:06       ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aNo1PdbeHsd_rpgl@infradead.org \
    --to=hch@infradead.org \
    --cc=cel@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=dai.ngo@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neil@brown.name \
    --cc=okorniev@redhat.com \
    --cc=snitzer@kernel.org \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.