Linux NFS development
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@kernel.org>
To: Chuck Lever <cel@kernel.org>
Cc: NeilBrown <neil@brown.name>,
	jlayton@kernel.org, okorniev@redhat.com, dai.ngo@oracle.com,
	tom@talpey.com, linux-nfs@vger.kernel.org
Subject: Re: [PATCH v2 4/4] NFSD: Implement NFSD_IO_DIRECT for NFS READ
Date: Thu, 18 Sep 2025 15:01:41 -0400	[thread overview]
Message-ID: <aMxXFcJuTKYbixFB@kernel.org> (raw)
In-Reply-To: <0ab1138f-9085-444a-9e8a-822c29e404bd@kernel.org>

On Thu, Sep 18, 2025 at 11:42:03AM -0700, Chuck Lever wrote:
> On 9/17/25 4:29 PM, NeilBrown wrote:
> >> +/*
> >> + * The byte range of the client's READ request is expanded on both
> >> + * ends until it meets the underlying file system's direct I/O
> >> + * alignment requirements. After the internal read is complete, the
> >> + * byte range of the NFS READ payload is reduced to the byte range
> >> + * that was originally requested.
> >> + *
> >> + * Note that a direct read can be done only when the xdr_buf
> >> + * containing the NFS READ reply does not already have contents in
> >> + * its .pages array. This is due to potentially restrictive
> >> + * alignment requirements on the read buffer. When .page_len and
> >> + * @base are zero, the .pages array is guaranteed to be page-
> >> + * aligned.
> > This para is confusing.
> > It starts talking about the xdr_buf not having any contents.  Then it
> > transitions to a guarantee of page alignment.
> > 
> > If the start of the read requests isn't sufficiently aligned then a gap
> > will be created in the xdr_buf and that can only be handled at the start
> > (using page_base).
> > 
> > So as you say we need page_len to be zero.  But nowhere in the code is
> > this condition tested.
> 
> Despite what the comment claims, I had thought that things would work if
> the payload started at a page boundary in xdr_buf.pages. But I can see
> that page_offset applies only to the first entry in xdr_buf.pages.
> 
> So xdr_buf.page_len does need to be zero. That check can be added in
> nfsd_iter_read().

I had a look at trying to do that, it wasn't obvious given
fs/nfsd/vfs.c doesn't have any direct access or need for xdr_buf.

But I agree just adding that is ideal at this point.

> I prefer this approach over more elaborate checking against the
> dio_mem_alignment parameter because for the overwhelmingly common cases
> of both NFSv3 READ and NFSv4 COMPOUND with one READ operation, page_len
> will be zero. The extra complication is hard to unit-test and will
> almost never be used.

Extra checking might be interesting so nfsd_iov_iter_aligned_bvec()
is used (I've found its needed for the WRITE support).  Maybe an
additional CONFIG_NFSD_DIRECT_VERIFY_DIO_ALIGNMENT option? -- might
seem overkill but I've found the iov_iter checking to really save me.
One of those things that you really only need during development, to
verify you didn't somehow miss something. But for release we don't
need/want it.

Anyway, just a "futures" tangent.. nothing actionable for this patch ;)

Mike

  reply	other threads:[~2025-09-18 19:01 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-17 14:31 [PATCH v2 0/4] NFSD direct I/O read Chuck Lever
2025-09-17 14:31 ` [PATCH v2 1/4] NFSD: Add array bounds-checking in nfsd_iter_read() Chuck Lever
2025-09-17 17:51   ` Mike Snitzer
2025-09-17 14:31 ` [PATCH v2 2/4] NFSD: filecache: add STATX_DIOALIGN and STATX_DIO_READ_ALIGN support Chuck Lever
2025-09-17 14:31 ` [PATCH v2 3/4] NFSD: pass nfsd_file to nfsd_iter_read() Chuck Lever
2025-09-17 14:32 ` [PATCH v2 4/4] NFSD: Implement NFSD_IO_DIRECT for NFS READ Chuck Lever
2025-09-17 23:29   ` NeilBrown
2025-09-18 14:50     ` Mike Snitzer
2025-09-18 15:20       ` Mike Snitzer
2025-09-18 18:42     ` Chuck Lever
2025-09-18 19:01       ` Mike Snitzer [this message]
2025-09-18 16:29   ` Mike Snitzer
2025-09-18 18:27     ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aMxXFcJuTKYbixFB@kernel.org \
    --to=snitzer@kernel.org \
    --cc=cel@kernel.org \
    --cc=dai.ngo@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neil@brown.name \
    --cc=okorniev@redhat.com \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox