From: Mike Snitzer <snitzer@kernel.org>
To: Chuck Lever <cel@kernel.org>
Cc: NeilBrown <neil@brown.name>,
jlayton@kernel.org, okorniev@redhat.com, dai.ngo@oracle.com,
tom@talpey.com, linux-nfs@vger.kernel.org
Subject: Re: [PATCH v2 4/4] NFSD: Implement NFSD_IO_DIRECT for NFS READ
Date: Thu, 18 Sep 2025 15:01:41 -0400 [thread overview]
Message-ID: <aMxXFcJuTKYbixFB@kernel.org> (raw)
In-Reply-To: <0ab1138f-9085-444a-9e8a-822c29e404bd@kernel.org>
On Thu, Sep 18, 2025 at 11:42:03AM -0700, Chuck Lever wrote:
> On 9/17/25 4:29 PM, NeilBrown wrote:
> >> +/*
> >> + * The byte range of the client's READ request is expanded on both
> >> + * ends until it meets the underlying file system's direct I/O
> >> + * alignment requirements. After the internal read is complete, the
> >> + * byte range of the NFS READ payload is reduced to the byte range
> >> + * that was originally requested.
> >> + *
> >> + * Note that a direct read can be done only when the xdr_buf
> >> + * containing the NFS READ reply does not already have contents in
> >> + * its .pages array. This is due to potentially restrictive
> >> + * alignment requirements on the read buffer. When .page_len and
> >> + * @base are zero, the .pages array is guaranteed to be page-
> >> + * aligned.
> > This para is confusing.
> > It starts talking about the xdr_buf not having any contents. Then it
> > transitions to a guarantee of page alignment.
> >
> > If the start of the read requests isn't sufficiently aligned then a gap
> > will be created in the xdr_buf and that can only be handled at the start
> > (using page_base).
> >
> > So as you say we need page_len to be zero. But nowhere in the code is
> > this condition tested.
>
> Despite what the comment claims, I had thought that things would work if
> the payload started at a page boundary in xdr_buf.pages. But I can see
> that page_offset applies only to the first entry in xdr_buf.pages.
>
> So xdr_buf.page_len does need to be zero. That check can be added in
> nfsd_iter_read().
I had a look at trying to do that, it wasn't obvious given
fs/nfsd/vfs.c doesn't have any direct access or need for xdr_buf.
But I agree just adding that is ideal at this point.
> I prefer this approach over more elaborate checking against the
> dio_mem_alignment parameter because for the overwhelmingly common cases
> of both NFSv3 READ and NFSv4 COMPOUND with one READ operation, page_len
> will be zero. The extra complication is hard to unit-test and will
> almost never be used.
Extra checking might be interesting so nfsd_iov_iter_aligned_bvec()
is used (I've found its needed for the WRITE support). Maybe an
additional CONFIG_NFSD_DIRECT_VERIFY_DIO_ALIGNMENT option? -- might
seem overkill but I've found the iov_iter checking to really save me.
One of those things that you really only need during development, to
verify you didn't somehow miss something. But for release we don't
need/want it.
Anyway, just a "futures" tangent.. nothing actionable for this patch ;)
Mike
next prev parent reply other threads:[~2025-09-18 19:01 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-17 14:31 [PATCH v2 0/4] NFSD direct I/O read Chuck Lever
2025-09-17 14:31 ` [PATCH v2 1/4] NFSD: Add array bounds-checking in nfsd_iter_read() Chuck Lever
2025-09-17 17:51 ` Mike Snitzer
2025-09-17 14:31 ` [PATCH v2 2/4] NFSD: filecache: add STATX_DIOALIGN and STATX_DIO_READ_ALIGN support Chuck Lever
2025-09-17 14:31 ` [PATCH v2 3/4] NFSD: pass nfsd_file to nfsd_iter_read() Chuck Lever
2025-09-17 14:32 ` [PATCH v2 4/4] NFSD: Implement NFSD_IO_DIRECT for NFS READ Chuck Lever
2025-09-17 23:29 ` NeilBrown
2025-09-18 14:50 ` Mike Snitzer
2025-09-18 15:20 ` Mike Snitzer
2025-09-18 18:42 ` Chuck Lever
2025-09-18 19:01 ` Mike Snitzer [this message]
2025-09-18 16:29 ` Mike Snitzer
2025-09-18 18:27 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aMxXFcJuTKYbixFB@kernel.org \
--to=snitzer@kernel.org \
--cc=cel@kernel.org \
--cc=dai.ngo@oracle.com \
--cc=jlayton@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neil@brown.name \
--cc=okorniev@redhat.com \
--cc=tom@talpey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.