linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@kernel.org>
To: Jeff Layton <jlayton@kernel.org>
Cc: Chuck Lever <chuck.lever@oracle.com>,
	linux-nfs@vger.kernel.org, hch@lst.de
Subject: Re: [PATCH v2 4/4] NFSD: handle unaligned DIO for NFS reexport
Date: Thu, 31 Jul 2025 18:14:55 -0400	[thread overview]
Message-ID: <aIvq36wplMN_Rsu7@kernel.org> (raw)
In-Reply-To: <4f12862ab8560f788210b0c2d0c7b13a5dffcd70.camel@kernel.org>

On Thu, Jul 31, 2025 at 05:45:31PM -0400, Jeff Layton wrote:
> On Thu, 2025-07-31 at 17:28 -0400, Mike Snitzer wrote:
> > On Thu, Jul 31, 2025 at 04:58:00PM -0400, Jeff Layton wrote:
> > > On Thu, 2025-07-31 at 15:44 -0400, Mike Snitzer wrote:
> > > > NFS doesn't have any DIO alignment constraints but it doesn't support
> > > > STATX_DIOALIGN, so update NFSD such that it doesn't disable the use of
> > > > NFSD_IO_DIRECT if it is reexporting NFS.
> > > > 
> > > > Signed-off-by: Mike Snitzer <snitzer@kernel.org>
> > > > ---
> > > >  fs/nfs/export.c          |  3 ++-
> > > >  fs/nfsd/filecache.c      | 11 +++++++++++
> > > >  include/linux/exportfs.h | 13 +++++++++++++
> > > >  3 files changed, 26 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/fs/nfs/export.c b/fs/nfs/export.c
> > > > index e9c233b6fd209..2cae75ba6b35d 100644
> > > > --- a/fs/nfs/export.c
> > > > +++ b/fs/nfs/export.c
> > > > @@ -155,5 +155,6 @@ const struct export_operations nfs_export_ops = {
> > > >  		 EXPORT_OP_REMOTE_FS		|
> > > >  		 EXPORT_OP_NOATOMIC_ATTR	|
> > > >  		 EXPORT_OP_FLUSH_ON_CLOSE	|
> > > > -		 EXPORT_OP_NOLOCKS,
> > > > +		 EXPORT_OP_NOLOCKS		|
> > > > +		 EXPORT_OP_NO_DIOALIGN_NEEDED,
> > > >  };
> > > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> > > > index 5601e839a72da..ea489dd44fd9a 100644
> > > > --- a/fs/nfsd/filecache.c
> > > > +++ b/fs/nfsd/filecache.c
> > > > @@ -1066,6 +1066,17 @@ nfsd_file_getattr(const struct svc_fh *fhp, struct nfsd_file *nf)
> > > >  	     nfsd_io_cache_write != NFSD_IO_DIRECT))
> > > >  		return nfs_ok;
> > > >  
> > > > +	if (exportfs_handles_unaligned_dio(nf->nf_file->f_path.mnt->mnt_sb->s_export_op)) {
> > > > +		/* Underlying filesystem doesn't support STATX_DIOALIGN
> > > > +		 * but it can handle all unaligned DIO, so establish
> > > > +		 * DIO alignment that is accommodating.
> > > > +		 */
> > > > +		nf->nf_dio_mem_align = 4;
> > > > +		nf->nf_dio_offset_align = PAGE_SIZE;
> > > > +		nf->nf_dio_read_offset_align = nf->nf_dio_offset_align;
> > > > +		return nfs_ok;
> > > > +	}
> > > > +
> > > >  	status = fh_getattr(fhp, &stat);
> > > >  	if (status != nfs_ok)
> > > >  		return status;
> > > > diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
> > > > index 9369a607224c1..626b8486dd985 100644
> > > > --- a/include/linux/exportfs.h
> > > > +++ b/include/linux/exportfs.h
> > > > @@ -247,6 +247,7 @@ struct export_operations {
> > > >  						*/
> > > >  #define EXPORT_OP_FLUSH_ON_CLOSE	(0x20) /* fs flushes file data on close */
> > > >  #define EXPORT_OP_NOLOCKS		(0x40) /* no file locking support */
> > > > +#define EXPORT_OP_NO_DIOALIGN_NEEDED	(0x80) /* fs can handle unaligned DIO */
> > > >  	unsigned long	flags;
> > > >  };
> > > >  
> > > > @@ -262,6 +263,18 @@ exportfs_cannot_lock(const struct export_operations *export_ops)
> > > >  	return export_ops->flags & EXPORT_OP_NOLOCKS;
> > > >  }
> > > >  
> > > > +/**
> > > > + * exportfs_handles_unaligned_dio() - check if export can handle unaligned DIO
> > > > + * @export_ops:	the nfs export operations to check
> > > > + *
> > > > + * Returns true if the export can handle unaligned DIO.
> > > > + */
> > > > +static inline bool
> > > > +exportfs_handles_unaligned_dio(const struct export_operations *export_ops)
> > > > +{
> > > > +	return export_ops->flags & EXPORT_OP_NO_DIOALIGN_NEEDED;
> > > > +}
> > > > +
> > > >  extern int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid,
> > > >  				    int *max_len, struct inode *parent,
> > > >  				    int flags);
> > > 
> > > 
> > > Would it not be simpler (better?) to add support for STATX_DIOALIGN to
> > > NFS, and just have it report '1' for both values?
> > 
> > I suppose adding NFS support for STATX_DIOALIGN, that doesn't actually
> > go over the wire, does make sense.
> > 
> 
> The NFS protocol doesn't have any alignment restrictions. The NFS
> client supports DIO, but doesn't enforce any sort of alignment
> restriction on userland.
> 
> > But I wouldn't think setting them to 1 valid.  Pretty sure they need
> > to be a power-of-2 (since they are used as masks passed to
> > iov_iter_is_aligned).
> > 
> 
> 2^0 == 1   :-)
> 
> This might be a good thing to bring up to the greater fsdevel
> community. What should filesystems that support DIO but don't enforce
> any alignment restrictions report for that attribute?
> 
> '1' would seem to be the natural thing to return in that case. Maybe we
> need to special case that in some of the helpers?
> 
> > In addition, we want to make sure NFS's default DIO alignment (which
> > isn't informed by actual DIO alignment advertised by NFSD's underlying
> > filesystem and hardware, e.g. XFS and NVMe) is able to be compatible
> > with both finer (512b) and coarser (4096b) grained DIO alignment.
> > Only way to achieve that would be to skew toward the coarse-grained
> > end of the spectrum, right?
> > 
> > More conservative but more likely to work with everything.
> > 
> 
> 
> I don't think NFS has ever enforced a particular alignment on userland,
> at least not with regular network transport. Does RDMA change this?

Not _exactly_ sure what you're asking.  But no, as you mentioned, NFS
doesn't have any DIO alignment constraints -- so it certainly isn't
imposing any.  I'm not looking to impose any either.  I'm just trying
to have NFSD and NFS offer a sane response in the reexport case ;)

One that doesn't limit the utility of NFSD doing work to shape the IO
so that it is compatible with the remote NFSD(s) by the time it gets
to an NFSD that _actually_ sits ontop of a local filesystem like XFS.
 
> In any case, I'm fine with taking this for now as a stopgap fix, but we
> should aim to plumb proper support for STATX_DIOALIGN in the client
> sometime soon. Applications are going to start using that attribute,
> and if they get back that it's unsupported, some may fail or fall back
> on buffered I/O on NFS.

That is a valid concern, maybe we'd do well to make it possible for
both NFSD _and_ NFS to avoid going over the wire if all that it is
asked to provide is STATX_DIOALIGN | STATX_DIO_READ_ALIGN (via
request_mask).

Currently, fh_getattr() is used (which expects to be querying a local
filesystem) so it is heavier than we need it to be given we're just
looking to populate the nfsd_file's DIO alignment attrs in
nfsd_file_getattr().

Mike

  reply	other threads:[~2025-07-31 22:14 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-31 19:44 [PATCH v2 0/4] NFSD DIRECT: add handling for misaligned WRITEs Mike Snitzer
2025-07-31 19:44 ` [PATCH v2 1/4] NFSD: refactor nfsd_read_vector_dio to EVENT_CLASS useful for READ and WRITE Mike Snitzer
2025-07-31 20:28   ` Jeff Layton
2025-07-31 19:44 ` [PATCH v2 2/4] NFSD: prepare nfsd_vfs_write() to use O_DIRECT on misaligned WRITEs Mike Snitzer
2025-07-31 20:28   ` Jeff Layton
2025-07-31 20:49     ` Mike Snitzer
2025-07-31 20:54   ` Jeff Layton
2025-07-31 19:44 ` [PATCH v2 3/4] NFSD: issue WRITEs using O_DIRECT even if IO is misaligned Mike Snitzer
2025-07-31 20:53   ` Jeff Layton
2025-07-31 19:44 ` [PATCH v2 4/4] NFSD: handle unaligned DIO for NFS reexport Mike Snitzer
2025-07-31 20:58   ` Jeff Layton
2025-07-31 21:28     ` Mike Snitzer
2025-07-31 21:45       ` Jeff Layton
2025-07-31 22:14         ` Mike Snitzer [this message]
2025-08-01 23:17         ` Tom Talpey
2025-07-31 21:48       ` Mike Snitzer
2025-08-01 14:07         ` Chuck Lever
2025-08-01 14:33           ` Jeff Layton
2025-08-01 16:06             ` Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aIvq36wplMN_Rsu7@kernel.org \
    --to=snitzer@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=hch@lst.de \
    --cc=jlayton@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).