From: Mike Snitzer <snitzer@kernel.org>
To: NeilBrown <neilb@suse.de>
Cc: Chuck Lever <chuck.lever@oracle.com>,
linux-nfs@vger.kernel.org, Jeff Layton <jlayton@kernel.org>,
Trond Myklebust <trondmy@hammerspace.com>,
snitzer@hammerspace.com
Subject: Re: [PATCH v6 17/18] nfs: add Documentation/filesystems/nfs/localio.rst
Date: Thu, 20 Jun 2024 20:38:45 -0400 [thread overview]
Message-ID: <ZnTLlRaD7-28onLr@kernel.org> (raw)
In-Reply-To: <ZnTJjKRiAHLz9GxG@kernel.org>
On Thu, Jun 20, 2024 at 08:30:04PM -0400, Mike Snitzer wrote:
> On Fri, Jun 21, 2024 at 09:42:26AM +1000, NeilBrown wrote:
> > On Fri, 21 Jun 2024, Chuck Lever wrote:
> > > On Thu, Jun 20, 2024 at 06:35:38PM -0400, Mike Snitzer wrote:
> > > > On Fri, Jun 21, 2024 at 08:12:56AM +1000, NeilBrown wrote:
> > > > > On Thu, 20 Jun 2024, Chuck Lever wrote:
> > > > > > On Wed, Jun 19, 2024 at 04:40:31PM -0400, Mike Snitzer wrote:
> > > > > > > This document gives an overview of the LOCALIO auxiliary RPC protocol
> > > > > > > added to the Linux NFS client and server (both v3 and v4) to allow a
> > > > > > > client and server to reliably handshake to determine if they are on the
> > > > > > > same host. The LOCALIO auxiliary protocol's implementation, which uses
> > > > > > > the same connection as NFS traffic, follows the pattern established by
> > > > > > > the NFS ACL protocol extension.
> > > > > > >
> > > > > > > The robust handshake between local client and server is just the
> > > > > > > beginning, the ultimate usecase this locality makes possible is the
> > > > > > > client is able to issue reads, writes and commits directly to the server
> > > > > > > without having to go over the network. This is particularly useful for
> > > > > > > container usecases (e.g. kubernetes) where it is possible to run an IO
> > > > > > > job local to the server.
> > > > > > >
> > > > > > > Signed-off-by: Mike Snitzer <snitzer@kernel.org>
> > > > > > > ---
> > > > > > > Documentation/filesystems/nfs/localio.rst | 148 ++++++++++++++++++++++
> > > > > > > include/linux/nfslocalio.h | 2 +
> > > > > > > 2 files changed, 150 insertions(+)
> > > > > > > create mode 100644 Documentation/filesystems/nfs/localio.rst
> > > > > > >
> > > > > > > diff --git a/Documentation/filesystems/nfs/localio.rst b/Documentation/filesystems/nfs/localio.rst
> > > > > > > new file mode 100644
> > > > > > > index 000000000000..a43c3dab2cab
> > > > > > > --- /dev/null
> > > > > > > +++ b/Documentation/filesystems/nfs/localio.rst
> > > > > > > @@ -0,0 +1,148 @@
> > > > > > > +===========
> > > > > > > +NFS localio
> > > > > > > +===========
> > > > > > > +
> > > > > > > +This document gives an overview of the LOCALIO auxiliary RPC protocol
> > > > > > > +added to the Linux NFS client and server (both v3 and v4) to allow a
> > > > > > > +client and server to reliably handshake to determine if they are on the
> > > > > > > +same host. The LOCALIO auxiliary protocol's implementation, which uses
> > > > > > > +the same connection as NFS traffic, follows the pattern established by
> > > > > > > +the NFS ACL protocol extension.
> > > > > > > +
> > > > > > > +The LOCALIO auxiliary protocol is needed to allow robust discovery of
> > > > > > > +clients local to their servers. Prior to this LOCALIO protocol a
> > > > > > > +fragile sockaddr network address based match against all local network
> > > > > > > +interfaces was attempted. But unlike the LOCALIO protocol, the
> > > > > > > +sockaddr-based matching didn't handle use of iptables or containers.
> > > > > > > +
> > > > > > > +The robust handshake between local client and server is just the
> > > > > > > +beginning, the ultimate usecase this locality makes possible is the
> > > > > > > +client is able to issue reads, writes and commits directly to the server
> > > > > > > +without having to go over the network. This is particularly useful for
> > > > > > > +container usecases (e.g. kubernetes) where it is possible to run an IO
> > > > > > > +job local to the server.
> > > > > > > +
> > > > > > > +The performance advantage realized from localio's ability to bypass
> > > > > > > +using XDR and RPC for reads, writes and commits can be extreme, e.g.:
> > > > > > > +fio for 20 secs with 24 libaio threads, 64k directio reads, qd of 8,
> > > > > > > +- With localio:
> > > > > > > + read: IOPS=691k, BW=42.2GiB/s (45.3GB/s)(843GiB/20002msec)
> > > > > > > +- Without localio:
> > > > > > > + read: IOPS=15.7k, BW=984MiB/s (1032MB/s)(19.2GiB/20013msec)
> > > > > > > +
> > > > > > > +RPC
> > > > > > > +---
> > > > > > > +
> > > > > > > +The LOCALIO auxiliary RPC protocol consists of a single "GETUUID" RPC
> > > > > > > +method that allows the Linux nfs client to retrieve a Linux nfs server's
> > > > > > > +uuid. This protocol isn't part of an IETF standard, nor does it need to
> > > > > > > +be considering it is Linux-to-Linux auxiliary RPC protocol that amounts
> > > > > > > +to an implementation detail.
> > > > > > > +
> > > > > > > +The GETUUID method encodes the server's uuid_t in terms of the fixed
> > > > > > > +UUID_SIZE (16 bytes). The fixed size opaque encode and decode XDR
> > > > > > > +methods are used instead of the less efficient variable sized methods.
> > > > > > > +
> > > > > > > +The RPC program number for the NFS_LOCALIO_PROGRAM is currently defined
> > > > > > > +as 0x20000002 (but a request for a unique RPC program number assignment
> > > > > > > +has been submitted to IANA.org).
> > > > > > > +
> > > > > > > +The following approximately describes the LOCALIO in a pseudo rpcgen .x
> > > > > > > +syntax:
> > > > > > > +
> > > > > > > +#define UUID_SIZE 16
> > > > > > > +typedef u8 uuid_t<UUID_SIZE>;
> > > > > > > +
> > > > > > > +program NFS_LOCALIO_PROGRAM {
> > > > > > > + version NULLVERS {
> > > > > > > + void NULL(void) = 0;
> > > > > > > + } = 1;
> > > > > > > + version GETUUIDVERS {
> > > > > > > + uuid_t GETUUID(void) = 1;
> > > > > > > + } = 1;
> > > > > > > +} = 0x20000002;
> > > > > > > +
> > > > > > > +The above is the skeleton for the LOCALIO protocol, it doesn't account
> > > > > > > +for NFS v3 and v4 RPC boilerplate (which also marshalls RPC status) that
> > > > > > > +is used to implement GETUUID.
> > > > > > > +
> > > > > > > +Here are the respective XDR results for nfsd and nfs:
> > > > > >
> > > > > > Hi Mike!
> > > > > >
> > > > > > A protocol spec describes the on-the-wire data formats, not the
> > > > > > in-memory structure layouts. The below C structures are not
> > > > > > relevant to this specification. This should be all you need here,
> > > > > > if I understand your protocol correctly:
> > > > > >
> > > > > > /* raw RFC 9562 UUID */
> > > > > > #define UUID_SIZE 16
> > > > > > typedef u8 uuid_t<UUID_SIZE>;
> > > > > >
> > > > > > union GETUUID1res switch (uint32 status) {
> > > > >
> > > > > I don't think we need a status in the protocol. GETUUID always returns
> > > > > a UUID. There is no possible error condition.
> > > >
> > > > By having localio use NFS's XDR we're able to piggyback on a status
> > > > being returned by standard NFS RPC handling.
> > > >
> > > > See:
> > > > nfs3svc_encode_getuuidres and nfs4svc_encode_getuuidres.
> > > > nfs3_xdr_dec_getuuidres and nfs4_xdr_dec_getuuidres (and note the
> > > > FIXME comment about abusing nfs_opnum4).
> > >
> > > No, let's not piggyback like that. Please make it a separate
> > > XDR implementation just like NFSACL is. Again, LOCALIO is not
> > > an extension of the NFS protocol. Making that claim confuses
> > > people for whom the term "extension" has a very precise meaning.
> > > If we were extending NFS, then yes, adding the new procedures
> > > to the NFS XDR implementation is appropriate, but that's not
> > > what you are doing: you are adding a new side-band protocol.
> >
> > I'm currently working through the LOCALIO protocol code to make it a
> > single version rather than '3' and '4'. In the process I'm making it
> > completely separate from the NFS protocol implementation and cleaning up
> > some other bits. e.g. it shouldn't register with rpcbind.
> >
> > I'll hopefully post patches in a few hours. I writing this now to
> > discourage Mike from starting work on this.
>
> Cool, thanks Neil!
Oh, please base your changes on my latest nfs-localio-for-6.11 branch:
https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/log/?h=nfs-localio-for-6.11
next prev parent reply other threads:[~2024-06-21 0:38 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-19 20:40 [PATCH v6 00/18] nfs/nfsd: add support for localio Mike Snitzer
2024-06-19 20:40 ` [PATCH v6 01/18] nfs: pass nfs_client to nfs_initiate_pgio Mike Snitzer
2024-06-19 20:40 ` [PATCH v6 02/18] nfs: pass descriptor thru nfs_initiate_pgio path Mike Snitzer
2024-06-19 20:40 ` [PATCH v6 03/18] nfs: pass struct file to nfs_init_pgio and nfs_init_commit Mike Snitzer
2024-06-19 20:40 ` [PATCH v6 04/18] sunrpc: add rpcauth_map_to_svc_cred_local Mike Snitzer
2024-06-19 20:40 ` [PATCH v6 05/18] nfs_common: add NFS LOCALIO auxiliary protocol enablement Mike Snitzer
2024-06-21 4:43 ` Jeff Johnson
2024-06-19 20:40 ` [PATCH v6 06/18] nfs/nfsd: add "localio" support Mike Snitzer
2024-06-21 6:08 ` NeilBrown
2024-06-21 23:28 ` Mike Snitzer
2024-06-23 22:27 ` NeilBrown
2024-06-25 4:59 ` Mike Snitzer
2024-06-19 20:40 ` [PATCH v6 07/18] nfsd/localio: manage netns reference in nfsd_open_local_fh Mike Snitzer
2024-06-19 20:40 ` [PATCH v6 08/18] NFS: Enable localio for non-pNFS I/O Mike Snitzer
2024-06-19 20:40 ` [PATCH v6 09/18] pnfs/flexfiles: Enable localio for flexfiles I/O Mike Snitzer
2024-06-19 20:40 ` [PATCH v6 10/18] nfs/localio: use dedicated workqueues for filesystem read and write Mike Snitzer
2024-06-19 20:40 ` [PATCH v6 11/18] nfs: implement v3 and v4 client support for NFS_LOCALIO_PROGRAM Mike Snitzer
2024-06-19 20:40 ` [PATCH v6 12/18] nfsd: implement v3 and v4 server " Mike Snitzer
2024-06-19 20:40 ` [PATCH v6 13/18] nfs/nfsd: consolidate {encode,decode}_opaque_fixed in nfs_xdr.h Mike Snitzer
2024-06-19 20:40 ` [PATCH v6 14/18] nfsd: prepare to use SRCU to dereference nn->nfsd_serv Mike Snitzer
2024-06-19 20:40 ` [PATCH v6 15/18] nfsd: " Mike Snitzer
2024-06-21 6:35 ` NeilBrown
2024-06-21 23:58 ` Mike Snitzer
2024-06-19 20:40 ` [PATCH v6 16/18] nfsd/localio: use SRCU to dereference nn->nfsd_serv in nfsd_open_local_fh Mike Snitzer
2024-06-19 20:40 ` [PATCH v6 17/18] nfs: add Documentation/filesystems/nfs/localio.rst Mike Snitzer
2024-06-20 13:52 ` Chuck Lever
2024-06-20 14:33 ` Mike Snitzer
2024-06-20 14:45 ` Chuck Lever
2024-06-20 22:12 ` NeilBrown
2024-06-20 22:35 ` Mike Snitzer
2024-06-20 23:28 ` Chuck Lever
2024-06-20 23:42 ` NeilBrown
2024-06-21 0:30 ` Mike Snitzer
2024-06-21 0:38 ` Mike Snitzer [this message]
2024-06-21 0:28 ` Mike Snitzer
2024-06-21 2:18 ` Chuck Lever III
2024-06-19 20:40 ` [PATCH v6 18/18] nfs/nfsd: add Kconfig options to allow localio to be enabled Mike Snitzer
2024-06-20 5:04 ` [PATCH v6 00/18] nfs/nfsd: add support for localio Mike Snitzer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZnTLlRaD7-28onLr@kernel.org \
--to=snitzer@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=jlayton@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
--cc=snitzer@hammerspace.com \
--cc=trondmy@hammerspace.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox