* Security issue in NFS localio
@ 2024-07-03 22:24 NeilBrown
2024-07-03 23:35 ` Jeff Layton
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: NeilBrown @ 2024-07-03 22:24 UTC (permalink / raw)
To: Mike Snitzer, Jeff Layton, Chuck Lever
Cc: Anna Schumaker, Trond Myklebust, Christoph Hellwig, linux-nfs
I've been pondering security questions with localio - particularly
wondering what questions I need to ask. I've found three focal points
which overlap but help me organise my thoughts:
1- the LOCALIO RPC protocol
2- the 'auth_domain' that nfsd uses to authorise access
3- the credential that is used to access the file
1/ It occurs to me that I could find out the UUID reported by a given
local server (just ask it over the RPC connection), find out the
filehandle for some file that I don't have write access to (not too
hard), and create a private NFS server (hacking nfs-ganasha?) which
reports the same uuid and reports that I have access to a file with
that filehandle. If I then mount from that server inside a private
container on the same host that is running the local server, I would get
localio access to the target file.
I might not be able to write to it because of credential checking, but I
think that is getting a lot closer to unauthorised access than I would
like.
I would much prefer it if there was no credible way to subvert the
LOCALIO protocol.
My current idea goes like this:
- NFS client tells nfs_common it is going to probe for localio
and gets back a nonce. nfs_common records that this probe is happening
- NFS client sends the nonce to the server over LOCALIO.
- server tells nfs_common "I just got this nonce - does it mean
anything?". If it does, the server gets connected with the client
through nfs_common. The server reports success over LOCALIO.
If it doesn't the server reports failure of LOCALIO.
- NFS client gets the reply and tells nfs_common that it has a reply
so the nonce is invalidated. If the reply was success and nfs_local
confirms there is a connection, then the two stay connected.
I think that having a nonce (single-use uuid) is better than using the
same uuid for the life of the server, and I think that sending it
proactively by client rather than reactively by the server is also
safer.
2/ The localio access should use exactly the same auth_domain as the
network access uses. This ensure the credentials implied by
rootsquash and allsquash are used correctly. I think the current
code has the client guessing what IP address the server will see and
finding an auth_domain based on that. I'm not comfortable with that.
In the new LOCALIO protocol I suggest above, the server registers
with nfs_common at the moment it receives an RPC request. At that
moment it knows the characteristics of the connection - remote IP?
krb5? tls? - and can determine an auth_domain and give it to
nfs_common and so make it available to the client.
Jeff wondered about an export option to explicitly enable LOCALIO. I
had wondered about that too. But I think that if we firmly tie the
localio auth_domain to the connection as above, that shouldn't be needed.
3/ The current code uses the 'struct cred' of the application to look up
the file in the server code. When a request goes over the wire the
credential is translated to uid/gid (or krb identity) and this is
mapped back to a credential on the server which might be in a
different uid name space (might it? Does that even work for nfsd?)
I think that if rootsquash or allsquash is in effect the correct
server-side credential is used but otherwise the client-side
credential is used. That is likely correct in many cases but I'd
like to be convinced that it is correct in all case. Maybe it is
time to get a deeper understanding of uid name spaces.
Have I missed anything? Any other thoughts?
Thanks,
NeilBrown
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: Security issue in NFS localio 2024-07-03 22:24 Security issue in NFS localio NeilBrown @ 2024-07-03 23:35 ` Jeff Layton 2024-07-04 4:31 ` Mike Snitzer ` (2 subsequent siblings) 3 siblings, 0 replies; 12+ messages in thread From: Jeff Layton @ 2024-07-03 23:35 UTC (permalink / raw) To: NeilBrown, Mike Snitzer, Chuck Lever Cc: Anna Schumaker, Trond Myklebust, Christoph Hellwig, linux-nfs On Thu, 2024-07-04 at 08:24 +1000, NeilBrown wrote: > > I've been pondering security questions with localio - particularly > wondering what questions I need to ask. I've found three focal points > which overlap but help me organise my thoughts: > 1- the LOCALIO RPC protocol > 2- the 'auth_domain' that nfsd uses to authorise access > 3- the credential that is used to access the file > > 1/ It occurs to me that I could find out the UUID reported by a given > local server (just ask it over the RPC connection), find out the > filehandle for some file that I don't have write access to (not too > hard), and create a private NFS server (hacking nfs-ganasha?) which > reports the same uuid and reports that I have access to a file with > that filehandle. If I then mount from that server inside a private > container on the same host that is running the local server, I would get > localio access to the target file. > > I might not be able to write to it because of credential checking, but I > think that is getting a lot closer to unauthorised access than I would > like. > > I would much prefer it if there was no credible way to subvert the > LOCALIO protocol. > > My current idea goes like this: > - NFS client tells nfs_common it is going to probe for localio > and gets back a nonce. nfs_common records that this probe is happening > - NFS client sends the nonce to the server over LOCALIO. > - server tells nfs_common "I just got this nonce - does it mean > anything?". If it does, the server gets connected with the client > through nfs_common. The server reports success over LOCALIO. > If it doesn't the server reports failure of LOCALIO. > - NFS client gets the reply and tells nfs_common that it has a reply > so the nonce is invalidated. If the reply was success and nfs_local > confirms there is a connection, then the two stay connected. > > I think that having a nonce (single-use uuid) is better than using the > same uuid for the life of the server, and I think that sending it > proactively by client rather than reactively by the server is also > safer. > I like this idea. That does sound much safer. > 2/ The localio access should use exactly the same auth_domain as the > network access uses. This ensure the credentials implied by > rootsquash and allsquash are used correctly. I think the current > code has the client guessing what IP address the server will see and > finding an auth_domain based on that. I'm not comfortable with that. > > In the new LOCALIO protocol I suggest above, the server registers > with nfs_common at the moment it receives an RPC request. At that > moment it knows the characteristics of the connection - remote IP? > krb5? tls? - and can determine an auth_domain and give it to > nfs_common and so make it available to the client. > The current localio code does this: + /* Note: we're connecting to ourself, so source addr == peer addr */ + rqstp->rq_addrlen = rpc_peeraddr(rpc_clnt, + (struct sockaddr *)&rqstp->rq_addr, + sizeof(rqstp->rq_addr)); ...which I guess means that we're setting this to the server's address? That does seem like it might allow a client in another namespace to bypass export permissions. I think your idea about associating an auth_domain should fix that. > Jeff wondered about an export option to explicitly enable LOCALIO. I > had wondered about that too. But I think that if we firmly tie the > localio auth_domain to the connection as above, that shouldn't be needed. > > 3/ The current code uses the 'struct cred' of the application to look up > the file in the server code. When a request goes over the wire the > credential is translated to uid/gid (or krb identity) and this is > mapped back to a credential on the server which might be in a > different uid name space (might it? Does that even work for nfsd?) > > I think that if rootsquash or allsquash is in effect the correct > server-side credential is used but otherwise the client-side > credential is used. That is likely correct in many cases but I'd > like to be convinced that it is correct in all case. Maybe it is > time to get a deeper understanding of uid name spaces. > I'll have to consider this #3 over some July 4th libations! -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Security issue in NFS localio 2024-07-03 22:24 Security issue in NFS localio NeilBrown 2024-07-03 23:35 ` Jeff Layton @ 2024-07-04 4:31 ` Mike Snitzer 2024-07-05 21:39 ` NeilBrown 2024-07-04 5:45 ` Christoph Hellwig 2024-07-04 19:00 ` Chuck Lever III 3 siblings, 1 reply; 12+ messages in thread From: Mike Snitzer @ 2024-07-04 4:31 UTC (permalink / raw) To: NeilBrown Cc: Jeff Layton, Chuck Lever, Anna Schumaker, Trond Myklebust, Christoph Hellwig, linux-nfs On Thu, Jul 04, 2024 at 08:24:44AM +1000, NeilBrown wrote: > > I've been pondering security questions with localio - particularly > wondering what questions I need to ask. I've found three focal points > which overlap but help me organise my thoughts: > 1- the LOCALIO RPC protocol > 2- the 'auth_domain' that nfsd uses to authorise access > 3- the credential that is used to access the file I'm missing some details that are critical to giving your hypothetical attack vector teeth. But I do think no matter what we need to be precise with documenting localio's security in localio.rst. > 1/ It occurs to me that I could find out the UUID reported by a given > local server (just ask it over the RPC connection), find out the > filehandle for some file that I don't have write access to (not too > hard), and create a private NFS server (hacking nfs-ganasha?) which > reports the same uuid and reports that I have access to a file with > that filehandle. If I then mount from that server inside a private > container on the same host that is running the local server, I would get > localio access to the target file. Even if nfs-ganasha were modified to allow the nfs client's nfs_init_localioclient() to establish access to the 'struct net' associated with the knfsd (rpc_bind_new_program -> GETUUID -> nfsd_uuid_is_local): what does that _really_ give the nfs client? The nfs client still needs to call into nfsd with nfsd_open_local_fh() which requires rpc_clnt->cl_auth to allow access to the server's files. In an earlier email Chuck asked why localio's svcauth_unix_set_client() would ever fail, and he correctly answered his question with: "Wouldn't it only be because the local application is trying to open a file it doesn't have permission to?" Yes, if a client never actually negotiated with the NFS server that provides access to protected files, then the client must not be granted access. _This_ is why having a "fake" svc_rqst is important. [Also "fake" isn't a great name considering it is still meant to enforce required established NFS credentials.] > I might not be able to write to it because of credential checking, but I > think that is getting a lot closer to unauthorised access than I would > like. Kind of hand-wavy on the finale... but I'm also relieved ;) That said, I appreciate the desire to avoid the "fake" svc_rqst based access control. So I still agree it is worthwhile to carry your nfsd_file_acquire_local() series through to completion. > I would much prefer it if there was no credible way to subvert the > LOCALIO protocol. > > My current idea goes like this: > - NFS client tells nfs_common it is going to probe for localio > and gets back a nonce. nfs_common records that this probe is happening > - NFS client sends the nonce to the server over LOCALIO. > - server tells nfs_common "I just got this nonce - does it mean > anything?". If it does, the server gets connected with the client > through nfs_common. The server reports success over LOCALIO. > If it doesn't the server reports failure of LOCALIO. > - NFS client gets the reply and tells nfs_common that it has a reply > so the nonce is invalidated. If the reply was success and nfs_local > confirms there is a connection, then the two stay connected. > I think that having a nonce (single-use uuid) is better than using the > same uuid for the life of the server, and I think that sending it > proactively by client rather than reactively by the server is also > safer. Inverting and tweaking the localio protocol like this is clever, but it still strikes me as unnecessary (given above). Having it be less widely accessible is a good idea in general though. > 2/ The localio access should use exactly the same auth_domain as the > network access uses. This ensure the credentials implied by > rootsquash and allsquash are used correctly. I think the current > code has the client guessing what IP address the server will see and > finding an auth_domain based on that. I'm not comfortable with that. nfsd_local_fakerqst_create() isn't guessing. rpc_peeraddr() returns the IP address of the server because the rpc_clnt is the same as established for traditional network access. > In the new LOCALIO protocol I suggest above, the server registers > with nfs_common at the moment it receives an RPC request. At that > moment it knows the characteristics of the connection - remote IP? > krb5? tls? - and can determine an auth_domain and give it to > nfs_common and so make it available to the client. > > Jeff wondered about an export option to explicitly enable LOCALIO. I > had wondered about that too. But I think that if we firmly tie the > localio auth_domain to the connection as above, that shouldn't be needed. I do have concerns that your approach to use "exactly the same auth_domain" isn't so much better than what the current localio code does. But I'll concede it sounds better than the hackish "fake" svc_rqst based security of the current localio code. Worth doing just to see how it all shakes out in benchmarks, and on a "better approach" level, but I do currently have "speed kills" concerns. > 3/ The current code uses the 'struct cred' of the application to look up > the file in the server code. When a request goes over the wire the > credential is translated to uid/gid (or krb identity) and this is > mapped back to a credential on the server which might be in a > different uid name space (might it? Does that even work for nfsd?) > > I think that if rootsquash or allsquash is in effect the correct > server-side credential is used but otherwise the client-side > credential is used. That is likely correct in many cases but I'd > like to be convinced that it is correct in all case. Maybe it is > time to get a deeper understanding of uid name spaces. You just made me feel slightly better about not knowing enough about user namespaces. > Have I missed anything? Any other thoughts? Here is the branch again that I'd like to use as a base for continued _incremental_ localio development (make code evolution clearer, if you want to rip and replace code, just remove in a separate commit.. I can rebase to cleanup when the dust settles): https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/log/?h=nfs-localio-for-next Thanks for spending time on all of this localio stuff! It's now 4th of July for me, so I'm with Jeff: I need a drink! ;) Mike ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Security issue in NFS localio 2024-07-04 4:31 ` Mike Snitzer @ 2024-07-05 21:39 ` NeilBrown 0 siblings, 0 replies; 12+ messages in thread From: NeilBrown @ 2024-07-05 21:39 UTC (permalink / raw) To: Mike Snitzer Cc: Jeff Layton, Chuck Lever, Anna Schumaker, Trond Myklebust, Christoph Hellwig, linux-nfs On Thu, 04 Jul 2024, Mike Snitzer wrote: > On Thu, Jul 04, 2024 at 08:24:44AM +1000, NeilBrown wrote: > > > 2/ The localio access should use exactly the same auth_domain as the > > network access uses. This ensure the credentials implied by > > rootsquash and allsquash are used correctly. I think the current > > code has the client guessing what IP address the server will see and > > finding an auth_domain based on that. I'm not comfortable with that. > > nfsd_local_fakerqst_create() isn't guessing. rpc_peeraddr() returns the > IP address of the server because the rpc_clnt is the same as > established for traditional network access. I think it is guessing in exactly they same way that your previous internal code tried to use IP addresses to guess whether the server was on the same host or not. Whatever reasons you had for thinking that was fragile and needed to be replaced - apply those reasons to the mechanism for choosing an 'auth_domain' (which is what the IP address is used for). I am confident that we need to choose the auth_domain when the client is making a LOCALIO RPC request to the server, and to use that auth_domain for subsequent interactions with that client (and possibly a different auth_domain for a different client). > > It's now 4th of July for me, so I'm with Jeff: I need a drink! ;) > Hope you enjoyed your drink! NeilBrown ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Security issue in NFS localio 2024-07-03 22:24 Security issue in NFS localio NeilBrown 2024-07-03 23:35 ` Jeff Layton 2024-07-04 4:31 ` Mike Snitzer @ 2024-07-04 5:45 ` Christoph Hellwig 2024-07-04 19:00 ` Chuck Lever III 3 siblings, 0 replies; 12+ messages in thread From: Christoph Hellwig @ 2024-07-04 5:45 UTC (permalink / raw) To: NeilBrown Cc: Mike Snitzer, Jeff Layton, Chuck Lever, Anna Schumaker, Trond Myklebust, Christoph Hellwig, linux-nfs On Thu, Jul 04, 2024 at 08:24:44AM +1000, NeilBrown wrote: > 3/ The current code uses the 'struct cred' of the application to look up > the file in the server code. When a request goes over the wire the > credential is translated to uid/gid (or krb identity) and this is > mapped back to a credential on the server which might be in a > different uid name space (might it? Does that even work for nfsd?) > > I think that if rootsquash or allsquash is in effect the correct > server-side credential is used but otherwise the client-side > credential is used. That is likely correct in many cases but I'd > like to be convinced that it is correct in all case. Maybe it is > time to get a deeper understanding of uid name spaces. > > Have I missed anything? Any other thoughts? Still getting up to speed with the code (and only the current one posted by Mike, I haven't looked at your series yet), and the fundamental underlying problem seems to be that while the code to open the file resides in the nfsd code, it is called from client context. If opening the file was done in normal nfsd context with all the usual limitations and then handed out it would work exactly like FD passing and avoid almost all possibility of privilege escalation. I'm not sure how to best archive that, though. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Security issue in NFS localio 2024-07-03 22:24 Security issue in NFS localio NeilBrown ` (2 preceding siblings ...) 2024-07-04 5:45 ` Christoph Hellwig @ 2024-07-04 19:00 ` Chuck Lever III 2024-07-04 23:25 ` Dave Chinner 2024-07-05 13:45 ` Christoph Hellwig 3 siblings, 2 replies; 12+ messages in thread From: Chuck Lever III @ 2024-07-04 19:00 UTC (permalink / raw) To: Neil Brown Cc: Mike Snitzer, Jeff Layton, Anna Schumaker, Trond Myklebust, Christoph Hellwig, Linux NFS Mailing List, Linux FS Devel > On Jul 3, 2024, at 6:24 PM, NeilBrown <neilb@suse.de> wrote: > > > I've been pondering security questions with localio - particularly > wondering what questions I need to ask. I've found three focal points > which overlap but help me organise my thoughts: > 1- the LOCALIO RPC protocol > 2- the 'auth_domain' that nfsd uses to authorise access > 3- the credential that is used to access the file > > 1/ It occurs to me that I could find out the UUID reported by a given > local server (just ask it over the RPC connection), find out the > filehandle for some file that I don't have write access to (not too > hard), and create a private NFS server (hacking nfs-ganasha?) which > reports the same uuid and reports that I have access to a file with > that filehandle. If I then mount from that server inside a private > container on the same host that is running the local server, I would get > localio access to the target file. > > I might not be able to write to it because of credential checking, but I > think that is getting a lot closer to unauthorised access than I would > like. > > I would much prefer it if there was no credible way to subvert the > LOCALIO protocol. > > My current idea goes like this: > - NFS client tells nfs_common it is going to probe for localio > and gets back a nonce. nfs_common records that this probe is happening > - NFS client sends the nonce to the server over LOCALIO. > - server tells nfs_common "I just got this nonce - does it mean > anything?". If it does, the server gets connected with the client > through nfs_common. The server reports success over LOCALIO. > If it doesn't the server reports failure of LOCALIO. > - NFS client gets the reply and tells nfs_common that it has a reply > so the nonce is invalidated. If the reply was success and nfs_local > confirms there is a connection, then the two stay connected. > > I think that having a nonce (single-use uuid) is better than using the > same uuid for the life of the server, and I think that sending it > proactively by client rather than reactively by the server is also > safer. This echoes typical security approaches, and I think it would be very easy to convert the current LOCALIO protocol to operate this way. > 2/ The localio access should use exactly the same auth_domain as the > network access uses. This ensure the credentials implied by > rootsquash and allsquash are used correctly. I think the current > code has the client guessing what IP address the server will see and > finding an auth_domain based on that. I'm not comfortable with that. > > In the new LOCALIO protocol I suggest above, the server registers > with nfs_common at the moment it receives an RPC request. At that > moment it knows the characteristics of the connection - remote IP? > krb5? tls? - and can determine an auth_domain and give it to > nfs_common and so make it available to the client. I wasn't sure about the "copy the IP address" logic. It doesn't seem like it would provide adequate security across network namespaces on the same physical host, but I haven't studied it closely. My sense is an administrator would want authorization for localIO to behave just like it would if this were a normal NFS access. So the export's IP address settings, secure/insecure, GSS, TLS, and squashing options should all be effective and work exactly the same without regard to the I/O method. auth_domains are an area I still don't know much about, but that seems like it would get us closer to having access authorization behave the same in both cases. It looked like your __fh_verify() clean-up was headed in that direction. > Jeff wondered about an export option to explicitly enable LOCALIO. I > had wondered about that too. But I think that if we firmly tie the > localio auth_domain to the connection as above, that shouldn't be needed. > > 3/ The current code uses the 'struct cred' of the application to look up > the file in the server code. When a request goes over the wire the > credential is translated to uid/gid (or krb identity) and this is > mapped back to a credential on the server which might be in a > different uid name space (might it? Does that even work for nfsd?) > > I think that if rootsquash or allsquash is in effect the correct > server-side credential is used but otherwise the client-side > credential is used. That is likely correct in many cases but I'd > like to be convinced that it is correct in all case. Maybe it is > time to get a deeper understanding of uid name spaces. I've wondered about the idmapping issues, actually. Thanks for bringing that up. I think Christian and linux-fsdevel need to be involved in this conversation; added. -- Chuck Lever ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Security issue in NFS localio 2024-07-04 19:00 ` Chuck Lever III @ 2024-07-04 23:25 ` Dave Chinner 2024-07-05 1:42 ` Mike Snitzer 2024-07-05 21:51 ` NeilBrown 2024-07-05 13:45 ` Christoph Hellwig 1 sibling, 2 replies; 12+ messages in thread From: Dave Chinner @ 2024-07-04 23:25 UTC (permalink / raw) To: Chuck Lever III Cc: Neil Brown, Mike Snitzer, Jeff Layton, Anna Schumaker, Trond Myklebust, Christoph Hellwig, Linux NFS Mailing List, Linux FS Devel On Thu, Jul 04, 2024 at 07:00:23PM +0000, Chuck Lever III wrote: > > > > On Jul 3, 2024, at 6:24 PM, NeilBrown <neilb@suse.de> wrote: > > > > > > I've been pondering security questions with localio - particularly > > wondering what questions I need to ask. I've found three focal points > > which overlap but help me organise my thoughts: > > 1- the LOCALIO RPC protocol > > 2- the 'auth_domain' that nfsd uses to authorise access > > 3- the credential that is used to access the file > > > > 1/ It occurs to me that I could find out the UUID reported by a given > > local server (just ask it over the RPC connection), find out the > > filehandle for some file that I don't have write access to (not too > > hard), and create a private NFS server (hacking nfs-ganasha?) which > > reports the same uuid and reports that I have access to a file with > > that filehandle. If I then mount from that server inside a private > > container on the same host that is running the local server, I would get > > localio access to the target file. This seems amazingly complex for something that is actually really simple. Keep in mind that I am speaking from having direct experience with developing and maintaining NFS client IO bypass infrastructure from when I worked at SGI as an NFS engineer. So, let's look at the Irix NFS client/server and the "Bulk Data Service" protocol extensions that SGI wrote for NFSv3 back in the mid 1990s. Here's an overview from the 1996 product documentation "Getting Started with BDSpro": https://irix7.com/techpubs/007-3274-001.pdf At least read chapter 1 so you grok the fundamentals of how the IO bypass worked. It should look familiar, because it isn't very different to how NFS over RDMA or client side IO for pNFS works. Essentially, The NFS client transparently sent all the data IO (read and write) over a separate communications channel for any IO that met the size and alignment constraints. This was effectively a "remote-IO" bypass that streamed data rather than packetised it (NFS_READ/NFS_WRITE is packetised data with RTT latency issues). By getting rid of the round trip latency penalty, data could be sent/recieved at full network throughput rates. [ As an aside, the BDS side channel was also the mechanism that used by SGI for NFS over RDMA with custom full stack network offload hardware back in the mid 1990s. NFS w/ BDS ran at about 800MB/s on those networks on machines with 200MHz CPUs (think MIPS r10k). ] The client side userspace has no idea this low level protocol hijacking occurs, and it doesn't need to because all it changes is the read/write IO speed. The NFS protocol is still used for all authorisation, access checks, metadata operations, etc, and all that changes is how NFS_READ and NFS_WRITE operations are performed. The local-io stuff is no different - we're just using a different client side IO path in kernel. We don't need a new protocol, nor do we need userspace to be involved *at all*. The kernel NFS client can easily discover that it is on the same host as the server. The server already does this "client is on the same host", so both will then know they can *transparently* enable the localio bypass without involving userspace at all. The NFS protocol still provides all the auth, creds, etc to allow the NFS client read and write access to the file. The NFS server provides the client with a filehandle build by the underlying filesystem for the file the NFS client has been permission to access. The local filesystem will accept that filehandle from any kernel side context via the export ops for that filesystem. This provides a mechanism for the NFS client to convert that to a dentry and so open the file directly from the file handle. This is what the server already does, so it should be able to share the filehandle decode and open code from the server, maybe even just reach into the server export table directly.... IOWs, we don't need to care about whether the mount is visible to the NFS client - the filesystem *export* is visible to the *kernel* and the export ops allow unfettered filehandle decoding. Containers are irrelevant - the server has granted access to the file, and so the NFS client has effective permissions to resolve the filehandle directly.. Fundamentally, this is the same permission and access model that pNFS is built on. Hence I don't understand why this local-io bypass needs something completely new and seemingly very complex... -Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Security issue in NFS localio 2024-07-04 23:25 ` Dave Chinner @ 2024-07-05 1:42 ` Mike Snitzer 2024-07-05 21:51 ` NeilBrown 1 sibling, 0 replies; 12+ messages in thread From: Mike Snitzer @ 2024-07-05 1:42 UTC (permalink / raw) To: Dave Chinner Cc: Chuck Lever III, Neil Brown, Jeff Layton, Anna Schumaker, Trond Myklebust, Christoph Hellwig, Linux NFS Mailing List, Linux FS Devel On Fri, Jul 05, 2024 at 09:25:56AM +1000, Dave Chinner wrote: > On Thu, Jul 04, 2024 at 07:00:23PM +0000, Chuck Lever III wrote: > > > > > > > On Jul 3, 2024, at 6:24 PM, NeilBrown <neilb@suse.de> wrote: > > > > > > > > > I've been pondering security questions with localio - particularly > > > wondering what questions I need to ask. I've found three focal points > > > which overlap but help me organise my thoughts: > > > 1- the LOCALIO RPC protocol > > > 2- the 'auth_domain' that nfsd uses to authorise access > > > 3- the credential that is used to access the file > > > > > > 1/ It occurs to me that I could find out the UUID reported by a given > > > local server (just ask it over the RPC connection), find out the > > > filehandle for some file that I don't have write access to (not too > > > hard), and create a private NFS server (hacking nfs-ganasha?) which > > > reports the same uuid and reports that I have access to a file with > > > that filehandle. If I then mount from that server inside a private > > > container on the same host that is running the local server, I would get > > > localio access to the target file. > > This seems amazingly complex for something that is actually really > simple. Could be completely wrong, but I'm inferring you've read more linux-nfs email (particularly about alternative directions for implementation) than looked at the localio code. But more below. > Keep in mind that I am speaking from having direct > experience with developing and maintaining NFS client IO bypass > infrastructure from when I worked at SGI as an NFS engineer. Thanks for sharing all this about IRIX, really helpful. > So, let's look at the Irix NFS client/server and the "Bulk Data > Service" protocol extensions that SGI wrote for NFSv3 back in the > mid 1990s. Here's an overview from the 1996 product documentation > "Getting Started with BDSpro": > > https://irix7.com/techpubs/007-3274-001.pdf > > At least read chapter 1 so you grok the fundamentals of how the IO > bypass worked. It should look familiar, because it isn't very > different to how NFS over RDMA or client side IO for pNFS works. > > Essentially, The NFS client transparently sent all the data IO (read > and write) over a separate communications channel for any IO that > met the size and alignment constraints. This was effectively a > "remote-IO" bypass that streamed data rather than packetised it > (NFS_READ/NFS_WRITE is packetised data with RTT latency issues). > By getting rid of the round trip latency penalty, data could be > sent/recieved at full network throughput rates. > > [ As an aside, the BDS side channel was also the mechanism that used > by SGI for NFS over RDMA with custom full stack network offload > hardware back in the mid 1990s. NFS w/ BDS ran at about 800MB/s on > those networks on machines with 200MHz CPUs (think MIPS r10k). ] > > The client side userspace has no idea this low level protocol > hijacking occurs, and it doesn't need to because all it changes > is the read/write IO speed. The NFS protocol is still used for all > authorisation, access checks, metadata operations, etc, and all that > changes is how NFS_READ and NFS_WRITE operations are performed. > > The local-io stuff is no different - we're just using a different > client side IO path in kernel. We don't need a new protocol, nor do > we need userspace to be involved *at all*. The kernel NFS client > can easily discover that it is on the same host as the server. The > server already does this "client is on the same host", so both will > then know they can *transparently* enable the localio bypass without > involving userspace at all. > > The NFS protocol still provides all the auth, creds, etc to allow > the NFS client read and write access to the file. The NFS server > provides the client with a filehandle build by the underlying > filesystem for the file the NFS client has been permission to > access. > > The local filesystem will accept that filehandle from any kernel > side context via the export ops for that filesystem. This provides > a mechanism for the NFS client to convert that to a dentry > and so open the file directly from the file handle. This is what the > server already does, so it should be able to share the filehandle > decode and open code from the server, maybe even just reach into the > server export table directly.... > > IOWs, we don't need to care about whether the mount is visible to > the NFS client - the filesystem *export* is visible to the *kernel* > and the export ops allow unfettered filehandle decoding. Containers > are irrelevant - the server has granted access to the file, and so > the NFS client has effective permissions to resolve the filehandle > directly.. IRIX sounds well engineered. The Linux NFS code's interfaces aren't so clean/precise. The data structures are pretty tightly coupled (one big struct nfs_client, nfs_server, nfsd_net, etc. sunrpc's svc_rqst in particular carries auth info that is used as part of the the wire protocol business end -- so NFS auth is in the layer localio wants to bypass). And as such interfaces tend to do a lot of different tasks on behalf of structures that carry the kitchen sink. So retrofitting the Linux NFS and RPC code to allow a subset of the NFS client and server code to be used isn't so clean. But what you described IRIX did is pretty much what my localio series provides, see: https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/log/?h=nfs-localio-for-next (Always room for improvement, like I said to Christoph, especially on the IO submission and handling side.. as you've seen it is doing buffered IO and is synchronous.. really leaving performance lackluster but lots of upside to be had making it async and support DIO). > Fundamentally, this is the same permission and access model that > pNFS is built on. Hence I don't understand why this local-io bypass > needs something completely new and seemingly very complex... pNFS doesn't need to have a direct role in any of this localio code, but pNFS can use localio if it is enabled, e.g.: https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=nfs-localio-for-next&id=5e7bf77fbbecdbea0e6ae7174c97a69b11e3098a Localio not being tightly coupled to pNFS enables localio to easily support NFSv3. NFSv3 support is a requirement and there is no reason not to support it. Anyway, I really think Neil's ideas for localio improvement are solid. Especially factoring out the auth_domain to ensure bog standard NFS authentication and security mechanisms used. Though IMO the proposed localio protocol changes aren't _really_ needed, but I also won't fight to stop localio nfsd UUID sharing being more ephemeral and risk-averse... The only reason there is a sideband/auxilliary "localio protocol" is the NFS protocol is very focused on enabling NFS spec implementation. I actually framed it in terms of NFS encode and decode on the server side and Chuck wanted me to make sure to decouple localio so that it stood on its own (I agree with him, I just didn't think to do it that way). I just needed a a means to generate and get a UUID from the server to anchor the mechanism for nfs_common to allow the client and server to rendezvous, see: nfs_common: https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=nfs-localio-for-next&id=cb542e791eda114adcc9291feb6c66a5ea338f7c nfs server: https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=nfs-localio-for-next&id=877a8212c3af37b5ba32959275f4c49bfe805f24 nfs client: https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=nfs-localio-for-next&id=572b36de2bb1dde06d6da4488686c9fbbc79d7e1 Really quite simple. And the pgio hooks used to branch to localio handling of READ, WRITE and COMMIT are the interface point for then generating a kiocb and issuing the IO accordingly (last 2 commits below): https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=nfs-localio-for-next&id=46569e6d92a074188bb1f0090d36c327729ab418 But even the buffered and direct IO in nfsd are really tightly coupled to the wire protocol interface. So localio hooks pgio and calls down to the underlying filesystem with its own side channel (that uses .read_iter and .write_iter), see fs/nfs/localio.c https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=nfs-localio-for-next&id=877a8212c3af37b5ba32959275f4c49bfe805f24 https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=nfs-localio-for-next&id=4222309dac70e485f089738d0ffe9113b9a5a1e1 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Security issue in NFS localio 2024-07-04 23:25 ` Dave Chinner 2024-07-05 1:42 ` Mike Snitzer @ 2024-07-05 21:51 ` NeilBrown 1 sibling, 0 replies; 12+ messages in thread From: NeilBrown @ 2024-07-05 21:51 UTC (permalink / raw) To: Dave Chinner Cc: Chuck Lever III, Mike Snitzer, Jeff Layton, Anna Schumaker, Trond Myklebust, Christoph Hellwig, Linux NFS Mailing List, Linux FS Devel On Fri, 05 Jul 2024, Dave Chinner wrote: > On Thu, Jul 04, 2024 at 07:00:23PM +0000, Chuck Lever III wrote: > > > > > > > On Jul 3, 2024, at 6:24 PM, NeilBrown <neilb@suse.de> wrote: > > > > > > > > > I've been pondering security questions with localio - particularly > > > wondering what questions I need to ask. I've found three focal points > > > which overlap but help me organise my thoughts: > > > 1- the LOCALIO RPC protocol > > > 2- the 'auth_domain' that nfsd uses to authorise access > > > 3- the credential that is used to access the file > > > > > > 1/ It occurs to me that I could find out the UUID reported by a given > > > local server (just ask it over the RPC connection), find out the > > > filehandle for some file that I don't have write access to (not too > > > hard), and create a private NFS server (hacking nfs-ganasha?) which > > > reports the same uuid and reports that I have access to a file with > > > that filehandle. If I then mount from that server inside a private > > > container on the same host that is running the local server, I would get > > > localio access to the target file. > > This seems amazingly complex for something that is actually really > simple. Keep in mind that I am speaking from having direct > experience with developing and maintaining NFS client IO bypass > infrastructure from when I worked at SGI as an NFS engineer. > > So, let's look at the Irix NFS client/server and the "Bulk Data > Service" protocol extensions that SGI wrote for NFSv3 back in the > mid 1990s. Here's an overview from the 1996 product documentation > "Getting Started with BDSpro": > > https://irix7.com/techpubs/007-3274-001.pdf Interesting work. Thanks for the pointer. It appear to me that BDS uses a separate network protocol - possibly over a separate TCP connection or even a separate fabric - to connect client to server, and this protocol is tuned for high throughput data transfer and nothing else. Makes perfect sense. It would seem to still use the IP address (or similar NFS-style mechanism) to authenticate each party to the other and to establish a path for the data to flow over. This is the question facing localio in the text of mine that you quote above. We don't want a network data flow. We want to hand over a file descriptor (or 'struct file'). There is no standard way to achieve this over an IP-connected channel. So we are creating one. The proposed protocol is to send a unique number over the IP-connected channel, and use that to achieve rendezvous between the in-kernel client and the in-kernel server. The interesting questions are around how unique this number should be, which direction it should travel, and whether anything else other than the file descriptor should be passed through when the kernel sides rendezvous. I don't think the documentation on BDS sheds any particular light on this question. Thanks, NeilBrown ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Security issue in NFS localio 2024-07-04 19:00 ` Chuck Lever III 2024-07-04 23:25 ` Dave Chinner @ 2024-07-05 13:45 ` Christoph Hellwig 2024-07-05 13:50 ` Chuck Lever III 1 sibling, 1 reply; 12+ messages in thread From: Christoph Hellwig @ 2024-07-05 13:45 UTC (permalink / raw) To: Chuck Lever III Cc: Neil Brown, Mike Snitzer, Jeff Layton, Anna Schumaker, Trond Myklebust, Christoph Hellwig, Linux NFS Mailing List, Linux FS Devel On Thu, Jul 04, 2024 at 07:00:23PM +0000, Chuck Lever III wrote: > > 3/ The current code uses the 'struct cred' of the application to look up > > the file in the server code. When a request goes over the wire the > > credential is translated to uid/gid (or krb identity) and this is > > mapped back to a credential on the server which might be in a > > different uid name space (might it? Does that even work for nfsd?) > > > > I think that if rootsquash or allsquash is in effect the correct > > server-side credential is used but otherwise the client-side > > credential is used. That is likely correct in many cases but I'd > > like to be convinced that it is correct in all case. Maybe it is > > time to get a deeper understanding of uid name spaces. > > I've wondered about the idmapping issues, actually. Thanks > for bringing that up. I think Christian and linux-fsdevel > need to be involved in this conversation; added. There is a lot more issues than just idmapping. That's why I don't think the current approach where the open is executed in the client can work. The right way is to ensure the open always happens in and nfsd thread context which just pases the open file to client for doing I/O. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Security issue in NFS localio 2024-07-05 13:45 ` Christoph Hellwig @ 2024-07-05 13:50 ` Chuck Lever III 2024-07-05 21:56 ` NeilBrown 0 siblings, 1 reply; 12+ messages in thread From: Chuck Lever III @ 2024-07-05 13:50 UTC (permalink / raw) To: Christoph Hellwig Cc: Neil Brown, Mike Snitzer, Jeff Layton, Anna Schumaker, Trond Myklebust, Linux NFS Mailing List, Linux FS Devel > On Jul 5, 2024, at 9:45 AM, Christoph Hellwig <hch@infradead.org> wrote: > > On Thu, Jul 04, 2024 at 07:00:23PM +0000, Chuck Lever III wrote: >>> 3/ The current code uses the 'struct cred' of the application to look up >>> the file in the server code. When a request goes over the wire the >>> credential is translated to uid/gid (or krb identity) and this is >>> mapped back to a credential on the server which might be in a >>> different uid name space (might it? Does that even work for nfsd?) >>> >>> I think that if rootsquash or allsquash is in effect the correct >>> server-side credential is used but otherwise the client-side >>> credential is used. That is likely correct in many cases but I'd >>> like to be convinced that it is correct in all case. Maybe it is >>> time to get a deeper understanding of uid name spaces. >> >> I've wondered about the idmapping issues, actually. Thanks >> for bringing that up. I think Christian and linux-fsdevel >> need to be involved in this conversation; added. > > There is a lot more issues than just idmapping. That's why I don't > think the current approach where the open is executed in the client > can work. The right way is to ensure the open always happens in and > nfsd thread context which just pases the open file to client for doing > I/O. I have considered that approach, but I don't yet have a clear enough understanding of the idmapping issues to say whether it is going to be necessary. Just in terms of primitives, the server has svc_wake_up() which can enqueue non-transport work on an nfsd thread. Question then is what is the form of the response -- how does it get back to the "client" side. -- Chuck Lever ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Security issue in NFS localio 2024-07-05 13:50 ` Chuck Lever III @ 2024-07-05 21:56 ` NeilBrown 0 siblings, 0 replies; 12+ messages in thread From: NeilBrown @ 2024-07-05 21:56 UTC (permalink / raw) To: Chuck Lever III Cc: Christoph Hellwig, Mike Snitzer, Jeff Layton, Anna Schumaker, Trond Myklebust, Linux NFS Mailing List, Linux FS Devel On Fri, 05 Jul 2024, Chuck Lever III wrote: > > > > On Jul 5, 2024, at 9:45 AM, Christoph Hellwig <hch@infradead.org> wrote: > > > > On Thu, Jul 04, 2024 at 07:00:23PM +0000, Chuck Lever III wrote: > >>> 3/ The current code uses the 'struct cred' of the application to look up > >>> the file in the server code. When a request goes over the wire the > >>> credential is translated to uid/gid (or krb identity) and this is > >>> mapped back to a credential on the server which might be in a > >>> different uid name space (might it? Does that even work for nfsd?) > >>> > >>> I think that if rootsquash or allsquash is in effect the correct > >>> server-side credential is used but otherwise the client-side > >>> credential is used. That is likely correct in many cases but I'd > >>> like to be convinced that it is correct in all case. Maybe it is > >>> time to get a deeper understanding of uid name spaces. > >> > >> I've wondered about the idmapping issues, actually. Thanks > >> for bringing that up. I think Christian and linux-fsdevel > >> need to be involved in this conversation; added. > > > > There is a lot more issues than just idmapping. That's why I don't > > think the current approach where the open is executed in the client > > can work. The right way is to ensure the open always happens in and > > nfsd thread context which just pases the open file to client for doing > > I/O. > > I have considered that approach, but I don't yet have a clear > enough understanding of the idmapping issues to say whether > it is going to be necessary. I can't see that nfsd pays any attention to uid namespaces. I think it always uses the init_ns. So we would need to ensure the client credential was interpreted against the init namespace. I still have made time to understand what this means in practice. > > Just in terms of primitives, the server has svc_wake_up() which > can enqueue non-transport work on an nfsd thread. Question then > is what is the form of the response -- how does it get back > to the "client" side. I think it would be quite easy to establish a new work-queue (lwq?) that clients can attach a request structure too before calling svc_wake_up(), and which the nfsd thread would examine after svc_recv() returns. This request structure would include the filehandle, credential, possible other context, place to store the result, and a completion. The server thread would complete the completion and the client, after queuing the request, would wait for the completion. So: easy enough to do if it turns out to be necessary. NeilBrown > > > -- > Chuck Lever > > > ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-07-05 21:57 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-07-03 22:24 Security issue in NFS localio NeilBrown 2024-07-03 23:35 ` Jeff Layton 2024-07-04 4:31 ` Mike Snitzer 2024-07-05 21:39 ` NeilBrown 2024-07-04 5:45 ` Christoph Hellwig 2024-07-04 19:00 ` Chuck Lever III 2024-07-04 23:25 ` Dave Chinner 2024-07-05 1:42 ` Mike Snitzer 2024-07-05 21:51 ` NeilBrown 2024-07-05 13:45 ` Christoph Hellwig 2024-07-05 13:50 ` Chuck Lever III 2024-07-05 21:56 ` NeilBrown
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox