public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
* Security issue in NFS localio
@ 2024-07-03 22:24 NeilBrown
  2024-07-03 23:35 ` Jeff Layton
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: NeilBrown @ 2024-07-03 22:24 UTC (permalink / raw)
  To: Mike Snitzer, Jeff Layton, Chuck Lever
  Cc: Anna Schumaker, Trond Myklebust, Christoph Hellwig, linux-nfs


I've been pondering security questions with localio - particularly
wondering what questions I need to ask.  I've found three focal points
which overlap but help me organise my thoughts:
1- the LOCALIO RPC protocol
2- the 'auth_domain' that nfsd uses to authorise access
3- the credential that is used to access the file

1/ It occurs to me that I could find out the UUID reported by a given
local server (just ask it over the RPC connection), find out the
filehandle for some file that I don't have write access to (not too
hard), and create a private NFS server (hacking nfs-ganasha?) which
reports the same uuid and reports that I have access to a file with
that filehandle.  If I then mount from that server inside a private
container on the same host that is running the local server, I would get
localio access to the target file.

I might not be able to write to it because of credential checking, but I
think that is getting a lot closer to unauthorised access than I would
like.

I would much prefer it if there was no credible way to subvert the
LOCALIO protocol.

My current idea goes like this:
 - NFS client tells nfs_common it is going to probe for localio
   and gets back a nonce.  nfs_common records that this probe is happening
 - NFS client sends the nonce to the server over LOCALIO.
 - server tells nfs_common "I just got this nonce - does it mean
   anything?".  If it does, the server gets connected with the client
   through nfs_common.  The server reports success over LOCALIO.
   If it doesn't the server reports failure of LOCALIO.
 - NFS client gets the reply and tells nfs_common that it has a reply
   so the nonce is invalidated.  If the reply was success and nfs_local
   confirms there is a connection, then the two stay connected.

I think that having a nonce (single-use uuid) is better than using the
same uuid for the life of the server, and I think that sending it
proactively by client rather than reactively by the server is also
safer.

2/ The localio access should use exactly the same auth_domain as the
   network access uses.  This ensure the credentials implied by
   rootsquash and allsquash are used correctly.  I think the current
   code has the client guessing what IP address the server will see and
   finding an auth_domain based on that.  I'm not comfortable with that.
   
   In the new LOCALIO protocol I suggest above, the server registers
   with nfs_common at the moment it receives an RPC request.  At that
   moment it knows the characteristics of the connection - remote IP?
   krb5?  tls?  - and can determine an auth_domain and give it to
   nfs_common and so make it available to the client.

   Jeff wondered about an export option to explicitly enable LOCALIO.  I
   had wondered about that too.  But I think that if we firmly tie the
   localio auth_domain to the connection as above, that shouldn't be needed.

3/ The current code uses the 'struct cred' of the application to look up
   the file in the server code.  When a request goes over the wire the
   credential is translated to uid/gid (or krb identity) and this is
   mapped back to a credential on the server which might be in a
   different uid name space (might it?  Does that even work for nfsd?)

   I think that if rootsquash or allsquash is in effect the correct
   server-side credential is used but otherwise the client-side
   credential is used.  That is likely correct in many cases but I'd
   like to be convinced that it is correct in all case.  Maybe it is
   time to get a deeper understanding of uid name spaces.

Have I missed anything?  Any other thoughts?

Thanks,
NeilBrown
  

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Security issue in NFS localio
  2024-07-03 22:24 Security issue in NFS localio NeilBrown
@ 2024-07-03 23:35 ` Jeff Layton
  2024-07-04  4:31 ` Mike Snitzer
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Jeff Layton @ 2024-07-03 23:35 UTC (permalink / raw)
  To: NeilBrown, Mike Snitzer, Chuck Lever
  Cc: Anna Schumaker, Trond Myklebust, Christoph Hellwig, linux-nfs

On Thu, 2024-07-04 at 08:24 +1000, NeilBrown wrote:
> 
> I've been pondering security questions with localio - particularly
> wondering what questions I need to ask.  I've found three focal points
> which overlap but help me organise my thoughts:
> 1- the LOCALIO RPC protocol
> 2- the 'auth_domain' that nfsd uses to authorise access
> 3- the credential that is used to access the file
> 
> 1/ It occurs to me that I could find out the UUID reported by a given
> local server (just ask it over the RPC connection), find out the
> filehandle for some file that I don't have write access to (not too
> hard), and create a private NFS server (hacking nfs-ganasha?) which
> reports the same uuid and reports that I have access to a file with
> that filehandle.  If I then mount from that server inside a private
> container on the same host that is running the local server, I would get
> localio access to the target file.
> 
> I might not be able to write to it because of credential checking, but I
> think that is getting a lot closer to unauthorised access than I would
> like.
> 
> I would much prefer it if there was no credible way to subvert the
> LOCALIO protocol.
> 
> My current idea goes like this:
>  - NFS client tells nfs_common it is going to probe for localio
>    and gets back a nonce.  nfs_common records that this probe is happening
>  - NFS client sends the nonce to the server over LOCALIO.
>  - server tells nfs_common "I just got this nonce - does it mean
>    anything?".  If it does, the server gets connected with the client
>    through nfs_common.  The server reports success over LOCALIO.
>    If it doesn't the server reports failure of LOCALIO.
>  - NFS client gets the reply and tells nfs_common that it has a reply
>    so the nonce is invalidated.  If the reply was success and nfs_local
>    confirms there is a connection, then the two stay connected.
> 
> I think that having a nonce (single-use uuid) is better than using the
> same uuid for the life of the server, and I think that sending it
> proactively by client rather than reactively by the server is also
> safer.
> 

I like this idea. That does sound much safer.

> 2/ The localio access should use exactly the same auth_domain as the
>    network access uses.  This ensure the credentials implied by
>    rootsquash and allsquash are used correctly.  I think the current
>    code has the client guessing what IP address the server will see and
>    finding an auth_domain based on that.  I'm not comfortable with that.
>    
>    In the new LOCALIO protocol I suggest above, the server registers
>    with nfs_common at the moment it receives an RPC request.  At that
>    moment it knows the characteristics of the connection - remote IP?
>    krb5?  tls?  - and can determine an auth_domain and give it to
>    nfs_common and so make it available to the client.
> 

The current localio code does this:

+	/* Note: we're connecting to ourself, so source addr == peer addr */
+	rqstp->rq_addrlen = rpc_peeraddr(rpc_clnt,
+			(struct sockaddr *)&rqstp->rq_addr,
+			sizeof(rqstp->rq_addr));

...which I guess means that we're setting this to the server's address?
 
That does seem like it might allow a client in another namespace to
bypass export permissions.

I think your idea about associating an auth_domain should fix that.

>    Jeff wondered about an export option to explicitly enable LOCALIO.  I
>    had wondered about that too.  But I think that if we firmly tie the
>    localio auth_domain to the connection as above, that shouldn't be needed.
>
> 3/ The current code uses the 'struct cred' of the application to look up
>    the file in the server code.  When a request goes over the wire the
>    credential is translated to uid/gid (or krb identity) and this is
>    mapped back to a credential on the server which might be in a
>    different uid name space (might it?  Does that even work for nfsd?)
> 
>    I think that if rootsquash or allsquash is in effect the correct
>    server-side credential is used but otherwise the client-side
>    credential is used.  That is likely correct in many cases but I'd
>    like to be convinced that it is correct in all case.  Maybe it is
>    time to get a deeper understanding of uid name spaces.
> 


I'll have to consider this #3 over some July 4th libations!
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Security issue in NFS localio
  2024-07-03 22:24 Security issue in NFS localio NeilBrown
  2024-07-03 23:35 ` Jeff Layton
@ 2024-07-04  4:31 ` Mike Snitzer
  2024-07-05 21:39   ` NeilBrown
  2024-07-04  5:45 ` Christoph Hellwig
  2024-07-04 19:00 ` Chuck Lever III
  3 siblings, 1 reply; 12+ messages in thread
From: Mike Snitzer @ 2024-07-04  4:31 UTC (permalink / raw)
  To: NeilBrown
  Cc: Jeff Layton, Chuck Lever, Anna Schumaker, Trond Myklebust,
	Christoph Hellwig, linux-nfs

On Thu, Jul 04, 2024 at 08:24:44AM +1000, NeilBrown wrote:
> 
> I've been pondering security questions with localio - particularly
> wondering what questions I need to ask.  I've found three focal points
> which overlap but help me organise my thoughts:
> 1- the LOCALIO RPC protocol
> 2- the 'auth_domain' that nfsd uses to authorise access
> 3- the credential that is used to access the file

I'm missing some details that are critical to giving your hypothetical
attack vector teeth.  But I do think no matter what we need to be
precise with documenting localio's security in localio.rst.

> 1/ It occurs to me that I could find out the UUID reported by a given
> local server (just ask it over the RPC connection), find out the
> filehandle for some file that I don't have write access to (not too
> hard), and create a private NFS server (hacking nfs-ganasha?) which
> reports the same uuid and reports that I have access to a file with
> that filehandle.  If I then mount from that server inside a private
> container on the same host that is running the local server, I would get
> localio access to the target file.

Even if nfs-ganasha were modified to allow the nfs client's
nfs_init_localioclient() to establish access to the 'struct net'
associated with the knfsd (rpc_bind_new_program -> GETUUID ->
nfsd_uuid_is_local): what does that _really_ give the nfs client?

The nfs client still needs to call into nfsd with nfsd_open_local_fh()
which requires rpc_clnt->cl_auth to allow access to the server's
files.

In an earlier email Chuck asked why localio's svcauth_unix_set_client()
would ever fail, and he correctly answered his question with:
"Wouldn't it only be because the local application is trying to open a
file it doesn't have permission to?"

Yes, if a client never actually negotiated with the NFS server that
provides access to protected files, then the client must not be
granted access.  _This_ is why having a "fake" svc_rqst is important.
[Also "fake" isn't a great name considering it is still meant to
enforce required established NFS credentials.]

> I might not be able to write to it because of credential checking, but I
> think that is getting a lot closer to unauthorised access than I would
> like.

Kind of hand-wavy on the finale... but I'm also relieved ;)

That said, I appreciate the desire to avoid the "fake" svc_rqst based
access control.  So I still agree it is worthwhile to carry your
nfsd_file_acquire_local() series through to completion.

> I would much prefer it if there was no credible way to subvert the
> LOCALIO protocol.
>
> My current idea goes like this:
>  - NFS client tells nfs_common it is going to probe for localio
>    and gets back a nonce.  nfs_common records that this probe is happening
>  - NFS client sends the nonce to the server over LOCALIO.
>  - server tells nfs_common "I just got this nonce - does it mean
>    anything?".  If it does, the server gets connected with the client
>    through nfs_common.  The server reports success over LOCALIO.
>    If it doesn't the server reports failure of LOCALIO.
>  - NFS client gets the reply and tells nfs_common that it has a reply
>    so the nonce is invalidated.  If the reply was success and nfs_local
>    confirms there is a connection, then the two stay connected. 
> I think that having a nonce (single-use uuid) is better than using the
> same uuid for the life of the server, and I think that sending it
> proactively by client rather than reactively by the server is also
> safer.

Inverting and tweaking the localio protocol like this is clever, but
it still strikes me as unnecessary (given above).

Having it be less widely accessible is a good idea in general though.

> 2/ The localio access should use exactly the same auth_domain as the
>    network access uses.  This ensure the credentials implied by
>    rootsquash and allsquash are used correctly.  I think the current
>    code has the client guessing what IP address the server will see and
>    finding an auth_domain based on that.  I'm not comfortable with that.

nfsd_local_fakerqst_create() isn't guessing.  rpc_peeraddr() returns the
IP address of the server because the rpc_clnt is the same as
established for traditional network access.

>    In the new LOCALIO protocol I suggest above, the server registers
>    with nfs_common at the moment it receives an RPC request.  At that
>    moment it knows the characteristics of the connection - remote IP?
>    krb5?  tls?  - and can determine an auth_domain and give it to
>    nfs_common and so make it available to the client.
> 
>    Jeff wondered about an export option to explicitly enable LOCALIO.  I
>    had wondered about that too.  But I think that if we firmly tie the
>    localio auth_domain to the connection as above, that shouldn't be needed.

I do have concerns that your approach to use "exactly the same
auth_domain" isn't so much better than what the current localio code
does.  But I'll concede it sounds better than the hackish "fake"
svc_rqst based security of the current localio code.

Worth doing just to see how it all shakes out in benchmarks, and on a
"better approach" level, but I do currently have "speed kills" concerns.

> 3/ The current code uses the 'struct cred' of the application to look up
>    the file in the server code.  When a request goes over the wire the
>    credential is translated to uid/gid (or krb identity) and this is
>    mapped back to a credential on the server which might be in a
>    different uid name space (might it?  Does that even work for nfsd?)
>
>    I think that if rootsquash or allsquash is in effect the correct
>    server-side credential is used but otherwise the client-side
>    credential is used.  That is likely correct in many cases but I'd
>    like to be convinced that it is correct in all case.  Maybe it is
>    time to get a deeper understanding of uid name spaces.

You just made me feel slightly better about not knowing enough about
user namespaces.
 
> Have I missed anything?  Any other thoughts?

Here is the branch again that I'd like to use as a base for continued
_incremental_ localio development (make code evolution clearer, if you
want to rip and replace code, just remove in a separate commit.. I
can rebase to cleanup when the dust settles):
https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/log/?h=nfs-localio-for-next

Thanks for spending time on all of this localio stuff!

It's now 4th of July for me, so I'm with Jeff: I need a drink! ;)

Mike

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Security issue in NFS localio
  2024-07-03 22:24 Security issue in NFS localio NeilBrown
  2024-07-03 23:35 ` Jeff Layton
  2024-07-04  4:31 ` Mike Snitzer
@ 2024-07-04  5:45 ` Christoph Hellwig
  2024-07-04 19:00 ` Chuck Lever III
  3 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2024-07-04  5:45 UTC (permalink / raw)
  To: NeilBrown
  Cc: Mike Snitzer, Jeff Layton, Chuck Lever, Anna Schumaker,
	Trond Myklebust, Christoph Hellwig, linux-nfs

On Thu, Jul 04, 2024 at 08:24:44AM +1000, NeilBrown wrote:
> 3/ The current code uses the 'struct cred' of the application to look up
>    the file in the server code.  When a request goes over the wire the
>    credential is translated to uid/gid (or krb identity) and this is
>    mapped back to a credential on the server which might be in a
>    different uid name space (might it?  Does that even work for nfsd?)
> 
>    I think that if rootsquash or allsquash is in effect the correct
>    server-side credential is used but otherwise the client-side
>    credential is used.  That is likely correct in many cases but I'd
>    like to be convinced that it is correct in all case.  Maybe it is
>    time to get a deeper understanding of uid name spaces.
> 
> Have I missed anything?  Any other thoughts?

Still getting up to speed with the code (and only the current one
posted by Mike, I haven't looked at your series yet), and the
fundamental underlying problem seems to be that while the code to
open the file resides in the nfsd code, it is called from client
context.  If opening the file was done in normal nfsd context with
all the usual limitations and then handed out it would work exactly
like FD passing and avoid almost all possibility of privilege
escalation.  I'm not sure how to best archive that, though.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Security issue in NFS localio
  2024-07-03 22:24 Security issue in NFS localio NeilBrown
                   ` (2 preceding siblings ...)
  2024-07-04  5:45 ` Christoph Hellwig
@ 2024-07-04 19:00 ` Chuck Lever III
  2024-07-04 23:25   ` Dave Chinner
  2024-07-05 13:45   ` Christoph Hellwig
  3 siblings, 2 replies; 12+ messages in thread
From: Chuck Lever III @ 2024-07-04 19:00 UTC (permalink / raw)
  To: Neil Brown
  Cc: Mike Snitzer, Jeff Layton, Anna Schumaker, Trond Myklebust,
	Christoph Hellwig, Linux NFS Mailing List, Linux FS Devel



> On Jul 3, 2024, at 6:24 PM, NeilBrown <neilb@suse.de> wrote:
> 
> 
> I've been pondering security questions with localio - particularly
> wondering what questions I need to ask.  I've found three focal points
> which overlap but help me organise my thoughts:
> 1- the LOCALIO RPC protocol
> 2- the 'auth_domain' that nfsd uses to authorise access
> 3- the credential that is used to access the file
> 
> 1/ It occurs to me that I could find out the UUID reported by a given
> local server (just ask it over the RPC connection), find out the
> filehandle for some file that I don't have write access to (not too
> hard), and create a private NFS server (hacking nfs-ganasha?) which
> reports the same uuid and reports that I have access to a file with
> that filehandle.  If I then mount from that server inside a private
> container on the same host that is running the local server, I would get
> localio access to the target file.
> 
> I might not be able to write to it because of credential checking, but I
> think that is getting a lot closer to unauthorised access than I would
> like.
> 
> I would much prefer it if there was no credible way to subvert the
> LOCALIO protocol.
> 
> My current idea goes like this:
> - NFS client tells nfs_common it is going to probe for localio
>   and gets back a nonce.  nfs_common records that this probe is happening
> - NFS client sends the nonce to the server over LOCALIO.
> - server tells nfs_common "I just got this nonce - does it mean
>   anything?".  If it does, the server gets connected with the client
>   through nfs_common.  The server reports success over LOCALIO.
>   If it doesn't the server reports failure of LOCALIO.
> - NFS client gets the reply and tells nfs_common that it has a reply
>   so the nonce is invalidated.  If the reply was success and nfs_local
>   confirms there is a connection, then the two stay connected.
> 
> I think that having a nonce (single-use uuid) is better than using the
> same uuid for the life of the server, and I think that sending it
> proactively by client rather than reactively by the server is also
> safer.

This echoes typical security approaches, and I think it
would be very easy to convert the current LOCALIO protocol
to operate this way.


> 2/ The localio access should use exactly the same auth_domain as the
>   network access uses.  This ensure the credentials implied by
>   rootsquash and allsquash are used correctly.  I think the current
>   code has the client guessing what IP address the server will see and
>   finding an auth_domain based on that.  I'm not comfortable with that.
> 
>   In the new LOCALIO protocol I suggest above, the server registers
>   with nfs_common at the moment it receives an RPC request.  At that
>   moment it knows the characteristics of the connection - remote IP?
>   krb5?  tls?  - and can determine an auth_domain and give it to
>   nfs_common and so make it available to the client.

I wasn't sure about the "copy the IP address" logic. It doesn't
seem like it would provide adequate security across network
namespaces on the same physical host, but I haven't studied it
closely.

My sense is an administrator would want authorization for localIO
to behave just like it would if this were a normal NFS access.
So the export's IP address settings, secure/insecure, GSS, TLS,
and squashing options should all be effective and work exactly
the same without regard to the I/O method.

auth_domains are an area I still don't know much about, but that
seems like it would get us closer to having access authorization
behave the same in both cases. It looked like your __fh_verify()
clean-up was headed in that direction.


>   Jeff wondered about an export option to explicitly enable LOCALIO.  I
>   had wondered about that too.  But I think that if we firmly tie the
>   localio auth_domain to the connection as above, that shouldn't be needed.
> 
> 3/ The current code uses the 'struct cred' of the application to look up
>   the file in the server code.  When a request goes over the wire the
>   credential is translated to uid/gid (or krb identity) and this is
>   mapped back to a credential on the server which might be in a
>   different uid name space (might it?  Does that even work for nfsd?)
> 
>   I think that if rootsquash or allsquash is in effect the correct
>   server-side credential is used but otherwise the client-side
>   credential is used.  That is likely correct in many cases but I'd
>   like to be convinced that it is correct in all case.  Maybe it is
>   time to get a deeper understanding of uid name spaces.

I've wondered about the idmapping issues, actually. Thanks
for bringing that up. I think Christian and linux-fsdevel
need to be involved in this conversation; added.

--
Chuck Lever



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Security issue in NFS localio
  2024-07-04 19:00 ` Chuck Lever III
@ 2024-07-04 23:25   ` Dave Chinner
  2024-07-05  1:42     ` Mike Snitzer
  2024-07-05 21:51     ` NeilBrown
  2024-07-05 13:45   ` Christoph Hellwig
  1 sibling, 2 replies; 12+ messages in thread
From: Dave Chinner @ 2024-07-04 23:25 UTC (permalink / raw)
  To: Chuck Lever III
  Cc: Neil Brown, Mike Snitzer, Jeff Layton, Anna Schumaker,
	Trond Myklebust, Christoph Hellwig, Linux NFS Mailing List,
	Linux FS Devel

On Thu, Jul 04, 2024 at 07:00:23PM +0000, Chuck Lever III wrote:
> 
> 
> > On Jul 3, 2024, at 6:24 PM, NeilBrown <neilb@suse.de> wrote:
> > 
> > 
> > I've been pondering security questions with localio - particularly
> > wondering what questions I need to ask.  I've found three focal points
> > which overlap but help me organise my thoughts:
> > 1- the LOCALIO RPC protocol
> > 2- the 'auth_domain' that nfsd uses to authorise access
> > 3- the credential that is used to access the file
> > 
> > 1/ It occurs to me that I could find out the UUID reported by a given
> > local server (just ask it over the RPC connection), find out the
> > filehandle for some file that I don't have write access to (not too
> > hard), and create a private NFS server (hacking nfs-ganasha?) which
> > reports the same uuid and reports that I have access to a file with
> > that filehandle.  If I then mount from that server inside a private
> > container on the same host that is running the local server, I would get
> > localio access to the target file.

This seems amazingly complex for something that is actually really
simple.  Keep in mind that I am speaking from having direct
experience with developing and maintaining NFS client IO bypass
infrastructure from when I worked at SGI as an NFS engineer.

So, let's look at the Irix NFS client/server and the "Bulk Data
Service" protocol extensions that SGI wrote for NFSv3 back in the
mid 1990s.  Here's an overview from the 1996 product documentation
"Getting Started with BDSpro":

https://irix7.com/techpubs/007-3274-001.pdf

At least read chapter 1 so you grok the fundamentals of how the IO
bypass worked. It should look familiar, because it isn't very
different to how NFS over RDMA or client side IO for pNFS works.

Essentially, The NFS client transparently sent all the data IO (read
and write) over a separate communications channel for any IO that
met the size and alignment constraints. This was effectively a
"remote-IO" bypass that streamed data rather than packetised it
(NFS_READ/NFS_WRITE is packetised data with RTT latency issues).
By getting rid of the round trip latency penalty, data could be
sent/recieved at full network throughput rates.

[ As an aside, the BDS side channel was also the mechanism that used
by SGI for NFS over RDMA with custom full stack network offload
hardware back in the mid 1990s. NFS w/ BDS ran at about 800MB/s on
those networks on machines with 200MHz CPUs (think MIPS r10k). ]

The client side userspace has no idea this low level protocol
hijacking occurs, and it doesn't need to because all it changes
is the read/write IO speed. The NFS protocol is still used for all
authorisation, access checks, metadata operations, etc, and all that
changes is how NFS_READ and NFS_WRITE operations are performed.

The local-io stuff is no different - we're just using a different
client side IO path in kernel. We don't need a new protocol, nor do
we need userspace to be involved *at all*.  The kernel NFS client
can easily discover that it is on the same host as the server. The
server already does this "client is on the same host", so both will
then know they can *transparently* enable the localio bypass without
involving userspace at all.

The NFS protocol still provides all the auth, creds, etc to allow
the NFS client read and write access to the file. The NFS server
provides the client with a filehandle build by the underlying
filesystem for the file the NFS client has been permission to
access.

The local filesystem will accept that filehandle from any kernel
side context via the export ops for that filesystem. This provides
a mechanism for the NFS client to convert that to a dentry
and so open the file directly from the file handle. This is what the
server already does, so it should be able to share the filehandle
decode and open code from the server, maybe even just reach into the
server export table directly....

IOWs, we don't need to care about whether the mount is visible to
the NFS client - the filesystem *export* is visible to the *kernel*
and the export ops allow unfettered filehandle decoding. Containers
are irrelevant - the server has granted access to the file, and so
the NFS client has effective permissions to resolve the filehandle
directly..

Fundamentally, this is the same permission and access model that
pNFS is built on. Hence I don't understand why this local-io bypass
needs something completely new and seemingly very complex...

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Security issue in NFS localio
  2024-07-04 23:25   ` Dave Chinner
@ 2024-07-05  1:42     ` Mike Snitzer
  2024-07-05 21:51     ` NeilBrown
  1 sibling, 0 replies; 12+ messages in thread
From: Mike Snitzer @ 2024-07-05  1:42 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Chuck Lever III, Neil Brown, Jeff Layton, Anna Schumaker,
	Trond Myklebust, Christoph Hellwig, Linux NFS Mailing List,
	Linux FS Devel

On Fri, Jul 05, 2024 at 09:25:56AM +1000, Dave Chinner wrote:
> On Thu, Jul 04, 2024 at 07:00:23PM +0000, Chuck Lever III wrote:
> > 
> > 
> > > On Jul 3, 2024, at 6:24 PM, NeilBrown <neilb@suse.de> wrote:
> > > 
> > > 
> > > I've been pondering security questions with localio - particularly
> > > wondering what questions I need to ask.  I've found three focal points
> > > which overlap but help me organise my thoughts:
> > > 1- the LOCALIO RPC protocol
> > > 2- the 'auth_domain' that nfsd uses to authorise access
> > > 3- the credential that is used to access the file
> > > 
> > > 1/ It occurs to me that I could find out the UUID reported by a given
> > > local server (just ask it over the RPC connection), find out the
> > > filehandle for some file that I don't have write access to (not too
> > > hard), and create a private NFS server (hacking nfs-ganasha?) which
> > > reports the same uuid and reports that I have access to a file with
> > > that filehandle.  If I then mount from that server inside a private
> > > container on the same host that is running the local server, I would get
> > > localio access to the target file.
> 
> This seems amazingly complex for something that is actually really
> simple.

Could be completely wrong, but I'm inferring you've read more
linux-nfs email (particularly about alternative directions for
implementation) than looked at the localio code.  But more below.

> Keep in mind that I am speaking from having direct
> experience with developing and maintaining NFS client IO bypass
> infrastructure from when I worked at SGI as an NFS engineer.

Thanks for sharing all this about IRIX, really helpful.

> So, let's look at the Irix NFS client/server and the "Bulk Data
> Service" protocol extensions that SGI wrote for NFSv3 back in the
> mid 1990s.  Here's an overview from the 1996 product documentation
> "Getting Started with BDSpro":
> 
> https://irix7.com/techpubs/007-3274-001.pdf
> 
> At least read chapter 1 so you grok the fundamentals of how the IO
> bypass worked. It should look familiar, because it isn't very
> different to how NFS over RDMA or client side IO for pNFS works.
> 
> Essentially, The NFS client transparently sent all the data IO (read
> and write) over a separate communications channel for any IO that
> met the size and alignment constraints. This was effectively a
> "remote-IO" bypass that streamed data rather than packetised it
> (NFS_READ/NFS_WRITE is packetised data with RTT latency issues).
> By getting rid of the round trip latency penalty, data could be
> sent/recieved at full network throughput rates.
> 
> [ As an aside, the BDS side channel was also the mechanism that used
> by SGI for NFS over RDMA with custom full stack network offload
> hardware back in the mid 1990s. NFS w/ BDS ran at about 800MB/s on
> those networks on machines with 200MHz CPUs (think MIPS r10k). ]
> 
> The client side userspace has no idea this low level protocol
> hijacking occurs, and it doesn't need to because all it changes
> is the read/write IO speed. The NFS protocol is still used for all
> authorisation, access checks, metadata operations, etc, and all that
> changes is how NFS_READ and NFS_WRITE operations are performed.
> 
> The local-io stuff is no different - we're just using a different
> client side IO path in kernel. We don't need a new protocol, nor do
> we need userspace to be involved *at all*.  The kernel NFS client
> can easily discover that it is on the same host as the server. The
> server already does this "client is on the same host", so both will
> then know they can *transparently* enable the localio bypass without
> involving userspace at all.
> 
> The NFS protocol still provides all the auth, creds, etc to allow
> the NFS client read and write access to the file. The NFS server
> provides the client with a filehandle build by the underlying
> filesystem for the file the NFS client has been permission to
> access.
> 
> The local filesystem will accept that filehandle from any kernel
> side context via the export ops for that filesystem. This provides
> a mechanism for the NFS client to convert that to a dentry
> and so open the file directly from the file handle. This is what the
> server already does, so it should be able to share the filehandle
> decode and open code from the server, maybe even just reach into the
> server export table directly....
> 
> IOWs, we don't need to care about whether the mount is visible to
> the NFS client - the filesystem *export* is visible to the *kernel*
> and the export ops allow unfettered filehandle decoding. Containers
> are irrelevant - the server has granted access to the file, and so
> the NFS client has effective permissions to resolve the filehandle
> directly..

IRIX sounds well engineered.  The Linux NFS code's interfaces aren't
so clean/precise.  The data structures are pretty tightly coupled (one
big struct nfs_client, nfs_server, nfsd_net, etc.  sunrpc's svc_rqst
in particular carries auth info that is used as part of the the wire
protocol business end -- so NFS auth is in the layer localio wants to
bypass).  And as such interfaces tend to do a lot of different tasks
on behalf of structures that carry the kitchen sink.

So retrofitting the Linux NFS and RPC code to allow a subset of the
NFS client and server code to be used isn't so clean.

But what you described IRIX did is pretty much what my localio series
provides, see:
https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/log/?h=nfs-localio-for-next

(Always room for improvement, like I said to Christoph, especially on
the IO submission and handling side.. as you've seen it is doing
buffered IO and is synchronous.. really leaving performance lackluster
but lots of upside to be had making it async and support DIO).

> Fundamentally, this is the same permission and access model that
> pNFS is built on. Hence I don't understand why this local-io bypass
> needs something completely new and seemingly very complex...

pNFS doesn't need to have a direct role in any of this localio code,
but pNFS can use localio if it is enabled, e.g.:
https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=nfs-localio-for-next&id=5e7bf77fbbecdbea0e6ae7174c97a69b11e3098a

Localio not being tightly coupled to pNFS enables localio to easily
support NFSv3.  NFSv3 support is a requirement and there is no reason
not to support it.

Anyway, I really think Neil's ideas for localio improvement are solid.
Especially factoring out the auth_domain to ensure bog standard NFS
authentication and security mechanisms used.  Though IMO the proposed
localio protocol changes aren't _really_ needed, but I also won't
fight to stop localio nfsd UUID sharing being more ephemeral and
risk-averse...

The only reason there is a sideband/auxilliary "localio protocol" is
the NFS protocol is very focused on enabling NFS spec implementation.
I actually framed it in terms of NFS encode and decode on the server
side and Chuck wanted me to make sure to decouple localio so that it
stood on its own (I agree with him, I just didn't think to do it that
way).  I just needed a a means to generate and get a UUID from the
server to anchor the mechanism for nfs_common to allow the client and
server to rendezvous, see:

nfs_common:
https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=nfs-localio-for-next&id=cb542e791eda114adcc9291feb6c66a5ea338f7c
nfs server:
https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=nfs-localio-for-next&id=877a8212c3af37b5ba32959275f4c49bfe805f24
nfs client:
https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=nfs-localio-for-next&id=572b36de2bb1dde06d6da4488686c9fbbc79d7e1

Really quite simple.

And the pgio hooks used to branch to localio handling of READ, WRITE and
COMMIT are the interface point for then generating a kiocb and issuing
the IO accordingly (last 2 commits below):
https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=nfs-localio-for-next&id=46569e6d92a074188bb1f0090d36c327729ab418

But even the buffered and direct IO in nfsd are really tightly coupled
to the wire protocol interface.  So localio hooks pgio and calls down
to the underlying filesystem with its own side channel (that uses
.read_iter and .write_iter), see fs/nfs/localio.c
https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=nfs-localio-for-next&id=877a8212c3af37b5ba32959275f4c49bfe805f24
https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=nfs-localio-for-next&id=4222309dac70e485f089738d0ffe9113b9a5a1e1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Security issue in NFS localio
  2024-07-04 19:00 ` Chuck Lever III
  2024-07-04 23:25   ` Dave Chinner
@ 2024-07-05 13:45   ` Christoph Hellwig
  2024-07-05 13:50     ` Chuck Lever III
  1 sibling, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2024-07-05 13:45 UTC (permalink / raw)
  To: Chuck Lever III
  Cc: Neil Brown, Mike Snitzer, Jeff Layton, Anna Schumaker,
	Trond Myklebust, Christoph Hellwig, Linux NFS Mailing List,
	Linux FS Devel

On Thu, Jul 04, 2024 at 07:00:23PM +0000, Chuck Lever III wrote:
> > 3/ The current code uses the 'struct cred' of the application to look up
> >   the file in the server code.  When a request goes over the wire the
> >   credential is translated to uid/gid (or krb identity) and this is
> >   mapped back to a credential on the server which might be in a
> >   different uid name space (might it?  Does that even work for nfsd?)
> > 
> >   I think that if rootsquash or allsquash is in effect the correct
> >   server-side credential is used but otherwise the client-side
> >   credential is used.  That is likely correct in many cases but I'd
> >   like to be convinced that it is correct in all case.  Maybe it is
> >   time to get a deeper understanding of uid name spaces.
> 
> I've wondered about the idmapping issues, actually. Thanks
> for bringing that up. I think Christian and linux-fsdevel
> need to be involved in this conversation; added.

There is a lot more issues than just idmapping.  That's why I don't
think the current approach where the open is executed in the client
can work.  The right way is to ensure the open always happens in and
nfsd thread context which just pases the open file to client for doing
I/O.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Security issue in NFS localio
  2024-07-05 13:45   ` Christoph Hellwig
@ 2024-07-05 13:50     ` Chuck Lever III
  2024-07-05 21:56       ` NeilBrown
  0 siblings, 1 reply; 12+ messages in thread
From: Chuck Lever III @ 2024-07-05 13:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Neil Brown, Mike Snitzer, Jeff Layton, Anna Schumaker,
	Trond Myklebust, Linux NFS Mailing List, Linux FS Devel



> On Jul 5, 2024, at 9:45 AM, Christoph Hellwig <hch@infradead.org> wrote:
> 
> On Thu, Jul 04, 2024 at 07:00:23PM +0000, Chuck Lever III wrote:
>>> 3/ The current code uses the 'struct cred' of the application to look up
>>>  the file in the server code.  When a request goes over the wire the
>>>  credential is translated to uid/gid (or krb identity) and this is
>>>  mapped back to a credential on the server which might be in a
>>>  different uid name space (might it?  Does that even work for nfsd?)
>>> 
>>>  I think that if rootsquash or allsquash is in effect the correct
>>>  server-side credential is used but otherwise the client-side
>>>  credential is used.  That is likely correct in many cases but I'd
>>>  like to be convinced that it is correct in all case.  Maybe it is
>>>  time to get a deeper understanding of uid name spaces.
>> 
>> I've wondered about the idmapping issues, actually. Thanks
>> for bringing that up. I think Christian and linux-fsdevel
>> need to be involved in this conversation; added.
> 
> There is a lot more issues than just idmapping.  That's why I don't
> think the current approach where the open is executed in the client
> can work.  The right way is to ensure the open always happens in and
> nfsd thread context which just pases the open file to client for doing
> I/O.

I have considered that approach, but I don't yet have a clear
enough understanding of the idmapping issues to say whether
it is going to be necessary.

Just in terms of primitives, the server has svc_wake_up() which
can enqueue non-transport work on an nfsd thread. Question then
is what is the form of the response -- how does it get back
to the "client" side.


--
Chuck Lever



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Security issue in NFS localio
  2024-07-04  4:31 ` Mike Snitzer
@ 2024-07-05 21:39   ` NeilBrown
  0 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2024-07-05 21:39 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Jeff Layton, Chuck Lever, Anna Schumaker, Trond Myklebust,
	Christoph Hellwig, linux-nfs

On Thu, 04 Jul 2024, Mike Snitzer wrote:
> On Thu, Jul 04, 2024 at 08:24:44AM +1000, NeilBrown wrote:
> 
> > 2/ The localio access should use exactly the same auth_domain as the
> >    network access uses.  This ensure the credentials implied by
> >    rootsquash and allsquash are used correctly.  I think the current
> >    code has the client guessing what IP address the server will see and
> >    finding an auth_domain based on that.  I'm not comfortable with that.
> 
> nfsd_local_fakerqst_create() isn't guessing.  rpc_peeraddr() returns the
> IP address of the server because the rpc_clnt is the same as
> established for traditional network access.

I think it is guessing in exactly they same way that your previous
internal code tried to use IP addresses to guess whether the server was
on the same host or not.

Whatever reasons you had for thinking that was fragile and needed to be
replaced - apply those reasons to the mechanism for choosing an
'auth_domain' (which is what the IP address is used for).  I am
confident that we need to choose the auth_domain when the client is
making a LOCALIO RPC request to the server, and to use that auth_domain
for subsequent interactions with that client (and possibly a different
auth_domain for a different client).

> 
> It's now 4th of July for me, so I'm with Jeff: I need a drink! ;)
> 

Hope you enjoyed your drink!

NeilBrown

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Security issue in NFS localio
  2024-07-04 23:25   ` Dave Chinner
  2024-07-05  1:42     ` Mike Snitzer
@ 2024-07-05 21:51     ` NeilBrown
  1 sibling, 0 replies; 12+ messages in thread
From: NeilBrown @ 2024-07-05 21:51 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Chuck Lever III, Mike Snitzer, Jeff Layton, Anna Schumaker,
	Trond Myklebust, Christoph Hellwig, Linux NFS Mailing List,
	Linux FS Devel

On Fri, 05 Jul 2024, Dave Chinner wrote:
> On Thu, Jul 04, 2024 at 07:00:23PM +0000, Chuck Lever III wrote:
> > 
> > 
> > > On Jul 3, 2024, at 6:24 PM, NeilBrown <neilb@suse.de> wrote:
> > > 
> > > 
> > > I've been pondering security questions with localio - particularly
> > > wondering what questions I need to ask.  I've found three focal points
> > > which overlap but help me organise my thoughts:
> > > 1- the LOCALIO RPC protocol
> > > 2- the 'auth_domain' that nfsd uses to authorise access
> > > 3- the credential that is used to access the file
> > > 
> > > 1/ It occurs to me that I could find out the UUID reported by a given
> > > local server (just ask it over the RPC connection), find out the
> > > filehandle for some file that I don't have write access to (not too
> > > hard), and create a private NFS server (hacking nfs-ganasha?) which
> > > reports the same uuid and reports that I have access to a file with
> > > that filehandle.  If I then mount from that server inside a private
> > > container on the same host that is running the local server, I would get
> > > localio access to the target file.
> 
> This seems amazingly complex for something that is actually really
> simple.  Keep in mind that I am speaking from having direct
> experience with developing and maintaining NFS client IO bypass
> infrastructure from when I worked at SGI as an NFS engineer.
> 
> So, let's look at the Irix NFS client/server and the "Bulk Data
> Service" protocol extensions that SGI wrote for NFSv3 back in the
> mid 1990s.  Here's an overview from the 1996 product documentation
> "Getting Started with BDSpro":
> 
> https://irix7.com/techpubs/007-3274-001.pdf

Interesting work.  Thanks for the pointer.

It appear to me that BDS uses a separate network protocol - possibly
over a separate TCP connection or even a separate fabric - to connect
client to server, and this protocol is tuned for high throughput data
transfer and nothing else.  Makes perfect sense.

It would seem to still use the IP address (or similar NFS-style
mechanism) to authenticate each party to the other and to establish a
path for the data to flow over.  This is the question facing localio in
the text of mine that you quote above.  We don't want a network data
flow.  We want to hand over a file descriptor (or 'struct file').  There
is no standard way to achieve this over an IP-connected channel.  So we
are creating one.

The proposed protocol is to send a unique number over the IP-connected
channel, and use that to achieve rendezvous between the in-kernel client
and the in-kernel server.  The interesting questions are around how
unique this number should be, which direction it should travel, and
whether anything else other than the file descriptor should be passed
through when the kernel sides rendezvous.

I don't think the documentation on BDS sheds any particular light on
this question.

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Security issue in NFS localio
  2024-07-05 13:50     ` Chuck Lever III
@ 2024-07-05 21:56       ` NeilBrown
  0 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2024-07-05 21:56 UTC (permalink / raw)
  To: Chuck Lever III
  Cc: Christoph Hellwig, Mike Snitzer, Jeff Layton, Anna Schumaker,
	Trond Myklebust, Linux NFS Mailing List, Linux FS Devel

On Fri, 05 Jul 2024, Chuck Lever III wrote:
> 
> 
> > On Jul 5, 2024, at 9:45 AM, Christoph Hellwig <hch@infradead.org> wrote:
> > 
> > On Thu, Jul 04, 2024 at 07:00:23PM +0000, Chuck Lever III wrote:
> >>> 3/ The current code uses the 'struct cred' of the application to look up
> >>>  the file in the server code.  When a request goes over the wire the
> >>>  credential is translated to uid/gid (or krb identity) and this is
> >>>  mapped back to a credential on the server which might be in a
> >>>  different uid name space (might it?  Does that even work for nfsd?)
> >>> 
> >>>  I think that if rootsquash or allsquash is in effect the correct
> >>>  server-side credential is used but otherwise the client-side
> >>>  credential is used.  That is likely correct in many cases but I'd
> >>>  like to be convinced that it is correct in all case.  Maybe it is
> >>>  time to get a deeper understanding of uid name spaces.
> >> 
> >> I've wondered about the idmapping issues, actually. Thanks
> >> for bringing that up. I think Christian and linux-fsdevel
> >> need to be involved in this conversation; added.
> > 
> > There is a lot more issues than just idmapping.  That's why I don't
> > think the current approach where the open is executed in the client
> > can work.  The right way is to ensure the open always happens in and
> > nfsd thread context which just pases the open file to client for doing
> > I/O.
> 
> I have considered that approach, but I don't yet have a clear
> enough understanding of the idmapping issues to say whether
> it is going to be necessary.

I can't see that nfsd pays any attention to uid namespaces.  I think it
always uses the init_ns.  So we would need to ensure the client
credential was interpreted against the init namespace.  I still have
made time to understand what this means in practice.

> 
> Just in terms of primitives, the server has svc_wake_up() which
> can enqueue non-transport work on an nfsd thread. Question then
> is what is the form of the response -- how does it get back
> to the "client" side.

I think it would be quite easy to establish a new work-queue (lwq?) that
clients can attach a request structure too before calling svc_wake_up(),
and which the nfsd thread would examine after svc_recv() returns.

This request structure would include the filehandle, credential,
possible other context, place to store the result, and a completion.
The server thread would complete the completion and the client, after
queuing the request, would wait for the completion.

So: easy enough to do if it turns out to be necessary.

NeilBrown


> 
> 
> --
> Chuck Lever
> 
> 
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-07-05 21:57 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-03 22:24 Security issue in NFS localio NeilBrown
2024-07-03 23:35 ` Jeff Layton
2024-07-04  4:31 ` Mike Snitzer
2024-07-05 21:39   ` NeilBrown
2024-07-04  5:45 ` Christoph Hellwig
2024-07-04 19:00 ` Chuck Lever III
2024-07-04 23:25   ` Dave Chinner
2024-07-05  1:42     ` Mike Snitzer
2024-07-05 21:51     ` NeilBrown
2024-07-05 13:45   ` Christoph Hellwig
2024-07-05 13:50     ` Chuck Lever III
2024-07-05 21:56       ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox