public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] SUNRPC: Avoid address overwrite with eBPF NAT
@ 2023-08-17  1:48 Jordan Rife
  2023-08-17  2:07 ` Chuck Lever III
  2023-08-17  2:09 ` Trond Myklebust
  0 siblings, 2 replies; 4+ messages in thread
From: Jordan Rife @ 2023-08-17  1:48 UTC (permalink / raw)
  To: chuck.lever, jlayton, neilb; +Cc: linux-nfs, Jordan Rife

kernel_connect() will modify the rpc_xprt socket address in contexts
where eBPF programs perform NAT instead of iptables. In these contexts,
it is common for an NFS mount to be mounted to be a static virtual IP
while the server has an ephemeral IP leading to a problem where the
virtual IP gets overwritten and forgotten. When the endpoint IP changes,
reconnect attempts fail and the mount never recovers.

This patch protects addr from being modified in these scenarios, allowing
NFS reconnects to work as intended.

Link: https://github.com/cilium/cilium/issues/21541#issuecomment-1386857338
Signed-off-by: Jordan Rife <jrife@google.com>
---
 include/linux/sunrpc/xprt.h |  1 +
 net/sunrpc/xprtsock.c       | 17 +++++++++++++++--
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index b52411bcfe4e7..ddde79b025c53 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -211,6 +211,7 @@ struct rpc_xprt {
 
 	const struct rpc_timeout *timeout;	/* timeout parms */
 	struct sockaddr_storage	addr;		/* server address */
+	struct sockaddr_storage m_addr;      /* mutable server address */
 	size_t			addrlen;	/* size of server address */
 	int			prot;		/* IP protocol */
 
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 9f010369100a2..4100e0bf5ebba 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -236,6 +236,18 @@ static inline struct sockaddr *xs_addr(struct rpc_xprt *xprt)
 	return (struct sockaddr *) &xprt->addr;
 }
 
+static inline struct sockaddr *xs_m_addr(struct rpc_xprt *xprt)
+{
+    /* kernel_connect() may modify the address in contexts where NAT is
+     * performed by eBPF programs instead of iptables. Make a copy to ensure
+     * that our original address, xprt->addr, is not modified. Without this,
+     * NFS reconnects may fail if the endpoint address changes.
+     */
+	memcpy(&xprt->m_addr, &xprt->addr, xprt->addrlen);
+
+	return (struct sockaddr *) &xprt->m_addr;
+}
+
 static inline struct sockaddr_un *xs_addr_un(struct rpc_xprt *xprt)
 {
 	return (struct sockaddr_un *) &xprt->addr;
@@ -1954,7 +1966,7 @@ static int xs_local_finish_connecting(struct rpc_xprt *xprt,
 
 	xs_stream_start_connect(transport);
 
-	return kernel_connect(sock, xs_addr(xprt), xprt->addrlen, 0);
+	return kernel_connect(sock, xs_m_addr(xprt), xprt->addrlen, 0);
 }
 
 /**
@@ -2334,7 +2346,8 @@ static int xs_tcp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
 
 	/* Tell the socket layer to start connecting... */
 	set_bit(XPRT_SOCK_CONNECTING, &transport->sock_state);
-	return kernel_connect(sock, xs_addr(xprt), xprt->addrlen, O_NONBLOCK);
+
+	return kernel_connect(sock, xs_m_addr(xprt), xprt->addrlen, O_NONBLOCK);
 }
 
 /**
-- 
2.42.0.rc1.204.g551eb34607-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] SUNRPC: Avoid address overwrite with eBPF NAT
  2023-08-17  1:48 [PATCH] SUNRPC: Avoid address overwrite with eBPF NAT Jordan Rife
@ 2023-08-17  2:07 ` Chuck Lever III
  2023-08-17  2:09 ` Trond Myklebust
  1 sibling, 0 replies; 4+ messages in thread
From: Chuck Lever III @ 2023-08-17  2:07 UTC (permalink / raw)
  To: Jordan Rife; +Cc: jlayton@kernel.org, Neil Brown, Linux NFS Mailing List



> On Aug 16, 2023, at 9:48 PM, Jordan Rife <jrife@google.com> wrote:
> 
> kernel_connect() will modify the rpc_xprt socket address in contexts
> where eBPF programs perform NAT instead of iptables. In these contexts,
> it is common for an NFS mount to be mounted to be a static virtual IP
> while the server has an ephemeral IP leading to a problem where the
> virtual IP gets overwritten and forgotten. When the endpoint IP changes,
> reconnect attempts fail and the mount never recovers.
> 
> This patch protects addr from being modified in these scenarios, allowing
> NFS reconnects to work as intended.
> 
> Link: https://github.com/cilium/cilium/issues/21541#issuecomment-1386857338
> Signed-off-by: Jordan Rife <jrife@google.com>

Hello Jordan, since kernel_connect() is used exclusively
by the RPC client, I suggest directing your patch to
Trond and Anna.

<trondmy@hammerspace.com <mailto:trondmy@hammerspace.com>>
<anna@kernel.org <mailto:anna@kernel.org>>

Does the RPC/RDMA client also have this issue? It does
not use kernel_connect().


> ---
> include/linux/sunrpc/xprt.h |  1 +
> net/sunrpc/xprtsock.c       | 17 +++++++++++++++--
> 2 files changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
> index b52411bcfe4e7..ddde79b025c53 100644
> --- a/include/linux/sunrpc/xprt.h
> +++ b/include/linux/sunrpc/xprt.h
> @@ -211,6 +211,7 @@ struct rpc_xprt {
> 
> const struct rpc_timeout *timeout; /* timeout parms */
> struct sockaddr_storage addr; /* server address */
> + struct sockaddr_storage m_addr;      /* mutable server address */
> size_t addrlen; /* size of server address */
> int prot; /* IP protocol */
> 
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index 9f010369100a2..4100e0bf5ebba 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -236,6 +236,18 @@ static inline struct sockaddr *xs_addr(struct rpc_xprt *xprt)
> return (struct sockaddr *) &xprt->addr;
> }
> 
> +static inline struct sockaddr *xs_m_addr(struct rpc_xprt *xprt)
> +{
> +    /* kernel_connect() may modify the address in contexts where NAT is
> +     * performed by eBPF programs instead of iptables. Make a copy to ensure
> +     * that our original address, xprt->addr, is not modified. Without this,
> +     * NFS reconnects may fail if the endpoint address changes.
> +     */
> + memcpy(&xprt->m_addr, &xprt->addr, xprt->addrlen);
> +
> + return (struct sockaddr *) &xprt->m_addr;
> +}
> +
> static inline struct sockaddr_un *xs_addr_un(struct rpc_xprt *xprt)
> {
> return (struct sockaddr_un *) &xprt->addr;
> @@ -1954,7 +1966,7 @@ static int xs_local_finish_connecting(struct rpc_xprt *xprt,
> 
> xs_stream_start_connect(transport);
> 
> - return kernel_connect(sock, xs_addr(xprt), xprt->addrlen, 0);
> + return kernel_connect(sock, xs_m_addr(xprt), xprt->addrlen, 0);
> }
> 
> /**
> @@ -2334,7 +2346,8 @@ static int xs_tcp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
> 
> /* Tell the socket layer to start connecting... */
> set_bit(XPRT_SOCK_CONNECTING, &transport->sock_state);
> - return kernel_connect(sock, xs_addr(xprt), xprt->addrlen, O_NONBLOCK);
> +
> + return kernel_connect(sock, xs_m_addr(xprt), xprt->addrlen, O_NONBLOCK);
> }
> 
> /**
> -- 
> 2.42.0.rc1.204.g551eb34607-goog
> 

--
Chuck Lever



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] SUNRPC: Avoid address overwrite with eBPF NAT
  2023-08-17  1:48 [PATCH] SUNRPC: Avoid address overwrite with eBPF NAT Jordan Rife
  2023-08-17  2:07 ` Chuck Lever III
@ 2023-08-17  2:09 ` Trond Myklebust
  2023-08-17  2:29   ` Trond Myklebust
  1 sibling, 1 reply; 4+ messages in thread
From: Trond Myklebust @ 2023-08-17  2:09 UTC (permalink / raw)
  To: jlayton@kernel.org, neilb@suse.de, jrife@google.com,
	chuck.lever@oracle.com
  Cc: linux-nfs@vger.kernel.org

On Wed, 2023-08-16 at 20:48 -0500, Jordan Rife wrote:
> [You don't often get email from jrife@google.com. Learn why this is
> important at https://aka.ms/LearnAboutSenderIdentification ]
> 
> kernel_connect() will modify the rpc_xprt socket address in contexts
> where eBPF programs perform NAT instead of iptables. In these
> contexts,
> it is common for an NFS mount to be mounted to be a static virtual IP
> while the server has an ephemeral IP leading to a problem where the
> virtual IP gets overwritten and forgotten. When the endpoint IP
> changes,
> reconnect attempts fail and the mount never recovers.
> 
> This patch protects addr from being modified in these scenarios,
> allowing
> NFS reconnects to work as intended.

What? No! A connect() call should not be allowed to modify its own call
parameters.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] SUNRPC: Avoid address overwrite with eBPF NAT
  2023-08-17  2:09 ` Trond Myklebust
@ 2023-08-17  2:29   ` Trond Myklebust
  0 siblings, 0 replies; 4+ messages in thread
From: Trond Myklebust @ 2023-08-17  2:29 UTC (permalink / raw)
  To: jlayton@kernel.org, neilb@suse.de, jrife@google.com,
	chuck.lever@oracle.com
  Cc: linux-nfs@vger.kernel.org

On Thu, 2023-08-17 at 02:09 +0000, Trond Myklebust wrote:
> On Wed, 2023-08-16 at 20:48 -0500, Jordan Rife wrote:
> > [You don't often get email from jrife@google.com. Learn why this is
> > important at https://aka.ms/LearnAboutSenderIdentification ]
> > 
> > kernel_connect() will modify the rpc_xprt socket address in
> > contexts
> > where eBPF programs perform NAT instead of iptables. In these
> > contexts,
> > it is common for an NFS mount to be mounted to be a static virtual
> > IP
> > while the server has an ephemeral IP leading to a problem where the
> > virtual IP gets overwritten and forgotten. When the endpoint IP
> > changes,
> > reconnect attempts fail and the mount never recovers.
> > 
> > This patch protects addr from being modified in these scenarios,
> > allowing
> > NFS reconnects to work as intended.
> 
> What? No! A connect() call should not be allowed to modify its own
> call
> parameters.
> 

To put it more succinctly, the struct rpc_xprt is one of many private
kernel structures. Parts of it can be exposed through public APIs, such
as the sysfs API that we're building, but when you use eBPF to hack
your way around those public APIs, then you're on your own. We're not
going to commit to support your hacks.


-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-08-17  2:30 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-17  1:48 [PATCH] SUNRPC: Avoid address overwrite with eBPF NAT Jordan Rife
2023-08-17  2:07 ` Chuck Lever III
2023-08-17  2:09 ` Trond Myklebust
2023-08-17  2:29   ` Trond Myklebust

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox