* [PATCH] net: Avoid address overwrite in kernel_connect
@ 2023-08-21 10:00 Jordan Rife
2023-08-21 14:59 ` Kuniyuki Iwashima
0 siblings, 1 reply; 5+ messages in thread
From: Jordan Rife @ 2023-08-21 10:00 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni; +Cc: netdev, Jordan Rife
BPF programs that run on connect can rewrite the connect address. For
the connect system call this isn't a problem, because a copy of the address
is made when it is moved into kernel space. However, kernel_connect
simply passes through the address it is given, so the caller may observe
its address value unexpectedly change.
A practical example where this is problematic is where NFS is combined
with a system such as Cilium which implements BPF-based load balancing.
A common pattern in software-defined storage systems is to have an NFS
mount that connects to a persistent virtual IP which in turn maps to an
ephemeral server IP. This is usually done to achieve high availability:
if your server goes down you can quickly spin up a replacement and remap
the virtual IP to that endpoint. With BPF-based load balancing, mounts
will forget the virtual IP address when the address rewrite occurs
because a pointer to the only copy of that address is passed down the
stack. Server failover then breaks, because clients have forgotten the
virtual IP address. Reconnects fail and mounts remain broken. This patch
was tested by setting up a scenario like this and ensuring that NFS
reconnects worked after applying the patch.
Signed-off-by: Jordan Rife <jrife@google.com>
---
net/socket.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/net/socket.c b/net/socket.c
index 2b0e54b2405c8..f49edb9b49185 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -3519,7 +3519,11 @@ EXPORT_SYMBOL(kernel_accept);
int kernel_connect(struct socket *sock, struct sockaddr *addr, int addrlen,
int flags)
{
- return sock->ops->connect(sock, addr, addrlen, flags);
+ struct sockaddr_storage address;
+
+ memcpy(&address, addr, addrlen);
+
+ return sock->ops->connect(sock, (struct sockaddr *)&address, addrlen, flags);
}
EXPORT_SYMBOL(kernel_connect);
--
2.42.0.rc1.204.g551eb34607-goog
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] net: Avoid address overwrite in kernel_connect
2023-08-21 10:00 [PATCH] net: Avoid address overwrite in kernel_connect Jordan Rife
@ 2023-08-21 14:59 ` Kuniyuki Iwashima
2023-08-21 16:26 ` [PATCH v2] " Jordan Rife
0 siblings, 1 reply; 5+ messages in thread
From: Kuniyuki Iwashima @ 2023-08-21 14:59 UTC (permalink / raw)
To: jrife; +Cc: davem, edumazet, kuba, netdev, pabeni, kuniyu
From: Jordan Rife <jrife@google.com>
Date: Mon, 21 Aug 2023 05:00:06 -0500
> BPF programs that run on connect can rewrite the connect address. For
> the connect system call this isn't a problem, because a copy of the address
> is made when it is moved into kernel space. However, kernel_connect
> simply passes through the address it is given, so the caller may observe
> its address value unexpectedly change.
>
> A practical example where this is problematic is where NFS is combined
> with a system such as Cilium which implements BPF-based load balancing.
> A common pattern in software-defined storage systems is to have an NFS
> mount that connects to a persistent virtual IP which in turn maps to an
> ephemeral server IP. This is usually done to achieve high availability:
> if your server goes down you can quickly spin up a replacement and remap
> the virtual IP to that endpoint. With BPF-based load balancing, mounts
> will forget the virtual IP address when the address rewrite occurs
> because a pointer to the only copy of that address is passed down the
> stack. Server failover then breaks, because clients have forgotten the
> virtual IP address. Reconnects fail and mounts remain broken. This patch
> was tested by setting up a scenario like this and ensuring that NFS
> reconnects worked after applying the patch.
>
> Signed-off-by: Jordan Rife <jrife@google.com>
> ---
> net/socket.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/net/socket.c b/net/socket.c
> index 2b0e54b2405c8..f49edb9b49185 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -3519,7 +3519,11 @@ EXPORT_SYMBOL(kernel_accept);
> int kernel_connect(struct socket *sock, struct sockaddr *addr, int addrlen,
> int flags)
> {
> - return sock->ops->connect(sock, addr, addrlen, flags);
> + struct sockaddr_storage address;
> +
> + memcpy(&address, addr, addrlen);
> +
> + return sock->ops->connect(sock, (struct sockaddr *)&address, addrlen, flags);
Could you rebase on net-next.git ? I think this patch conflicts with
1ded5e5a5931 ("net: annotate data-races around sock->ops").
> }
> EXPORT_SYMBOL(kernel_connect);
>
> --
> 2.42.0.rc1.204.g551eb34607-goog
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v2] net: Avoid address overwrite in kernel_connect
2023-08-21 14:59 ` Kuniyuki Iwashima
@ 2023-08-21 16:26 ` Jordan Rife
2023-08-21 21:45 ` [PATCH net-next v3] " Jordan Rife
0 siblings, 1 reply; 5+ messages in thread
From: Jordan Rife @ 2023-08-21 16:26 UTC (permalink / raw)
To: kuniyu; +Cc: davem, edumazet, kuba, pabeni, netdev, Jordan Rife
BPF programs that run on connect can rewrite the connect address. For
the connect system call this isn't a problem, because a copy of the address
is made when it is moved into kernel space. However, kernel_connect
simply passes through the address it is given, so the caller may observe
its address value unexpectedly change.
A practical example where this is problematic is where NFS is combined
with a system such as Cilium which implements BPF-based load balancing.
A common pattern in software-defined storage systems is to have an NFS
mount that connects to a persistent virtual IP which in turn maps to an
ephemeral server IP. This is usually done to achieve high availability:
if your server goes down you can quickly spin up a replacement and remap
the virtual IP to that endpoint. With BPF-based load balancing, mounts
will forget the virtual IP address when the address rewrite occurs
because a pointer to the only copy of that address is passed down the
stack. Server failover then breaks, because clients have forgotten the
virtual IP address. Reconnects fail and mounts remain broken. This patch
was tested by setting up a scenario like this and ensuring that NFS
reconnects worked after applying the patch.
Signed-off-by: Jordan Rife <jrife@google.com>
---
V1 -> V2: Rebased on net-next
net/socket.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/net/socket.c b/net/socket.c
index fdb5233bf560c..90c07148835e6 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -3567,7 +3567,11 @@ EXPORT_SYMBOL(kernel_accept);
int kernel_connect(struct socket *sock, struct sockaddr *addr, int addrlen,
int flags)
{
- return READ_ONCE(sock->ops)->connect(sock, addr, addrlen, flags);
+ struct sockaddr_storage address;
+
+ memcpy(&address, addr, addrlen);
+
+ return READ_ONCE(sock->ops)->connect(sock, (struct sockaddr *)&address, addrlen, flags);
}
EXPORT_SYMBOL(kernel_connect);
--
2.42.0.rc1.204.g551eb34607-goog
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH net-next v3] net: Avoid address overwrite in kernel_connect
2023-08-21 16:26 ` [PATCH v2] " Jordan Rife
@ 2023-08-21 21:45 ` Jordan Rife
2023-08-23 8:50 ` patchwork-bot+netdevbpf
0 siblings, 1 reply; 5+ messages in thread
From: Jordan Rife @ 2023-08-21 21:45 UTC (permalink / raw)
To: kuniyu; +Cc: davem, edumazet, kuba, pabeni, netdev, Jordan Rife
BPF programs that run on connect can rewrite the connect address. For
the connect system call this isn't a problem, because a copy of the address
is made when it is moved into kernel space. However, kernel_connect
simply passes through the address it is given, so the caller may observe
its address value unexpectedly change.
A practical example where this is problematic is where NFS is combined
with a system such as Cilium which implements BPF-based load balancing.
A common pattern in software-defined storage systems is to have an NFS
mount that connects to a persistent virtual IP which in turn maps to an
ephemeral server IP. This is usually done to achieve high availability:
if your server goes down you can quickly spin up a replacement and remap
the virtual IP to that endpoint. With BPF-based load balancing, mounts
will forget the virtual IP address when the address rewrite occurs
because a pointer to the only copy of that address is passed down the
stack. Server failover then breaks, because clients have forgotten the
virtual IP address. Reconnects fail and mounts remain broken. This patch
was tested by setting up a scenario like this and ensuring that NFS
reconnects worked after applying the patch.
Signed-off-by: Jordan Rife <jrife@google.com>
---
V2 -> V3: Broke up long line
V1 -> V2: Rebased on net-next
net/socket.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/socket.c b/net/socket.c
index fdb5233bf560c..848116d06b511 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -3567,7 +3567,12 @@ EXPORT_SYMBOL(kernel_accept);
int kernel_connect(struct socket *sock, struct sockaddr *addr, int addrlen,
int flags)
{
- return READ_ONCE(sock->ops)->connect(sock, addr, addrlen, flags);
+ struct sockaddr_storage address;
+
+ memcpy(&address, addr, addrlen);
+
+ return READ_ONCE(sock->ops)->connect(sock, (struct sockaddr *)&address,
+ addrlen, flags);
}
EXPORT_SYMBOL(kernel_connect);
--
2.42.0.rc1.204.g551eb34607-goog
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net-next v3] net: Avoid address overwrite in kernel_connect
2023-08-21 21:45 ` [PATCH net-next v3] " Jordan Rife
@ 2023-08-23 8:50 ` patchwork-bot+netdevbpf
0 siblings, 0 replies; 5+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-08-23 8:50 UTC (permalink / raw)
To: Jordan Rife; +Cc: kuniyu, davem, edumazet, kuba, pabeni, netdev
Hello:
This patch was applied to netdev/net-next.git (main)
by David S. Miller <davem@davemloft.net>:
On Mon, 21 Aug 2023 16:45:23 -0500 you wrote:
> BPF programs that run on connect can rewrite the connect address. For
> the connect system call this isn't a problem, because a copy of the address
> is made when it is moved into kernel space. However, kernel_connect
> simply passes through the address it is given, so the caller may observe
> its address value unexpectedly change.
>
> A practical example where this is problematic is where NFS is combined
> with a system such as Cilium which implements BPF-based load balancing.
> A common pattern in software-defined storage systems is to have an NFS
> mount that connects to a persistent virtual IP which in turn maps to an
> ephemeral server IP. This is usually done to achieve high availability:
> if your server goes down you can quickly spin up a replacement and remap
> the virtual IP to that endpoint. With BPF-based load balancing, mounts
> will forget the virtual IP address when the address rewrite occurs
> because a pointer to the only copy of that address is passed down the
> stack. Server failover then breaks, because clients have forgotten the
> virtual IP address. Reconnects fail and mounts remain broken. This patch
> was tested by setting up a scenario like this and ensuring that NFS
> reconnects worked after applying the patch.
>
> [...]
Here is the summary with links:
- [net-next,v3] net: Avoid address overwrite in kernel_connect
https://git.kernel.org/netdev/net-next/c/0bdf399342c5
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-08-23 8:50 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-21 10:00 [PATCH] net: Avoid address overwrite in kernel_connect Jordan Rife
2023-08-21 14:59 ` Kuniyuki Iwashima
2023-08-21 16:26 ` [PATCH v2] " Jordan Rife
2023-08-21 21:45 ` [PATCH net-next v3] " Jordan Rife
2023-08-23 8:50 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).