linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bobby Eshleman <bobbyeshleman@gmail.com>
To: "Stefano Garzarella" <sgarzare@redhat.com>,
	"Shuah Khan" <shuah@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	"Eric Dumazet" <edumazet@google.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Simon Horman" <horms@kernel.org>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Jason Wang" <jasowang@redhat.com>,
	"Xuan Zhuo" <xuanzhuo@linux.alibaba.com>,
	"Eugenio Pérez" <eperezma@redhat.com>,
	"K. Y. Srinivasan" <kys@microsoft.com>,
	"Haiyang Zhang" <haiyangz@microsoft.com>,
	"Wei Liu" <wei.liu@kernel.org>,
	"Dexuan Cui" <decui@microsoft.com>,
	"Bryan Tan" <bryan-bt.tan@broadcom.com>,
	"Vishnu Dasa" <vishnu.dasa@broadcom.com>,
	"Broadcom internal kernel review list"
	<bcm-kernel-feedback-list@broadcom.com>,
	"Bobby Eshleman" <bobbyeshleman@gmail.com>
Cc: virtualization@lists.linux.dev, netdev@vger.kernel.org,
	 linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org,
	 kvm@vger.kernel.org, linux-hyperv@vger.kernel.org,
	 Sargun Dhillon <sargun@sargun.me>,
	berrange@redhat.com,  Bobby Eshleman <bobbyeshleman@meta.com>
Subject: [PATCH net-next v9 08/14] vsock: reject bad VSOCK_NET_MODE_LOCAL configuration for G2H
Date: Tue, 11 Nov 2025 22:54:50 -0800	[thread overview]
Message-ID: <20251111-vsock-vmtest-v9-8-852787a37bed@meta.com> (raw)
In-Reply-To: <20251111-vsock-vmtest-v9-0-852787a37bed@meta.com>

From: Bobby Eshleman <bobbyeshleman@meta.com>

Reject setting VSOCK_NET_MODE_LOCAL with -EOPNOTSUPP if a G2H transport
is operational. Additionally, reject G2H transport registration if there
already exists a namespace in local mode.

G2H sockets break in local mode because the G2H transports don't support
namespacing yet. The current approach is to coerce packets coming out of
G2H transports into VSOCK_NET_MODE_GLOBAL mode, but it is not possible
to coerce sockets in the same way because it cannot be deduced which
transport will be used by the socket. Specifically, when bound to
VMADDR_CID_ANY in a nested VM (both G2H and H2G available), it is not
until a packet is received and matched to the bound socket that we
assign the transport. This presents a chicken-and-egg problem, because
we need the namespace to lookup the socket and resolve the transport,
but we need the transport to know how to use the namespace during
lookup.

For that reason, this patch prevents VSOCK_NET_MODE_LOCAL from being
used on systems that support G2H, even nested systems that also have H2G
transports.

Local mode is blocked based on detecting the presence of G2H devices
(when possible, as hyperv is special). This means that a host kernel
with G2H support compiled in (or has the module loaded), will still
support local mode because there is no G2H (e.g., virtio-vsock) device
detected. This enables using the same kernel in the host and in the
guest, as we do in kselftest.

Systems with only namespace-aware transports (vhost-vsock, loopback) can
still use both VSOCK_NET_MODE_GLOBAL and VSOCK_NET_MODE_LOCAL modes as
intended.

The hyperv transport must be treated specially. Other G2H transports can
can report presence of a device using get_local_cid(). When a device is
present it returns a valid CID; otherwise, it returns VMADDR_CID_ANY.
THe hyperv transport's get_local_cid() always returns VMADDR_CID_ANY,
however, even when a device is present.

For that reason, this patch adds an always_block_local_mode flag to
struct vsock_transport. When set to true, VSOCK_NET_MODE_LOCAL is
blocked unconditionally whenever the transport is registered, regardless
of device presence. When false, LOCAL mode is only blocked when
get_local_cid() indicates a device is present (!= VMADDR_CID_ANY).

The hyperv transport sets this flag to true to unconditionally block
local mode. Other G2H transports (virtio-vsock, vmci-vsock) leave it
false and continue using device detection via get_local_cid() to block
local mode.

These restrictions can be lifted in a future patch series when G2H
transports gain namespace support.

Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
---
 include/net/af_vsock.h           |  8 +++++++
 net/vmw_vsock/af_vsock.c         | 45 +++++++++++++++++++++++++++++++++++++---
 net/vmw_vsock/hyperv_transport.c |  1 +
 3 files changed, 51 insertions(+), 3 deletions(-)

diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index cfd121bb5ab7..089c61105dda 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -108,6 +108,14 @@ struct vsock_transport_send_notify_data {
 
 struct vsock_transport {
 	struct module *module;
+	/* If true, block VSOCK_NET_MODE_LOCAL unconditionally when this G2H
+	 * transport is registered. If false, only block LOCAL mode when
+	 * get_local_cid() indicates a device is present (!= VMADDR_CID_ANY).
+	 * Hyperv sets this true because it doesn't offer a callback that
+	 * detects device presence. This only applies to G2H transports; H2G
+	 * transports are unaffected.
+	 */
+	bool always_block_local_mode;
 
 	/* Initialize/tear-down socket. */
 	int (*init)(struct vsock_sock *, struct vsock_sock *);
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index c0b5946bdc95..a2da1810b802 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -91,6 +91,11 @@
  *   and locked down by a namespace manager. The default is "global". The mode
  *   is set per-namespace.
  *
+ *   Note: LOCAL mode is only supported when using namespace-aware transports
+ *   (vhost-vsock, loopback). If a guest-to-host transport (virtio-vsock,
+ *   hyperv-vsock, vmci-vsock) is loaded, attempts to set LOCAL mode will fail
+ *   with EOPNOTSUPP, as these transports do not support per-namespace isolation.
+ *
  *   The modes affect the allocation and accessibility of CIDs as follows:
  *
  *   - global - access and allocation are all system-wide
@@ -2757,12 +2762,30 @@ static int vsock_net_mode_string(const struct ctl_table *table, int write,
 		if (*lenp >= sizeof(data))
 			return -EINVAL;
 
-		if (!strncmp(data, VSOCK_NET_MODE_STR_GLOBAL, sizeof(data)))
+		if (!strncmp(data, VSOCK_NET_MODE_STR_GLOBAL, sizeof(data))) {
 			mode = VSOCK_NET_MODE_GLOBAL;
-		else if (!strncmp(data, VSOCK_NET_MODE_STR_LOCAL, sizeof(data)))
+		} else if (!strncmp(data, VSOCK_NET_MODE_STR_LOCAL, sizeof(data))) {
+			/* LOCAL mode is not supported when G2H transports
+			 * (virtio-vsock, hyperv, vmci) are active, because
+			 * these transports don't support namespaces. We must
+			 * stay in GLOBAL mode to avoid bind/lookup mismatches.
+			 *
+			 * Check if G2H transport is present and either:
+			 * 1. Has always_block_local_mode set (hyperv), OR
+			 * 2. Has an actual device present (get_local_cid() != VMADDR_CID_ANY)
+			 */
+			mutex_lock(&vsock_register_mutex);
+			if (transport_g2h &&
+			    (transport_g2h->always_block_local_mode ||
+			     transport_g2h->get_local_cid() != VMADDR_CID_ANY)) {
+				mutex_unlock(&vsock_register_mutex);
+				return -EOPNOTSUPP;
+			}
+			mutex_unlock(&vsock_register_mutex);
 			mode = VSOCK_NET_MODE_LOCAL;
-		else
+		} else {
 			return -EINVAL;
+		}
 
 		if (!vsock_net_write_mode(net, mode))
 			return -EPERM;
@@ -2909,6 +2932,7 @@ int vsock_core_register(const struct vsock_transport *t, int features)
 {
 	const struct vsock_transport *t_h2g, *t_g2h, *t_dgram, *t_local;
 	int err = mutex_lock_interruptible(&vsock_register_mutex);
+	struct net *net;
 
 	if (err)
 		return err;
@@ -2931,6 +2955,21 @@ int vsock_core_register(const struct vsock_transport *t, int features)
 			err = -EBUSY;
 			goto err_busy;
 		}
+
+		/* G2H sockets break in LOCAL mode namespaces because G2H transports
+		 * don't support them yet. Block registering new G2H transports if we
+		 * already have local mode namespaces on the system.
+		 */
+		rcu_read_lock();
+		for_each_net_rcu(net) {
+			if (vsock_net_mode(net) == VSOCK_NET_MODE_LOCAL) {
+				rcu_read_unlock();
+				err = -EOPNOTSUPP;
+				goto err_busy;
+			}
+		}
+		rcu_read_unlock();
+
 		t_g2h = t;
 	}
 
diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
index 432fcbbd14d4..ed48dd1ff19b 100644
--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -835,6 +835,7 @@ int hvs_notify_set_rcvlowat(struct vsock_sock *vsk, int val)
 
 static struct vsock_transport hvs_transport = {
 	.module                   = THIS_MODULE,
+	.always_block_local_mode  = true,
 
 	.get_local_cid            = hvs_get_local_cid,
 

-- 
2.47.3


  parent reply	other threads:[~2025-11-12  6:55 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-12  6:54 [PATCH net-next v9 00/14] vsock: add namespace support to vhost-vsock and loopback Bobby Eshleman
2025-11-12  6:54 ` [PATCH net-next v9 01/14] vsock: a per-net vsock NS mode state Bobby Eshleman
2025-11-12 14:13   ` Stefano Garzarella
2025-11-12  6:54 ` [PATCH net-next v9 02/14] vsock: add netns to vsock core Bobby Eshleman
2025-11-12 14:14   ` Stefano Garzarella
2025-11-12  6:54 ` [PATCH net-next v9 03/14] vsock/virtio: add netns support to virtio transport and virtio common Bobby Eshleman
2025-11-12 14:18   ` Stefano Garzarella
2025-11-12 16:13     ` Bobby Eshleman
2025-11-12 17:39       ` Stefano Garzarella
2025-11-12 19:32         ` Bobby Eshleman
2025-11-13 15:31           ` Stefano Garzarella
2025-11-12  6:54 ` [PATCH net-next v9 04/14] vsock/virtio: pack struct virtio_vsock_skb_cb Bobby Eshleman
2025-11-12  6:54 ` [PATCH net-next v9 05/14] vsock: add netns and netns_tracker to vsock skb cb Bobby Eshleman
2025-11-12  6:54 ` [PATCH net-next v9 06/14] vsock/loopback: add netns support Bobby Eshleman
2025-11-12 14:19   ` Stefano Garzarella
2025-11-12 18:27     ` Bobby Eshleman
2025-11-13 15:24       ` Stefano Garzarella
2025-11-13 18:26         ` Bobby Eshleman
2025-11-14  9:33           ` Stefano Garzarella
2025-11-14 22:13             ` Bobby Eshleman
2025-11-17  9:27               ` Stefano Garzarella
2025-11-12  6:54 ` [PATCH net-next v9 07/14] vhost/vsock: " Bobby Eshleman
2025-11-12  6:54 ` Bobby Eshleman [this message]
2025-11-12 14:21   ` [PATCH net-next v9 08/14] vsock: reject bad VSOCK_NET_MODE_LOCAL configuration for G2H Stefano Garzarella
2025-11-12 18:36     ` Bobby Eshleman
2025-11-12  6:54 ` [PATCH net-next v9 09/14] selftests/vsock: add namespace helpers to vmtest.sh Bobby Eshleman
2025-11-12  6:54 ` [PATCH net-next v9 10/14] selftests/vsock: prepare vm management helpers for namespaces Bobby Eshleman
2025-11-12 14:23   ` Stefano Garzarella
2025-11-12  6:54 ` [PATCH net-next v9 11/14] selftests/vsock: add tests for proc sys vsock ns_mode Bobby Eshleman
2025-11-12 14:38   ` Stefano Garzarella
2025-11-12  6:54 ` [PATCH net-next v9 12/14] selftests/vsock: add namespace tests for CID collisions Bobby Eshleman
2025-11-12  6:54 ` [PATCH net-next v9 13/14] selftests/vsock: add tests for host <-> vm connectivity with namespaces Bobby Eshleman
2025-11-12 14:41   ` Stefano Garzarella
2025-11-12  6:54 ` [PATCH net-next v9 14/14] selftests/vsock: add tests for namespace deletion and mode changes Bobby Eshleman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251111-vsock-vmtest-v9-8-852787a37bed@meta.com \
    --to=bobbyeshleman@gmail.com \
    --cc=bcm-kernel-feedback-list@broadcom.com \
    --cc=berrange@redhat.com \
    --cc=bobbyeshleman@meta.com \
    --cc=bryan-bt.tan@broadcom.com \
    --cc=davem@davemloft.net \
    --cc=decui@microsoft.com \
    --cc=edumazet@google.com \
    --cc=eperezma@redhat.com \
    --cc=haiyangz@microsoft.com \
    --cc=horms@kernel.org \
    --cc=jasowang@redhat.com \
    --cc=kuba@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sargun@sargun.me \
    --cc=sgarzare@redhat.com \
    --cc=shuah@kernel.org \
    --cc=stefanha@redhat.com \
    --cc=virtualization@lists.linux.dev \
    --cc=vishnu.dasa@broadcom.com \
    --cc=wei.liu@kernel.org \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).