From: Bobby Eshleman <bobbyeshleman@gmail.com>
To: "Stefano Garzarella" <sgarzare@redhat.com>,
"Shuah Khan" <shuah@kernel.org>,
"David S. Miller" <davem@davemloft.net>,
"Eric Dumazet" <edumazet@google.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Paolo Abeni" <pabeni@redhat.com>,
"Simon Horman" <horms@kernel.org>,
"Stefan Hajnoczi" <stefanha@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
"Jason Wang" <jasowang@redhat.com>,
"Xuan Zhuo" <xuanzhuo@linux.alibaba.com>,
"Eugenio Pérez" <eperezma@redhat.com>,
"K. Y. Srinivasan" <kys@microsoft.com>,
"Haiyang Zhang" <haiyangz@microsoft.com>,
"Wei Liu" <wei.liu@kernel.org>,
"Dexuan Cui" <decui@microsoft.com>,
"Bryan Tan" <bryan-bt.tan@broadcom.com>,
"Vishnu Dasa" <vishnu.dasa@broadcom.com>,
"Broadcom internal kernel review list"
<bcm-kernel-feedback-list@broadcom.com>,
"Bobby Eshleman" <bobbyeshleman@gmail.com>
Cc: virtualization@lists.linux.dev, netdev@vger.kernel.org,
linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, linux-hyperv@vger.kernel.org,
berrange@redhat.com, Bobby Eshleman <bobbyeshleman@meta.com>
Subject: [PATCH net-next v7 01/26] vsock: a per-net vsock NS mode state
Date: Tue, 21 Oct 2025 16:46:44 -0700 [thread overview]
Message-ID: <20251021-vsock-vmtest-v7-1-0661b7b6f081@meta.com> (raw)
In-Reply-To: <20251021-vsock-vmtest-v7-0-0661b7b6f081@meta.com>
From: Bobby Eshleman <bobbyeshleman@meta.com>
Add the per-net vsock NS mode state. This only adds the structure for
holding the mode and some of the functions for setting/getting and
checking the mode, but does not integrate the functionality yet.
A "net_mode" field is added to vsock_sock to store the mode of the
namespace when the vsock_sock was created. In order to evaluate
namespace mode rules we need to know both a) which namespace the
endpoints are in, and b) what mode that namespace had when the endpoints
were created. This allows us to handle the changing of modes from global
to local *after* a socket has been created by remembering that the mode
was global when the socket was created. If we were to use the current
net's mode instead, then the lookup would fail and the socket would
break.
Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
---
Changes in v7:
- clarify vsock_net_check_mode() comments
- change to `orig_net_mode == VSOCK_NET_MODE_GLOBAL && orig_net_mode == vsk->orig_net_mode`
- remove extraneous explanation of `orig_net_mode`
- rename `written` to `mode_locked`
- rename `vsock_hdr` to `sysctl_hdr`
- change `orig_net_mode` to `net_mode`
- make vsock_net_check_mode() more generic by taking just net pointers
and modes, instead of a vsock_sock ptr, for reuse by transports
(e.g., vhost_vsock)
Changes in v6:
- add orig_net_mode to store mode at creation time which will be used to
avoid breakage when namespace changes mode during socket/VM lifespan
Changes in v5:
- use /proc/sys/net/vsock/ns_mode instead of /proc/net/vsock_ns_mode
- change from net->vsock.ns_mode to net->vsock.mode
- change vsock_net_set_mode() to vsock_net_write_mode()
- vsock_net_write_mode() returns bool for write success to avoid
need to use vsock_net_mode_can_set()
- remove vsock_net_mode_can_set()
---
MAINTAINERS | 1 +
include/net/af_vsock.h | 64 +++++++++++++++++++++++++++++++++++++++++++++
include/net/net_namespace.h | 4 +++
include/net/netns/vsock.h | 20 ++++++++++++++
4 files changed, 89 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 4faa7719bf86..c58f9e38898a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -27062,6 +27062,7 @@ L: netdev@vger.kernel.org
S: Maintained
F: drivers/vhost/vsock.c
F: include/linux/virtio_vsock.h
+F: include/net/netns/vsock.h
F: include/uapi/linux/virtio_vsock.h
F: net/vmw_vsock/virtio_transport.c
F: net/vmw_vsock/virtio_transport_common.c
diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index d40e978126e3..a1053d3668cf 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -10,6 +10,7 @@
#include <linux/kernel.h>
#include <linux/workqueue.h>
+#include <net/netns/vsock.h>
#include <net/sock.h>
#include <uapi/linux/vm_sockets.h>
@@ -65,6 +66,7 @@ struct vsock_sock {
u32 peer_shutdown;
bool sent_request;
bool ignore_connecting_rst;
+ enum vsock_net_mode net_mode;
/* Protected by lock_sock(sk) */
u64 buffer_size;
@@ -256,4 +258,66 @@ static inline bool vsock_msgzerocopy_allow(const struct vsock_transport *t)
{
return t->msgzerocopy_allow && t->msgzerocopy_allow();
}
+
+static inline enum vsock_net_mode vsock_net_mode(struct net *net)
+{
+ enum vsock_net_mode ret;
+
+ spin_lock_bh(&net->vsock.lock);
+ ret = net->vsock.mode;
+ spin_unlock_bh(&net->vsock.lock);
+ return ret;
+}
+
+static inline bool vsock_net_write_mode(struct net *net, u8 mode)
+{
+ bool ret;
+
+ spin_lock_bh(&net->vsock.lock);
+
+ if (net->vsock.mode_locked) {
+ ret = false;
+ goto skip;
+ }
+
+ net->vsock.mode = mode;
+ net->vsock.mode_locked = true;
+ ret = true;
+
+skip:
+ spin_unlock_bh(&net->vsock.lock);
+ return ret;
+}
+
+/* Return true if two namespaces and modes pass the mode rules. Otherwise,
+ * return false.
+ *
+ * ns0 and ns1 are the namespaces being checked.
+ * mode0 and mode1 are the vsock namespace modes of ns0 and ns1.
+ *
+ * Read more about modes in the comment header of net/vmw_vsock/af_vsock.c.
+ */
+static inline bool vsock_net_check_mode(struct net *ns0, enum vsock_net_mode mode0,
+ struct net *ns1, enum vsock_net_mode mode1)
+{
+ /* Any vsocks within the same network namespace are always reachable,
+ * regardless of the mode.
+ */
+ if (net_eq(ns0, ns1))
+ return true;
+
+ /*
+ * If the network namespaces differ, vsocks are only reachable if both
+ * were created in VSOCK_NET_MODE_GLOBAL mode.
+ *
+ * The vsock namespace mode is write-once, and the default is
+ * VSOCK_NET_MODE_GLOBAL. Once set to VSOCK_NET_MODE_LOCAL, it cannot
+ * revert to GLOBAL. It is not possible to have a case where a socket
+ * was created in LOCAL mode, and then the mode switched to GLOBAL.
+ *
+ * As a result, we only need to check if the modes were global at
+ * creation time.
+ */
+ return mode0 == VSOCK_NET_MODE_GLOBAL && mode0 == mode1;
+}
#endif /* __AF_VSOCK_H__ */
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index cb664f6e3558..66d3de1d935f 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -37,6 +37,7 @@
#include <net/netns/smc.h>
#include <net/netns/bpf.h>
#include <net/netns/mctp.h>
+#include <net/netns/vsock.h>
#include <net/net_trackers.h>
#include <linux/ns_common.h>
#include <linux/idr.h>
@@ -196,6 +197,9 @@ struct net {
/* Move to a better place when the config guard is removed. */
struct mutex rtnl_mutex;
#endif
+#if IS_ENABLED(CONFIG_VSOCKETS)
+ struct netns_vsock vsock;
+#endif
} __randomize_layout;
#include <linux/seq_file_net.h>
diff --git a/include/net/netns/vsock.h b/include/net/netns/vsock.h
new file mode 100644
index 000000000000..c9a438ad52f2
--- /dev/null
+++ b/include/net/netns/vsock.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __NET_NET_NAMESPACE_VSOCK_H
+#define __NET_NET_NAMESPACE_VSOCK_H
+
+#include <linux/types.h>
+
+enum vsock_net_mode {
+ VSOCK_NET_MODE_GLOBAL,
+ VSOCK_NET_MODE_LOCAL,
+};
+
+struct netns_vsock {
+ struct ctl_table_header *sysctl_hdr;
+ spinlock_t lock;
+
+ /* protected by lock */
+ enum vsock_net_mode mode;
+ bool mode_locked;
+};
+#endif /* __NET_NET_NAMESPACE_VSOCK_H */
--
2.47.3
next prev parent reply other threads:[~2025-10-21 23:47 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-21 23:46 [PATCH net-next v7 00/26] vsock: add namespace support to vhost-vsock Bobby Eshleman
2025-10-21 23:46 ` Bobby Eshleman [this message]
2025-10-21 23:46 ` [PATCH net-next v7 02/26] vsock/virtio: pack struct virtio_vsock_skb_cb Bobby Eshleman
2025-10-21 23:46 ` [PATCH net-next v7 03/26] vsock: add netns to vsock skb cb Bobby Eshleman
2025-10-21 23:46 ` [PATCH net-next v7 04/26] vsock: add netns to vsock core Bobby Eshleman
2025-10-21 23:46 ` [PATCH net-next v7 05/26] vsock/loopback: add netns support Bobby Eshleman
2025-10-21 23:46 ` [PATCH net-next v7 06/26] vsock/virtio: add netns to virtio transport common Bobby Eshleman
2025-10-21 23:46 ` [PATCH net-next v7 07/26] vhost/vsock: add netns support Bobby Eshleman
2025-10-21 23:46 ` [PATCH net-next v7 08/26] selftests/vsock: improve logging in vmtest.sh Bobby Eshleman
2025-10-22 0:01 ` Jakub Kicinski
2025-10-22 0:18 ` Bobby Eshleman
2025-10-21 23:46 ` [PATCH net-next v7 09/26] selftests/vsock: make wait_for_listener() work even if pipefail is on Bobby Eshleman
2025-10-21 23:46 ` [PATCH net-next v7 10/26] selftests/vsock: reuse logic for vsock_test through wrapper functions Bobby Eshleman
2025-10-21 23:46 ` [PATCH net-next v7 11/26] selftests/vsock: avoid multi-VM pidfile collisions with QEMU Bobby Eshleman
2025-10-21 23:46 ` [PATCH net-next v7 12/26] selftests/vsock: do not unconditionally die if qemu fails Bobby Eshleman
2025-10-21 23:46 ` [PATCH net-next v7 13/26] selftests/vsock: speed up tests by reducing the QEMU pidfile timeout Bobby Eshleman
2025-10-21 23:46 ` [PATCH net-next v7 14/26] selftests/vsock: add check_result() for pass/fail counting Bobby Eshleman
2025-10-21 23:46 ` [PATCH net-next v7 15/26] selftests/vsock: identify and execute tests that can re-use VM Bobby Eshleman
2025-10-21 23:46 ` [PATCH net-next v7 16/26] selftests/vsock: add namespace initialization function Bobby Eshleman
2025-10-21 23:47 ` [PATCH net-next v7 17/26] selftests/vsock: remove namespaces in cleanup() Bobby Eshleman
2025-10-21 23:47 ` [PATCH net-next v7 18/26] selftests/vsock: prepare vm management helpers for namespaces Bobby Eshleman
2025-10-21 23:47 ` [PATCH net-next v7 19/26] selftests/vsock: add BUILD=0 definition Bobby Eshleman
2025-10-21 23:47 ` [PATCH net-next v7 20/26] selftests/vsock: avoid false-positives when checking dmesg Bobby Eshleman
2025-10-21 23:47 ` [PATCH net-next v7 21/26] selftests/vsock: add tests for proc sys vsock ns_mode Bobby Eshleman
2025-10-21 23:47 ` [PATCH net-next v7 22/26] selftests/vsock: add namespace tests for CID collisions Bobby Eshleman
2025-10-21 23:47 ` [PATCH net-next v7 23/26] selftests/vsock: add tests for host <-> vm connectivity with namespaces Bobby Eshleman
2025-10-21 23:47 ` [PATCH net-next v7 24/26] selftests/vsock: add tests for namespace deletion and mode changes Bobby Eshleman
2025-10-21 23:47 ` [PATCH net-next v7 25/26] selftests/vsock: add tests for module loading order Bobby Eshleman
2025-10-21 23:47 ` [PATCH net-next v7 26/26] selftests/vsock: add 1.37 to tested virtme-ng versions Bobby Eshleman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251021-vsock-vmtest-v7-1-0661b7b6f081@meta.com \
--to=bobbyeshleman@gmail.com \
--cc=bcm-kernel-feedback-list@broadcom.com \
--cc=berrange@redhat.com \
--cc=bobbyeshleman@meta.com \
--cc=bryan-bt.tan@broadcom.com \
--cc=davem@davemloft.net \
--cc=decui@microsoft.com \
--cc=edumazet@google.com \
--cc=eperezma@redhat.com \
--cc=haiyangz@microsoft.com \
--cc=horms@kernel.org \
--cc=jasowang@redhat.com \
--cc=kuba@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=kys@microsoft.com \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sgarzare@redhat.com \
--cc=shuah@kernel.org \
--cc=stefanha@redhat.com \
--cc=virtualization@lists.linux.dev \
--cc=vishnu.dasa@broadcom.com \
--cc=wei.liu@kernel.org \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox