From: Stefano Garzarella <sgarzare@redhat.com>
To: Alexander Graf <graf@amazon.com>,
Bryan Tan <bryan-bt.tan@broadcom.com>,
Vishnu Dasa <vishnu.dasa@broadcom.com>,
Broadcom internal kernel review list
<bcm-kernel-feedback-list@broadcom.com>
Cc: virtualization@lists.linux.dev, linux-kernel@vger.kernel.org,
netdev@vger.kernel.org, kvm@vger.kernel.org,
eperezma@redhat.com, Jason Wang <jasowang@redhat.com>,
mst@redhat.com, Stefan Hajnoczi <stefanha@redhat.com>,
nh-open-source@amazon.com
Subject: Re: [PATCH] vsock: Enable H2G override
Date: Mon, 2 Mar 2026 13:06:51 +0100 [thread overview]
Message-ID: <aaV80wWlpjEtYCQJ@sgarzare-redhat> (raw)
In-Reply-To: <aaVrsXMmULivV4Se@sgarzare-redhat>
CCing Bryan, Vishnu, and Broadcom list.
On Mon, Mar 02, 2026 at 12:47:05PM +0100, Stefano Garzarella wrote:
>
>Please target net-next tree for this new feature.
>
>On Mon, Mar 02, 2026 at 10:41:38AM +0000, Alexander Graf wrote:
>>Vsock maintains a single CID number space which can be used to
>>communicate to the host (G2H) or to a child-VM (H2G). The current logic
>>trivially assumes that G2H is only relevant for CID <= 2 because these
>>target the hypervisor. However, in environments like Nitro Enclaves, an
>>instance that hosts vhost_vsock powered VMs may still want to communicate
>>to Enclaves that are reachable at higher CIDs through virtio-vsock-pci.
>>
>>That means that for CID > 2, we really want an overlay. By default, all
>>CIDs are owned by the hypervisor. But if vhost registers a CID, it takes
>>precedence. Implement that logic. Vhost already knows which CIDs it
>>supports anyway.
>>
>>With this logic, I can run a Nitro Enclave as well as a nested VM with
>>vhost-vsock support in parallel, with the parent instance able to
>>communicate to both simultaneously.
>
>I honestly don't understand why VMADDR_FLAG_TO_HOST (added
>specifically for Nitro IIRC) isn't enough for this scenario and we
>have to add this change. Can you elaborate a bit more about the
>relationship between this change and VMADDR_FLAG_TO_HOST we added?
>
>>
>>Signed-off-by: Alexander Graf <graf@amazon.com>
>>---
>>drivers/vhost/vsock.c | 11 +++++++++++
>>include/net/af_vsock.h | 3 +++
>>net/vmw_vsock/af_vsock.c | 3 +++
>>3 files changed, 17 insertions(+)
>>
>>diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
>>index 054f7a718f50..223da817e305 100644
>>--- a/drivers/vhost/vsock.c
>>+++ b/drivers/vhost/vsock.c
>>@@ -91,6 +91,16 @@ static struct vhost_vsock *vhost_vsock_get(u32 guest_cid, struct net *net)
>> return NULL;
>>}
>>
>>+static bool vhost_transport_has_cid(u32 cid)
>>+{
>>+ bool found;
>>+
>>+ rcu_read_lock();
>>+ found = vhost_vsock_get(cid) != NULL;
>
>We recently added namespaces support that changed vhost_vsock_get()
>params. This is also in net tree now and in Linus' tree, so not sure
>where this patch is based, but this needs to be rebased since it is
>not building:
>
>../drivers/vhost/vsock.c: In function ‘vhost_transport_has_cid’:
>../drivers/vhost/vsock.c:99:17: error: too few arguments to function ‘vhost_vsock_get’; expected 2, have 1
> 99 | found = vhost_vsock_get(cid) != NULL;
> | ^~~~~~~~~~~~~~~
>../drivers/vhost/vsock.c:74:28: note: declared here
> 74 | static struct vhost_vsock *vhost_vsock_get(u32 guest_cid, struct net *net)
> |
>
>>+ rcu_read_unlock();
>>+ return found;
>>+}
>>+
>>static void
>>vhost_transport_do_send_pkt(struct vhost_vsock *vsock,
>> struct vhost_virtqueue *vq)
>>@@ -424,6 +434,7 @@ static struct virtio_transport vhost_transport = {
>> .module = THIS_MODULE,
>>
>> .get_local_cid = vhost_transport_get_local_cid,
>>+ .has_cid = vhost_transport_has_cid,
>>
>> .init = virtio_transport_do_socket_init,
>> .destruct = virtio_transport_destruct,
>>diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
>>index 533d8e75f7bb..4cdcb72f9765 100644
>>--- a/include/net/af_vsock.h
>>+++ b/include/net/af_vsock.h
>>@@ -179,6 +179,9 @@ struct vsock_transport {
>> /* Addressing. */
>> u32 (*get_local_cid)(void);
>>
>>+ /* Check if this transport serves a specific remote CID. */
>>+ bool (*has_cid)(u32 cid);
>
>What about "has_remote_cid" ?
>
>>+
>> /* Read a single skb */
>> int (*read_skb)(struct vsock_sock *, skb_read_actor_t);
>>
>>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>>index 2f7d94d682cb..8b34b264b246 100644
>>--- a/net/vmw_vsock/af_vsock.c
>>+++ b/net/vmw_vsock/af_vsock.c
>>@@ -584,6 +584,9 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
>> else if (remote_cid <= VMADDR_CID_HOST || !transport_h2g ||
>> (remote_flags & VMADDR_FLAG_TO_HOST))
>> new_transport = transport_g2h;
>>+ else if (transport_h2g->has_cid &&
>>+ !transport_h2g->has_cid(remote_cid))
>>+ new_transport = transport_g2h;
>
>We should update the comment on top of this fuction, and maybe also
>try to support the other H2G transport (i.e. VMCI).
>
>@Bryan @Vishnu can the new has_cid()/has_remote_cid() be supported by
>VMCI too?
Oops, I forgot to CC them, now they should be in copy.
Stefano
>
>
>
>I have a question: until now, transport assignment was based simply on
>analyzing local socket information (vsk->remote_addr), but now we are
>also adding the status of other components (e.g., VMs that have
>started and registered the CID in vhost-vsock).
>
>Could this produce strange behavior?
>For example, two sockets with the same remote_addr communicate with
>the host or with the guest depending on whether or not the VM existed
>when they were created.
>
>Thanks,
>Stefano
>
>> else
>> new_transport = transport_h2g;
>> break;
>>--
>>2.47.1
>>
>>
>>
>>
>>Amazon Web Services Development Center Germany GmbH
>>Tamara-Danz-Str. 13
>>10243 Berlin
>>Geschaeftsfuehrung: Christof Hellmis, Andreas Stieger
>>Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
>>Sitz: Berlin
>>Ust-ID: DE 365 538 597
>>
>>
next prev parent reply other threads:[~2026-03-02 12:07 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-02 10:41 [PATCH] vsock: Enable H2G override Alexander Graf
2026-03-02 11:47 ` Stefano Garzarella
2026-03-02 12:06 ` Stefano Garzarella [this message]
2026-03-02 15:48 ` Alexander Graf
2026-03-02 16:25 ` Stefano Garzarella
2026-03-02 19:04 ` Alexander Graf
2026-03-03 9:49 ` Stefano Garzarella
2026-03-03 14:17 ` Bryan Tan
2026-03-03 20:47 ` Alexander Graf
2026-03-03 20:52 ` Michael S. Tsirkin
2026-03-03 21:05 ` Alexander Graf
2026-03-02 19:52 ` Michael S. Tsirkin
2026-03-03 6:51 ` Alexander Graf
2026-03-03 7:19 ` Michael S. Tsirkin
2026-03-03 9:57 ` Stefano Garzarella
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aaV80wWlpjEtYCQJ@sgarzare-redhat \
--to=sgarzare@redhat.com \
--cc=bcm-kernel-feedback-list@broadcom.com \
--cc=bryan-bt.tan@broadcom.com \
--cc=eperezma@redhat.com \
--cc=graf@amazon.com \
--cc=jasowang@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=nh-open-source@amazon.com \
--cc=stefanha@redhat.com \
--cc=virtualization@lists.linux.dev \
--cc=vishnu.dasa@broadcom.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.