[PATCH v6 RESEND] virtio-vsock: Add support for multi devices

public inbox for virtio-comment@lists.linux.dev
 help / color / mirror / Atom feed

* [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
@ 2025-03-24  6:43 Xuewei Niu
  2025-03-24 13:51 ` Stefano Garzarella
  0 siblings, 1 reply; 23+ messages in thread
From: Xuewei Niu @ 2025-03-24  6:43 UTC (permalink / raw)
  To: sgarzare, parav, mst, fupan.lfp; +Cc: virtio-comment, Xuewei Niu

This patch brings a new feature, called "multi devices", to the virtio
vsock. It introduces a "VIRTIO_VSOCK_F_MULTI_DEVICES" feature bit, and a
"device_order" field to the config for the virtio vsock.

== Motivition ==

Vsock is a lightweight and widely used data exchange mechanism between host
and guest. Currently, the virtio-vsock only supports one device, resulting
in the inability to enable more than one backend. For instance, two devices
are required: one to transfer data to the VMM via virtio-vsock, and another
to a user process via vhost-user-vsock.

Apart from that, a side gain is that theoretically the performance might be
improved since each device has its own queue. But it varies depending on
the implementation.

== Typical Usages ==

Assuming there are two virtio-vsock devices on the guest, with CIDs 3 and 4
respectively. And the device with CID 3 is default.

Connect to the host using the device with CID 3.

```c
// use default one (no bind)
fd = socket(AF_VSOCK);
connect(fd, 2, 1234);
n = write(fd, buffer);

// or bind explicitly
fd = socket(AF_VSOCK);
bind(fd, 3, -1);
connect(fd, 2, 1234);
n = write(fd, buffer);
```

Connect to the host using the device with CID 4.

```c
// must bind explicitly as the device with CID 4 is not default.
fd = socket(AF_VSOCK);
bind(fd, 4, -1);
connect(fd, 2, 1234);
n = write(fd, buffer);
```

The first version of multi-devices implementation is available at [1].

[1] https://lore.kernel.org/virtualization/20240517144607.2595798-1-niuxuewei.nxw@antgroup.com

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
---
 device-types/vsock/description.tex | 30 ++++++++++++++++++++++++++++--
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/device-types/vsock/description.tex b/device-types/vsock/description.tex
index 7d91d15..7d0cfe4 100644
--- a/device-types/vsock/description.tex
+++ b/device-types/vsock/description.tex
@@ -20,6 +20,7 @@ \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}
 \item[VIRTIO_VSOCK_F_STREAM (0)] stream socket type is supported.
 \item[VIRTIO_VSOCK_F_SEQPACKET (1)] seqpacket socket type is supported.
 \item[VIRTIO_VSOCK_F_NO_IMPLIED_STREAM (2)] stream socket type is not implied.
+\item[VIRTIO_VSOCK_F_MULTI_DEVICES (3)] multiple devices feature is supported.
 \end{description}
 
 \drivernormative{\subsubsection}{Feature bits}{Device Types / Socket Device / Feature bits}
@@ -34,6 +35,12 @@ \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}
 VIRTIO_VSOCK_F_NO_IMPLIED_STREAM, the driver MAY act as if
 VIRTIO_VSOCK_F_STREAM has also been negotiated.
 
+The driver SHOULD ignore devices that do not have
+VIRTIO_VSOCK_F_MULTI_DEVICES if the feature has been negotiated.
+
+The driver SHOULD ignore all subsequent devices if a device without
+VIRTIO_VSOCK_F_MULTI_DEVICES is present.
+
 \devicenormative{\subsubsection}{Feature bits}{Device Types / Socket Device / Feature bits}
 
 The device SHOULD offer the VIRTIO_VSOCK_F_NO_IMPLIED_STREAM feature.
@@ -52,6 +59,7 @@ \subsection{Device configuration layout}\label{sec:Device Types / Socket Device
 \begin{lstlisting}
 struct virtio_vsock_config {
 	le64 guest_cid;
+	le16 device_order;
 };
 \end{lstlisting}
 
@@ -77,11 +85,27 @@ \subsection{Device configuration layout}\label{sec:Device Types / Socket Device
 \hline
 \end{tabular}
 
+The \field{device_order} is used to identify the default device. Up to
+65,535 devices can be supported due to the size.
+
+\devicenormative{\subsubsection}{Device configuration layout}{Device Types / Socket Device / Device configuration layout}
+
+The device MUST provide a distinct \field{device_order} if
+VIRTIO_VSOCK_F_MULTI_DEVICES feature has been negotiated.
+
+\drivernormative{\subsubsection}{Device configuration layout}{Device Types / Socket Device / Device configuration layout}
+
+The driver MUST treat the device with the lowest \field{device_order} as
+the default device.
+
 \subsection{Device Initialization}\label{sec:Device Types / Socket Device / Device Initialization}
 
 \begin{enumerate}
 \item The guest's cid is read from \field{guest_cid}.
 
+\item If VIRTIO_VSOCK_F_MULTI_DEVICES has been negotiated, the device's
+order will be read from \field{device_order}.
+
 \item Buffers are added to the event virtqueue to receive events from the device.
 
 \item Buffers are added to the rx virtqueue to start receiving packets.
@@ -233,8 +257,10 @@ \subsubsection{Receive and Transmit}\label{sec:Device Types / Socket Device / De
 
 \drivernormative{\paragraph}{Device Operation: Receive and Transmit}{Device Types / Socket Device / Device Operation / Receive and Transmit}
 
-The \field{guest_cid} configuration field MUST be used as the source CID when
-sending outgoing packets.
+If \field{src_cid} is missing in outgoing packets, the driver MUST assign
+one. If more than one device is present, the driver SHOULD use the default
+device's \field{guest_cid} configuration. Otherwise, the driver SHOULD use
+the \field{guest_cid} of the only available device.
 
 A VIRTIO_VSOCK_OP_RST reply MUST be sent if a packet is received with an
 unknown \field{type} value.
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-03-24  6:43 [PATCH v6 RESEND] virtio-vsock: Add support for multi devices Xuewei Niu
@ 2025-03-24 13:51 ` Stefano Garzarella
  2025-03-25  3:19   ` Xuewei Niu
  2025-03-26  2:59   ` Xuewei Niu
  0 siblings, 2 replies; 23+ messages in thread
From: Stefano Garzarella @ 2025-03-24 13:51 UTC (permalink / raw)
  To: Xuewei Niu; +Cc: parav, mst, fupan.lfp, virtio-comment, Xuewei Niu

On Mon, Mar 24, 2025 at 02:43:35PM +0800, Xuewei Niu wrote:
>This patch brings a new feature, called "multi devices", to the virtio
>vsock. It introduces a "VIRTIO_VSOCK_F_MULTI_DEVICES" feature bit, and a
>"device_order" field to the config for the virtio vsock.
>
>== Motivition ==
>
>Vsock is a lightweight and widely used data exchange mechanism between host
>and guest. Currently, the virtio-vsock only supports one device, resulting
>in the inability to enable more than one backend. For instance, two devices
>are required: one to transfer data to the VMM via virtio-vsock,

Come to think of it, AF_VSOCK defines CID 0 (VMADDR_CID_HYPERVISOR) to 
communicate with the hypervisor, but in virtio-vsock we never supported 
it. Could this be the use case?

We could in this way add a new feature for those devices that 
communicate only with the VMM, where the CID of the VM is quite useless.  
So instead of having multiple CIDs per VM, we could continue to have a 
single CID, but the transport could support 2 devices, one to 
communicate with the VMM (CID = 0) and one to communicate with the host 
apps (CID = 2).

Maybe this is orthogonal to this proposal, though, because it might 
still make sense to have multiple vsock devices, even though it's not 
very clear to me.

> and another to a user process via vhost-user-vsock.

So to recap, one device would be used only to communicate with the VMM, 
and the other device to communicate with other external processes, 
right?

Do you have any other use cases?

>
>Apart from that, a side gain is that theoretically the performance might be
>improved since each device has its own queue. But it varies depending on
>the implementation.

This though might be easier to implement supported multi-queue in the 
device, instead of adding n devices to the VM.

>
>== Typical Usages ==
>
>Assuming there are two virtio-vsock devices on the guest, with CIDs 3 and 4
>respectively. And the device with CID 3 is default.
>
>Connect to the host using the device with CID 3.
>
>```c
>// use default one (no bind)
>fd = socket(AF_VSOCK);
>connect(fd, 2, 1234);
>n = write(fd, buffer);
>
>// or bind explicitly
>fd = socket(AF_VSOCK);
>bind(fd, 3, -1);
>connect(fd, 2, 1234);
>n = write(fd, buffer);
>```
>
>Connect to the host using the device with CID 4.
>
>```c
>// must bind explicitly as the device with CID 4 is not default.
>fd = socket(AF_VSOCK);
>bind(fd, 4, -1);
>connect(fd, 2, 1234);
>n = write(fd, buffer);
>```
>
>The first version of multi-devices implementation is available at [1].
>
>[1] https://lore.kernel.org/virtualization/20240517144607.2595798-1-niuxuewei.nxw@antgroup.com
>
>Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
>---
> device-types/vsock/description.tex | 30 ++++++++++++++++++++++++++++--
> 1 file changed, 28 insertions(+), 2 deletions(-)
>
>diff --git a/device-types/vsock/description.tex b/device-types/vsock/description.tex
>index 7d91d15..7d0cfe4 100644
>--- a/device-types/vsock/description.tex
>+++ b/device-types/vsock/description.tex
>@@ -20,6 +20,7 @@ \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}
> \item[VIRTIO_VSOCK_F_STREAM (0)] stream socket type is supported.
> \item[VIRTIO_VSOCK_F_SEQPACKET (1)] seqpacket socket type is supported.
> \item[VIRTIO_VSOCK_F_NO_IMPLIED_STREAM (2)] stream socket type is not implied.
>+\item[VIRTIO_VSOCK_F_MULTI_DEVICES (3)] multiple devices feature is supported.
> \end{description}
>
> \drivernormative{\subsubsection}{Feature bits}{Device Types / Socket Device / Feature bits}
>@@ -34,6 +35,12 @@ \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}
> VIRTIO_VSOCK_F_NO_IMPLIED_STREAM, the driver MAY act as if
> VIRTIO_VSOCK_F_STREAM has also been negotiated.
>
>+The driver SHOULD ignore devices that do not have
>+VIRTIO_VSOCK_F_MULTI_DEVICES if the feature has been negotiated.
>+
>+The driver SHOULD ignore all subsequent devices if a device without
>+VIRTIO_VSOCK_F_MULTI_DEVICES is present.
>+
> \devicenormative{\subsubsection}{Feature bits}{Device Types / Socket Device / Feature bits}
>
> The device SHOULD offer the VIRTIO_VSOCK_F_NO_IMPLIED_STREAM feature.
>@@ -52,6 +59,7 @@ \subsection{Device configuration layout}\label{sec:Device Types / Socket Device
> \begin{lstlisting}
> struct virtio_vsock_config {
> 	le64 guest_cid;
>+	le16 device_order;
> };
> \end{lstlisting}
>
>@@ -77,11 +85,27 @@ \subsection{Device configuration layout}\label{sec:Device Types / Socket Device
> \hline
> \end{tabular}
>
>+The \field{device_order} is used to identify the default device. Up to
>+65,535 devices can be supported due to the size.
>+
>+\devicenormative{\subsubsection}{Device configuration layout}{Device Types / Socket Device / Device configuration layout}
>+
>+The device MUST provide a distinct \field{device_order} if
>+VIRTIO_VSOCK_F_MULTI_DEVICES feature has been negotiated.
>+
>+\drivernormative{\subsubsection}{Device configuration layout}{Device Types / Socket Device / Device configuration layout}
>+
>+The driver MUST treat the device with the lowest \field{device_order} as
>+the default device.
>+
> \subsection{Device Initialization}\label{sec:Device Types / Socket Device / Device Initialization}
>
> \begin{enumerate}
> \item The guest's cid is read from \field{guest_cid}.
>
>+\item If VIRTIO_VSOCK_F_MULTI_DEVICES has been negotiated, the device's
>+order will be read from \field{device_order}.
>+
> \item Buffers are added to the event virtqueue to receive events from the device.
>
> \item Buffers are added to the rx virtqueue to start receiving packets.
>@@ -233,8 +257,10 @@ \subsubsection{Receive and Transmit}\label{sec:Device Types / Socket Device / De
>
> \drivernormative{\paragraph}{Device Operation: Receive and Transmit}{Device Types / Socket Device / Device Operation / Receive and Transmit}
>
>-The \field{guest_cid} configuration field MUST be used as the source CID when
>-sending outgoing packets.
>+If \field{src_cid} is missing in outgoing packets, the driver MUST assign

I think here we have to define what the driver does, since the driver 
has to populate that field, how is it missing?

Maybe we are confusing interaction with user space, so we should say 
something like, “If the source socket is not bound to any source CID, 
the driver MUST assign ...”

Thanks,
Stefano

>+one. If more than one device is present, the driver SHOULD use the default
>+device's \field{guest_cid} configuration. Otherwise, the driver SHOULD use
>+the \field{guest_cid} of the only available device.
>
> A VIRTIO_VSOCK_OP_RST reply MUST be sent if a packet is received with an
> unknown \field{type} value.
>-- 
>2.34.1
>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-03-24 13:51 ` Stefano Garzarella
@ 2025-03-25  3:19   ` Xuewei Niu
  2025-03-26  8:50     ` Stefano Garzarella
  2025-03-26  2:59   ` Xuewei Niu
  1 sibling, 1 reply; 23+ messages in thread
From: Xuewei Niu @ 2025-03-25  3:19 UTC (permalink / raw)
  To: sgarzare
  Cc: fupan.lfp, mst, niuxuewei.nxw, niuxuewei97, parav, virtio-comment

> On Mon, Mar 24, 2025 at 02:43:35PM +0800, Xuewei Niu wrote:
> >This patch brings a new feature, called "multi devices", to the virtio
> >vsock. It introduces a "VIRTIO_VSOCK_F_MULTI_DEVICES" feature bit, and a
> >"device_order" field to the config for the virtio vsock.
> >
> >== Motivition ==
> >
> >Vsock is a lightweight and widely used data exchange mechanism between host
> >and guest. Currently, the virtio-vsock only supports one device, resulting
> >in the inability to enable more than one backend. For instance, two devices
> >are required: one to transfer data to the VMM via virtio-vsock,
> 
> Come to think of it, AF_VSOCK defines CID 0 (VMADDR_CID_HYPERVISOR) to 
> communicate with the hypervisor, but in virtio-vsock we never supported 
> it. Could this be the use case?
> 
> We could in this way add a new feature for those devices that 
> communicate only with the VMM, where the CID of the VM is quite useless.  
> So instead of having multiple CIDs per VM, we could continue to have a 
> single CID, but the transport could support 2 devices, one to 
> communicate with the VMM (CID = 0) and one to communicate with the host 
> apps (CID = 2).
> 
> Maybe this is orthogonal to this proposal, though, because it might 
> still make sense to have multiple vsock devices, even though it's not 
> very clear to me.

In terms of the current situation, two devices are enough.

We are the team of Kata Containers, so we are focusing on cloud-native
computing. What I mentioned below might be beyond the scope of the virtio
spec, just for your reference.

The background is that the architecture of proxy mesh has been evolved over
the past few years: from per-pod to per-host (e.g. Istio Ambient Mesh[1]).

Thanks to the TSI[2] and vhost-user protocol, network packets can bypass
both host and guest network stacks. It is possible to establish a fast path
between the pod and the proxy.

When we have multiple networks, it is intuitive to have multiple NICs. So
does vsock.

When multiple networks are availble, it means that it is possible to have
multiple proxies(i.e. user processes). In this case, two devices are not
enough. This feature makes vsock more flexible and scalable.

Looks like you don't like the design of multiple devices. May I ask why? Is
it too heavy for you?

> > and another to a user process via vhost-user-vsock.
> 
> So to recap, one device would be used only to communicate with the VMM, 
> and the other device to communicate with other external processes, 
> right?
> 
> Do you have any other use cases?
> 
> >Apart from that, a side gain is that theoretically the performance might be
> >improved since each device has its own queue. But it varies depending on
> >the implementation.
> 
> This though might be easier to implement supported multi-queue in the 
> device, instead of adding n devices to the VM.

I think multi-queue and multi-device are independent of each other, just
like what network devices do. A single vsock device can be considered as a
group of queues (if multi-queue is supported), and it can be assigned a
thread to handle the traffic.

So I accepted Parav's sugguestion, mentioned it as a side gain.

> >== Typical Usages ==
> >
> >Assuming there are two virtio-vsock devices on the guest, with CIDs 3 and 4
> >respectively. And the device with CID 3 is default.
> >
> >Connect to the host using the device with CID 3.
> >
> >```c
> >// use default one (no bind)
> >fd = socket(AF_VSOCK);
> >connect(fd, 2, 1234);
> >n = write(fd, buffer);
> >
> >// or bind explicitly
> >fd = socket(AF_VSOCK);
> >bind(fd, 3, -1);
> >connect(fd, 2, 1234);
> >n = write(fd, buffer);
> >```
> >
> >Connect to the host using the device with CID 4.
> >
> >```c
> >// must bind explicitly as the device with CID 4 is not default.
> >fd = socket(AF_VSOCK);
> >bind(fd, 4, -1);
> >connect(fd, 2, 1234);
> >n = write(fd, buffer);
> >```
> >
> >The first version of multi-devices implementation is available at [1].
> >
> >[1] https://lore.kernel.org/virtualization/20240517144607.2595798-1-niuxuewei.nxw@antgroup.com
> >
> >Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
> >---
> > device-types/vsock/description.tex | 30 ++++++++++++++++++++++++++++--
> > 1 file changed, 28 insertions(+), 2 deletions(-)
> >
> >diff --git a/device-types/vsock/description.tex b/device-types/vsock/description.tex
> >index 7d91d15..7d0cfe4 100644
> >--- a/device-types/vsock/description.tex
> >+++ b/device-types/vsock/description.tex
> >@@ -20,6 +20,7 @@ \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}
> > \item[VIRTIO_VSOCK_F_STREAM (0)] stream socket type is supported.
> > \item[VIRTIO_VSOCK_F_SEQPACKET (1)] seqpacket socket type is supported.
> > \item[VIRTIO_VSOCK_F_NO_IMPLIED_STREAM (2)] stream socket type is not implied.
> >+\item[VIRTIO_VSOCK_F_MULTI_DEVICES (3)] multiple devices feature is supported.
> > \end{description}
> >
> > \drivernormative{\subsubsection}{Feature bits}{Device Types / Socket Device / Feature bits}
> >@@ -34,6 +35,12 @@ \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}
> > VIRTIO_VSOCK_F_NO_IMPLIED_STREAM, the driver MAY act as if
> > VIRTIO_VSOCK_F_STREAM has also been negotiated.
> >
> >+The driver SHOULD ignore devices that do not have
> >+VIRTIO_VSOCK_F_MULTI_DEVICES if the feature has been negotiated.
> >+
> >+The driver SHOULD ignore all subsequent devices if a device without
> >+VIRTIO_VSOCK_F_MULTI_DEVICES is present.
> >+
> > \devicenormative{\subsubsection}{Feature bits}{Device Types / Socket Device / Feature bits}
> >
> > The device SHOULD offer the VIRTIO_VSOCK_F_NO_IMPLIED_STREAM feature.
> >@@ -52,6 +59,7 @@ \subsection{Device configuration layout}\label{sec:Device Types / Socket Device
> > \begin{lstlisting}
> > struct virtio_vsock_config {
> > 	le64 guest_cid;
> >+	le16 device_order;
> > };
> > \end{lstlisting}
> >
> >@@ -77,11 +85,27 @@ \subsection{Device configuration layout}\label{sec:Device Types / Socket Device
> > \hline
> > \end{tabular}
> >
> >+The \field{device_order} is used to identify the default device. Up to
> >+65,535 devices can be supported due to the size.
> >+
> >+\devicenormative{\subsubsection}{Device configuration layout}{Device Types / Socket Device / Device configuration layout}
> >+
> >+The device MUST provide a distinct \field{device_order} if
> >+VIRTIO_VSOCK_F_MULTI_DEVICES feature has been negotiated.
> >+
> >+\drivernormative{\subsubsection}{Device configuration layout}{Device Types / Socket Device / Device configuration layout}
> >+
> >+The driver MUST treat the device with the lowest \field{device_order} as
> >+the default device.
> >+
> > \subsection{Device Initialization}\label{sec:Device Types / Socket Device / Device Initialization}
> >
> > \begin{enumerate}
> > \item The guest's cid is read from \field{guest_cid}.
> >
> >+\item If VIRTIO_VSOCK_F_MULTI_DEVICES has been negotiated, the device's
> >+order will be read from \field{device_order}.
> >+
> > \item Buffers are added to the event virtqueue to receive events from the device.
> >
> > \item Buffers are added to the rx virtqueue to start receiving packets.
> >@@ -233,8 +257,10 @@ \subsubsection{Receive and Transmit}\label{sec:Device Types / Socket Device / De
> >
> > \drivernormative{\paragraph}{Device Operation: Receive and Transmit}{Device Types / Socket Device / Device Operation / Receive and Transmit}
> >
> >-The \field{guest_cid} configuration field MUST be used as the source CID when
> >-sending outgoing packets.
> >+If \field{src_cid} is missing in outgoing packets, the driver MUST assign
> 
> I think here we have to define what the driver does, since the driver 
> has to populate that field, how is it missing?
> 
> Maybe we are confusing interaction with user space, so we should say 
> something like, “If the source socket is not bound to any source CID, 
> the driver MUST assign ...”

Will do.

> >+one. If more than one device is present, the driver SHOULD use the default
> >+device's \field{guest_cid} configuration. Otherwise, the driver SHOULD use
> >+the \field{guest_cid} of the only available device.
> >
> > A VIRTIO_VSOCK_OP_RST reply MUST be sent if a packet is received with an
> > unknown \field{type} value.
> >-- 
> >2.34.1

[1]: https://istio.io/latest/blog/2022/introducing-ambient-mesh/
[2]: https://github.com/containers/libkrun?tab=readme-ov-file#networking

Thanks,
Xuewei

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-03-25  3:19   ` Xuewei Niu
@ 2025-03-26  8:50     ` Stefano Garzarella
  2025-03-26 10:00       ` Xuewei Niu
  0 siblings, 1 reply; 23+ messages in thread
From: Stefano Garzarella @ 2025-03-26  8:50 UTC (permalink / raw)
  To: Xuewei Niu; +Cc: fupan.lfp, mst, niuxuewei.nxw, parav, virtio-comment

On Tue, Mar 25, 2025 at 11:19:46AM +0800, Xuewei Niu wrote:
>> On Mon, Mar 24, 2025 at 02:43:35PM +0800, Xuewei Niu wrote:
>> >This patch brings a new feature, called "multi devices", to the virtio
>> >vsock. It introduces a "VIRTIO_VSOCK_F_MULTI_DEVICES" feature bit, and a
>> >"device_order" field to the config for the virtio vsock.
>> >
>> >== Motivition ==
>> >
>> >Vsock is a lightweight and widely used data exchange mechanism between host
>> >and guest. Currently, the virtio-vsock only supports one device, resulting
>> >in the inability to enable more than one backend. For instance, two devices
>> >are required: one to transfer data to the VMM via virtio-vsock,
>>
>> Come to think of it, AF_VSOCK defines CID 0 (VMADDR_CID_HYPERVISOR) to
>> communicate with the hypervisor, but in virtio-vsock we never supported
>> it. Could this be the use case?
>>
>> We could in this way add a new feature for those devices that
>> communicate only with the VMM, where the CID of the VM is quite useless.
>> So instead of having multiple CIDs per VM, we could continue to have a
>> single CID, but the transport could support 2 devices, one to
>> communicate with the VMM (CID = 0) and one to communicate with the host
>> apps (CID = 2).
>>
>> Maybe this is orthogonal to this proposal, though, because it might
>> still make sense to have multiple vsock devices, even though it's not
>> very clear to me.
>
>In terms of the current situation, two devices are enough.
>
>We are the team of Kata Containers, so we are focusing on cloud-native
>computing. What I mentioned below might be beyond the scope of the virtio
>spec, just for your reference.
>
>The background is that the architecture of proxy mesh has been evolved over
>the past few years: from per-pod to per-host (e.g. Istio Ambient Mesh[1]).
>
>Thanks to the TSI[2] and vhost-user protocol, network packets can bypass
>both host and guest network stacks. It is possible to establish a fast path
>between the pod and the proxy.
>
>When we have multiple networks, it is intuitive to have multiple NICs. So
>does vsock.

Be careful though, we don't want to complicate vsock to become like a 
NIC.

>
>When multiple networks are availble, it means that it is possible to have
>multiple proxies(i.e. user processes). In this case, two devices are not
>enough. This feature makes vsock more flexible and scalable.

This is a good point, but I really don't understand why a VM should have 
multiple CIDs assigned.

>
>Looks like you don't like the design of multiple devices. May I ask why? Is
>it too heavy for you?

Yes, I am concerned that we are over complicating vsock.
Since AF_VSOCK already defines an address to communicate with the 
hypervisor, why can't the device that ends up in the VMM (TSI) use that?

I believe that having multiple devices only introduces a complication in 
the user. What source device/CID should the user use and for what 
reason?

All this should be hidden and especially in your case, this is already 
easily done by using VMADDR_CID_HYPERVISOR to communicate with the VMM 
and VMADDR_CID_HOST to communicate with applications in the host. So 
maybe we only need to handle 2 device types in the driver, without 
adding this functionality but just introducing a new device type (via a 
feature) that handles VMADDR_CID_HYPERVISOR, so that the driver knows 
that at most it can have 2 devices, one for the VMM and one for the 
host.

>
>> > and another to a user process via vhost-user-vsock.
>>
>> So to recap, one device would be used only to communicate with the VMM,
>> and the other device to communicate with other external processes,
>> right?
>>
>> Do you have any other use cases?
>>
>> >Apart from that, a side gain is that theoretically the performance might be
>> >improved since each device has its own queue. But it varies depending on
>> >the implementation.
>>
>> This though might be easier to implement supported multi-queue in the
>> device, instead of adding n devices to the VM.
>
>I think multi-queue and multi-device are independent of each other, just
>like what network devices do. A single vsock device can be considered as a
>group of queues (if multi-queue is supported), and it can be assigned a
>thread to handle the traffic.

That's right
    multi-queue -> performance
    multi-device -> more addresses

And that's where I worry, why complicate vsock to get more addresses for 
a VM?

>
>So I accepted Parav's sugguestion, mentioned it as a side gain.
>
>> >== Typical Usages ==
>> >
>> >Assuming there are two virtio-vsock devices on the guest, with CIDs 3 and 4
>> >respectively. And the device with CID 3 is default.
>> >
>> >Connect to the host using the device with CID 3.
>> >
>> >```c
>> >// use default one (no bind)
>> >fd = socket(AF_VSOCK);
>> >connect(fd, 2, 1234);
>> >n = write(fd, buffer);
>> >
>> >// or bind explicitly
>> >fd = socket(AF_VSOCK);
>> >bind(fd, 3, -1);
>> >connect(fd, 2, 1234);
>> >n = write(fd, buffer);
>> >```
>> >
>> >Connect to the host using the device with CID 4.
>> >
>> >```c
>> >// must bind explicitly as the device with CID 4 is not default.
>> >fd = socket(AF_VSOCK);
>> >bind(fd, 4, -1);
>> >connect(fd, 2, 1234);
>> >n = write(fd, buffer);
>> >```
>> >
>> >The first version of multi-devices implementation is available at [1].
>> >
>> >[1] https://lore.kernel.org/virtualization/20240517144607.2595798-1-niuxuewei.nxw@antgroup.com
>> >
>> >Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
>> >---
>> > device-types/vsock/description.tex | 30 ++++++++++++++++++++++++++++--
>> > 1 file changed, 28 insertions(+), 2 deletions(-)
>> >
>> >diff --git a/device-types/vsock/description.tex b/device-types/vsock/description.tex
>> >index 7d91d15..7d0cfe4 100644
>> >--- a/device-types/vsock/description.tex
>> >+++ b/device-types/vsock/description.tex
>> >@@ -20,6 +20,7 @@ \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}
>> > \item[VIRTIO_VSOCK_F_STREAM (0)] stream socket type is supported.
>> > \item[VIRTIO_VSOCK_F_SEQPACKET (1)] seqpacket socket type is supported.
>> > \item[VIRTIO_VSOCK_F_NO_IMPLIED_STREAM (2)] stream socket type is not implied.
>> >+\item[VIRTIO_VSOCK_F_MULTI_DEVICES (3)] multiple devices feature is supported.
>> > \end{description}
>> >
>> > \drivernormative{\subsubsection}{Feature bits}{Device Types / Socket Device / Feature bits}
>> >@@ -34,6 +35,12 @@ \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}
>> > VIRTIO_VSOCK_F_NO_IMPLIED_STREAM, the driver MAY act as if
>> > VIRTIO_VSOCK_F_STREAM has also been negotiated.
>> >
>> >+The driver SHOULD ignore devices that do not have
>> >+VIRTIO_VSOCK_F_MULTI_DEVICES if the feature has been negotiated.
>> >+
>> >+The driver SHOULD ignore all subsequent devices if a device without
>> >+VIRTIO_VSOCK_F_MULTI_DEVICES is present.
>> >+
>> > \devicenormative{\subsubsection}{Feature bits}{Device Types / Socket Device / Feature bits}
>> >
>> > The device SHOULD offer the VIRTIO_VSOCK_F_NO_IMPLIED_STREAM feature.
>> >@@ -52,6 +59,7 @@ \subsection{Device configuration layout}\label{sec:Device Types / Socket Device
>> > \begin{lstlisting}
>> > struct virtio_vsock_config {
>> > 	le64 guest_cid;
>> >+	le16 device_order;
>> > };
>> > \end{lstlisting}
>> >
>> >@@ -77,11 +85,27 @@ \subsection{Device configuration layout}\label{sec:Device Types / Socket Device
>> > \hline
>> > \end{tabular}
>> >
>> >+The \field{device_order} is used to identify the default device. Up to
>> >+65,535 devices can be supported due to the size.
>> >+
>> >+\devicenormative{\subsubsection}{Device configuration layout}{Device Types / Socket Device / Device configuration layout}
>> >+
>> >+The device MUST provide a distinct \field{device_order} if
>> >+VIRTIO_VSOCK_F_MULTI_DEVICES feature has been negotiated.
>> >+
>> >+\drivernormative{\subsubsection}{Device configuration layout}{Device Types / Socket Device / Device configuration layout}
>> >+
>> >+The driver MUST treat the device with the lowest \field{device_order} as
>> >+the default device.
>> >+
>> > \subsection{Device Initialization}\label{sec:Device Types / Socket Device / Device Initialization}
>> >
>> > \begin{enumerate}
>> > \item The guest's cid is read from \field{guest_cid}.
>> >
>> >+\item If VIRTIO_VSOCK_F_MULTI_DEVICES has been negotiated, the device's
>> >+order will be read from \field{device_order}.
>> >+
>> > \item Buffers are added to the event virtqueue to receive events from the device.
>> >
>> > \item Buffers are added to the rx virtqueue to start receiving packets.
>> >@@ -233,8 +257,10 @@ \subsubsection{Receive and Transmit}\label{sec:Device Types / Socket Device / De
>> >
>> > \drivernormative{\paragraph}{Device Operation: Receive and Transmit}{Device Types / Socket Device / Device Operation / Receive and Transmit}
>> >
>> >-The \field{guest_cid} configuration field MUST be used as the source CID when
>> >-sending outgoing packets.
>> >+If \field{src_cid} is missing in outgoing packets, the driver MUST assign
>>
>> I think here we have to define what the driver does, since the driver
>> has to populate that field, how is it missing?
>>
>> Maybe we are confusing interaction with user space, so we should say
>> something like, “If the source socket is not bound to any source CID,
>> the driver MUST assign ...”
>
>Will do.
>
>> >+one. If more than one device is present, the driver SHOULD use the default
>> >+device's \field{guest_cid} configuration. Otherwise, the driver SHOULD use
>> >+the \field{guest_cid} of the only available device.
>> >
>> > A VIRTIO_VSOCK_OP_RST reply MUST be sent if a packet is received with an
>> > unknown \field{type} value.
>> >--
>> >2.34.1
>
>[1]: https://istio.io/latest/blog/2022/introducing-ambient-mesh/
>[2]: https://github.com/containers/libkrun?tab=readme-ov-file#networking
>
>Thanks,
>Xuewei
>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-03-26  8:50     ` Stefano Garzarella
@ 2025-03-26 10:00       ` Xuewei Niu
  2025-03-26 10:32         ` Stefano Garzarella
  0 siblings, 1 reply; 23+ messages in thread
From: Xuewei Niu @ 2025-03-26 10:00 UTC (permalink / raw)
  To: sgarzare
  Cc: fupan.lfp, mst, niuxuewei.nxw, niuxuewei97, parav, virtio-comment

> On Tue, Mar 25, 2025 at 11:19:46AM +0800, Xuewei Niu wrote:
> >> On Mon, Mar 24, 2025 at 02:43:35PM +0800, Xuewei Niu wrote:
> >> >This patch brings a new feature, called "multi devices", to the virtio
> >> >vsock. It introduces a "VIRTIO_VSOCK_F_MULTI_DEVICES" feature bit, and a
> >> >"device_order" field to the config for the virtio vsock.
> >> >
> >> >== Motivition ==
> >> >
> >> >Vsock is a lightweight and widely used data exchange mechanism between host
> >> >and guest. Currently, the virtio-vsock only supports one device, resulting
> >> >in the inability to enable more than one backend. For instance, two devices
> >> >are required: one to transfer data to the VMM via virtio-vsock,
> >>
> >> Come to think of it, AF_VSOCK defines CID 0 (VMADDR_CID_HYPERVISOR) to
> >> communicate with the hypervisor, but in virtio-vsock we never supported
> >> it. Could this be the use case?
> >>
> >> We could in this way add a new feature for those devices that
> >> communicate only with the VMM, where the CID of the VM is quite useless.
> >> So instead of having multiple CIDs per VM, we could continue to have a
> >> single CID, but the transport could support 2 devices, one to
> >> communicate with the VMM (CID = 0) and one to communicate with the host
> >> apps (CID = 2).
> >>
> >> Maybe this is orthogonal to this proposal, though, because it might
> >> still make sense to have multiple vsock devices, even though it's not
> >> very clear to me.
> >
> >In terms of the current situation, two devices are enough.
> >
> >We are the team of Kata Containers, so we are focusing on cloud-native
> >computing. What I mentioned below might be beyond the scope of the virtio
> >spec, just for your reference.
> >
> >The background is that the architecture of proxy mesh has been evolved over
> >the past few years: from per-pod to per-host (e.g. Istio Ambient Mesh[1]).
> >
> >Thanks to the TSI[2] and vhost-user protocol, network packets can bypass
> >both host and guest network stacks. It is possible to establish a fast path
> >between the pod and the proxy.
> >
> >When we have multiple networks, it is intuitive to have multiple NICs. So
> >does vsock.
> 
> Be careful though, we don't want to complicate vsock to become like a 
> NIC.
> 
> >
> >When multiple networks are availble, it means that it is possible to have
> >multiple proxies(i.e. user processes). In this case, two devices are not
> >enough. This feature makes vsock more flexible and scalable.
> 
> This is a good point, but I really don't understand why a VM should have 
> multiple CIDs assigned.

I think priority is not the biggest issue here. So let us focus on how to
route the connection to the right device among more than two devices.

Our solution uses CID as device identification. From the users'
perspective, they can direct the connection to the appropriate device by
specifying a CID in either the `connect` or `bind` syscall.

Assigning one CID to a VM looks good to me. But I am not sure how to
distinguish the devices. For example, should we expose a ioctl or a
sockopt?

Thanks,
Xuewei

> >Looks like you don't like the design of multiple devices. May I ask why? Is
> >it too heavy for you?
> 
> Yes, I am concerned that we are over complicating vsock.
> Since AF_VSOCK already defines an address to communicate with the 
> hypervisor, why can't the device that ends up in the VMM (TSI) use that?
> 
> I believe that having multiple devices only introduces a complication in 
> the user. What source device/CID should the user use and for what 
> reason?
> 
> All this should be hidden and especially in your case, this is already 
> easily done by using VMADDR_CID_HYPERVISOR to communicate with the VMM 
> and VMADDR_CID_HOST to communicate with applications in the host. So 
> maybe we only need to handle 2 device types in the driver, without 
> adding this functionality but just introducing a new device type (via a 
> feature) that handles VMADDR_CID_HYPERVISOR, so that the driver knows 
> that at most it can have 2 devices, one for the VMM and one for the 
> host.
> 
> >
> >> > and another to a user process via vhost-user-vsock.
> >>
> >> So to recap, one device would be used only to communicate with the VMM,
> >> and the other device to communicate with other external processes,
> >> right?
> >>
> >> Do you have any other use cases?
> >>
> >> >Apart from that, a side gain is that theoretically the performance might be
> >> >improved since each device has its own queue. But it varies depending on
> >> >the implementation.
> >>
> >> This though might be easier to implement supported multi-queue in the
> >> device, instead of adding n devices to the VM.
> >
> >I think multi-queue and multi-device are independent of each other, just
> >like what network devices do. A single vsock device can be considered as a
> >group of queues (if multi-queue is supported), and it can be assigned a
> >thread to handle the traffic.
> 
> That's right
>     multi-queue -> performance
>     multi-device -> more addresses
> 
> And that's where I worry, why complicate vsock to get more addresses for 
> a VM?
> 
> >
> >So I accepted Parav's sugguestion, mentioned it as a side gain.
> >
> >> >== Typical Usages ==
> >> >
> >> >Assuming there are two virtio-vsock devices on the guest, with CIDs 3 and 4
> >> >respectively. And the device with CID 3 is default.
> >> >
> >> >Connect to the host using the device with CID 3.
> >> >
> >> >```c
> >> >// use default one (no bind)
> >> >fd = socket(AF_VSOCK);
> >> >connect(fd, 2, 1234);
> >> >n = write(fd, buffer);
> >> >
> >> >// or bind explicitly
> >> >fd = socket(AF_VSOCK);
> >> >bind(fd, 3, -1);
> >> >connect(fd, 2, 1234);
> >> >n = write(fd, buffer);
> >> >```
> >> >
> >> >Connect to the host using the device with CID 4.
> >> >
> >> >```c
> >> >// must bind explicitly as the device with CID 4 is not default.
> >> >fd = socket(AF_VSOCK);
> >> >bind(fd, 4, -1);
> >> >connect(fd, 2, 1234);
> >> >n = write(fd, buffer);
> >> >```
> >> >
> >> >The first version of multi-devices implementation is available at [1].
> >> >
> >> >[1] https://lore.kernel.org/virtualization/20240517144607.2595798-1-niuxuewei.nxw@antgroup.com
> >> >
> >> >Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
> >> >---
> >> > device-types/vsock/description.tex | 30 ++++++++++++++++++++++++++++--
> >> > 1 file changed, 28 insertions(+), 2 deletions(-)
> >> >
> >> >diff --git a/device-types/vsock/description.tex b/device-types/vsock/description.tex
> >> >index 7d91d15..7d0cfe4 100644
> >> >--- a/device-types/vsock/description.tex
> >> >+++ b/device-types/vsock/description.tex
> >> >@@ -20,6 +20,7 @@ \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}
> >> > \item[VIRTIO_VSOCK_F_STREAM (0)] stream socket type is supported.
> >> > \item[VIRTIO_VSOCK_F_SEQPACKET (1)] seqpacket socket type is supported.
> >> > \item[VIRTIO_VSOCK_F_NO_IMPLIED_STREAM (2)] stream socket type is not implied.
> >> >+\item[VIRTIO_VSOCK_F_MULTI_DEVICES (3)] multiple devices feature is supported.
> >> > \end{description}
> >> >
> >> > \drivernormative{\subsubsection}{Feature bits}{Device Types / Socket Device / Feature bits}
> >> >@@ -34,6 +35,12 @@ \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}
> >> > VIRTIO_VSOCK_F_NO_IMPLIED_STREAM, the driver MAY act as if
> >> > VIRTIO_VSOCK_F_STREAM has also been negotiated.
> >> >
> >> >+The driver SHOULD ignore devices that do not have
> >> >+VIRTIO_VSOCK_F_MULTI_DEVICES if the feature has been negotiated.
> >> >+
> >> >+The driver SHOULD ignore all subsequent devices if a device without
> >> >+VIRTIO_VSOCK_F_MULTI_DEVICES is present.
> >> >+
> >> > \devicenormative{\subsubsection}{Feature bits}{Device Types / Socket Device / Feature bits}
> >> >
> >> > The device SHOULD offer the VIRTIO_VSOCK_F_NO_IMPLIED_STREAM feature.
> >> >@@ -52,6 +59,7 @@ \subsection{Device configuration layout}\label{sec:Device Types / Socket Device
> >> > \begin{lstlisting}
> >> > struct virtio_vsock_config {
> >> > 	le64 guest_cid;
> >> >+	le16 device_order;
> >> > };
> >> > \end{lstlisting}
> >> >
> >> >@@ -77,11 +85,27 @@ \subsection{Device configuration layout}\label{sec:Device Types / Socket Device
> >> > \hline
> >> > \end{tabular}
> >> >
> >> >+The \field{device_order} is used to identify the default device. Up to
> >> >+65,535 devices can be supported due to the size.
> >> >+
> >> >+\devicenormative{\subsubsection}{Device configuration layout}{Device Types / Socket Device / Device configuration layout}
> >> >+
> >> >+The device MUST provide a distinct \field{device_order} if
> >> >+VIRTIO_VSOCK_F_MULTI_DEVICES feature has been negotiated.
> >> >+
> >> >+\drivernormative{\subsubsection}{Device configuration layout}{Device Types / Socket Device / Device configuration layout}
> >> >+
> >> >+The driver MUST treat the device with the lowest \field{device_order} as
> >> >+the default device.
> >> >+
> >> > \subsection{Device Initialization}\label{sec:Device Types / Socket Device / Device Initialization}
> >> >
> >> > \begin{enumerate}
> >> > \item The guest's cid is read from \field{guest_cid}.
> >> >
> >> >+\item If VIRTIO_VSOCK_F_MULTI_DEVICES has been negotiated, the device's
> >> >+order will be read from \field{device_order}.
> >> >+
> >> > \item Buffers are added to the event virtqueue to receive events from the device.
> >> >
> >> > \item Buffers are added to the rx virtqueue to start receiving packets.
> >> >@@ -233,8 +257,10 @@ \subsubsection{Receive and Transmit}\label{sec:Device Types / Socket Device / De
> >> >
> >> > \drivernormative{\paragraph}{Device Operation: Receive and Transmit}{Device Types / Socket Device / Device Operation / Receive and Transmit}
> >> >
> >> >-The \field{guest_cid} configuration field MUST be used as the source CID when
> >> >-sending outgoing packets.
> >> >+If \field{src_cid} is missing in outgoing packets, the driver MUST assign
> >>
> >> I think here we have to define what the driver does, since the driver
> >> has to populate that field, how is it missing?
> >>
> >> Maybe we are confusing interaction with user space, so we should say
> >> something like, “If the source socket is not bound to any source CID,
> >> the driver MUST assign ...”
> >
> >Will do.
> >
> >> >+one. If more than one device is present, the driver SHOULD use the default
> >> >+device's \field{guest_cid} configuration. Otherwise, the driver SHOULD use
> >> >+the \field{guest_cid} of the only available device.
> >> >
> >> > A VIRTIO_VSOCK_OP_RST reply MUST be sent if a packet is received with an
> >> > unknown \field{type} value.
> >> >--
> >> >2.34.1
> >
> >[1]: https://istio.io/latest/blog/2022/introducing-ambient-mesh/
> >[2]: https://github.com/containers/libkrun?tab=readme-ov-file#networking
> >
> >Thanks,
> >Xuewei

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-03-26 10:00       ` Xuewei Niu
@ 2025-03-26 10:32         ` Stefano Garzarella
  2025-03-26 10:36           ` Stefano Garzarella
  0 siblings, 1 reply; 23+ messages in thread
From: Stefano Garzarella @ 2025-03-26 10:32 UTC (permalink / raw)
  To: Xuewei Niu; +Cc: fupan.lfp, mst, niuxuewei.nxw, parav, virtio-comment

On Wed, Mar 26, 2025 at 06:00:31PM +0800, Xuewei Niu wrote:
>> On Tue, Mar 25, 2025 at 11:19:46AM +0800, Xuewei Niu wrote:
>> >> On Mon, Mar 24, 2025 at 02:43:35PM +0800, Xuewei Niu wrote:
>> >> >This patch brings a new feature, called "multi devices", to the virtio
>> >> >vsock. It introduces a "VIRTIO_VSOCK_F_MULTI_DEVICES" feature bit, and a
>> >> >"device_order" field to the config for the virtio vsock.
>> >> >
>> >> >== Motivition ==
>> >> >
>> >> >Vsock is a lightweight and widely used data exchange mechanism between host
>> >> >and guest. Currently, the virtio-vsock only supports one device, resulting
>> >> >in the inability to enable more than one backend. For instance, two devices
>> >> >are required: one to transfer data to the VMM via virtio-vsock,
>> >>
>> >> Come to think of it, AF_VSOCK defines CID 0 (VMADDR_CID_HYPERVISOR) to
>> >> communicate with the hypervisor, but in virtio-vsock we never supported
>> >> it. Could this be the use case?
>> >>
>> >> We could in this way add a new feature for those devices that
>> >> communicate only with the VMM, where the CID of the VM is quite useless.
>> >> So instead of having multiple CIDs per VM, we could continue to have a
>> >> single CID, but the transport could support 2 devices, one to
>> >> communicate with the VMM (CID = 0) and one to communicate with the host
>> >> apps (CID = 2).
>> >>
>> >> Maybe this is orthogonal to this proposal, though, because it might
>> >> still make sense to have multiple vsock devices, even though it's not
>> >> very clear to me.
>> >
>> >In terms of the current situation, two devices are enough.
>> >
>> >We are the team of Kata Containers, so we are focusing on cloud-native
>> >computing. What I mentioned below might be beyond the scope of the virtio
>> >spec, just for your reference.
>> >
>> >The background is that the architecture of proxy mesh has been evolved over
>> >the past few years: from per-pod to per-host (e.g. Istio Ambient Mesh[1]).
>> >
>> >Thanks to the TSI[2] and vhost-user protocol, network packets can bypass
>> >both host and guest network stacks. It is possible to establish a fast path
>> >between the pod and the proxy.
>> >
>> >When we have multiple networks, it is intuitive to have multiple NICs. So
>> >does vsock.
>>
>> Be careful though, we don't want to complicate vsock to become like a
>> NIC.
>>
>> >
>> >When multiple networks are availble, it means that it is possible to have
>> >multiple proxies(i.e. user processes). In this case, two devices are not
>> >enough. This feature makes vsock more flexible and scalable.
>>
>> This is a good point, but I really don't understand why a VM should have
>> multiple CIDs assigned.
>
>I think priority is not the biggest issue here. So let us focus on how to
>route the connection to the right device among more than two devices.

That's why I was recommending a different approach. IMO the user should 
not do this, but that should be transparent, hidden in the driver.

By supporting VMADDR_CID_HYPERVISOR, we know very well if a packet is to 
be sent to the VMM, then we have to use the device that supports it.
Whereas if the user connects to VMADDR_CID_HOST we have to use the other 
device.

The user doesn't have to do anything, only use the right destination CID 
if it wants to talk to the VMM or another host process.

>
>Our solution uses CID as device identification. From the users'
>perspective, they can direct the connection to the appropriate device by
>specifying a CID in either the `connect` or `bind` syscall.

How does the user know which device/CID to bind if it wants to talk with 
the VMM or with the application?

>
>Assigning one CID to a VM looks good to me. But I am not sure how to
>distinguish the devices. For example, should we expose a ioctl or a
>sockopt?

Nope, just simply use the right destination CID in the connect() 
(VMADDR_CID_HYPERVISOR or VMADDR_CID_HOST), without doing any bind().

For receiving, the user can check the source CID after connection and 
decide to discard connections from VMADDR_CID_HYPERVISOR or 
VMADDR_CID_HOST depending of the service.

Thanks,
Stefano


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-03-26 10:32         ` Stefano Garzarella
@ 2025-03-26 10:36           ` Stefano Garzarella
  0 siblings, 0 replies; 23+ messages in thread
From: Stefano Garzarella @ 2025-03-26 10:36 UTC (permalink / raw)
  To: Xuewei Niu; +Cc: fupan.lfp, mst, niuxuewei.nxw, parav, virtio-comment

On Wed, 26 Mar 2025 at 11:32, Stefano Garzarella <sgarzare@redhat.com> wrote:
> On Wed, Mar 26, 2025 at 06:00:31PM +0800, Xuewei Niu wrote:
> >> On Tue, Mar 25, 2025 at 11:19:46AM +0800, Xuewei Niu wrote:
> >> >> On Mon, Mar 24, 2025 at 02:43:35PM +0800, Xuewei Niu wrote:
> >> >> >This patch brings a new feature, called "multi devices", to the virtio
> >> >> >vsock. It introduces a "VIRTIO_VSOCK_F_MULTI_DEVICES" feature bit, and a
> >> >> >"device_order" field to the config for the virtio vsock.
> >> >> >
> >> >> >== Motivition ==
> >> >> >
> >> >> >Vsock is a lightweight and widely used data exchange mechanism between host
> >> >> >and guest. Currently, the virtio-vsock only supports one device, resulting
> >> >> >in the inability to enable more than one backend. For instance, two devices
> >> >> >are required: one to transfer data to the VMM via virtio-vsock,
> >> >>
> >> >> Come to think of it, AF_VSOCK defines CID 0 (VMADDR_CID_HYPERVISOR) to
> >> >> communicate with the hypervisor, but in virtio-vsock we never supported
> >> >> it. Could this be the use case?
> >> >>
> >> >> We could in this way add a new feature for those devices that
> >> >> communicate only with the VMM, where the CID of the VM is quite useless.
> >> >> So instead of having multiple CIDs per VM, we could continue to have a
> >> >> single CID, but the transport could support 2 devices, one to
> >> >> communicate with the VMM (CID = 0) and one to communicate with the host
> >> >> apps (CID = 2).
> >> >>
> >> >> Maybe this is orthogonal to this proposal, though, because it might
> >> >> still make sense to have multiple vsock devices, even though it's not
> >> >> very clear to me.
> >> >
> >> >In terms of the current situation, two devices are enough.
> >> >
> >> >We are the team of Kata Containers, so we are focusing on cloud-native
> >> >computing. What I mentioned below might be beyond the scope of the virtio
> >> >spec, just for your reference.
> >> >
> >> >The background is that the architecture of proxy mesh has been evolved over
> >> >the past few years: from per-pod to per-host (e.g. Istio Ambient Mesh[1]).
> >> >
> >> >Thanks to the TSI[2] and vhost-user protocol, network packets can bypass
> >> >both host and guest network stacks. It is possible to establish a fast path
> >> >between the pod and the proxy.
> >> >
> >> >When we have multiple networks, it is intuitive to have multiple NICs. So
> >> >does vsock.
> >>
> >> Be careful though, we don't want to complicate vsock to become like a
> >> NIC.
> >>
> >> >
> >> >When multiple networks are availble, it means that it is possible to have
> >> >multiple proxies(i.e. user processes). In this case, two devices are not
> >> >enough. This feature makes vsock more flexible and scalable.
> >>
> >> This is a good point, but I really don't understand why a VM should have
> >> multiple CIDs assigned.
> >
> >I think priority is not the biggest issue here. So let us focus on how to
> >route the connection to the right device among more than two devices.
>
> That's why I was recommending a different approach. IMO the user should
> not do this, but that should be transparent, hidden in the driver.
>
> By supporting VMADDR_CID_HYPERVISOR, we know very well if a packet is to
> be sent to the VMM, then we have to use the device that supports it.
> Whereas if the user connects to VMADDR_CID_HOST we have to use the other
> device.
>
> The user doesn't have to do anything, only use the right destination CID
> if it wants to talk to the VMM or another host process.

Obviously, if we want to support more than 2 devices, we need this
that you are proposing. But IMO we need also to support
VMADDR_CID_HYPERVISOR, and we should prevent the user from doing
bind() on a random CID if one of the two devices only talks to the
VMM.

Because, again, how does the user know which CID to bind?

>
> >
> >Our solution uses CID as device identification. From the users'
> >perspective, they can direct the connection to the appropriate device by
> >specifying a CID in either the `connect` or `bind` syscall.
>
> How does the user know which device/CID to bind if it wants to talk with
> the VMM or with the application?
>
> >
> >Assigning one CID to a VM looks good to me. But I am not sure how to
> >distinguish the devices. For example, should we expose a ioctl or a
> >sockopt?
>
> Nope, just simply use the right destination CID in the connect()
> (VMADDR_CID_HYPERVISOR or VMADDR_CID_HOST), without doing any bind().
>
> For receiving, the user can check the source CID after connection and
> decide to discard connections from VMADDR_CID_HYPERVISOR or
> VMADDR_CID_HOST depending of the service.
>
> Thanks,
> Stefano
>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-03-24 13:51 ` Stefano Garzarella
  2025-03-25  3:19   ` Xuewei Niu
@ 2025-03-26  2:59   ` Xuewei Niu
  2025-03-26  9:03     ` Stefano Garzarella
  1 sibling, 1 reply; 23+ messages in thread
From: Xuewei Niu @ 2025-03-26  2:59 UTC (permalink / raw)
  To: sgarzare
  Cc: fupan.lfp, mst, niuxuewei.nxw, niuxuewei97, parav, virtio-comment

> On Mon, Mar 24, 2025 at 02:43:35PM +0800, Xuewei Niu wrote:
> >This patch brings a new feature, called "multi devices", to the virtio
> >vsock. It introduces a "VIRTIO_VSOCK_F_MULTI_DEVICES" feature bit, and a
> >"device_order" field to the config for the virtio vsock.
> >
> >== Motivition ==
> >
> >Vsock is a lightweight and widely used data exchange mechanism between host
> >and guest. Currently, the virtio-vsock only supports one device, resulting
> >in the inability to enable more than one backend. For instance, two devices
> >are required: one to transfer data to the VMM via virtio-vsock,
> 
> Come to think of it, AF_VSOCK defines CID 0 (VMADDR_CID_HYPERVISOR) to 
> communicate with the hypervisor, but in virtio-vsock we never supported 
> it. Could this be the use case?
> 
> We could in this way add a new feature for those devices that 
> communicate only with the VMM, where the CID of the VM is quite useless.  
> So instead of having multiple CIDs per VM, we could continue to have a 
> single CID, but the transport could support 2 devices, one to 
> communicate with the VMM (CID = 0) and one to communicate with the host 
> apps (CID = 2).
> 
> Maybe this is orthogonal to this proposal, though, because it might 
> still make sense to have multiple vsock devices, even though it's not 
> very clear to me.
> 
> > and another to a user process via vhost-user-vsock.
> 
> So to recap, one device would be used only to communicate with the VMM, 
> and the other device to communicate with other external processes, 
> right?
> 
> Do you have any other use cases?
> 
> >
> >Apart from that, a side gain is that theoretically the performance might be
> >improved since each device has its own queue. But it varies depending on
> >the implementation.
> 
> This though might be easier to implement supported multi-queue in the 
> device, instead of adding n devices to the VM.

How to specify virtqueue for a connection?

For example, I have a higher priority connection, and some of lower
connections. How to route the higher one to vq0 group (including tx and
rx), and route the lower to vq1?

I think it could be easier to be achieved with multi-device feature.

Thanks,
Xuewei

> >== Typical Usages ==
> >
> >Assuming there are two virtio-vsock devices on the guest, with CIDs 3 and 4
> >respectively. And the device with CID 3 is default.
> >
> >Connect to the host using the device with CID 3.
> >
> >```c
> >// use default one (no bind)
> >fd = socket(AF_VSOCK);
> >connect(fd, 2, 1234);
> >n = write(fd, buffer);
> >
> >// or bind explicitly
> >fd = socket(AF_VSOCK);
> >bind(fd, 3, -1);
> >connect(fd, 2, 1234);
> >n = write(fd, buffer);
> >```
> >
> >Connect to the host using the device with CID 4.
> >
> >```c
> >// must bind explicitly as the device with CID 4 is not default.
> >fd = socket(AF_VSOCK);
> >bind(fd, 4, -1);
> >connect(fd, 2, 1234);
> >n = write(fd, buffer);
> >```
> >
> >The first version of multi-devices implementation is available at [1].
> >
> >[1] https://lore.kernel.org/virtualization/20240517144607.2595798-1-niuxuewei.nxw@antgroup.com
> >
> >Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
> >---
> > device-types/vsock/description.tex | 30 ++++++++++++++++++++++++++++--
> > 1 file changed, 28 insertions(+), 2 deletions(-)
> >
> >diff --git a/device-types/vsock/description.tex b/device-types/vsock/description.tex
> >index 7d91d15..7d0cfe4 100644
> >--- a/device-types/vsock/description.tex
> >+++ b/device-types/vsock/description.tex
> >@@ -20,6 +20,7 @@ \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}
> > \item[VIRTIO_VSOCK_F_STREAM (0)] stream socket type is supported.
> > \item[VIRTIO_VSOCK_F_SEQPACKET (1)] seqpacket socket type is supported.
> > \item[VIRTIO_VSOCK_F_NO_IMPLIED_STREAM (2)] stream socket type is not implied.
> >+\item[VIRTIO_VSOCK_F_MULTI_DEVICES (3)] multiple devices feature is supported.
> > \end{description}
> >
> > \drivernormative{\subsubsection}{Feature bits}{Device Types / Socket Device / Feature bits}
> >@@ -34,6 +35,12 @@ \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}
> > VIRTIO_VSOCK_F_NO_IMPLIED_STREAM, the driver MAY act as if
> > VIRTIO_VSOCK_F_STREAM has also been negotiated.
> >
> >+The driver SHOULD ignore devices that do not have
> >+VIRTIO_VSOCK_F_MULTI_DEVICES if the feature has been negotiated.
> >+
> >+The driver SHOULD ignore all subsequent devices if a device without
> >+VIRTIO_VSOCK_F_MULTI_DEVICES is present.
> >+
> > \devicenormative{\subsubsection}{Feature bits}{Device Types / Socket Device / Feature bits}
> >
> > The device SHOULD offer the VIRTIO_VSOCK_F_NO_IMPLIED_STREAM feature.
> >@@ -52,6 +59,7 @@ \subsection{Device configuration layout}\label{sec:Device Types / Socket Device
> > \begin{lstlisting}
> > struct virtio_vsock_config {
> > 	le64 guest_cid;
> >+	le16 device_order;
> > };
> > \end{lstlisting}
> >
> >@@ -77,11 +85,27 @@ \subsection{Device configuration layout}\label{sec:Device Types / Socket Device
> > \hline
> > \end{tabular}
> >
> >+The \field{device_order} is used to identify the default device. Up to
> >+65,535 devices can be supported due to the size.
> >+
> >+\devicenormative{\subsubsection}{Device configuration layout}{Device Types / Socket Device / Device configuration layout}
> >+
> >+The device MUST provide a distinct \field{device_order} if
> >+VIRTIO_VSOCK_F_MULTI_DEVICES feature has been negotiated.
> >+
> >+\drivernormative{\subsubsection}{Device configuration layout}{Device Types / Socket Device / Device configuration layout}
> >+
> >+The driver MUST treat the device with the lowest \field{device_order} as
> >+the default device.
> >+
> > \subsection{Device Initialization}\label{sec:Device Types / Socket Device / Device Initialization}
> >
> > \begin{enumerate}
> > \item The guest's cid is read from \field{guest_cid}.
> >
> >+\item If VIRTIO_VSOCK_F_MULTI_DEVICES has been negotiated, the device's
> >+order will be read from \field{device_order}.
> >+
> > \item Buffers are added to the event virtqueue to receive events from the device.
> >
> > \item Buffers are added to the rx virtqueue to start receiving packets.
> >@@ -233,8 +257,10 @@ \subsubsection{Receive and Transmit}\label{sec:Device Types / Socket Device / De
> >
> > \drivernormative{\paragraph}{Device Operation: Receive and Transmit}{Device Types / Socket Device / Device Operation / Receive and Transmit}
> >
> >-The \field{guest_cid} configuration field MUST be used as the source CID when
> >-sending outgoing packets.
> >+If \field{src_cid} is missing in outgoing packets, the driver MUST assign
> 
> I think here we have to define what the driver does, since the driver 
> has to populate that field, how is it missing?
> 
> Maybe we are confusing interaction with user space, so we should say 
> something like, “If the source socket is not bound to any source CID, 
> the driver MUST assign ...”
> 
> Thanks,
> Stefano
> 
> >+one. If more than one device is present, the driver SHOULD use the default
> >+device's \field{guest_cid} configuration. Otherwise, the driver SHOULD use
> >+the \field{guest_cid} of the only available device.
> >
> > A VIRTIO_VSOCK_OP_RST reply MUST be sent if a packet is received with an
> > unknown \field{type} value.
> >-- 
> >2.34.1

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-03-26  2:59   ` Xuewei Niu
@ 2025-03-26  9:03     ` Stefano Garzarella
  2025-03-27  8:18       ` Xuewei Niu
  0 siblings, 1 reply; 23+ messages in thread
From: Stefano Garzarella @ 2025-03-26  9:03 UTC (permalink / raw)
  To: Xuewei Niu, mst, stefanha
  Cc: fupan.lfp, mst, niuxuewei.nxw, parav, virtio-comment

On Wed, Mar 26, 2025 at 10:59:52AM +0800, Xuewei Niu wrote:
>> On Mon, Mar 24, 2025 at 02:43:35PM +0800, Xuewei Niu wrote:
>> >This patch brings a new feature, called "multi devices", to the virtio
>> >vsock. It introduces a "VIRTIO_VSOCK_F_MULTI_DEVICES" feature bit, and a
>> >"device_order" field to the config for the virtio vsock.
>> >
>> >== Motivition ==
>> >
>> >Vsock is a lightweight and widely used data exchange mechanism between host
>> >and guest. Currently, the virtio-vsock only supports one device, resulting
>> >in the inability to enable more than one backend. For instance, two devices
>> >are required: one to transfer data to the VMM via virtio-vsock,
>>
>> Come to think of it, AF_VSOCK defines CID 0 (VMADDR_CID_HYPERVISOR) to
>> communicate with the hypervisor, but in virtio-vsock we never supported
>> it. Could this be the use case?
>>
>> We could in this way add a new feature for those devices that
>> communicate only with the VMM, where the CID of the VM is quite useless.
>> So instead of having multiple CIDs per VM, we could continue to have a
>> single CID, but the transport could support 2 devices, one to
>> communicate with the VMM (CID = 0) and one to communicate with the host
>> apps (CID = 2).
>>
>> Maybe this is orthogonal to this proposal, though, because it might
>> still make sense to have multiple vsock devices, even though it's not
>> very clear to me.
>>
>> > and another to a user process via vhost-user-vsock.
>>
>> So to recap, one device would be used only to communicate with the VMM,
>> and the other device to communicate with other external processes,
>> right?
>>
>> Do you have any other use cases?
>>
>> >
>> >Apart from that, a side gain is that theoretically the performance might be
>> >improved since each device has its own queue. But it varies depending on
>> >the implementation.
>>
>> This though might be easier to implement supported multi-queue in the
>> device, instead of adding n devices to the VM.
>
>How to specify virtqueue for a connection?

This should not be done by the user, and that's exactly the problem with 
having multiple devices!

How do you decide which application should use one device and which 
should use another device? You have to add some kind of scheduler or 
static assignment that really makes it super complicated.

This has to be transparent to the user, so if you have multiple queues 
(usually one per vCPU), the driver automatically uses the right queue 
according to the vCPU it's running on avoiding the bottleneck with the 
other vCPUs.

>
>For example, I have a higher priority connection, and some of lower
>connections. How to route the higher one to vq0 group (including tx and
>rx), and route the lower to vq1?

That's still another problem, some kind of QOS and again, this shouldn't 
be done by inventing some kind of logic that the user has to use device 
X instead of device Y because X is higher priority.

We should just expose a sockopt like SO_PRIORITY to the user to decide 
the priority and the driver decides which queue to use according to it 
(e.g. we could have an extra queue only for those at higher priority, 
etc.).

>
>I think it could be easier to be achieved with multi-device feature.

No, I don't think QOS and performance should be solved by adding 
devices, but by adding queues.

More devices are only needed to use addresses differently or because 
virtqueue needs to be consumed in different entities, but again we don't 
have to complicate the user's life at all who has to use one device or 
another for QOS or performance.

That said, I'm not totally against this change; it might make sense to 
attach a virtio-vsock device consumed by the VMM, one consumed by 
vhost-vsock, and one consumed by vhost-user-vsock to the VM. I simply 
would not use performance or QOS to justify this, but your original 
problem might be a justification.

@Michael, @Stefan, any thoughts on this?

Thanks,
Stefano

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-03-26  9:03     ` Stefano Garzarella
@ 2025-03-27  8:18       ` Xuewei Niu
  2025-03-31  6:18         ` Xuewei Niu
  0 siblings, 1 reply; 23+ messages in thread
From: Xuewei Niu @ 2025-03-27  8:18 UTC (permalink / raw)
  To: sgarzare
  Cc: fupan.lfp, mst, niuxuewei.nxw, niuxuewei97, parav, stefanha,
	virtio-comment

> On Wed, 26 Mar 2025 at 11:32, Stefano Garzarella <sgarzare@redhat.com> wrote:
> > On Wed, Mar 26, 2025 at 06:00:31PM +0800, Xuewei Niu wrote:
> > >> On Tue, Mar 25, 2025 at 11:19:46AM +0800, Xuewei Niu wrote:
> > >> >> On Mon, Mar 24, 2025 at 02:43:35PM +0800, Xuewei Niu wrote:
> > >> >> >This patch brings a new feature, called "multi devices", to the virtio
> > >> >> >vsock. It introduces a "VIRTIO_VSOCK_F_MULTI_DEVICES" feature bit, and a
> > >> >> >"device_order" field to the config for the virtio vsock.
> > >> >> >
> > >> >> >== Motivition ==
> > >> >> >
> > >> >> >Vsock is a lightweight and widely used data exchange mechanism between host
> > >> >> >and guest. Currently, the virtio-vsock only supports one device, resulting
> > >> >> >in the inability to enable more than one backend. For instance, two devices
> > >> >> >are required: one to transfer data to the VMM via virtio-vsock,
> > >> >>
> > >> >> Come to think of it, AF_VSOCK defines CID 0 (VMADDR_CID_HYPERVISOR) to
> > >> >> communicate with the hypervisor, but in virtio-vsock we never supported
> > >> >> it. Could this be the use case?
> > >> >>
> > >> >> We could in this way add a new feature for those devices that
> > >> >> communicate only with the VMM, where the CID of the VM is quite useless.
> > >> >> So instead of having multiple CIDs per VM, we could continue to have a
> > >> >> single CID, but the transport could support 2 devices, one to
> > >> >> communicate with the VMM (CID = 0) and one to communicate with the host
> > >> >> apps (CID = 2).
> > >> >>
> > >> >> Maybe this is orthogonal to this proposal, though, because it might
> > >> >> still make sense to have multiple vsock devices, even though it's not
> > >> >> very clear to me.
> > >> >
> > >> >In terms of the current situation, two devices are enough.
> > >> >
> > >> >We are the team of Kata Containers, so we are focusing on cloud-native
> > >> >computing. What I mentioned below might be beyond the scope of the virtio
> > >> >spec, just for your reference.
> > >> >
> > >> >The background is that the architecture of proxy mesh has been evolved over
> > >> >the past few years: from per-pod to per-host (e.g. Istio Ambient Mesh[1]).
> > >> >
> > >> >Thanks to the TSI[2] and vhost-user protocol, network packets can bypass
> > >> >both host and guest network stacks. It is possible to establish a fast path
> > >> >between the pod and the proxy.
> > >> >
> > >> >When we have multiple networks, it is intuitive to have multiple NICs. So
> > >> >does vsock.
> > >>
> > >> Be careful though, we don't want to complicate vsock to become like a
> > >> NIC.
> > >>
> > >> >
> > >> >When multiple networks are availble, it means that it is possible to have
> > >> >multiple proxies(i.e. user processes). In this case, two devices are not
> > >> >enough. This feature makes vsock more flexible and scalable.
> > >>
> > >> This is a good point, but I really don't understand why a VM should have
> > >> multiple CIDs assigned.
> > >
> > >I think priority is not the biggest issue here. So let us focus on how to
> > >route the connection to the right device among more than two devices.
> >
> > That's why I was recommending a different approach. IMO the user should
> > not do this, but that should be transparent, hidden in the driver.
> >
> > By supporting VMADDR_CID_HYPERVISOR, we know very well if a packet is to
> > be sent to the VMM, then we have to use the device that supports it.
> > Whereas if the user connects to VMADDR_CID_HOST we have to use the other
> > device.
> >
> > The user doesn't have to do anything, only use the right destination CID
> > if it wants to talk to the VMM or another host process.
> 
> Obviously, if we want to support more than 2 devices, we need this
> that you are proposing. But IMO we need also to support
> VMADDR_CID_HYPERVISOR, and we should prevent the user from doing
> bind() on a random CID if one of the two devices only talks to the
> VMM.

I agree with supporting `VMADDR_CID_HYPERVISOR` for virtio-vsock. I can
work on this later.

> Because, again, how does the user know which CID to bind?

Nice catch! I am trying to give a solution for this issue regarding the
scenario of more than two devices.

Let users access the `device_order` and the `guest_cid` field. Host user
program and guest user program can make an advance agreement. For example,
the first device (whose `device_order` is smallest) is used to communicate
with host process 1, the second device is used to host process 2, and so
on.

The guest user program want to direct the message to host process 2, then
the things would be:

1. Guest user program gets the second device's `guest_cid`.
2. Guest user program binds to the CID.

This could be worked because the `device_order` is a VM-level
configuration. (On the contrary, the `guest_cid` is a host-level
configuration).

If people don't need this feature (use 1 or 2 devices only), they can use
vsock as the simple way. Otherwise, people should accept the more
complicated way.

WDYT?

Thanks,
Xuewei

> >
> > >
> > >Our solution uses CID as device identification. From the users'
> > >perspective, they can direct the connection to the appropriate device by
> > >specifying a CID in either the `connect` or `bind` syscall.
> >
> > How does the user know which device/CID to bind if it wants to talk with
> > the VMM or with the application?
> >
> > >
> > >Assigning one CID to a VM looks good to me. But I am not sure how to
> > >distinguish the devices. For example, should we expose a ioctl or a
> > >sockopt?
> >
> > Nope, just simply use the right destination CID in the connect()
> > (VMADDR_CID_HYPERVISOR or VMADDR_CID_HOST), without doing any bind().
> >
> > For receiving, the user can check the source CID after connection and
> > decide to discard connections from VMADDR_CID_HYPERVISOR or
> > VMADDR_CID_HOST depending of the service.
> >
> > Thanks,
> > Stefano

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-03-27  8:18       ` Xuewei Niu
@ 2025-03-31  6:18         ` Xuewei Niu
  2025-04-01 11:15           ` Stefano Garzarella
  0 siblings, 1 reply; 23+ messages in thread
From: Xuewei Niu @ 2025-03-31  6:18 UTC (permalink / raw)
  To: niuxuewei97
  Cc: fupan.lfp, mst, niuxuewei.nxw, parav, sgarzare, stefanha,
	virtio-comment

> > On Wed, 26 Mar 2025 at 11:32, Stefano Garzarella <sgarzare@redhat.com> wrote:
> > > On Wed, Mar 26, 2025 at 06:00:31PM +0800, Xuewei Niu wrote:
> > > >> On Tue, Mar 25, 2025 at 11:19:46AM +0800, Xuewei Niu wrote:
> > > >> >> On Mon, Mar 24, 2025 at 02:43:35PM +0800, Xuewei Niu wrote:
> > > >> >> >This patch brings a new feature, called "multi devices", to the virtio
> > > >> >> >vsock. It introduces a "VIRTIO_VSOCK_F_MULTI_DEVICES" feature bit, and a
> > > >> >> >"device_order" field to the config for the virtio vsock.
> > > >> >> >
> > > >> >> >== Motivition ==
> > > >> >> >
> > > >> >> >Vsock is a lightweight and widely used data exchange mechanism between host
> > > >> >> >and guest. Currently, the virtio-vsock only supports one device, resulting
> > > >> >> >in the inability to enable more than one backend. For instance, two devices
> > > >> >> >are required: one to transfer data to the VMM via virtio-vsock,
> > > >> >>
> > > >> >> Come to think of it, AF_VSOCK defines CID 0 (VMADDR_CID_HYPERVISOR) to
> > > >> >> communicate with the hypervisor, but in virtio-vsock we never supported
> > > >> >> it. Could this be the use case?
> > > >> >>
> > > >> >> We could in this way add a new feature for those devices that
> > > >> >> communicate only with the VMM, where the CID of the VM is quite useless.
> > > >> >> So instead of having multiple CIDs per VM, we could continue to have a
> > > >> >> single CID, but the transport could support 2 devices, one to
> > > >> >> communicate with the VMM (CID = 0) and one to communicate with the host
> > > >> >> apps (CID = 2).
> > > >> >>
> > > >> >> Maybe this is orthogonal to this proposal, though, because it might
> > > >> >> still make sense to have multiple vsock devices, even though it's not
> > > >> >> very clear to me.
> > > >> >
> > > >> >In terms of the current situation, two devices are enough.
> > > >> >
> > > >> >We are the team of Kata Containers, so we are focusing on cloud-native
> > > >> >computing. What I mentioned below might be beyond the scope of the virtio
> > > >> >spec, just for your reference.
> > > >> >
> > > >> >The background is that the architecture of proxy mesh has been evolved over
> > > >> >the past few years: from per-pod to per-host (e.g. Istio Ambient Mesh[1]).
> > > >> >
> > > >> >Thanks to the TSI[2] and vhost-user protocol, network packets can bypass
> > > >> >both host and guest network stacks. It is possible to establish a fast path
> > > >> >between the pod and the proxy.
> > > >> >
> > > >> >When we have multiple networks, it is intuitive to have multiple NICs. So
> > > >> >does vsock.
> > > >>
> > > >> Be careful though, we don't want to complicate vsock to become like a
> > > >> NIC.
> > > >>
> > > >> >
> > > >> >When multiple networks are availble, it means that it is possible to have
> > > >> >multiple proxies(i.e. user processes). In this case, two devices are not
> > > >> >enough. This feature makes vsock more flexible and scalable.
> > > >>
> > > >> This is a good point, but I really don't understand why a VM should have
> > > >> multiple CIDs assigned.
> > > >
> > > >I think priority is not the biggest issue here. So let us focus on how to
> > > >route the connection to the right device among more than two devices.
> > >
> > > That's why I was recommending a different approach. IMO the user should
> > > not do this, but that should be transparent, hidden in the driver.
> > >
> > > By supporting VMADDR_CID_HYPERVISOR, we know very well if a packet is to
> > > be sent to the VMM, then we have to use the device that supports it.
> > > Whereas if the user connects to VMADDR_CID_HOST we have to use the other
> > > device.
> > >
> > > The user doesn't have to do anything, only use the right destination CID
> > > if it wants to talk to the VMM or another host process.
> > 
> > Obviously, if we want to support more than 2 devices, we need this
> > that you are proposing. But IMO we need also to support
> > VMADDR_CID_HYPERVISOR, and we should prevent the user from doing
> > bind() on a random CID if one of the two devices only talks to the
> > VMM.
> 
> I agree with supporting `VMADDR_CID_HYPERVISOR` for virtio-vsock. I can
> work on this later.
> 
> > Because, again, how does the user know which CID to bind?
> 
> Nice catch! I am trying to give a solution for this issue regarding the
> scenario of more than two devices.
> 
> Let users access the `device_order` and the `guest_cid` field. Host user
> program and guest user program can make an advance agreement. For example,
> the first device (whose `device_order` is smallest) is used to communicate
> with host process 1, the second device is used to host process 2, and so
> on.
> 
> The guest user program want to direct the message to host process 2, then
> the things would be:
> 
> 1. Guest user program gets the second device's `guest_cid`.
> 2. Guest user program binds to the CID.
> 
> This could be worked because the `device_order` is a VM-level
> configuration. (On the contrary, the `guest_cid` is a host-level
> configuration).
> 
> If people don't need this feature (use 1 or 2 devices only), they can use
> vsock as the simple way. Otherwise, people should accept the more
> complicated way.
> 
> WDYT?

Or we can replace the device_order with the guest_lid (aka local id). The
guest_lid is a VM-level address space, while the guest_cid is a host-level
address space.

```c
struct virtio_vsock_config {
	__le64 guest_cid;
	__le16 guest_lid; /* previous device_order */
};
```

With this design, the relationship between the device and the guest_lid
should be set properly before building the guest app and launching the
VM.

For example, host process 0's guest_lid is 1000, and host process 1's is
2000. Their guest_cid will be determined when the VM started. The device
table should be like this:

* device0: process=VM   guest_lid=0     guest_cid=0 <default device>
* device1: process=0    guest_lid=1000  guest_cid=x
* device2: process=1    guest_lid=2000  guest_cid=y

The driver should expose an interface, such as ioctl, receiving a
local_cid. Guest apps can use it to obtain the actual guest_cid.

It is expected that there will not be too many virtio-vsock devices (less
than 16). Therefore, conflicts with guest_lid are not a big issue.

Thanks,
Xuewei

> > >
> > > >
> > > >Our solution uses CID as device identification. From the users'
> > > >perspective, they can direct the connection to the appropriate device by
> > > >specifying a CID in either the `connect` or `bind` syscall.
> > >
> > > How does the user know which device/CID to bind if it wants to talk with
> > > the VMM or with the application?
> > >
> > > >
> > > >Assigning one CID to a VM looks good to me. But I am not sure how to
> > > >distinguish the devices. For example, should we expose a ioctl or a
> > > >sockopt?
> > >
> > > Nope, just simply use the right destination CID in the connect()
> > > (VMADDR_CID_HYPERVISOR or VMADDR_CID_HOST), without doing any bind().
> > >
> > > For receiving, the user can check the source CID after connection and
> > > decide to discard connections from VMADDR_CID_HYPERVISOR or
> > > VMADDR_CID_HOST depending of the service.
> > >
> > > Thanks,
> > > Stefano

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-03-31  6:18         ` Xuewei Niu
@ 2025-04-01 11:15           ` Stefano Garzarella
  2025-04-07  2:17             ` Xuewei Niu
  0 siblings, 1 reply; 23+ messages in thread
From: Stefano Garzarella @ 2025-04-01 11:15 UTC (permalink / raw)
  To: Xuewei Niu; +Cc: fupan.lfp, mst, niuxuewei.nxw, parav, stefanha, virtio-comment

On Mon, Mar 31, 2025 at 02:18:27PM +0800, Xuewei Niu wrote:
>> > On Wed, 26 Mar 2025 at 11:32, Stefano Garzarella <sgarzare@redhat.com> wrote:
>> > > On Wed, Mar 26, 2025 at 06:00:31PM +0800, Xuewei Niu wrote:
>> > > >> On Tue, Mar 25, 2025 at 11:19:46AM +0800, Xuewei Niu wrote:
>> > > >> >> On Mon, Mar 24, 2025 at 02:43:35PM +0800, Xuewei Niu wrote:
>> > > >> >> >This patch brings a new feature, called "multi devices", to the virtio
>> > > >> >> >vsock. It introduces a "VIRTIO_VSOCK_F_MULTI_DEVICES" feature bit, and a
>> > > >> >> >"device_order" field to the config for the virtio vsock.
>> > > >> >> >
>> > > >> >> >== Motivition ==
>> > > >> >> >
>> > > >> >> >Vsock is a lightweight and widely used data exchange mechanism between host
>> > > >> >> >and guest. Currently, the virtio-vsock only supports one device, resulting
>> > > >> >> >in the inability to enable more than one backend. For instance, two devices
>> > > >> >> >are required: one to transfer data to the VMM via virtio-vsock,
>> > > >> >>
>> > > >> >> Come to think of it, AF_VSOCK defines CID 0 (VMADDR_CID_HYPERVISOR) to
>> > > >> >> communicate with the hypervisor, but in virtio-vsock we never supported
>> > > >> >> it. Could this be the use case?
>> > > >> >>
>> > > >> >> We could in this way add a new feature for those devices that
>> > > >> >> communicate only with the VMM, where the CID of the VM is quite useless.
>> > > >> >> So instead of having multiple CIDs per VM, we could continue to have a
>> > > >> >> single CID, but the transport could support 2 devices, one to
>> > > >> >> communicate with the VMM (CID = 0) and one to communicate with the host
>> > > >> >> apps (CID = 2).
>> > > >> >>
>> > > >> >> Maybe this is orthogonal to this proposal, though, because it might
>> > > >> >> still make sense to have multiple vsock devices, even though it's not
>> > > >> >> very clear to me.
>> > > >> >
>> > > >> >In terms of the current situation, two devices are enough.
>> > > >> >
>> > > >> >We are the team of Kata Containers, so we are focusing on cloud-native
>> > > >> >computing. What I mentioned below might be beyond the scope of the virtio
>> > > >> >spec, just for your reference.
>> > > >> >
>> > > >> >The background is that the architecture of proxy mesh has been evolved over
>> > > >> >the past few years: from per-pod to per-host (e.g. Istio Ambient Mesh[1]).
>> > > >> >
>> > > >> >Thanks to the TSI[2] and vhost-user protocol, network packets can bypass
>> > > >> >both host and guest network stacks. It is possible to establish a fast path
>> > > >> >between the pod and the proxy.
>> > > >> >
>> > > >> >When we have multiple networks, it is intuitive to have multiple NICs. So
>> > > >> >does vsock.
>> > > >>
>> > > >> Be careful though, we don't want to complicate vsock to become like a
>> > > >> NIC.
>> > > >>
>> > > >> >
>> > > >> >When multiple networks are availble, it means that it is possible to have
>> > > >> >multiple proxies(i.e. user processes). In this case, two devices are not
>> > > >> >enough. This feature makes vsock more flexible and scalable.
>> > > >>
>> > > >> This is a good point, but I really don't understand why a VM should have
>> > > >> multiple CIDs assigned.
>> > > >
>> > > >I think priority is not the biggest issue here. So let us focus on how to
>> > > >route the connection to the right device among more than two devices.
>> > >
>> > > That's why I was recommending a different approach. IMO the user should
>> > > not do this, but that should be transparent, hidden in the driver.
>> > >
>> > > By supporting VMADDR_CID_HYPERVISOR, we know very well if a packet is to
>> > > be sent to the VMM, then we have to use the device that supports it.
>> > > Whereas if the user connects to VMADDR_CID_HOST we have to use the other
>> > > device.
>> > >
>> > > The user doesn't have to do anything, only use the right destination CID
>> > > if it wants to talk to the VMM or another host process.
>> >
>> > Obviously, if we want to support more than 2 devices, we need this
>> > that you are proposing. But IMO we need also to support
>> > VMADDR_CID_HYPERVISOR, and we should prevent the user from doing
>> > bind() on a random CID if one of the two devices only talks to the
>> > VMM.
>>
>> I agree with supporting `VMADDR_CID_HYPERVISOR` for virtio-vsock. I can
>> work on this later.

Would be nice to have both together, but I'm fine if you want to 
postpone it.

>>
>> > Because, again, how does the user know which CID to bind?
>>
>> Nice catch! I am trying to give a solution for this issue regarding the
>> scenario of more than two devices.
>>
>> Let users access the `device_order` and the `guest_cid` field. Host user
>> program and guest user program can make an advance agreement. For example,
>> the first device (whose `device_order` is smallest) is used to communicate
>> with host process 1, the second device is used to host process 2, and so
>> on.
>>
>> The guest user program want to direct the message to host process 2, then
>> the things would be:
>>
>> 1. Guest user program gets the second device's `guest_cid`.
>> 2. Guest user program binds to the CID.
>>
>> This could be worked because the `device_order` is a VM-level
>> configuration. (On the contrary, the `guest_cid` is a host-level
>> configuration).
>>
>> If people don't need this feature (use 1 or 2 devices only), they can use
>> vsock as the simple way. Otherwise, people should accept the more
>> complicated way.
>>
>> WDYT?
>
>Or we can replace the device_order with the guest_lid (aka local id). The
>guest_lid is a VM-level address space, while the guest_cid is a host-level
>address space.
>
>```c
>struct virtio_vsock_config {
>	__le64 guest_cid;
>	__le16 guest_lid; /* previous device_order */
>};
>```
>
>With this design, the relationship between the device and the guest_lid
>should be set properly before building the guest app and launching the
>VM.
>
>For example, host process 0's guest_lid is 1000, and host process 1's is
>2000. Their guest_cid will be determined when the VM started. The device
>table should be like this:
>
>* device0: process=VM   guest_lid=0     guest_cid=0 <default device>
>* device1: process=0    guest_lid=1000  guest_cid=x
>* device2: process=1    guest_lid=2000  guest_cid=y
>
>The driver should expose an interface, such as ioctl, receiving a
>local_cid. Guest apps can use it to obtain the actual guest_cid.

No, please, I don't think adding virtio-specific behaviour in AF_VSOCK 
is what we want.

Let's continue with device_order and see what others say.

I think we need to try to get a better understanding of what to do, 
depending on the direction:

- host -> guest: it might make sense multiple devices with different
   CIDs, and the host will know which one to use depending on the CID
   assigned to the device (e.g. vhost, vhost-user, device in VMM)

- guest -> host: again I think we should differentiate the device to use
   depending on the destination CID which can be VMADDR_CID_HOST,
   VMADDR_CID_HYPERVISOR, or in the case where sibling communication is
   supported a CID >= 3, so maybe we should have some features or flags
   in the config space to describe destination CID supported for each
   device so that the guest knows which device to use depending on the
   destination CID.

I don't want to stop this patch, but I would like to make it easy for 
the user to use.

Thanks,
Stefano


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-04-01 11:15           ` Stefano Garzarella
@ 2025-04-07  2:17             ` Xuewei Niu
  2025-04-08 13:34               ` Stefano Garzarella
  0 siblings, 1 reply; 23+ messages in thread
From: Xuewei Niu @ 2025-04-07  2:17 UTC (permalink / raw)
  To: sgarzare
  Cc: fupan.lfp, mst, niuxuewei.nxw, niuxuewei97, parav, stefanha,
	virtio-comment

> On Mon, Mar 31, 2025 at 02:18:27PM +0800, Xuewei Niu wrote:
> >> > On Wed, 26 Mar 2025 at 11:32, Stefano Garzarella <sgarzare@redhat.com> wrote:
> >> > > On Wed, Mar 26, 2025 at 06:00:31PM +0800, Xuewei Niu wrote:
> >> > > >> On Tue, Mar 25, 2025 at 11:19:46AM +0800, Xuewei Niu wrote:
> >> > > >> >> On Mon, Mar 24, 2025 at 02:43:35PM +0800, Xuewei Niu wrote:
> >> > > >> >> >This patch brings a new feature, called "multi devices", to the virtio
> >> > > >> >> >vsock. It introduces a "VIRTIO_VSOCK_F_MULTI_DEVICES" feature bit, and a
> >> > > >> >> >"device_order" field to the config for the virtio vsock.
> >> > > >> >> >
> >> > > >> >> >== Motivition ==
> >> > > >> >> >
> >> > > >> >> >Vsock is a lightweight and widely used data exchange mechanism between host
> >> > > >> >> >and guest. Currently, the virtio-vsock only supports one device, resulting
> >> > > >> >> >in the inability to enable more than one backend. For instance, two devices
> >> > > >> >> >are required: one to transfer data to the VMM via virtio-vsock,
> >> > > >> >>
> >> > > >> >> Come to think of it, AF_VSOCK defines CID 0 (VMADDR_CID_HYPERVISOR) to
> >> > > >> >> communicate with the hypervisor, but in virtio-vsock we never supported
> >> > > >> >> it. Could this be the use case?
> >> > > >> >>
> >> > > >> >> We could in this way add a new feature for those devices that
> >> > > >> >> communicate only with the VMM, where the CID of the VM is quite useless.
> >> > > >> >> So instead of having multiple CIDs per VM, we could continue to have a
> >> > > >> >> single CID, but the transport could support 2 devices, one to
> >> > > >> >> communicate with the VMM (CID = 0) and one to communicate with the host
> >> > > >> >> apps (CID = 2).
> >> > > >> >>
> >> > > >> >> Maybe this is orthogonal to this proposal, though, because it might
> >> > > >> >> still make sense to have multiple vsock devices, even though it's not
> >> > > >> >> very clear to me.
> >> > > >> >
> >> > > >> >In terms of the current situation, two devices are enough.
> >> > > >> >
> >> > > >> >We are the team of Kata Containers, so we are focusing on cloud-native
> >> > > >> >computing. What I mentioned below might be beyond the scope of the virtio
> >> > > >> >spec, just for your reference.
> >> > > >> >
> >> > > >> >The background is that the architecture of proxy mesh has been evolved over
> >> > > >> >the past few years: from per-pod to per-host (e.g. Istio Ambient Mesh[1]).
> >> > > >> >
> >> > > >> >Thanks to the TSI[2] and vhost-user protocol, network packets can bypass
> >> > > >> >both host and guest network stacks. It is possible to establish a fast path
> >> > > >> >between the pod and the proxy.
> >> > > >> >
> >> > > >> >When we have multiple networks, it is intuitive to have multiple NICs. So
> >> > > >> >does vsock.
> >> > > >>
> >> > > >> Be careful though, we don't want to complicate vsock to become like a
> >> > > >> NIC.
> >> > > >>
> >> > > >> >
> >> > > >> >When multiple networks are availble, it means that it is possible to have
> >> > > >> >multiple proxies(i.e. user processes). In this case, two devices are not
> >> > > >> >enough. This feature makes vsock more flexible and scalable.
> >> > > >>
> >> > > >> This is a good point, but I really don't understand why a VM should have
> >> > > >> multiple CIDs assigned.
> >> > > >
> >> > > >I think priority is not the biggest issue here. So let us focus on how to
> >> > > >route the connection to the right device among more than two devices.
> >> > >
> >> > > That's why I was recommending a different approach. IMO the user should
> >> > > not do this, but that should be transparent, hidden in the driver.
> >> > >
> >> > > By supporting VMADDR_CID_HYPERVISOR, we know very well if a packet is to
> >> > > be sent to the VMM, then we have to use the device that supports it.
> >> > > Whereas if the user connects to VMADDR_CID_HOST we have to use the other
> >> > > device.
> >> > >
> >> > > The user doesn't have to do anything, only use the right destination CID
> >> > > if it wants to talk to the VMM or another host process.
> >> >
> >> > Obviously, if we want to support more than 2 devices, we need this
> >> > that you are proposing. But IMO we need also to support
> >> > VMADDR_CID_HYPERVISOR, and we should prevent the user from doing
> >> > bind() on a random CID if one of the two devices only talks to the
> >> > VMM.
> >>
> >> I agree with supporting `VMADDR_CID_HYPERVISOR` for virtio-vsock. I can
> >> work on this later.
> 
> Would be nice to have both together, but I'm fine if you want to 
> postpone it.
> 
> >>
> >> > Because, again, how does the user know which CID to bind?
> >>
> >> Nice catch! I am trying to give a solution for this issue regarding the
> >> scenario of more than two devices.
> >>
> >> Let users access the `device_order` and the `guest_cid` field. Host user
> >> program and guest user program can make an advance agreement. For example,
> >> the first device (whose `device_order` is smallest) is used to communicate
> >> with host process 1, the second device is used to host process 2, and so
> >> on.
> >>
> >> The guest user program want to direct the message to host process 2, then
> >> the things would be:
> >>
> >> 1. Guest user program gets the second device's `guest_cid`.
> >> 2. Guest user program binds to the CID.
> >>
> >> This could be worked because the `device_order` is a VM-level
> >> configuration. (On the contrary, the `guest_cid` is a host-level
> >> configuration).
> >>
> >> If people don't need this feature (use 1 or 2 devices only), they can use
> >> vsock as the simple way. Otherwise, people should accept the more
> >> complicated way.
> >>
> >> WDYT?
> >
> >Or we can replace the device_order with the guest_lid (aka local id). The
> >guest_lid is a VM-level address space, while the guest_cid is a host-level
> >address space.
> >
> >```c
> >struct virtio_vsock_config {
> >	__le64 guest_cid;
> >	__le16 guest_lid; /* previous device_order */
> >};
> >```
> >
> >With this design, the relationship between the device and the guest_lid
> >should be set properly before building the guest app and launching the
> >VM.
> >
> >For example, host process 0's guest_lid is 1000, and host process 1's is
> >2000. Their guest_cid will be determined when the VM started. The device
> >table should be like this:
> >
> >* device0: process=VM   guest_lid=0     guest_cid=0 <default device>
> >* device1: process=0    guest_lid=1000  guest_cid=x
> >* device2: process=1    guest_lid=2000  guest_cid=y
> >
> >The driver should expose an interface, such as ioctl, receiving a
> >local_cid. Guest apps can use it to obtain the actual guest_cid.
> 
> No, please, I don't think adding virtio-specific behaviour in AF_VSOCK 
> is what we want.
> 
> Let's continue with device_order and see what others say.
> 
> I think we need to try to get a better understanding of what to do, 
> depending on the direction:
> 
> - host -> guest: it might make sense multiple devices with different
>    CIDs, and the host will know which one to use depending on the CID
>    assigned to the device (e.g. vhost, vhost-user, device in VMM)
> 
> - guest -> host: again I think we should differentiate the device to use
>    depending on the destination CID which can be VMADDR_CID_HOST,
>    VMADDR_CID_HYPERVISOR, or in the case where sibling communication is
>    supported a CID >= 3, so maybe we should have some features or flags
>    in the config space to describe destination CID supported for each
>    device

I don't understand the point of adding a new features/flags. Could you
explain a bit more?

We have had the guest_cid field in the config space. The guest knows all
devices present in the VM.

If the app tries to bind a random CID, it will fail since the driver can't
find the device by the CID.

> so that the guest knows which device to use depending on the destination
> CID.

Yes, this is what I was describing in the previous comment. The message
will be directed to the device by the destination CID.

Thanks,
Xuewei

> I don't want to stop this patch, but I would like to make it easy for 
> the user to use.
> 
> Thanks,
> Stefano

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-04-07  2:17             ` Xuewei Niu
@ 2025-04-08 13:34               ` Stefano Garzarella
  2025-04-09  6:55                 ` Xuewei Niu
  0 siblings, 1 reply; 23+ messages in thread
From: Stefano Garzarella @ 2025-04-08 13:34 UTC (permalink / raw)
  To: Xuewei Niu; +Cc: fupan.lfp, mst, niuxuewei.nxw, parav, stefanha, virtio-comment

On Mon, 7 Apr 2025 at 04:17, Xuewei Niu <niuxuewei97@gmail.com> wrote:
>
> > On Mon, Mar 31, 2025 at 02:18:27PM +0800, Xuewei Niu wrote:
> > >> > On Wed, 26 Mar 2025 at 11:32, Stefano Garzarella <sgarzare@redhat.com> wrote:
> > >> > > On Wed, Mar 26, 2025 at 06:00:31PM +0800, Xuewei Niu wrote:
> > >> > > >> On Tue, Mar 25, 2025 at 11:19:46AM +0800, Xuewei Niu wrote:
> > >> > > >> >> On Mon, Mar 24, 2025 at 02:43:35PM +0800, Xuewei Niu wrote:
> > >> > > >> >> >This patch brings a new feature, called "multi devices", to the virtio
> > >> > > >> >> >vsock. It introduces a "VIRTIO_VSOCK_F_MULTI_DEVICES" feature bit, and a
> > >> > > >> >> >"device_order" field to the config for the virtio vsock.
> > >> > > >> >> >
> > >> > > >> >> >== Motivition ==
> > >> > > >> >> >
> > >> > > >> >> >Vsock is a lightweight and widely used data exchange mechanism between host
> > >> > > >> >> >and guest. Currently, the virtio-vsock only supports one device, resulting
> > >> > > >> >> >in the inability to enable more than one backend. For instance, two devices
> > >> > > >> >> >are required: one to transfer data to the VMM via virtio-vsock,
> > >> > > >> >>
> > >> > > >> >> Come to think of it, AF_VSOCK defines CID 0 (VMADDR_CID_HYPERVISOR) to
> > >> > > >> >> communicate with the hypervisor, but in virtio-vsock we never supported
> > >> > > >> >> it. Could this be the use case?
> > >> > > >> >>
> > >> > > >> >> We could in this way add a new feature for those devices that
> > >> > > >> >> communicate only with the VMM, where the CID of the VM is quite useless.
> > >> > > >> >> So instead of having multiple CIDs per VM, we could continue to have a
> > >> > > >> >> single CID, but the transport could support 2 devices, one to
> > >> > > >> >> communicate with the VMM (CID = 0) and one to communicate with the host
> > >> > > >> >> apps (CID = 2).
> > >> > > >> >>
> > >> > > >> >> Maybe this is orthogonal to this proposal, though, because it might
> > >> > > >> >> still make sense to have multiple vsock devices, even though it's not
> > >> > > >> >> very clear to me.
> > >> > > >> >
> > >> > > >> >In terms of the current situation, two devices are enough.
> > >> > > >> >
> > >> > > >> >We are the team of Kata Containers, so we are focusing on cloud-native
> > >> > > >> >computing. What I mentioned below might be beyond the scope of the virtio
> > >> > > >> >spec, just for your reference.
> > >> > > >> >
> > >> > > >> >The background is that the architecture of proxy mesh has been evolved over
> > >> > > >> >the past few years: from per-pod to per-host (e.g. Istio Ambient Mesh[1]).
> > >> > > >> >
> > >> > > >> >Thanks to the TSI[2] and vhost-user protocol, network packets can bypass
> > >> > > >> >both host and guest network stacks. It is possible to establish a fast path
> > >> > > >> >between the pod and the proxy.
> > >> > > >> >
> > >> > > >> >When we have multiple networks, it is intuitive to have multiple NICs. So
> > >> > > >> >does vsock.
> > >> > > >>
> > >> > > >> Be careful though, we don't want to complicate vsock to become like a
> > >> > > >> NIC.
> > >> > > >>
> > >> > > >> >
> > >> > > >> >When multiple networks are availble, it means that it is possible to have
> > >> > > >> >multiple proxies(i.e. user processes). In this case, two devices are not
> > >> > > >> >enough. This feature makes vsock more flexible and scalable.
> > >> > > >>
> > >> > > >> This is a good point, but I really don't understand why a VM should have
> > >> > > >> multiple CIDs assigned.
> > >> > > >
> > >> > > >I think priority is not the biggest issue here. So let us focus on how to
> > >> > > >route the connection to the right device among more than two devices.
> > >> > >
> > >> > > That's why I was recommending a different approach. IMO the user should
> > >> > > not do this, but that should be transparent, hidden in the driver.
> > >> > >
> > >> > > By supporting VMADDR_CID_HYPERVISOR, we know very well if a packet is to
> > >> > > be sent to the VMM, then we have to use the device that supports it.
> > >> > > Whereas if the user connects to VMADDR_CID_HOST we have to use the other
> > >> > > device.
> > >> > >
> > >> > > The user doesn't have to do anything, only use the right destination CID
> > >> > > if it wants to talk to the VMM or another host process.
> > >> >
> > >> > Obviously, if we want to support more than 2 devices, we need this
> > >> > that you are proposing. But IMO we need also to support
> > >> > VMADDR_CID_HYPERVISOR, and we should prevent the user from doing
> > >> > bind() on a random CID if one of the two devices only talks to the
> > >> > VMM.
> > >>
> > >> I agree with supporting `VMADDR_CID_HYPERVISOR` for virtio-vsock. I can
> > >> work on this later.
> >
> > Would be nice to have both together, but I'm fine if you want to
> > postpone it.
> >
> > >>
> > >> > Because, again, how does the user know which CID to bind?
> > >>
> > >> Nice catch! I am trying to give a solution for this issue regarding the
> > >> scenario of more than two devices.
> > >>
> > >> Let users access the `device_order` and the `guest_cid` field. Host user
> > >> program and guest user program can make an advance agreement. For example,
> > >> the first device (whose `device_order` is smallest) is used to communicate
> > >> with host process 1, the second device is used to host process 2, and so
> > >> on.
> > >>
> > >> The guest user program want to direct the message to host process 2, then
> > >> the things would be:
> > >>
> > >> 1. Guest user program gets the second device's `guest_cid`.
> > >> 2. Guest user program binds to the CID.
> > >>
> > >> This could be worked because the `device_order` is a VM-level
> > >> configuration. (On the contrary, the `guest_cid` is a host-level
> > >> configuration).
> > >>
> > >> If people don't need this feature (use 1 or 2 devices only), they can use
> > >> vsock as the simple way. Otherwise, people should accept the more
> > >> complicated way.
> > >>
> > >> WDYT?
> > >
> > >Or we can replace the device_order with the guest_lid (aka local id). The
> > >guest_lid is a VM-level address space, while the guest_cid is a host-level
> > >address space.
> > >
> > >```c
> > >struct virtio_vsock_config {
> > >     __le64 guest_cid;
> > >     __le16 guest_lid; /* previous device_order */
> > >};
> > >```
> > >
> > >With this design, the relationship between the device and the guest_lid
> > >should be set properly before building the guest app and launching the
> > >VM.
> > >
> > >For example, host process 0's guest_lid is 1000, and host process 1's is
> > >2000. Their guest_cid will be determined when the VM started. The device
> > >table should be like this:
> > >
> > >* device0: process=VM   guest_lid=0     guest_cid=0 <default device>
> > >* device1: process=0    guest_lid=1000  guest_cid=x
> > >* device2: process=1    guest_lid=2000  guest_cid=y
> > >
> > >The driver should expose an interface, such as ioctl, receiving a
> > >local_cid. Guest apps can use it to obtain the actual guest_cid.
> >
> > No, please, I don't think adding virtio-specific behaviour in AF_VSOCK
> > is what we want.
> >
> > Let's continue with device_order and see what others say.
> >
> > I think we need to try to get a better understanding of what to do,
> > depending on the direction:
> >
> > - host -> guest: it might make sense multiple devices with different
> >    CIDs, and the host will know which one to use depending on the CID
> >    assigned to the device (e.g. vhost, vhost-user, device in VMM)
> >
> > - guest -> host: again I think we should differentiate the device to use
> >    depending on the destination CID which can be VMADDR_CID_HOST,
> >    VMADDR_CID_HYPERVISOR, or in the case where sibling communication is
> >    supported a CID >= 3, so maybe we should have some features or flags
> >    in the config space to describe destination CID supported for each
> >    device
>
> I don't understand the point of adding a new features/flags. Could you
> explain a bit more?

The idea is to inform the guest which addresses are reachable by the
device, so the guest can easily decide which device to use. I'm
talking about the destination, so CID_HOST(2), CID_HYPERVISOS(0) or a
sibling VM (CID >=3).

>
> We have had the guest_cid field in the config space. The guest knows all
> devices present in the VM.

Okay, but how can the guest figure out from this information which
device to use to talk to the hypervisor or an application in the host?

>
> If the app tries to bind a random CID, it will fail since the driver can't
> find the device by the CID.

I'm not talking about the source CID on which to do bind() (which I
honestly don't like), but I'm talking about the destination CID on
which to do connect().

>
> > so that the guest knows which device to use depending on the destination
> > CID.
>
> Yes, this is what I was describing in the previous comment. The message
> will be directed to the device by the destination CID.

Sorry, I don't understand how you do this without having an
information from the device about what addresses it supports. Can you
elaborate a bit?

Thanks,
Stefano


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-04-08 13:34               ` Stefano Garzarella
@ 2025-04-09  6:55                 ` Xuewei Niu
  2025-04-09  9:34                   ` Stefano Garzarella
  0 siblings, 1 reply; 23+ messages in thread
From: Xuewei Niu @ 2025-04-09  6:55 UTC (permalink / raw)
  To: sgarzare
  Cc: fupan.lfp, mst, niuxuewei.nxw, niuxuewei97, parav, stefanha,
	virtio-comment

> On Mon, 7 Apr 2025 at 04:17, Xuewei Niu <niuxuewei97@gmail.com> wrote:
> >
> > > On Mon, Mar 31, 2025 at 02:18:27PM +0800, Xuewei Niu wrote:
> > > >> > On Wed, 26 Mar 2025 at 11:32, Stefano Garzarella <sgarzare@redhat.com> wrote:
> > > >> > > On Wed, Mar 26, 2025 at 06:00:31PM +0800, Xuewei Niu wrote:
> > > >> > > >> On Tue, Mar 25, 2025 at 11:19:46AM +0800, Xuewei Niu wrote:
> > > >> > > >> >> On Mon, Mar 24, 2025 at 02:43:35PM +0800, Xuewei Niu wrote:
> > > >> > > >> >> >This patch brings a new feature, called "multi devices", to the virtio
> > > >> > > >> >> >vsock. It introduces a "VIRTIO_VSOCK_F_MULTI_DEVICES" feature bit, and a
> > > >> > > >> >> >"device_order" field to the config for the virtio vsock.
> > > >> > > >> >> >
> > > >> > > >> >> >== Motivition ==
> > > >> > > >> >> >
> > > >> > > >> >> >Vsock is a lightweight and widely used data exchange mechanism between host
> > > >> > > >> >> >and guest. Currently, the virtio-vsock only supports one device, resulting
> > > >> > > >> >> >in the inability to enable more than one backend. For instance, two devices
> > > >> > > >> >> >are required: one to transfer data to the VMM via virtio-vsock,
> > > >> > > >> >>
> > > >> > > >> >> Come to think of it, AF_VSOCK defines CID 0 (VMADDR_CID_HYPERVISOR) to
> > > >> > > >> >> communicate with the hypervisor, but in virtio-vsock we never supported
> > > >> > > >> >> it. Could this be the use case?
> > > >> > > >> >>
> > > >> > > >> >> We could in this way add a new feature for those devices that
> > > >> > > >> >> communicate only with the VMM, where the CID of the VM is quite useless.
> > > >> > > >> >> So instead of having multiple CIDs per VM, we could continue to have a
> > > >> > > >> >> single CID, but the transport could support 2 devices, one to
> > > >> > > >> >> communicate with the VMM (CID = 0) and one to communicate with the host
> > > >> > > >> >> apps (CID = 2).
> > > >> > > >> >>
> > > >> > > >> >> Maybe this is orthogonal to this proposal, though, because it might
> > > >> > > >> >> still make sense to have multiple vsock devices, even though it's not
> > > >> > > >> >> very clear to me.
> > > >> > > >> >
> > > >> > > >> >In terms of the current situation, two devices are enough.
> > > >> > > >> >
> > > >> > > >> >We are the team of Kata Containers, so we are focusing on cloud-native
> > > >> > > >> >computing. What I mentioned below might be beyond the scope of the virtio
> > > >> > > >> >spec, just for your reference.
> > > >> > > >> >
> > > >> > > >> >The background is that the architecture of proxy mesh has been evolved over
> > > >> > > >> >the past few years: from per-pod to per-host (e.g. Istio Ambient Mesh[1]).
> > > >> > > >> >
> > > >> > > >> >Thanks to the TSI[2] and vhost-user protocol, network packets can bypass
> > > >> > > >> >both host and guest network stacks. It is possible to establish a fast path
> > > >> > > >> >between the pod and the proxy.
> > > >> > > >> >
> > > >> > > >> >When we have multiple networks, it is intuitive to have multiple NICs. So
> > > >> > > >> >does vsock.
> > > >> > > >>
> > > >> > > >> Be careful though, we don't want to complicate vsock to become like a
> > > >> > > >> NIC.
> > > >> > > >>
> > > >> > > >> >
> > > >> > > >> >When multiple networks are availble, it means that it is possible to have
> > > >> > > >> >multiple proxies(i.e. user processes). In this case, two devices are not
> > > >> > > >> >enough. This feature makes vsock more flexible and scalable.
> > > >> > > >>
> > > >> > > >> This is a good point, but I really don't understand why a VM should have
> > > >> > > >> multiple CIDs assigned.
> > > >> > > >
> > > >> > > >I think priority is not the biggest issue here. So let us focus on how to
> > > >> > > >route the connection to the right device among more than two devices.
> > > >> > >
> > > >> > > That's why I was recommending a different approach. IMO the user should
> > > >> > > not do this, but that should be transparent, hidden in the driver.
> > > >> > >
> > > >> > > By supporting VMADDR_CID_HYPERVISOR, we know very well if a packet is to
> > > >> > > be sent to the VMM, then we have to use the device that supports it.
> > > >> > > Whereas if the user connects to VMADDR_CID_HOST we have to use the other
> > > >> > > device.
> > > >> > >
> > > >> > > The user doesn't have to do anything, only use the right destination CID
> > > >> > > if it wants to talk to the VMM or another host process.
> > > >> >
> > > >> > Obviously, if we want to support more than 2 devices, we need this
> > > >> > that you are proposing. But IMO we need also to support
> > > >> > VMADDR_CID_HYPERVISOR, and we should prevent the user from doing
> > > >> > bind() on a random CID if one of the two devices only talks to the
> > > >> > VMM.
> > > >>
> > > >> I agree with supporting `VMADDR_CID_HYPERVISOR` for virtio-vsock. I can
> > > >> work on this later.
> > >
> > > Would be nice to have both together, but I'm fine if you want to
> > > postpone it.
> > >
> > > >>
> > > >> > Because, again, how does the user know which CID to bind?
> > > >>
> > > >> Nice catch! I am trying to give a solution for this issue regarding the
> > > >> scenario of more than two devices.
> > > >>
> > > >> Let users access the `device_order` and the `guest_cid` field. Host user
> > > >> program and guest user program can make an advance agreement. For example,
> > > >> the first device (whose `device_order` is smallest) is used to communicate
> > > >> with host process 1, the second device is used to host process 2, and so
> > > >> on.
> > > >>
> > > >> The guest user program want to direct the message to host process 2, then
> > > >> the things would be:
> > > >>
> > > >> 1. Guest user program gets the second device's `guest_cid`.
> > > >> 2. Guest user program binds to the CID.
> > > >>
> > > >> This could be worked because the `device_order` is a VM-level
> > > >> configuration. (On the contrary, the `guest_cid` is a host-level
> > > >> configuration).
> > > >>
> > > >> If people don't need this feature (use 1 or 2 devices only), they can use
> > > >> vsock as the simple way. Otherwise, people should accept the more
> > > >> complicated way.
> > > >>
> > > >> WDYT?
> > > >
> > > >Or we can replace the device_order with the guest_lid (aka local id). The
> > > >guest_lid is a VM-level address space, while the guest_cid is a host-level
> > > >address space.
> > > >
> > > >```c
> > > >struct virtio_vsock_config {
> > > >     __le64 guest_cid;
> > > >     __le16 guest_lid; /* previous device_order */
> > > >};
> > > >```
> > > >
> > > >With this design, the relationship between the device and the guest_lid
> > > >should be set properly before building the guest app and launching the
> > > >VM.
> > > >
> > > >For example, host process 0's guest_lid is 1000, and host process 1's is
> > > >2000. Their guest_cid will be determined when the VM started. The device
> > > >table should be like this:
> > > >
> > > >* device0: process=VM   guest_lid=0     guest_cid=0 <default device>
> > > >* device1: process=0    guest_lid=1000  guest_cid=x
> > > >* device2: process=1    guest_lid=2000  guest_cid=y
> > > >
> > > >The driver should expose an interface, such as ioctl, receiving a
> > > >local_cid. Guest apps can use it to obtain the actual guest_cid.
> > >
> > > No, please, I don't think adding virtio-specific behaviour in AF_VSOCK
> > > is what we want.
> > >
> > > Let's continue with device_order and see what others say.
> > >
> > > I think we need to try to get a better understanding of what to do,
> > > depending on the direction:
> > >
> > > - host -> guest: it might make sense multiple devices with different
> > >    CIDs, and the host will know which one to use depending on the CID
> > >    assigned to the device (e.g. vhost, vhost-user, device in VMM)
> > >
> > > - guest -> host: again I think we should differentiate the device to use
> > >    depending on the destination CID which can be VMADDR_CID_HOST,
> > >    VMADDR_CID_HYPERVISOR, or in the case where sibling communication is
> > >    supported a CID >= 3, so maybe we should have some features or flags
> > >    in the config space to describe destination CID supported for each
> > >    device
> >
> > I don't understand the point of adding a new features/flags. Could you
> > explain a bit more?
> 
> The idea is to inform the guest which addresses are reachable by the
> device, so the guest can easily decide which device to use. I'm
> talking about the destination, so CID_HOST(2), CID_HYPERVISOS(0) or a
> sibling VM (CID >=3).
> 
> >
> > We have had the guest_cid field in the config space. The guest knows all
> > devices present in the VM.
> 
> Okay, but how can the guest figure out from this information which
> device to use to talk to the hypervisor or an application in the host?
> 
> >
> > If the app tries to bind a random CID, it will fail since the driver can't
> > find the device by the CID.
> 
> I'm not talking about the source CID on which to do bind() (which I
> honestly don't like), but I'm talking about the destination CID on
> which to do connect().
> 
> > > so that the guest knows which device to use depending on the destination
> > > CID.
> >
> > Yes, this is what I was describing in the previous comment. The message
> > will be directed to the device by the destination CID.
> 
> Sorry, I don't understand how you do this without having an
> information from the device about what addresses it supports. Can you
> elaborate a bit?

Thanks for your explanation. So things you were talking about are as
follows:

1) guest app as a server: the app MUST do `bind()` to a CID that is
available in current VM.
2) guest app as a client: the guest driver picks a device and uses the
device's CID as src CID, so that the guest app don't need to do `bind()`,
but only do `connect()`.

The key point is who takes responsibility for picking a device:

1) I prefer the guest app to do such thing: do `bind()` to pick one, then
do `connect()`.
2) You prefer the guest driver: only do `connect()`, and the guest driver
picks one according to the dst CID.

Am I right?

I'm open to both ideas, but I have some concerns:

1) Two devices are in the different namespaces, e.g. host kernel(vhost) and
hybrid vsock(vhost-user), which might cause two same CIDs (e.g.
VMADDR_CID_HOST). If that happened, the driver can't distinguish them.
Instead, we can avoid this by letting the guest app pick a device.
2) What if the number of VMs is too large? For instance, 1,000 VMs (1,000
CIDs) will need at least 8000B of config space. (Hmm, it looks like an
extreme example, I don't know if it will happen in real world.)

Thanks,
Xuewei

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-04-09  6:55                 ` Xuewei Niu
@ 2025-04-09  9:34                   ` Stefano Garzarella
  2025-04-10  3:05                     ` Xuewei Niu
  0 siblings, 1 reply; 23+ messages in thread
From: Stefano Garzarella @ 2025-04-09  9:34 UTC (permalink / raw)
  To: Xuewei Niu; +Cc: fupan.lfp, mst, niuxuewei.nxw, parav, stefanha, virtio-comment

On Wed, 9 Apr 2025 at 08:56, Xuewei Niu <niuxuewei97@gmail.com> wrote:
[...]
> >
> > The idea is to inform the guest which addresses are reachable by the
> > device, so the guest can easily decide which device to use. I'm
> > talking about the destination, so CID_HOST(2), CID_HYPERVISOS(0) or a
> > sibling VM (CID >=3).
> >
> > >
> > > We have had the guest_cid field in the config space. The guest knows all
> > > devices present in the VM.
> >
> > Okay, but how can the guest figure out from this information which
> > device to use to talk to the hypervisor or an application in the host?
> >
> > >
> > > If the app tries to bind a random CID, it will fail since the driver can't
> > > find the device by the CID.
> >
> > I'm not talking about the source CID on which to do bind() (which I
> > honestly don't like), but I'm talking about the destination CID on
> > which to do connect().
> >
> > > > so that the guest knows which device to use depending on the destination
> > > > CID.
> > >
> > > Yes, this is what I was describing in the previous comment. The message
> > > will be directed to the device by the destination CID.
> >
> > Sorry, I don't understand how you do this without having an
> > information from the device about what addresses it supports. Can you
> > elaborate a bit?
>
> Thanks for your explanation. So things you were talking about are as
> follows:
>
> 1) guest app as a server: the app MUST do `bind()` to a CID that is
> available in current VM.
> 2) guest app as a client: the guest driver picks a device and uses the
> device's CID as src CID, so that the guest app don't need to do `bind()`,
> but only do `connect()`.
>
> The key point is who takes responsibility for picking a device:
>
> 1) I prefer the guest app to do such thing: do `bind()` to pick one, then
> do `connect()`.

Why?

This implies that users must be informed in some way that they must
use a certain CID to communicate with the VMM and another to
communicate with the host application, when they could just as well
use CID_HYPERVISOS(0) or CID_HOST(2) for that and everything would be
transparent.

> 2) You prefer the guest driver: only do `connect()`, and the guest driver
> picks one according to the dst CID.
>
> Am I right?

Yes, I prefer to keep things simple.

Obviously what I said works if we only have 2 devices (or at most 3
where one supports only CID_HYPERVISOR, the other only CID_HOST, and
the third CID >=3).
As I anticipated, if we want to support more, then we have to
necessarily go with the original option of this proposal with the
default device, etc.

That's why IMO we should support both.

>
> I'm open to both ideas, but I have some concerns:
>
> 1) Two devices are in the different namespaces, e.g. host kernel(vhost) and
> hybrid vsock(vhost-user), which might cause two same CIDs (e.g.
> VMADDR_CID_HOST). If that happened, the driver can't distinguish them.
> Instead, we can avoid this by letting the guest app pick a device.

Yep, in this case you need that, but how the user knows which source CID to use?
Again I'm not saying we should not implement this proposal, I'm saying
that we should also add the other option in order to keep vsock simple
when you have 2 devices, one used to talk with the VMM and the other
to talk with applications.

> 2) What if the number of VMs is too large? For instance, 1,000 VMs (1,000
> CIDs) will need at least 8000B of config space. (Hmm, it looks like an
> extreme example, I don't know if it will happen in real world.)

I didn't get sorry, can you elaborate a bit?

Anyway, for me we can also go ahead with this proposal and add the
other one later.
I would like to keep the guest simple, and IMO hiding the complexity
in the driver is not bad for your use case where one device talks to
the VMM and the other talks to applications in the host.

Thanks,
Stefano

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-04-09  9:34                   ` Stefano Garzarella
@ 2025-04-10  3:05                     ` Xuewei Niu
  2025-04-10  7:21                       ` Stefano Garzarella
  0 siblings, 1 reply; 23+ messages in thread
From: Xuewei Niu @ 2025-04-10  3:05 UTC (permalink / raw)
  To: sgarzare
  Cc: fupan.lfp, mst, niuxuewei.nxw, niuxuewei97, parav, stefanha,
	virtio-comment

> On Wed, 9 Apr 2025 at 08:56, Xuewei Niu <niuxuewei97@gmail.com> wrote:
> [...]
> > >
> > > The idea is to inform the guest which addresses are reachable by the
> > > device, so the guest can easily decide which device to use. I'm
> > > talking about the destination, so CID_HOST(2), CID_HYPERVISOS(0) or a
> > > sibling VM (CID >=3).
> > >
> > > >
> > > > We have had the guest_cid field in the config space. The guest knows all
> > > > devices present in the VM.
> > >
> > > Okay, but how can the guest figure out from this information which
> > > device to use to talk to the hypervisor or an application in the host?
> > >
> > > >
> > > > If the app tries to bind a random CID, it will fail since the driver can't
> > > > find the device by the CID.
> > >
> > > I'm not talking about the source CID on which to do bind() (which I
> > > honestly don't like), but I'm talking about the destination CID on
> > > which to do connect().
> > >
> > > > > so that the guest knows which device to use depending on the destination
> > > > > CID.
> > > >
> > > > Yes, this is what I was describing in the previous comment. The message
> > > > will be directed to the device by the destination CID.
> > >
> > > Sorry, I don't understand how you do this without having an
> > > information from the device about what addresses it supports. Can you
> > > elaborate a bit?
> >
> > Thanks for your explanation. So things you were talking about are as
> > follows:
> >
> > 1) guest app as a server: the app MUST do `bind()` to a CID that is
> > available in current VM.
> > 2) guest app as a client: the guest driver picks a device and uses the
> > device's CID as src CID, so that the guest app don't need to do `bind()`,
> > but only do `connect()`.
> >
> > The key point is who takes responsibility for picking a device:
> >
> > 1) I prefer the guest app to do such thing: do `bind()` to pick one, then
> > do `connect()`.
> 
> Why?
> 
> This implies that users must be informed in some way that they must
> use a certain CID to communicate with the VMM and another to
> communicate with the host application, when they could just as well
> use CID_HYPERVISOS(0) or CID_HOST(2) for that and everything would be
> transparent.

Sorry, I didn't express it clearly. I just listed my original idea and your
idea.

What types of services can be considered as hypervisor services? In our
case, the kata-agent is a control service for Kata, so could we consider it
as a hypervisor service?

How does the driver know which device is for hypervisor? A possible way is
to add a feature, like `VIRTIO_VSOCK_F_HYPERVISOR`? If it is possible, I
think the CID_HYPERVISOR can still be used in the case of 3+ devices to
maintain the same behavior.

> > 2) You prefer the guest driver: only do `connect()`, and the guest driver
> > picks one according to the dst CID.
> >
> > Am I right?
> 
> Yes, I prefer to keep things simple.
> 
> Obviously what I said works if we only have 2 devices (or at most 3
> where one supports only CID_HYPERVISOR, the other only CID_HOST, and
> the third CID >=3).
> As I anticipated, if we want to support more, then we have to
> necessarily go with the original option of this proposal with the
> default device, etc.
> 
> That's why IMO we should support both.
> 
> >
> > I'm open to both ideas, but I have some concerns:
> >
> > 1) Two devices are in the different namespaces, e.g. host kernel(vhost) and
> > hybrid vsock(vhost-user), which might cause two same CIDs (e.g.
> > VMADDR_CID_HOST). If that happened, the driver can't distinguish them.
> > Instead, we can avoid this by letting the guest app pick a device.
> 
> Yep, in this case you need that, but how the user knows which source CID
> to use?

The CID should be confirmed in advance, and users should ensure that there
are no conflicts with CIDs.

> Again I'm not saying we should not implement this proposal, I'm saying
> that we should also add the other option in order to keep vsock simple
> when you have 2 devices, one used to talk with the VMM and the other
> to talk with applications.

Got it. Thanks for clarifying this.

> > 2) What if the number of VMs is too large? For instance, 1,000 VMs (1,000
> > CIDs) will need at least 8000B of config space. (Hmm, it looks like an
> > extreme example, I don't know if it will happen in real world.)
> 
> I didn't get sorry, can you elaborate a bit?

I misunderstood. I thought you are trying to have an array to save all
supported dst CIDs in the case of 3+ devices. Please just ignore this.

> Anyway, for me we can also go ahead with this proposal and add the
> other one later.
> I would like to keep the guest simple, and IMO hiding the complexity
> in the driver is not bad for your use case where one device talks to
> the VMM and the other talks to applications in the host.

Fair enough. I am very glad to see that we can move forward with this
feature ;)

Big thanks to you!
Xuewei

> Thanks,
> Stefano

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-04-10  3:05                     ` Xuewei Niu
@ 2025-04-10  7:21                       ` Stefano Garzarella
  2025-04-10  8:58                         ` Xuewei Niu
  0 siblings, 1 reply; 23+ messages in thread
From: Stefano Garzarella @ 2025-04-10  7:21 UTC (permalink / raw)
  To: Xuewei Niu; +Cc: fupan.lfp, mst, niuxuewei.nxw, parav, stefanha, virtio-comment

On Thu, 10 Apr 2025 at 05:06, Xuewei Niu <niuxuewei97@gmail.com> wrote:
>
> > On Wed, 9 Apr 2025 at 08:56, Xuewei Niu <niuxuewei97@gmail.com> wrote:
> > [...]
> > > >
> > > > The idea is to inform the guest which addresses are reachable by the
> > > > device, so the guest can easily decide which device to use. I'm
> > > > talking about the destination, so CID_HOST(2), CID_HYPERVISOS(0) or a
> > > > sibling VM (CID >=3).
> > > >
> > > > >
> > > > > We have had the guest_cid field in the config space. The guest knows all
> > > > > devices present in the VM.
> > > >
> > > > Okay, but how can the guest figure out from this information which
> > > > device to use to talk to the hypervisor or an application in the host?
> > > >
> > > > >
> > > > > If the app tries to bind a random CID, it will fail since the driver can't
> > > > > find the device by the CID.
> > > >
> > > > I'm not talking about the source CID on which to do bind() (which I
> > > > honestly don't like), but I'm talking about the destination CID on
> > > > which to do connect().
> > > >
> > > > > > so that the guest knows which device to use depending on the destination
> > > > > > CID.
> > > > >
> > > > > Yes, this is what I was describing in the previous comment. The message
> > > > > will be directed to the device by the destination CID.
> > > >
> > > > Sorry, I don't understand how you do this without having an
> > > > information from the device about what addresses it supports. Can you
> > > > elaborate a bit?
> > >
> > > Thanks for your explanation. So things you were talking about are as
> > > follows:
> > >
> > > 1) guest app as a server: the app MUST do `bind()` to a CID that is
> > > available in current VM.
> > > 2) guest app as a client: the guest driver picks a device and uses the
> > > device's CID as src CID, so that the guest app don't need to do `bind()`,
> > > but only do `connect()`.
> > >
> > > The key point is who takes responsibility for picking a device:
> > >
> > > 1) I prefer the guest app to do such thing: do `bind()` to pick one, then
> > > do `connect()`.
> >
> > Why?
> >
> > This implies that users must be informed in some way that they must
> > use a certain CID to communicate with the VMM and another to
> > communicate with the host application, when they could just as well
> > use CID_HYPERVISOS(0) or CID_HOST(2) for that and everything would be
> > transparent.
>
> Sorry, I didn't express it clearly. I just listed my original idea and your
> idea.
>
> What types of services can be considered as hypervisor services?

TSI.
The vsock used to implement it end up in the VMM (i.e. libkrun), right?

> In our
> case, the kata-agent is a control service for Kata, so could we consider it
> as a hypervisor service?

Nope, kata-agent is clearly a host application.

>
> How does the driver know which device is for hypervisor? A possible way is
> to add a feature, like `VIRTIO_VSOCK_F_HYPERVISOR`? If it is possible, I
> think the CID_HYPERVISOR can still be used in the case of 3+ devices to
> maintain the same behavior.

Or adding a bitfield in the config space, where the device set several
flags depending on what kind of CIDs is able to handle.
They don't need to be negotiated, IMHO. Just advertised by the device.

>
> > > 2) You prefer the guest driver: only do `connect()`, and the guest driver
> > > picks one according to the dst CID.
> > >
> > > Am I right?
> >
> > Yes, I prefer to keep things simple.
> >
> > Obviously what I said works if we only have 2 devices (or at most 3
> > where one supports only CID_HYPERVISOR, the other only CID_HOST, and
> > the third CID >=3).
> > As I anticipated, if we want to support more, then we have to
> > necessarily go with the original option of this proposal with the
> > default device, etc.
> >
> > That's why IMO we should support both.
> >
> > >
> > > I'm open to both ideas, but I have some concerns:
> > >
> > > 1) Two devices are in the different namespaces, e.g. host kernel(vhost) and
> > > hybrid vsock(vhost-user), which might cause two same CIDs (e.g.
> > > VMADDR_CID_HOST). If that happened, the driver can't distinguish them.
> > > Instead, we can avoid this by letting the guest app pick a device.
> >
> > Yep, in this case you need that, but how the user knows which source CID
> > to use?
>
> The CID should be confirmed in advance, and users should ensure that there
> are no conflicts with CIDs.

Exactly, this is kind of against what we call zero-config for vsock.

We could look at it the same way as ports, though, where the user has
to know which one to use to reach a service, so I'm not completely
against that.

>
> > Again I'm not saying we should not implement this proposal, I'm saying
> > that we should also add the other option in order to keep vsock simple
> > when you have 2 devices, one used to talk with the VMM and the other
> > to talk with applications.
>
> Got it. Thanks for clarifying this.
>
> > > 2) What if the number of VMs is too large? For instance, 1,000 VMs (1,000
> > > CIDs) will need at least 8000B of config space. (Hmm, it looks like an
> > > extreme example, I don't know if it will happen in real world.)
> >
> > I didn't get sorry, can you elaborate a bit?
>
> I misunderstood. I thought you are trying to have an array to save all
> supported dst CIDs in the case of 3+ devices. Please just ignore this.
>
> > Anyway, for me we can also go ahead with this proposal and add the
> > other one later.
> > I would like to keep the guest simple, and IMO hiding the complexity
> > in the driver is not bad for your use case where one device talks to
> > the VMM and the other talks to applications in the host.
>
> Fair enough. I am very glad to see that we can move forward with this
> feature ;)

:-)

Thanks,
Stefano


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-04-10  7:21                       ` Stefano Garzarella
@ 2025-04-10  8:58                         ` Xuewei Niu
  2025-04-10 10:38                           ` Stefano Garzarella
  0 siblings, 1 reply; 23+ messages in thread
From: Xuewei Niu @ 2025-04-10  8:58 UTC (permalink / raw)
  To: sgarzare
  Cc: fupan.lfp, mst, niuxuewei.nxw, niuxuewei97, parav, stefanha,
	virtio-comment

> On Thu, 10 Apr 2025 at 05:06, Xuewei Niu <niuxuewei97@gmail.com> wrote:
> >
> > > On Wed, 9 Apr 2025 at 08:56, Xuewei Niu <niuxuewei97@gmail.com> wrote:
> > > [...]
> > > > >
> > > > > The idea is to inform the guest which addresses are reachable by the
> > > > > device, so the guest can easily decide which device to use. I'm
> > > > > talking about the destination, so CID_HOST(2), CID_HYPERVISOS(0) or a
> > > > > sibling VM (CID >=3).
> > > > >
> > > > > >
> > > > > > We have had the guest_cid field in the config space. The guest knows all
> > > > > > devices present in the VM.
> > > > >
> > > > > Okay, but how can the guest figure out from this information which
> > > > > device to use to talk to the hypervisor or an application in the host?
> > > > >
> > > > > >
> > > > > > If the app tries to bind a random CID, it will fail since the driver can't
> > > > > > find the device by the CID.
> > > > >
> > > > > I'm not talking about the source CID on which to do bind() (which I
> > > > > honestly don't like), but I'm talking about the destination CID on
> > > > > which to do connect().
> > > > >
> > > > > > > so that the guest knows which device to use depending on the destination
> > > > > > > CID.
> > > > > >
> > > > > > Yes, this is what I was describing in the previous comment. The message
> > > > > > will be directed to the device by the destination CID.
> > > > >
> > > > > Sorry, I don't understand how you do this without having an
> > > > > information from the device about what addresses it supports. Can you
> > > > > elaborate a bit?
> > > >
> > > > Thanks for your explanation. So things you were talking about are as
> > > > follows:
> > > >
> > > > 1) guest app as a server: the app MUST do `bind()` to a CID that is
> > > > available in current VM.
> > > > 2) guest app as a client: the guest driver picks a device and uses the
> > > > device's CID as src CID, so that the guest app don't need to do `bind()`,
> > > > but only do `connect()`.
> > > >
> > > > The key point is who takes responsibility for picking a device:
> > > >
> > > > 1) I prefer the guest app to do such thing: do `bind()` to pick one, then
> > > > do `connect()`.
> > >
> > > Why?
> > >
> > > This implies that users must be informed in some way that they must
> > > use a certain CID to communicate with the VMM and another to
> > > communicate with the host application, when they could just as well
> > > use CID_HYPERVISOS(0) or CID_HOST(2) for that and everything would be
> > > transparent.
> >
> > Sorry, I didn't express it clearly. I just listed my original idea and your
> > idea.
> >
> > What types of services can be considered as hypervisor services?
> 
> TSI.
> The vsock used to implement it end up in the VMM (i.e. libkrun), right?
> 
> > In our
> > case, the kata-agent is a control service for Kata, so could we consider it
> > as a hypervisor service?
> 
> Nope, kata-agent is clearly a host application.

Make sense for me.

> > How does the driver know which device is for hypervisor? A possible way is
> > to add a feature, like `VIRTIO_VSOCK_F_HYPERVISOR`? If it is possible, I
> > think the CID_HYPERVISOR can still be used in the case of 3+ devices to
> > maintain the same behavior.
> 
> Or adding a bitfield in the config space, where the device set several
> flags depending on what kind of CIDs is able to handle.
> They don't need to be negotiated, IMHO. Just advertised by the device.

Will do in the v7.

Thanks,
Xuewei

> > > > 2) You prefer the guest driver: only do `connect()`, and the guest driver
> > > > picks one according to the dst CID.
> > > >
> > > > Am I right?
> > >
> > > Yes, I prefer to keep things simple.
> > >
> > > Obviously what I said works if we only have 2 devices (or at most 3
> > > where one supports only CID_HYPERVISOR, the other only CID_HOST, and
> > > the third CID >=3).
> > > As I anticipated, if we want to support more, then we have to
> > > necessarily go with the original option of this proposal with the
> > > default device, etc.
> > >
> > > That's why IMO we should support both.
> > >
> > > >
> > > > I'm open to both ideas, but I have some concerns:
> > > >
> > > > 1) Two devices are in the different namespaces, e.g. host kernel(vhost) and
> > > > hybrid vsock(vhost-user), which might cause two same CIDs (e.g.
> > > > VMADDR_CID_HOST). If that happened, the driver can't distinguish them.
> > > > Instead, we can avoid this by letting the guest app pick a device.
> > >
> > > Yep, in this case you need that, but how the user knows which source CID
> > > to use?
> >
> > The CID should be confirmed in advance, and users should ensure that there
> > are no conflicts with CIDs.
> 
> Exactly, this is kind of against what we call zero-config for vsock.
> 
> We could look at it the same way as ports, though, where the user has
> to know which one to use to reach a service, so I'm not completely
> against that.
> 
> >
> > > Again I'm not saying we should not implement this proposal, I'm saying
> > > that we should also add the other option in order to keep vsock simple
> > > when you have 2 devices, one used to talk with the VMM and the other
> > > to talk with applications.
> >
> > Got it. Thanks for clarifying this.
> >
> > > > 2) What if the number of VMs is too large? For instance, 1,000 VMs (1,000
> > > > CIDs) will need at least 8000B of config space. (Hmm, it looks like an
> > > > extreme example, I don't know if it will happen in real world.)
> > >
> > > I didn't get sorry, can you elaborate a bit?
> >
> > I misunderstood. I thought you are trying to have an array to save all
> > supported dst CIDs in the case of 3+ devices. Please just ignore this.
> >
> > > Anyway, for me we can also go ahead with this proposal and add the
> > > other one later.
> > > I would like to keep the guest simple, and IMO hiding the complexity
> > > in the driver is not bad for your use case where one device talks to
> > > the VMM and the other talks to applications in the host.
> >
> > Fair enough. I am very glad to see that we can move forward with this
> > feature ;)
> 
> :-)
> 
> Thanks,
> Stefano

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-04-10  8:58                         ` Xuewei Niu
@ 2025-04-10 10:38                           ` Stefano Garzarella
  2025-04-10 10:47                             ` Xuewei Niu
  0 siblings, 1 reply; 23+ messages in thread
From: Stefano Garzarella @ 2025-04-10 10:38 UTC (permalink / raw)
  To: Xuewei Niu; +Cc: fupan.lfp, mst, niuxuewei.nxw, parav, stefanha, virtio-comment

On Thu, 10 Apr 2025 at 10:59, Xuewei Niu <niuxuewei97@gmail.com> wrote:
>
> > On Thu, 10 Apr 2025 at 05:06, Xuewei Niu <niuxuewei97@gmail.com> wrote:
> > >
> > > > On Wed, 9 Apr 2025 at 08:56, Xuewei Niu <niuxuewei97@gmail.com> wrote:
> > > > [...]
> > > > > >
> > > > > > The idea is to inform the guest which addresses are reachable by the
> > > > > > device, so the guest can easily decide which device to use. I'm
> > > > > > talking about the destination, so CID_HOST(2), CID_HYPERVISOS(0) or a
> > > > > > sibling VM (CID >=3).
> > > > > >
> > > > > > >
> > > > > > > We have had the guest_cid field in the config space. The guest knows all
> > > > > > > devices present in the VM.
> > > > > >
> > > > > > Okay, but how can the guest figure out from this information which
> > > > > > device to use to talk to the hypervisor or an application in the host?
> > > > > >
> > > > > > >
> > > > > > > If the app tries to bind a random CID, it will fail since the driver can't
> > > > > > > find the device by the CID.
> > > > > >
> > > > > > I'm not talking about the source CID on which to do bind() (which I
> > > > > > honestly don't like), but I'm talking about the destination CID on
> > > > > > which to do connect().
> > > > > >
> > > > > > > > so that the guest knows which device to use depending on the destination
> > > > > > > > CID.
> > > > > > >
> > > > > > > Yes, this is what I was describing in the previous comment. The message
> > > > > > > will be directed to the device by the destination CID.
> > > > > >
> > > > > > Sorry, I don't understand how you do this without having an
> > > > > > information from the device about what addresses it supports. Can you
> > > > > > elaborate a bit?
> > > > >
> > > > > Thanks for your explanation. So things you were talking about are as
> > > > > follows:
> > > > >
> > > > > 1) guest app as a server: the app MUST do `bind()` to a CID that is
> > > > > available in current VM.
> > > > > 2) guest app as a client: the guest driver picks a device and uses the
> > > > > device's CID as src CID, so that the guest app don't need to do `bind()`,
> > > > > but only do `connect()`.
> > > > >
> > > > > The key point is who takes responsibility for picking a device:
> > > > >
> > > > > 1) I prefer the guest app to do such thing: do `bind()` to pick one, then
> > > > > do `connect()`.
> > > >
> > > > Why?
> > > >
> > > > This implies that users must be informed in some way that they must
> > > > use a certain CID to communicate with the VMM and another to
> > > > communicate with the host application, when they could just as well
> > > > use CID_HYPERVISOS(0) or CID_HOST(2) for that and everything would be
> > > > transparent.
> > >
> > > Sorry, I didn't express it clearly. I just listed my original idea and your
> > > idea.
> > >
> > > What types of services can be considered as hypervisor services?
> >
> > TSI.
> > The vsock used to implement it end up in the VMM (i.e. libkrun), right?
> >
> > > In our
> > > case, the kata-agent is a control service for Kata, so could we consider it
> > > as a hypervisor service?
> >
> > Nope, kata-agent is clearly a host application.
>
> Make sense for me.
>
> > > How does the driver know which device is for hypervisor? A possible way is
> > > to add a feature, like `VIRTIO_VSOCK_F_HYPERVISOR`? If it is possible, I
> > > think the CID_HYPERVISOR can still be used in the case of 3+ devices to
> > > maintain the same behavior.
> >
> > Or adding a bitfield in the config space, where the device set several
> > flags depending on what kind of CIDs is able to handle.
> > They don't need to be negotiated, IMHO. Just advertised by the device.
>
> Will do in the v7.

BTW this could be part of another proposal if you don't want to speed this down.
I think we are talking about 2 features here: multiple devices and CID
supported per device.

Stefano


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-04-10 10:38                           ` Stefano Garzarella
@ 2025-04-10 10:47                             ` Xuewei Niu
  2025-04-10 10:49                               ` Stefano Garzarella
  2025-04-10 13:47                               ` Michael S. Tsirkin
  0 siblings, 2 replies; 23+ messages in thread
From: Xuewei Niu @ 2025-04-10 10:47 UTC (permalink / raw)
  To: sgarzare
  Cc: fupan.lfp, mst, niuxuewei.nxw, niuxuewei97, parav, stefanha,
	virtio-comment

> On Thu, 10 Apr 2025 at 10:59, Xuewei Niu <niuxuewei97@gmail.com> wrote:
> >
> > > On Thu, 10 Apr 2025 at 05:06, Xuewei Niu <niuxuewei97@gmail.com> wrote:
> > > >
> > > > > On Wed, 9 Apr 2025 at 08:56, Xuewei Niu <niuxuewei97@gmail.com> wrote:
> > > > > [...]
> > > > > > >
> > > > > > > The idea is to inform the guest which addresses are reachable by the
> > > > > > > device, so the guest can easily decide which device to use. I'm
> > > > > > > talking about the destination, so CID_HOST(2), CID_HYPERVISOS(0) or a
> > > > > > > sibling VM (CID >=3).
> > > > > > >
> > > > > > > >
> > > > > > > > We have had the guest_cid field in the config space. The guest knows all
> > > > > > > > devices present in the VM.
> > > > > > >
> > > > > > > Okay, but how can the guest figure out from this information which
> > > > > > > device to use to talk to the hypervisor or an application in the host?
> > > > > > >
> > > > > > > >
> > > > > > > > If the app tries to bind a random CID, it will fail since the driver can't
> > > > > > > > find the device by the CID.
> > > > > > >
> > > > > > > I'm not talking about the source CID on which to do bind() (which I
> > > > > > > honestly don't like), but I'm talking about the destination CID on
> > > > > > > which to do connect().
> > > > > > >
> > > > > > > > > so that the guest knows which device to use depending on the destination
> > > > > > > > > CID.
> > > > > > > >
> > > > > > > > Yes, this is what I was describing in the previous comment. The message
> > > > > > > > will be directed to the device by the destination CID.
> > > > > > >
> > > > > > > Sorry, I don't understand how you do this without having an
> > > > > > > information from the device about what addresses it supports. Can you
> > > > > > > elaborate a bit?
> > > > > >
> > > > > > Thanks for your explanation. So things you were talking about are as
> > > > > > follows:
> > > > > >
> > > > > > 1) guest app as a server: the app MUST do `bind()` to a CID that is
> > > > > > available in current VM.
> > > > > > 2) guest app as a client: the guest driver picks a device and uses the
> > > > > > device's CID as src CID, so that the guest app don't need to do `bind()`,
> > > > > > but only do `connect()`.
> > > > > >
> > > > > > The key point is who takes responsibility for picking a device:
> > > > > >
> > > > > > 1) I prefer the guest app to do such thing: do `bind()` to pick one, then
> > > > > > do `connect()`.
> > > > >
> > > > > Why?
> > > > >
> > > > > This implies that users must be informed in some way that they must
> > > > > use a certain CID to communicate with the VMM and another to
> > > > > communicate with the host application, when they could just as well
> > > > > use CID_HYPERVISOS(0) or CID_HOST(2) for that and everything would be
> > > > > transparent.
> > > >
> > > > Sorry, I didn't express it clearly. I just listed my original idea and your
> > > > idea.
> > > >
> > > > What types of services can be considered as hypervisor services?
> > >
> > > TSI.
> > > The vsock used to implement it end up in the VMM (i.e. libkrun), right?
> > >
> > > > In our
> > > > case, the kata-agent is a control service for Kata, so could we consider it
> > > > as a hypervisor service?
> > >
> > > Nope, kata-agent is clearly a host application.
> >
> > Make sense for me.
> >
> > > > How does the driver know which device is for hypervisor? A possible way is
> > > > to add a feature, like `VIRTIO_VSOCK_F_HYPERVISOR`? If it is possible, I
> > > > think the CID_HYPERVISOR can still be used in the case of 3+ devices to
> > > > maintain the same behavior.
> > >
> > > Or adding a bitfield in the config space, where the device set several
> > > flags depending on what kind of CIDs is able to handle.
> > > They don't need to be negotiated, IMHO. Just advertised by the device.
> >
> > Will do in the v7.
> 
> BTW this could be part of another proposal if you don't want to speed this down.
> I think we are talking about 2 features here: multiple devices and CID
> supported per device.

IMO, it could be considered as one proposal but two features:
CID_HYPERVISOR support and per device CID support. My idea is to split
these two features into two patches.

WDYT?

Thanks,
Xuewei

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-04-10 10:47                             ` Xuewei Niu
@ 2025-04-10 10:49                               ` Stefano Garzarella
  2025-04-10 13:47                               ` Michael S. Tsirkin
  1 sibling, 0 replies; 23+ messages in thread
From: Stefano Garzarella @ 2025-04-10 10:49 UTC (permalink / raw)
  To: Xuewei Niu; +Cc: fupan.lfp, mst, niuxuewei.nxw, parav, stefanha, virtio-comment

On Thu, 10 Apr 2025 at 12:47, Xuewei Niu <niuxuewei97@gmail.com> wrote:
>
> > On Thu, 10 Apr 2025 at 10:59, Xuewei Niu <niuxuewei97@gmail.com> wrote:
> > >
> > > > On Thu, 10 Apr 2025 at 05:06, Xuewei Niu <niuxuewei97@gmail.com> wrote:
> > > > >
> > > > > > On Wed, 9 Apr 2025 at 08:56, Xuewei Niu <niuxuewei97@gmail.com> wrote:
> > > > > > [...]
> > > > > > > >
> > > > > > > > The idea is to inform the guest which addresses are reachable by the
> > > > > > > > device, so the guest can easily decide which device to use. I'm
> > > > > > > > talking about the destination, so CID_HOST(2), CID_HYPERVISOS(0) or a
> > > > > > > > sibling VM (CID >=3).
> > > > > > > >
> > > > > > > > >
> > > > > > > > > We have had the guest_cid field in the config space. The guest knows all
> > > > > > > > > devices present in the VM.
> > > > > > > >
> > > > > > > > Okay, but how can the guest figure out from this information which
> > > > > > > > device to use to talk to the hypervisor or an application in the host?
> > > > > > > >
> > > > > > > > >
> > > > > > > > > If the app tries to bind a random CID, it will fail since the driver can't
> > > > > > > > > find the device by the CID.
> > > > > > > >
> > > > > > > > I'm not talking about the source CID on which to do bind() (which I
> > > > > > > > honestly don't like), but I'm talking about the destination CID on
> > > > > > > > which to do connect().
> > > > > > > >
> > > > > > > > > > so that the guest knows which device to use depending on the destination
> > > > > > > > > > CID.
> > > > > > > > >
> > > > > > > > > Yes, this is what I was describing in the previous comment. The message
> > > > > > > > > will be directed to the device by the destination CID.
> > > > > > > >
> > > > > > > > Sorry, I don't understand how you do this without having an
> > > > > > > > information from the device about what addresses it supports. Can you
> > > > > > > > elaborate a bit?
> > > > > > >
> > > > > > > Thanks for your explanation. So things you were talking about are as
> > > > > > > follows:
> > > > > > >
> > > > > > > 1) guest app as a server: the app MUST do `bind()` to a CID that is
> > > > > > > available in current VM.
> > > > > > > 2) guest app as a client: the guest driver picks a device and uses the
> > > > > > > device's CID as src CID, so that the guest app don't need to do `bind()`,
> > > > > > > but only do `connect()`.
> > > > > > >
> > > > > > > The key point is who takes responsibility for picking a device:
> > > > > > >
> > > > > > > 1) I prefer the guest app to do such thing: do `bind()` to pick one, then
> > > > > > > do `connect()`.
> > > > > >
> > > > > > Why?
> > > > > >
> > > > > > This implies that users must be informed in some way that they must
> > > > > > use a certain CID to communicate with the VMM and another to
> > > > > > communicate with the host application, when they could just as well
> > > > > > use CID_HYPERVISOS(0) or CID_HOST(2) for that and everything would be
> > > > > > transparent.
> > > > >
> > > > > Sorry, I didn't express it clearly. I just listed my original idea and your
> > > > > idea.
> > > > >
> > > > > What types of services can be considered as hypervisor services?
> > > >
> > > > TSI.
> > > > The vsock used to implement it end up in the VMM (i.e. libkrun), right?
> > > >
> > > > > In our
> > > > > case, the kata-agent is a control service for Kata, so could we consider it
> > > > > as a hypervisor service?
> > > >
> > > > Nope, kata-agent is clearly a host application.
> > >
> > > Make sense for me.
> > >
> > > > > How does the driver know which device is for hypervisor? A possible way is
> > > > > to add a feature, like `VIRTIO_VSOCK_F_HYPERVISOR`? If it is possible, I
> > > > > think the CID_HYPERVISOR can still be used in the case of 3+ devices to
> > > > > maintain the same behavior.
> > > >
> > > > Or adding a bitfield in the config space, where the device set several
> > > > flags depending on what kind of CIDs is able to handle.
> > > > They don't need to be negotiated, IMHO. Just advertised by the device.
> > >
> > > Will do in the v7.
> >
> > BTW this could be part of another proposal if you don't want to speed this down.
> > I think we are talking about 2 features here: multiple devices and CID
> > supported per device.
>
> IMO, it could be considered as one proposal but two features:
> CID_HYPERVISOR support and per device CID support. My idea is to split
> these two features into two patches.
>
> WDYT?

Yeah, LGTM!

Stefano


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
  2025-04-10 10:47                             ` Xuewei Niu
  2025-04-10 10:49                               ` Stefano Garzarella
@ 2025-04-10 13:47                               ` Michael S. Tsirkin
  1 sibling, 0 replies; 23+ messages in thread
From: Michael S. Tsirkin @ 2025-04-10 13:47 UTC (permalink / raw)
  To: Xuewei Niu
  Cc: sgarzare, fupan.lfp, niuxuewei.nxw, parav, stefanha,
	virtio-comment

On Thu, Apr 10, 2025 at 06:47:27PM +0800, Xuewei Niu wrote:
> > On Thu, 10 Apr 2025 at 10:59, Xuewei Niu <niuxuewei97@gmail.com> wrote:
> > >
> > > > On Thu, 10 Apr 2025 at 05:06, Xuewei Niu <niuxuewei97@gmail.com> wrote:
> > > > >
> > > > > > On Wed, 9 Apr 2025 at 08:56, Xuewei Niu <niuxuewei97@gmail.com> wrote:
> > > > > > [...]
> > > > > > > >
> > > > > > > > The idea is to inform the guest which addresses are reachable by the
> > > > > > > > device, so the guest can easily decide which device to use. I'm
> > > > > > > > talking about the destination, so CID_HOST(2), CID_HYPERVISOS(0) or a
> > > > > > > > sibling VM (CID >=3).
> > > > > > > >
> > > > > > > > >
> > > > > > > > > We have had the guest_cid field in the config space. The guest knows all
> > > > > > > > > devices present in the VM.
> > > > > > > >
> > > > > > > > Okay, but how can the guest figure out from this information which
> > > > > > > > device to use to talk to the hypervisor or an application in the host?
> > > > > > > >
> > > > > > > > >
> > > > > > > > > If the app tries to bind a random CID, it will fail since the driver can't
> > > > > > > > > find the device by the CID.
> > > > > > > >
> > > > > > > > I'm not talking about the source CID on which to do bind() (which I
> > > > > > > > honestly don't like), but I'm talking about the destination CID on
> > > > > > > > which to do connect().
> > > > > > > >
> > > > > > > > > > so that the guest knows which device to use depending on the destination
> > > > > > > > > > CID.
> > > > > > > > >
> > > > > > > > > Yes, this is what I was describing in the previous comment. The message
> > > > > > > > > will be directed to the device by the destination CID.
> > > > > > > >
> > > > > > > > Sorry, I don't understand how you do this without having an
> > > > > > > > information from the device about what addresses it supports. Can you
> > > > > > > > elaborate a bit?
> > > > > > >
> > > > > > > Thanks for your explanation. So things you were talking about are as
> > > > > > > follows:
> > > > > > >
> > > > > > > 1) guest app as a server: the app MUST do `bind()` to a CID that is
> > > > > > > available in current VM.
> > > > > > > 2) guest app as a client: the guest driver picks a device and uses the
> > > > > > > device's CID as src CID, so that the guest app don't need to do `bind()`,
> > > > > > > but only do `connect()`.
> > > > > > >
> > > > > > > The key point is who takes responsibility for picking a device:
> > > > > > >
> > > > > > > 1) I prefer the guest app to do such thing: do `bind()` to pick one, then
> > > > > > > do `connect()`.
> > > > > >
> > > > > > Why?
> > > > > >
> > > > > > This implies that users must be informed in some way that they must
> > > > > > use a certain CID to communicate with the VMM and another to
> > > > > > communicate with the host application, when they could just as well
> > > > > > use CID_HYPERVISOS(0) or CID_HOST(2) for that and everything would be
> > > > > > transparent.
> > > > >
> > > > > Sorry, I didn't express it clearly. I just listed my original idea and your
> > > > > idea.
> > > > >
> > > > > What types of services can be considered as hypervisor services?
> > > >
> > > > TSI.
> > > > The vsock used to implement it end up in the VMM (i.e. libkrun), right?
> > > >
> > > > > In our
> > > > > case, the kata-agent is a control service for Kata, so could we consider it
> > > > > as a hypervisor service?
> > > >
> > > > Nope, kata-agent is clearly a host application.
> > >
> > > Make sense for me.
> > >
> > > > > How does the driver know which device is for hypervisor? A possible way is
> > > > > to add a feature, like `VIRTIO_VSOCK_F_HYPERVISOR`? If it is possible, I
> > > > > think the CID_HYPERVISOR can still be used in the case of 3+ devices to
> > > > > maintain the same behavior.
> > > >
> > > > Or adding a bitfield in the config space, where the device set several
> > > > flags depending on what kind of CIDs is able to handle.
> > > > They don't need to be negotiated, IMHO. Just advertised by the device.
> > >
> > > Will do in the v7.
> > 
> > BTW this could be part of another proposal if you don't want to speed this down.
> > I think we are talking about 2 features here: multiple devices and CID
> > supported per device.
> 
> IMO, it could be considered as one proposal but two features:
> CID_HYPERVISOR support and per device CID support. My idea is to split
> these two features into two patches.
> 
> WDYT?
> 
> Thanks,
> Xuewei

Split is always good.


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2025-04-10 13:47 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-24  6:43 [PATCH v6 RESEND] virtio-vsock: Add support for multi devices Xuewei Niu
2025-03-24 13:51 ` Stefano Garzarella
2025-03-25  3:19   ` Xuewei Niu
2025-03-26  8:50     ` Stefano Garzarella
2025-03-26 10:00       ` Xuewei Niu
2025-03-26 10:32         ` Stefano Garzarella
2025-03-26 10:36           ` Stefano Garzarella
2025-03-26  2:59   ` Xuewei Niu
2025-03-26  9:03     ` Stefano Garzarella
2025-03-27  8:18       ` Xuewei Niu
2025-03-31  6:18         ` Xuewei Niu
2025-04-01 11:15           ` Stefano Garzarella
2025-04-07  2:17             ` Xuewei Niu
2025-04-08 13:34               ` Stefano Garzarella
2025-04-09  6:55                 ` Xuewei Niu
2025-04-09  9:34                   ` Stefano Garzarella
2025-04-10  3:05                     ` Xuewei Niu
2025-04-10  7:21                       ` Stefano Garzarella
2025-04-10  8:58                         ` Xuewei Niu
2025-04-10 10:38                           ` Stefano Garzarella
2025-04-10 10:47                             ` Xuewei Niu
2025-04-10 10:49                               ` Stefano Garzarella
2025-04-10 13:47                               ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox