Discussion of the VIRTIO specification
 help / color / mirror / Atom feed
From: beshleman.devbox@gmail.com
To: mst@redhat.com, cohuck@redhat.com, virtio-comment@lists.oasis-open.org
Cc: cong.wang@bytedance.com, duanxiongchun@bytedance.com,
	jiang.wang@bytedance.com,
	virtualization@lists.linux-foundation.org,
	xieyongji@bytedance.com, chaiwen.cc@bytedance.com,
	stefanha@redhat.com, asias@redhat.com,
	arseny.krasnov@kaspersky.com, jhansen@vmware.com,
	bobby.eshleman@bytedance.com
Subject: [PATCH v5 2/2] virtio-vsock: add mergeable buffer feature bit
Date: Thu, 24 Feb 2022 22:15:47 +0000	[thread overview]
Message-ID: <20220224221547.2436395-3-beshleman.devbox@gmail.com> (raw)
In-Reply-To: <20220224221547.2436395-1-beshleman.devbox@gmail.com>

From: Jiang Wang <jiang.wang@bytedance.com>

Add support for mergeable buffers for virtio-vsock. Mergeable buffers
allow individual large packets to be spread across multiple buffers
while still using only a single packet header. This avoids
artificially restraining packet size to the size of a single
buffer and offers a performant fragmentation/defragmentation
scheme.

Signed-off-by: Jiang Wang <jiang.wang@bytedance.com>
Signed-off-by: Bobby Eshleman <bobby.eshleman@bytedance.com>
---
 virtio-vsock.tex | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)

diff --git a/virtio-vsock.tex b/virtio-vsock.tex
index 1a66a1b..bf44d5d 100644
--- a/virtio-vsock.tex
+++ b/virtio-vsock.tex
@@ -39,6 +39,7 @@ \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}
 \item[VIRTIO_VSOCK_F_STREAM (0)] stream socket type is supported.
 \item[VIRTIO_VSOCK_F_SEQPACKET (1)] seqpacket socket type is supported.
 \item[VIRTIO_VSOCK_F_DGRAM (2)] datagram socket type is supported.
+\item[VIRTIO_VSOCK_F_MRG_RXBUF (3)] driver can merge receive buffers.
 \end{description}
 
 \subsection{Device configuration layout}\label{sec:Device Types / Socket Device / Device configuration layout}
@@ -87,6 +88,8 @@ \subsection{Device Operation}\label{sec:Device Types / Socket Device / Device Op
 
 Packets transmitted or received contain a header before the payload:
 
+If feature VIRTIO_VSOCK_F_MRG_RXBUF is not negotiated, use the following header.
+
 \begin{lstlisting}
 struct virtio_vsock_hdr {
 	le64 src_cid;
@@ -102,6 +105,15 @@ \subsection{Device Operation}\label{sec:Device Types / Socket Device / Device Op
 };
 \end{lstlisting}
 
+If feature VIRTIO_VSOCK_F_MRG_RXBUF is negotiated, use the following header.
+\begin{lstlisting}
+struct virtio_vsock_hdr_mrg_rxbuf {
+	struct virtio_vsock_hdr hdr;
+	le16 num_buffers;
+};
+\end{lstlisting}
+
+
 The upper 32 bits of src_cid and dst_cid are reserved and zeroed.
 
 Most packets simply transfer data but control packets are also used for
@@ -207,6 +219,25 @@ \subsubsection{Buffer Space Management}\label{sec:Device Types / Socket Device /
 virtqueue is full. For receivers, the packet is dropped if there is no space
 in the receive buffer.
 
+\drivernormative{\paragraph}{Device Operation: Buffer Space Management}{Device Types / Socket Device / Device Operation / Setting Up Receive Buffers}
+\begin{itemize}
+\item If VIRTIO_VSOCK_F_MRG_RXBUF is not negotiated, the driver SHOULD populate the datagram rx queue
+      with buffers of at least 4096 bytes.
+\item If VIRTIO_VSOCK_F_MRG_RXBUF is negotiated, each buffer MUST be at
+      least the size of the struct virtio_vsock_hdr_mgr_rxbuf.
+\end{itemize}
+
+\begin{note}
+Each buffer may be split across multiple descriptor elements.
+\end{note}
+
+\devicenormative{\paragraph}{Device Operation: Buffer Space Management}{Device Types / Socket Device / Device Operation / Setting Up Receive Buffers}
+The device MUST set \field{num_buffers} to the number of descriptors used when
+transmitting the packet.
+
+The device MUST use only a single descriptor if VIRTIO_VSOCK_F_MRG_RXBUF
+is not negotiated.
+
 \drivernormative{\paragraph}{Device Operation: Buffer Space Management}{Device Types / Socket Device / Device Operation / Buffer Space Management}
 For stream sockets, VIRTIO_VSOCK_OP_RW data packets MUST only be transmitted when the peer has
 sufficient free buffer space for the payload. For dgram sockets, VIRTIO_VSOCK_OP_RW data packets
@@ -229,6 +260,7 @@ \subsubsection{Receive and Transmit}\label{sec:Device Types / Socket Device / De
 The driver queues outgoing packets on the tx virtqueue and allocates incoming packet
 receive buffers on the rx virtqueue. Packets are of the following form:
 
+If VIRTIO_VSOCK_F_MRG_RXBUF is not negotiated, use the following.
 \begin{lstlisting}
 struct virtio_vsock_packet {
     struct virtio_vsock_hdr hdr;
@@ -236,11 +268,42 @@ \subsubsection{Receive and Transmit}\label{sec:Device Types / Socket Device / De
 };
 \end{lstlisting}
 
+If VIRTIO_VSOCK_F_MRG_RXBUF is negotiated, use the following form:
+\begin{lstlisting}
+struct virtio_vsock_packet_mrg_rxbuf {
+    struct virtio_vsock_hdr_mrg_rxbuf hdr;
+    u8 data[];
+};
+\end{lstlisting}
+
+
 Virtqueue buffers for outgoing packets are read-only. Virtqueue buffers for
 incoming packets are write-only.
 
 When transmitting packets to the device, \field{num_buffers} is not used.
 
+\begin{enumerate}
+\item \field{num_buffers} indicates how many descriptors
+  this packet is spread over (including this one).
+  This is valid only if VIRTIO_VSOCK_F_MRG_RXBUF is negotiated.
+  This allows receipt of large packets without having to allocate large
+  buffers: a packet that does not fit in a single buffer can flow
+  over to the next buffer, and so on. In this case, there will be
+  at least \field{num_buffers} used buffers in the virtqueue, and the device
+  chains them together to form a single packet in a way similar to
+  how it would store it in a single buffer spread over multiple
+  descriptors.
+  The other buffers will not begin with a struct virtio_vsock_hdr.
+
+  If VIRTIO_VSOCK_F_MRG_RXBUF is not negotiated, then only one
+  descriptor is used.
+
+\item If
+  \field{num_buffers} is one, then the entire packet will be
+  contained within this buffer, immediately following the struct
+  virtio_vsock_hdr.
+\end{enumerate}
+
 \drivernormative{\paragraph}{Device Operation: Receive and Transmit}{Device Types / Socket Device / Device Operation / Receive and Transmit}
 
 The \field{guest_cid} configuration field MUST be used as the source CID when
@@ -256,6 +319,19 @@ \subsubsection{Receive and Transmit}\label{sec:Device Types / Socket Device / De
 A VIRTIO_VSOCK_OP_RST reply MUST be sent if a packet is received with an
 unknown \field{type} value.
 
+If VIRTIO_VSOCK_F_MRG_RXBUF has been negotiated, the device MUST set
+\field{num_buffers} to indicate the number of buffers
+the packet (including the header) is spread over.
+
+If a receive packet is spread over multiple buffers, the device
+MUST use all buffers but the last (i.e. the first $\field{num_buffers} -
+1$ buffers) completely up to the full length of each buffer
+supplied by the driver.
+
+The device MUST use all buffers used by a single receive
+packet together, such that at least \field{num_buffers} are
+observed by driver as used.
+
 \subsubsection{Stream Sockets}\label{sec:Device Types / Socket Device / Device Operation / Stream Sockets}
 
 Connections are established by sending a VIRTIO_VSOCK_OP_REQUEST packet. If a
-- 
2.11.0


  parent reply	other threads:[~2022-02-24 22:15 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-28  4:01 [virtio-comment] [RFC v4] virtio-vsock: add description for datagram type Jiang Wang
2021-06-07 18:45 ` [virtio-comment] " Jiang Wang .
2021-06-08 13:46 ` Stefano Garzarella
2021-06-09  4:22   ` [virtio-comment] Re: [External] " Jiang Wang .
2021-06-09  7:17     ` Stefano Garzarella
2021-06-10  3:31       ` Jiang Wang .
2021-06-10  6:56         ` Stefano Garzarella
2022-02-24 21:57 ` [PATCH v5 0/2] Support vsock datagram and mergeable buffers beshleman.devbox
2022-02-24 21:57   ` [PATCH v5 1/2] virtio-vsock: add description for datagram type beshleman.devbox
2022-02-24 21:57   ` [PATCH v5 2/2] virtio-vsock: add mergeable buffer feature bit beshleman.devbox
2022-02-24 22:15 ` [PATCH v5 0/2] Support vsock datagram and mergeable buffers beshleman.devbox
2022-02-24 22:15   ` [PATCH v5 1/2] virtio-vsock: add description for datagram type beshleman.devbox
2022-03-02 16:09     ` [virtio-comment] " Stefano Garzarella
2022-03-03  3:29       ` Bobby Eshleman
2022-03-03  7:15         ` Michael S. Tsirkin
2022-03-05  1:25           ` Bobby Eshleman
2022-03-06 10:17             ` Michael S. Tsirkin
2022-03-03 10:23         ` Stefano Garzarella
2022-03-05  1:25           ` Bobby Eshleman
2022-03-03 11:41         ` Michael S. Tsirkin
2022-03-07 17:41           ` Bobby Eshleman
2022-02-24 22:15   ` beshleman.devbox [this message]
2022-03-02 16:19     ` [virtio-comment] [PATCH v5 2/2] virtio-vsock: add mergeable buffer feature bit Stefano Garzarella

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220224221547.2436395-3-beshleman.devbox@gmail.com \
    --to=beshleman.devbox@gmail.com \
    --cc=arseny.krasnov@kaspersky.com \
    --cc=asias@redhat.com \
    --cc=bobby.eshleman@bytedance.com \
    --cc=chaiwen.cc@bytedance.com \
    --cc=cohuck@redhat.com \
    --cc=cong.wang@bytedance.com \
    --cc=duanxiongchun@bytedance.com \
    --cc=jhansen@vmware.com \
    --cc=jiang.wang@bytedance.com \
    --cc=mst@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=virtio-comment@lists.oasis-open.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xieyongji@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox