From: Bobby Eshleman <bobbyeshleman@gmail.com>
To: Bobby Eshleman <bobby.eshleman@bytedance.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>,
Stefano Garzarella <sgarzare@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Jason Wang <jasowang@redhat.com>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
"K. Y. Srinivasan" <kys@microsoft.com>,
Haiyang Zhang <haiyangz@microsoft.com>,
Wei Liu <wei.liu@kernel.org>, Dexuan Cui <decui@microsoft.com>,
Bryan Tan <bryantan@vmware.com>, Vishnu Dasa <vdasa@vmware.com>,
VMware PV-Drivers Reviewers <pv-drivers@vmware.com>,
kvm@vger.kernel.org, virtualization@lists.linux-foundation.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-hyperv@vger.kernel.org,
Jiang Wang <jiang.wang@bytedance.com>,
Cong Wang <cong.wang@bytedance.com>
Subject: Re: [PATCH RFC net-next v2 0/4] virtio/vsock: support datagrams
Date: Fri, 14 Apr 2023 11:18:40 +0000 [thread overview]
Message-ID: <ZDk2kOVnUvyLMLKE@bullseye> (raw)
In-Reply-To: <20230413-b4-vsock-dgram-v2-0-079cc7cee62e@bytedance.com>
CC'ing Cong.
On Fri, Apr 14, 2023 at 12:25:56AM +0000, Bobby Eshleman wrote:
> Hey all!
>
> This series introduces support for datagrams to virtio/vsock.
>
> It is a spin-off (and smaller version) of this series from the summer:
> https://lore.kernel.org/all/cover.1660362668.git.bobby.eshleman@bytedance.com/
>
> Please note that this is an RFC and should not be merged until
> associated changes are made to the virtio specification, which will
> follow after discussion from this series.
>
> This series first supports datagrams in a basic form for virtio, and
> then optimizes the sendpath for all transports.
>
> The result is a very fast datagram communication protocol that
> outperforms even UDP on multi-queue virtio-net w/ vhost on a variety
> of multi-threaded workload samples.
>
> For those that are curious, some summary data comparing UDP and VSOCK
> DGRAM (N=5):
>
> vCPUS: 16
> virtio-net queues: 16
> payload size: 4KB
> Setup: bare metal + vm (non-nested)
>
> UDP: 287.59 MB/s
> VSOCK DGRAM: 509.2 MB/s
>
> Some notes about the implementation...
>
> This datagram implementation forces datagrams to self-throttle according
> to the threshold set by sk_sndbuf. It behaves similar to the credits
> used by streams in its effect on throughput and memory consumption, but
> it is not influenced by the receiving socket as credits are.
>
> The device drops packets silently. There is room for improvement by
> building into the device and driver some intelligence around how to
> reduce frequency of kicking the virtqueue when packet loss is high. I
> think there is a good discussion to be had on this.
>
> In this series I am also proposing that fairness be reexamined as an
> issue separate from datagrams, which differs from my previous series
> that coupled these issues. After further testing and reflection on the
> design, I do not believe that these need to be coupled and I do not
> believe this implementation introduces additional unfairness or
> exacerbates pre-existing unfairness.
>
> I attempted to characterize vsock fairness by using a pool of processes
> to stress test the shared resources while measuring the performance of a
> lone stream socket. Given unfair preference for datagrams, we would
> assume that a lone stream socket would degrade much more when a pool of
> datagram sockets was stressing the system than when a pool of stream
> sockets are stressing the system. The result, however, showed no
> significant difference between the degradation of throughput of the lone
> stream socket when using a pool of datagrams to stress the queue over
> using a pool of streams. The absolute difference in throughput actually
> favored datagrams as interfering least as the mean difference was +16%
> compared to using streams to stress test (N=7), but it was not
> statistically significant. Workloads were matched for payload size and
> buffer size (to approximate memory consumption) and process count, and
> stress workloads were configured to start before and last long after the
> lifetime of the "lone" stream socket flow to ensure that competing flows
> were continuously hot.
>
> Given the above data, I propose that vsock fairness be addressed
> independent of datagrams and to defer its implementation to a future
> series.
>
> Signed-off-by: Bobby Eshleman <bobby.eshleman@bytedance.com>
> ---
> Bobby Eshleman (3):
> virtio/vsock: support dgram
> virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit
> vsock: Add lockless sendmsg() support
>
> Jiang Wang (1):
> tests: add vsock dgram tests
>
> drivers/vhost/vsock.c | 17 +-
> include/net/af_vsock.h | 20 ++-
> include/uapi/linux/virtio_vsock.h | 2 +
> net/vmw_vsock/af_vsock.c | 287 ++++++++++++++++++++++++++++----
> net/vmw_vsock/diag.c | 10 +-
> net/vmw_vsock/hyperv_transport.c | 15 +-
> net/vmw_vsock/virtio_transport.c | 10 +-
> net/vmw_vsock/virtio_transport_common.c | 221 ++++++++++++++++++++----
> net/vmw_vsock/vmci_transport.c | 70 ++++++--
> tools/testing/vsock/util.c | 105 ++++++++++++
> tools/testing/vsock/util.h | 4 +
> tools/testing/vsock/vsock_test.c | 193 +++++++++++++++++++++
> 12 files changed, 859 insertions(+), 95 deletions(-)
> ---
> base-commit: ed72bd5a6790a0c3747cb32b0427f921bd03bb71
> change-id: 20230413-b4-vsock-dgram-3b6eba6a64e5
>
> Best regards,
> --
> Bobby Eshleman <bobby.eshleman@bytedance.com>
>
next prev parent reply other threads:[~2023-04-14 18:27 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-14 0:25 [PATCH RFC net-next v2 0/4] virtio/vsock: support datagrams Bobby Eshleman
2023-04-14 0:25 ` [PATCH RFC net-next v2 1/4] virtio/vsock: support dgram Bobby Eshleman
2023-04-19 9:29 ` Stefano Garzarella
2023-04-14 0:25 ` [PATCH RFC net-next v2 2/4] virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit Bobby Eshleman
2023-04-14 8:47 ` Alvaro Karsz
2023-04-14 10:28 ` Bobby Eshleman
2023-04-19 9:30 ` Stefano Garzarella
2023-04-15 9:51 ` Bobby Eshleman
2023-04-14 0:25 ` [PATCH RFC net-next v2 3/4] vsock: Add lockless sendmsg() support Bobby Eshleman
2023-04-19 9:30 ` Stefano Garzarella
2023-04-15 10:30 ` Bobby Eshleman
2023-04-28 10:29 ` Stefano Garzarella
2023-04-15 17:29 ` Bobby Eshleman
2023-05-03 12:09 ` Stefano Garzarella
2023-04-14 0:26 ` [PATCH RFC net-next v2 4/4] tests: add vsock dgram tests Bobby Eshleman
2023-04-14 11:18 ` Bobby Eshleman [this message]
2023-04-19 10:00 ` [PATCH RFC net-next v2 0/4] virtio/vsock: support datagrams Stefano Garzarella
2023-04-15 7:13 ` Bobby Eshleman
2023-04-28 10:43 ` Stefano Garzarella
2023-04-15 15:55 ` Bobby Eshleman
2023-05-03 12:13 ` Stefano Garzarella
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZDk2kOVnUvyLMLKE@bullseye \
--to=bobbyeshleman@gmail.com \
--cc=bobby.eshleman@bytedance.com \
--cc=bryantan@vmware.com \
--cc=cong.wang@bytedance.com \
--cc=davem@davemloft.net \
--cc=decui@microsoft.com \
--cc=edumazet@google.com \
--cc=haiyangz@microsoft.com \
--cc=jasowang@redhat.com \
--cc=jiang.wang@bytedance.com \
--cc=kuba@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=kys@microsoft.com \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pv-drivers@vmware.com \
--cc=sgarzare@redhat.com \
--cc=stefanha@redhat.com \
--cc=vdasa@vmware.com \
--cc=virtualization@lists.linux-foundation.org \
--cc=wei.liu@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).