From: Simon Horman <simon.horman@corigine.com>
To: Bobby Eshleman <bobby.eshleman@bytedance.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>,
Stefano Garzarella <sgarzare@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Jason Wang <jasowang@redhat.com>,
Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
"K. Y. Srinivasan" <kys@microsoft.com>,
Haiyang Zhang <haiyangz@microsoft.com>,
Wei Liu <wei.liu@kernel.org>, Dexuan Cui <decui@microsoft.com>,
Bryan Tan <bryantan@vmware.com>, Vishnu Dasa <vdasa@vmware.com>,
VMware PV-Drivers Reviewers <pv-drivers@vmware.com>,
Dan Carpenter <dan.carpenter@linaro.org>,
Krasnov Arseniy <oxffffaa@gmail.com>,
kvm@vger.kernel.org, virtualization@lists.linux-foundation.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-hyperv@vger.kernel.org, bpf@vger.kernel.org
Subject: Re: [PATCH RFC net-next v4 7/8] vsock: Add lockless sendmsg() support
Date: Mon, 12 Jun 2023 11:53:00 +0200 [thread overview]
Message-ID: <ZIbq/CeWowEq+nvg@corigine.com> (raw)
In-Reply-To: <20230413-b4-vsock-dgram-v4-7-0cebbb2ae899@bytedance.com>
On Sat, Jun 10, 2023 at 12:58:34AM +0000, Bobby Eshleman wrote:
> Because the dgram sendmsg() path for AF_VSOCK acquires the socket lock
> it does not scale when many senders share a socket.
>
> Prior to this patch the socket lock is used to protect both reads and
> writes to the local_addr, remote_addr, transport, and buffer size
> variables of a vsock socket. What follows are the new protection schemes
> for these fields that ensure a race-free and usually lock-free
> multi-sender sendmsg() path for vsock dgrams.
>
> - local_addr
> local_addr changes as a result of binding a socket. The write path
> for local_addr is bind() and various vsock_auto_bind() call sites.
> After a socket has been bound via vsock_auto_bind() or bind(), subsequent
> calls to bind()/vsock_auto_bind() do not write to local_addr again. bind()
> rejects the user request and vsock_auto_bind() early exits.
> Therefore, the local addr can not change while a parallel thread is
> in sendmsg() and lock-free reads of local addr in sendmsg() are safe.
> Change: only acquire lock for auto-binding as-needed in sendmsg().
>
> - buffer size variables
> Not used by dgram, so they do not need protection. No change.
>
> - remote_addr and transport
> Because a remote_addr update may result in a changed transport, but we
> would like to be able to read these two fields lock-free but coherently
> in the vsock send path, this patch packages these two fields into a new
> struct vsock_remote_info that is referenced by an RCU-protected pointer.
>
> Writes are synchronized as usual by the socket lock. Reads only take
> place in RCU read-side critical sections. When remote_addr or transport
> is updated, a new remote info is allocated. Old readers still see the
> old coherent remote_addr/transport pair, and new readers will refer to
> the new coherent. The coherency between remote_addr and transport
> previously provided by the socket lock alone is now also preserved by
> RCU, except with the highly-scalable lock-free read-side.
>
> Helpers are introduced for accessing and updating the new pointer.
>
> The new structure is contains an rcu_head so that kfree_rcu() can be
> used. This removes the need of writers to use synchronize_rcu() after
> freeing old structures which is simply more efficient and reduces code
> churn where remote_addr/transport are already being updated inside RCU
> read-side sections.
>
> Only virtio has been tested, but updates were necessary to the VMCI and
> hyperv code. Unfortunately the author does not have access to
> VMCI/hyperv systems so those changes are untested.
>
> Perf Tests (results from patch v2)
> vCPUS: 16
> Threads: 16
> Payload: 4KB
> Test Runs: 5
> Type: SOCK_DGRAM
>
> Before: 245.2 MB/s
> After: 509.2 MB/s (+107%)
>
> Notably, on the same test system, vsock dgram even outperforms
> multi-threaded UDP over virtio-net with vhost and MQ support enabled.
>
> Throughput metrics for single-threaded SOCK_DGRAM and
> single/multi-threaded SOCK_STREAM showed no statistically signficant
Hi Bobby,
a minor nit from checkpatch --codespell: signficant -> significant
> throughput changes (lowest p-value reaching 0.27), with the range of the
> mean difference ranging between -5% to +1%.
>
> Signed-off-by: Bobby Eshleman <bobby.eshleman@bytedance.com>
...
next prev parent reply other threads:[~2023-06-12 9:53 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-10 0:58 [PATCH RFC net-next v4 0/8] virtio/vsock: support datagrams Bobby Eshleman
2023-06-10 0:58 ` [PATCH RFC net-next v4 1/8] vsock/dgram: generalize recvmsg and drop transport->dgram_dequeue Bobby Eshleman
2023-06-11 20:43 ` Arseniy Krasnov
2023-06-22 14:51 ` Stefano Garzarella
2023-06-22 14:51 ` Stefano Garzarella
2023-06-22 19:23 ` Arseniy Krasnov
2023-06-22 23:34 ` Bobby Eshleman
2023-06-22 23:37 ` Bobby Eshleman
2023-06-23 8:14 ` Stefano Garzarella
2023-06-23 8:14 ` Stefano Garzarella
2023-06-22 23:25 ` Bobby Eshleman
2023-06-10 0:58 ` [PATCH RFC net-next v4 2/8] vsock: refactor transport lookup code Bobby Eshleman
2023-06-22 14:57 ` Stefano Garzarella
2023-06-22 14:57 ` Stefano Garzarella
2023-06-10 0:58 ` [PATCH RFC net-next v4 3/8] vsock: support multi-transport datagrams Bobby Eshleman
2023-06-22 15:19 ` Stefano Garzarella
2023-06-22 15:19 ` Stefano Garzarella
2023-06-23 2:50 ` Bobby Eshleman
2023-06-23 2:59 ` Bobby Eshleman
2023-06-26 14:50 ` Stefano Garzarella
2023-06-26 14:50 ` Stefano Garzarella
2023-06-10 0:58 ` [PATCH RFC net-next v4 4/8] vsock: make vsock bind reusable Bobby Eshleman
2023-06-12 9:49 ` Simon Horman
2023-06-22 23:00 ` Bobby Eshleman
2023-06-22 15:25 ` Stefano Garzarella
2023-06-22 15:25 ` Stefano Garzarella
2023-06-22 23:05 ` Bobby Eshleman
2023-06-23 8:15 ` Stefano Garzarella
2023-06-23 8:15 ` Stefano Garzarella
2023-06-10 0:58 ` [PATCH RFC net-next v4 5/8] virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit Bobby Eshleman
2023-06-22 15:29 ` Stefano Garzarella
2023-06-22 15:29 ` Stefano Garzarella
2023-06-22 23:06 ` Bobby Eshleman
2023-06-10 0:58 ` [PATCH RFC net-next v4 6/8] virtio/vsock: support dgrams Bobby Eshleman
2023-06-11 20:49 ` Arseniy Krasnov
2023-06-22 16:09 ` Stefano Garzarella
2023-06-22 16:09 ` Stefano Garzarella
2023-06-22 18:46 ` Arseniy Krasnov
2023-06-23 4:37 ` Bobby Eshleman
2023-06-26 15:03 ` Stefano Garzarella
2023-06-26 15:03 ` Stefano Garzarella
2023-06-27 1:19 ` Bobby Eshleman
2023-06-29 12:30 ` Stefano Garzarella
2023-06-29 12:30 ` Stefano Garzarella
2023-06-22 16:31 ` Stefano Garzarella
2023-06-22 16:31 ` Stefano Garzarella
2023-06-10 0:58 ` [PATCH RFC net-next v4 7/8] vsock: Add lockless sendmsg() support Bobby Eshleman
2023-06-12 9:53 ` Simon Horman [this message]
2023-06-22 22:59 ` Bobby Eshleman
2023-06-22 16:37 ` Stefano Garzarella
2023-06-22 16:37 ` Stefano Garzarella
2023-06-22 22:57 ` Bobby Eshleman
2023-06-10 0:58 ` [PATCH RFC net-next v4 8/8] tests: add vsock dgram tests Bobby Eshleman
2023-06-11 20:54 ` Arseniy Krasnov
2023-06-22 23:16 ` Bobby Eshleman
2023-06-23 18:34 ` Arseniy Krasnov
2023-06-23 6:33 ` Bobby Eshleman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZIbq/CeWowEq+nvg@corigine.com \
--to=simon.horman@corigine.com \
--cc=bobby.eshleman@bytedance.com \
--cc=bpf@vger.kernel.org \
--cc=bryantan@vmware.com \
--cc=dan.carpenter@linaro.org \
--cc=davem@davemloft.net \
--cc=decui@microsoft.com \
--cc=edumazet@google.com \
--cc=haiyangz@microsoft.com \
--cc=jasowang@redhat.com \
--cc=kuba@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=kys@microsoft.com \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=oxffffaa@gmail.com \
--cc=pabeni@redhat.com \
--cc=pv-drivers@vmware.com \
--cc=sgarzare@redhat.com \
--cc=stefanha@redhat.com \
--cc=vdasa@vmware.com \
--cc=virtualization@lists.linux-foundation.org \
--cc=wei.liu@kernel.org \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.