Re: [PATCH v2 0/5] VSOCK: support mergeable rx buffer in vhost-vsock

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Michael S. Tsirkin" <mst@redhat.com>
To: jiangyiwen <jiangyiwen@huawei.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	netdev@vger.kernel.org, kvm@vger.kernel.org,
	virtualization@lists.linux-foundation.org
Subject: Re: [PATCH v2 0/5] VSOCK: support mergeable rx buffer in vhost-vsock
Date: Fri, 14 Dec 2018 08:22:45 -0500	[thread overview]
Message-ID: <20181214082146-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <5C1384E8.7040506@huawei.com>

On Fri, Dec 14, 2018 at 06:24:40PM +0800, jiangyiwen wrote:
> On 2018/12/12 23:09, Michael S. Tsirkin wrote:
> > On Wed, Dec 12, 2018 at 05:25:50PM +0800, jiangyiwen wrote:
> >> Now vsock only support send/receive small packet, it can't achieve
> >> high performance. As previous discussed with Jason Wang, I revisit the
> >> idea of vhost-net about mergeable rx buffer and implement the mergeable
> >> rx buffer in vhost-vsock, it can allow big packet to be scattered in
> >> into different buffers and improve performance obviously.
> >>
> >> This series of patches mainly did three things：
> >> - mergeable buffer implementation
> >> - increase the max send pkt size
> >> - add used and signal guest in a batch
> >>
> >> And I write a tool to test the vhost-vsock performance, mainly send big
> >> packet(64K) included guest->Host and Host->Guest. I test performance
> >> independently and the result as follows:
> >>
> >> Before performance:
> >>               Single socket            Multiple sockets(Max Bandwidth)
> >> Guest->Host   ~400MB/s                 ~480MB/s
> >> Host->Guest   ~1450MB/s                ~1600MB/s
> >>
> >> After performance only use implement mergeable rx buffer:
> >>               Single socket            Multiple sockets(Max Bandwidth)
> >> Guest->Host   ~400MB/s                 ~480MB/s
> >> Host->Guest   ~1280MB/s                ~1350MB/s
> >>
> >> In this case, max send pkt size is still limited to 4K, so Host->Guest
> >> performance will worse than before.
> > 
> > It's concerning though, what if application sends small packets?
> > What is the source of the slowdown? Do you know?
> > 
> 
> Hi Michael,
> 
> To the two cases, I test the results included small and big packets as
> follows:
> 
> 64K packets performance comparison:
>                                               Single socket    Multiple sockets
> Host->Guest(before)                           1352.60MB/s      1436.33MB/s
> 
> 
> Host->Guest(only use mergeable rx buffer)     1290.08MB/s      1212.67MB/s
> 
> 4K packets performance comparison:
>                                               Single socket    Multiple sockets
> Host->Guest(before)                           535.47MB/s       688.67MB/s
> Host->Guest(only use mergeable rx buffer)     522.33MB/s       599.00MB/s
> 
> 3K packets performance comparison:
>                                               Single socket    Multiple sockets
> Host->Guest(before)                           359.74MB/s       442.00MB/s
> Host->Guest(only use mergeable rx buffer)     374.47MB/s       452.33MB/s
> 
> We can see an interesting thing, for 64K and 4K packets,
> using mergeable buffer has a poor performance, for 3K packet,
> both have the same performance.
> 
> I guess in mergeable mode, when host send a 4k packet to guest, we
> should call vhost_get_vq_desc() twice in host(hdr + 4k data),
> and in guest we also should call virtqueue_get_buf() twice. So
> when packet is smaller than (4k - hdr), it can be packed in a
> single page, so the performance is the same as before.
> 
> So in the mergeable mode, the performance may be
> worse in ((4k - hdr), 4k] than before.
> 
> Thanks,
> Yiwen.


The conclusion seems to be that mergeable buffers themselves
only hurt performance, but they allow batching which improves
performance. So let's add batching without mergeable buffers then?


> >> After performance increase the max send pkt size to 64K:
> >>               Single socket            Multiple sockets(Max Bandwidth)
> >> Guest->Host   ~1700MB/s                ~2900MB/s
> >> Host->Guest   ~1500MB/s                ~2440MB/s
> >>
> >> After performance all patches are used:
> >>               Single socket            Multiple sockets(Max Bandwidth)
> >> Guest->Host   ~1700MB/s                ~2900MB/s
> >> Host->Guest   ~1700MB/s                ~2900MB/s
> >>
> >> >From the test results, the performance is improved obviously, and guest
> >> memory will not be wasted.
> >>
> >> In addition, in order to support mergeable rx buffer in virtio-vsock,
> >> we need to add a qemu patch to support parse feature.
> >>
> >> ---
> >> v1 -> v2:
> >>  * Addressed comments from Jason Wang.
> >>  * Add performance test result independently.
> >>  * Use Skb_page_frag_refill() which can use high order page and reduce
> >>    the stress of page allocator.
> >>  * Still use fixed size(PAGE_SIZE) to fill rx buffer, because too small
> >>    size can't fill one full packet, we only 128 vq num now.
> >>  * Use iovec to replace buf in struct virtio_vsock_pkt, keep tx and rx
> >>    consistency.
> >>  * Add virtio_transport ops to get max pkt len, in order to be compatible
> >>    with old version.
> >> ---
> >>
> >> Yiwen Jiang (5):
> >>   VSOCK: support fill mergeable rx buffer in guest
> >>   VSOCK: support fill data to mergeable rx buffer in host
> >>   VSOCK: support receive mergeable rx buffer in guest
> >>   VSOCK: increase send pkt len in mergeable mode to improve performance
> >>   VSOCK: batch sending rx buffer to increase bandwidth
> >>
> >>  drivers/vhost/vsock.c                   | 183 ++++++++++++++++++++-----
> >>  include/linux/virtio_vsock.h            |  13 +-
> >>  include/uapi/linux/virtio_vsock.h       |   5 +
> >>  net/vmw_vsock/virtio_transport.c        | 229 +++++++++++++++++++++++++++-----
> >>  net/vmw_vsock/virtio_transport_common.c |  66 ++++++---
> >>  5 files changed, 411 insertions(+), 85 deletions(-)
> >>
> >> -- 
> >> 1.8.3.1
> > 
> > .
> > 
>

next prev parent reply	other threads:[~2018-12-14 13:22 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-12  9:25 [PATCH v2 0/5] VSOCK: support mergeable rx buffer in vhost-vsock jiangyiwen
2018-12-12 15:09 ` Michael S. Tsirkin
2018-12-13  2:14   ` jiangyiwen
2018-12-14 10:24   ` jiangyiwen
2018-12-14 13:22     ` Michael S. Tsirkin [this message]
2018-12-13 16:34 ` Stefan Hajnoczi
2018-12-14  9:39   ` jiangyiwen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181214082146-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=jiangyiwen@huawei.com \
    --cc=kvm@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=stefanha@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).