From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [PATCH v2 0/5] VSOCK: support mergeable rx buffer in vhost-vsock
Date: Wed, 12 Dec 2018 10:09:51 -0500
Message-ID: <20181212100835-mutt-send-email-mst@kernel.org>
References: <5C10D41E.9050002@huawei.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Cc: Stefan Hajnoczi <stefanha@redhat.com>,
        Jason Wang <jasowang@redhat.com>, netdev@vger.kernel.org,
        kvm@vger.kernel.org, virtualization@lists.linux-foundation.org
To: jiangyiwen <jiangyiwen@huawei.com>
Return-path: <netdev-owner@vger.kernel.org>
Received: from mx1.redhat.com ([209.132.183.28]:60274 "EHLO mx1.redhat.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1726258AbeLLPJz (ORCPT <rfc822;netdev@vger.kernel.org>);
        Wed, 12 Dec 2018 10:09:55 -0500
Content-Disposition: inline
In-Reply-To: <5C10D41E.9050002@huawei.com>
Sender: netdev-owner@vger.kernel.org
List-ID: <netdev.vger.kernel.org>

On Wed, Dec 12, 2018 at 05:25:50PM +0800, jiangyiwen wrote:
> Now vsock only support send/receive small packet, it can't achieve
> high performance. As previous discussed with Jason Wang, I revisit the
> idea of vhost-net about mergeable rx buffer and implement the mergeable
> rx buffer in vhost-vsock, it can allow big packet to be scattered in
> into different buffers and improve performance obviously.
> 
> This series of patches mainly did three things：
> - mergeable buffer implementation
> - increase the max send pkt size
> - add used and signal guest in a batch
> 
> And I write a tool to test the vhost-vsock performance, mainly send big
> packet(64K) included guest->Host and Host->Guest. I test performance
> independently and the result as follows:
> 
> Before performance:
>               Single socket            Multiple sockets(Max Bandwidth)
> Guest->Host   ~400MB/s                 ~480MB/s
> Host->Guest   ~1450MB/s                ~1600MB/s
> 
> After performance only use implement mergeable rx buffer:
>               Single socket            Multiple sockets(Max Bandwidth)
> Guest->Host   ~400MB/s                 ~480MB/s
> Host->Guest   ~1280MB/s                ~1350MB/s
> 
> In this case, max send pkt size is still limited to 4K, so Host->Guest
> performance will worse than before.

It's concerning though, what if application sends small packets?
What is the source of the slowdown? Do you know?

> After performance increase the max send pkt size to 64K:
>               Single socket            Multiple sockets(Max Bandwidth)
> Guest->Host   ~1700MB/s                ~2900MB/s
> Host->Guest   ~1500MB/s                ~2440MB/s
> 
> After performance all patches are used:
>               Single socket            Multiple sockets(Max Bandwidth)
> Guest->Host   ~1700MB/s                ~2900MB/s
> Host->Guest   ~1700MB/s                ~2900MB/s
> 
> >From the test results, the performance is improved obviously, and guest
> memory will not be wasted.
> 
> In addition, in order to support mergeable rx buffer in virtio-vsock,
> we need to add a qemu patch to support parse feature.
> 
> ---
> v1 -> v2:
>  * Addressed comments from Jason Wang.
>  * Add performance test result independently.
>  * Use Skb_page_frag_refill() which can use high order page and reduce
>    the stress of page allocator.
>  * Still use fixed size(PAGE_SIZE) to fill rx buffer, because too small
>    size can't fill one full packet, we only 128 vq num now.
>  * Use iovec to replace buf in struct virtio_vsock_pkt, keep tx and rx
>    consistency.
>  * Add virtio_transport ops to get max pkt len, in order to be compatible
>    with old version.
> ---
> 
> Yiwen Jiang (5):
>   VSOCK: support fill mergeable rx buffer in guest
>   VSOCK: support fill data to mergeable rx buffer in host
>   VSOCK: support receive mergeable rx buffer in guest
>   VSOCK: increase send pkt len in mergeable mode to improve performance
>   VSOCK: batch sending rx buffer to increase bandwidth
> 
>  drivers/vhost/vsock.c                   | 183 ++++++++++++++++++++-----
>  include/linux/virtio_vsock.h            |  13 +-
>  include/uapi/linux/virtio_vsock.h       |   5 +
>  net/vmw_vsock/virtio_transport.c        | 229 +++++++++++++++++++++++++++-----
>  net/vmw_vsock/virtio_transport_common.c |  66 ++++++---
>  5 files changed, 411 insertions(+), 85 deletions(-)
> 
> -- 
> 1.8.3.1