Re: [PATCH 0/9][RFC] KVM virtio_net performance

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Avi Kivity <avi@qumranet.com>
To: Mark McLoughlin <markmc@redhat.com>
Cc: kvm@vger.kernel.org, Herbert Xu <herbert@gondor.apana.org.au>,
	Rusty Russell <rusty@rustcorp.com.au>
Subject: Re: [PATCH 0/9][RFC] KVM virtio_net performance
Date: Sat, 26 Jul 2008 12:45:36 +0300	[thread overview]
Message-ID: <488AF240.2060208@qumranet.com> (raw)
In-Reply-To: <1216899979-32532-1-git-send-email-markmc@redhat.com>

Mark McLoughlin wrote:
> Hey,
>       Here's a bunch of patches attempting to improve the performance
> of virtio_net. This is more an RFC rather than a patch submission
> since, as can be seen below, not all patches actually improve the
> perfomance measurably.
>
>       I've tried hard to test each of these patches with as stable and
> informative a benchmark as I could find. The first benchmark is a
> netperf[1] based throughput benchmark and the second uses a flood
> ping[2] to measure latency differences.
>
>       Each set of figures is min/average/max/standard deviation. The
> first set is Gb/s and the second is milliseconds.
>
>       The network configuration used was very simple - the guest with
> a virtio_net interface and the host with a tap interface and static
> IP addresses assigned to both - e.g. there was no bridge in the host
> involved and iptables was disable in both the host and guest.
>
>       I used:
>
>   1) kvm-71-26-g6152996 with the patches that follow
>
>   2) Linus's v2.6.26-5752-g93ded9b with Rusty's virtio patches from
>      219:bbd2611289c5 applied; these are the patches have just been
>      submitted to Linus
>
>       The conclusions I draw are:
>
>   1) The length of the tx mitigation timer makes quite a difference to
>      throughput achieved; we probably need a good heuristic for
>      adjusting this on the fly.
>   

The tx mitigation timer is just one part of the equation; the other is 
the virtio ring window size, which is now fixed.

Using a maximum sized window is good when the guest and host are running 
flat out, doing nothing but networking.  When throughput drops (because 
the guest is spending cpu on processing, or simply because the other 
side is not keeping up), we need to drop the windows size so as to 
retain acceptable latencies.

The tx timer can then be set to "a bit after the end of the window", 
acting as a safety belt in case the throughput changes.

>   4) Dropping the global mutex while reading GSO packets from the tap
>      interface gives a nice speedup. This highlights the global mutex
>      as a general perfomance issue.
>
>   

Not sure whether this is safe.  What's stopping the guest from accessing 
virtio and changing some state?

>   5) Eliminating an extra copy on the host->guest path only makes a
>      barely measurable difference.
>
>   

That's expected on a host->guest test.  Zero copy is mostly important 
for guest->external, and with zerocopy already enabled in the guest 
(sendfile or nfs server workloads).

>         Anyway, the figures:
>
>   netperf, 10x20s runs (Gb/s)  |       guest->host          |       host->guest
>   -----------------------------+----------------------------+---------------------------
>   baseline                     | 1.520/ 1.573/ 1.610/ 0.034 | 1.160/ 1.357/ 1.630/ 0.165
>   50us tx timer + rearm        | 1.050/ 1.086/ 1.110/ 0.017 | 1.710/ 1.832/ 1.960/ 0.092
>   250us tx timer + rearm       | 1.700/ 1.764/ 1.880/ 0.064 | 0.900/ 1.203/ 1.580/ 0.205
>   150us tx timer + rearm       | 1.520/ 1.602/ 1.690/ 0.044 | 1.670/ 1.928/ 2.150/ 0.141
>   no ring-full heuristic       | 1.480/ 1.569/ 1.710/ 0.066 | 1.610/ 1.857/ 2.140/ 0.153
>   VIRTIO_F_NOTIFY_ON_EMPTY     | 1.470/ 1.554/ 1.650/ 0.054 | 1.770/ 1.960/ 2.170/ 0.119
>   recv NO_NOTIFY               | 1.530/ 1.604/ 1.680/ 0.047 | 1.780/ 1.944/ 2.190/ 0.129
>   GSO                          | 4.120/ 4.323/ 4.420/ 0.099 | 6.540/ 7.033/ 7.340/ 0.244
>   ring size == 256             | 4.050/ 4.406/ 4.560/ 0.143 | 6.280/ 7.236/ 8.280/ 0.613
>   ring size == 512             | 4.420/ 4.600/ 4.960/ 0.140 | 6.470/ 7.205/ 7.510/ 0.314
>   drop mutex during tapfd read | 4.320/ 4.578/ 4.790/ 0.161 | 8.370/ 8.589/ 8.730/ 0.120
>   aligouri zero-copy           | 4.510/ 4.694/ 4.960/ 0.148 | 8.430/ 8.614/ 8.840/ 0.142
>   

Very impressive numbers; much better than I expected.  The host->guest 
numbers are around 100x better than the original emulated card througput 
we got from kvm.

>   ping -f -c 100000 (ms)       |       guest->host          |       host->guest
>   -----------------------------+----------------------------+---------------------------
>   baseline                     | 0.060/ 0.459/ 7.602/ 0.846 | 0.067/ 0.331/ 2.517/ 0.057
>   50us tx timer + rearm        | 0.081/ 0.143/ 7.436/ 0.374 | 0.093/ 0.133/ 1.883/ 0.026
>   250us tx timer + rearm       | 0.302/ 0.463/ 7.580/ 0.849 | 0.297/ 0.344/ 2.128/ 0.028
>   150us tx timer + rearm       | 0.197/ 0.323/ 7.671/ 0.740 | 0.199/ 0.245/ 7.836/ 0.037
>   no ring-full heuristic       | 0.182/ 0.324/ 7.688/ 0.753 | 0.199/ 0.243/ 2.197/ 0.030
>   VIRTIO_F_NOTIFY_ON_EMPTY     | 0.197/ 0.321/ 7.447/ 0.730 | 0.196/ 0.242/ 2.218/ 0.032
>   recv NO_NOTIFY               | 0.186/ 0.321/ 7.520/ 0.732 | 0.200/ 0.233/ 2.216/ 0.028
>   GSO                          | 0.178/ 0.324/ 7.667/ 0.736 | 0.147/ 0.246/ 1.361/ 0.024
>   ring size == 256             | 0.184/ 0.323/ 7.674/ 0.728 | 0.199/ 0.243/ 2.181/ 0.028
>   ring size == 512             |             (not measured) |             (not measured)
>   drop mutex during tapfd read | 0.183/ 0.323/ 7.820/ 0.733 | 0.202/ 0.242/ 2.219/ 0.027
>   aligouri zero-copy           | 0.185/ 0.325/ 7.863/ 0.736 | 0.202/ 0.245/ 7.844/ 0.036
>
>   

This isn't too good.  Low latency is important for nfs clients (or other 
request/response workloads).  I think we can keep these low by adjusting 
the virtio window (for example, on an idle system it should be 1), so 
that the tx mitigation timer only fires when the workload transitions 
from throughput to request/response.



-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

next prev parent reply	other threads:[~2008-07-26  9:45 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-24 11:46 [PATCH 0/9][RFC] KVM virtio_net performance Mark McLoughlin
2008-07-24 11:46 ` [PATCH 1/9] kvm: qemu: Set MIN_TIMER_REARM_US to 150us Mark McLoughlin
2008-07-24 11:46   ` [PATCH 2/9] kvm: qemu: Fix virtio_net tx timer Mark McLoughlin
2008-07-24 11:46     ` [PATCH 3/9] kvm: qemu: Remove virtio_net tx ring-full heuristic Mark McLoughlin
2008-07-24 11:46       ` [PATCH 4/9] kvm: qemu: Add VIRTIO_F_NOTIFY_ON_EMPTY Mark McLoughlin
2008-07-24 11:46         ` [PATCH 5/9] kvm: qemu: Disable recv notifications until avail buffers exhausted Mark McLoughlin
2008-07-24 11:46           ` [PATCH 6/9] kvm: qemu: Add support for partial csums and GSO Mark McLoughlin
2008-07-24 11:46             ` [PATCH 7/9] kvm: qemu: Increase size of virtio_net rings Mark McLoughlin
2008-07-24 11:46               ` [PATCH 8/9] kvm: qemu: Drop the mutex while reading from tapfd Mark McLoughlin
2008-07-24 11:46                 ` [PATCH 9/9] kvm: qemu: Eliminate extra virtio_net copy Mark McLoughlin
2008-07-24 23:33                 ` [PATCH 8/9] kvm: qemu: Drop the mutex while reading from tapfd Dor Laor
2008-07-25 17:25                   ` Mark McLoughlin
2008-07-24 23:22       ` [PATCH 3/9] kvm: qemu: Remove virtio_net tx ring-full heuristic Dor Laor
2008-07-25  0:30         ` Rusty Russell
2008-07-25 17:30           ` Mark McLoughlin
2008-07-25 17:23         ` Mark McLoughlin
2008-07-24 23:56       ` Dor Laor
2008-07-26  9:48     ` [PATCH 2/9] kvm: qemu: Fix virtio_net tx timer Avi Kivity
2008-07-26 12:08       ` Mark McLoughlin
2008-07-24 11:55 ` [PATCH 0/9][RFC] KVM virtio_net performance Herbert Xu
2008-07-24 16:53 ` Mark McLoughlin
2008-07-24 18:29   ` Anthony Liguori
2008-07-25 16:36     ` Mark McLoughlin
2008-07-24 20:56 ` Anthony Liguori
2008-07-25 17:17   ` Mark McLoughlin
2008-07-25 21:29     ` Dor Laor
2008-07-26 19:09   ` Bill Davidsen
2008-07-27  7:52     ` Avi Kivity
2008-07-27 12:52       ` Bill Davidsen
2008-07-27 13:17       ` Bill Davidsen
2008-07-28  6:42         ` Mark McLoughlin
2008-07-26  9:45 ` Avi Kivity [this message]
2008-07-27  6:48   ` Rusty Russell
2008-07-27  6:48   ` Rusty Russell
2008-08-11 19:56   ` Mark McLoughlin
2008-08-12 13:35     ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=488AF240.2060208@qumranet.com \
    --to=avi@qumranet.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=kvm@vger.kernel.org \
    --cc=markmc@redhat.com \
    --cc=rusty@rustcorp.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox