qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Vincenzo Maffione <v.maffione@gmail.com>
To: qemu-devel@nongnu.org
Cc: mst@redhat.com, jasowang@redhat.com, armbru@redhat.com,
	Vincenzo Maffione <v.maffione@gmail.com>,
	pbonzini@redhat.com, g.lettieri@iet.unipi.it, rizzo@iet.unipi.it
Subject: [Qemu-devel] [PATCH v2 0/3] virtio: proposal to optimize accesses to VQs
Date: Tue, 15 Dec 2015 23:33:18 +0100	[thread overview]
Message-ID: <cover.1450218353.git.v.maffione@gmail.com> (raw)

Hi,
  I am doing performance experiments to test how QEMU behaves when the
guest is transmitting (short) network packets at very high packet rates, say
over 1Mpps.
I run a netmap application in the guest to generate high packet rates,
but this is not relevant to this discussion. The only important fact is that
the generator running in the guest is not the bottleneck, and in fact CPU
utilization is low (20%).

Moreover, I'm not considering vhost-net to boost virtio-net HV-side processing,
because I want to do performance unit-tests on the QEMU virtio userspace
implementation (hw/virtio/virtio.c).

In the most common benchmarks - e.g. netperf TCP_STREAM, TCP_RR,
UDP_STREAM, ..., with one end of the communication in the guest, and the
other in the host, for instance with the simplest TAP networking setup - the
virtio-net adapter definitely outperforms the emulated e1000 adapter (and
all the other emulated devices). This was expected because of the great
benefits of I/O paravirtualization.

However, I was surprised to find out that the situation changes drastically
at very high packet rates.

My measurements show that e1000 emulated adapter is able to transmit over
3.5 Mpps when the network backend is disconnected. I disconnect the
backend to see at what packet rate e1000 becomes the bottleneck.

The same experiment, however, shows that virtio-net has a bottleneck at
1Mpps only. Once verified that the TX VQ kicks and TX VQ interrupts are
properly amortized/suppressed, I found out that the bottleneck is partially
due to the way the code accesses the VQ in the guest physical memory, since
each access involves an expensive address space translation. For each VQ
element to process, I counted over 15 accesses, while e1000 has just 2 accesses
to its rings.

This patch slightly rewrites the code to reduce the number of accesses, since
many of them seems unnecessary to me. After this reduction, the bottleneck
jumps from 1 Mpps to 2 Mpps.

Patch is not complete (e.g. it still does not properly manage endianess, it is
not clean, etc.). I just wanted to ask if you think the idea makes sense, and
a proper patch in this direction would be accepted.

Thanks,
  Vincenzo


CHANGELOG:
    - rebased on bonzini/dataplane branch
    - removed optimization (combined descriptor read) already present in
      the dataplane branch
    - 3 separate optimizations splitted into 3 patches

Vincenzo Maffione (3):
  virtio: cache used_idx in a VirtQueue field
  virtio: read avail_idx from VQ only when necessary
  virtio: combine write of an entry into used ring

 hw/virtio/virtio.c | 62 ++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 42 insertions(+), 20 deletions(-)

-- 
2.6.4

             reply	other threads:[~2015-12-15 22:34 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-15 22:33 Vincenzo Maffione [this message]
2015-12-15 22:33 ` [Qemu-devel] [PATCH v2 1/3] virtio: cache used_idx in a VirtQueue field Vincenzo Maffione
2015-12-15 22:33 ` [Qemu-devel] [PATCH v2 2/3] virtio: read avail_idx from VQ only when necessary Vincenzo Maffione
2015-12-15 22:33 ` [Qemu-devel] [PATCH v2 3/3] virtio: combine write of an entry into used ring Vincenzo Maffione
2015-12-16  8:38 ` [Qemu-devel] [PATCH v2 0/3] virtio: proposal to optimize accesses to VQs Paolo Bonzini
2015-12-16  9:28   ` Vincenzo Maffione
2015-12-16  9:34     ` Paolo Bonzini
2015-12-16 10:39       ` Vincenzo Maffione
2015-12-16 11:02         ` Michael S. Tsirkin
2015-12-16 11:23           ` Vincenzo Maffione
2015-12-16 11:37         ` Paolo Bonzini
2015-12-16 14:25           ` Vincenzo Maffione
2015-12-16 15:46             ` Paolo Bonzini
2015-12-30 16:45               ` Vincenzo Maffione

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1450218353.git.v.maffione@gmail.com \
    --to=v.maffione@gmail.com \
    --cc=armbru@redhat.com \
    --cc=g.lettieri@iet.unipi.it \
    --cc=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rizzo@iet.unipi.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).