From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:50052)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1a9EHR-0006n4-Nw
	for qemu-devel@nongnu.org; Wed, 16 Dec 2015 10:46:18 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1a9EHN-0005hF-Vt
	for qemu-devel@nongnu.org; Wed, 16 Dec 2015 10:46:17 -0500
Received: from mx1.redhat.com ([209.132.183.28]:55716)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1a9EHN-0005h8-Nj
	for qemu-devel@nongnu.org; Wed, 16 Dec 2015 10:46:13 -0500
References: <cover.1450218353.git.v.maffione@gmail.com>
	<5671230C.70102@redhat.com>
	<CA+_eA9hp4dr14aM0TaPCrNks-K8ik-Ma+bn74N-1XAipR69wbQ@mail.gmail.com>
	<5671300E.5060109@redhat.com>
	<CA+_eA9i8K7BCzm6Cw109T_D1wozCcXAU=612z2ZXRGcovSRbDg@mail.gmail.com>
	<56714D0E.1030209@redhat.com>
	<CA+_eA9iK-HV1VA1eZ9EAZSQwqeftwH1dTKTeDdkYgqxaq5tqZA@mail.gmail.com>
From: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <5671873D.6010302@redhat.com>
Date: Wed, 16 Dec 2015 16:46:05 +0100
MIME-Version: 1.0
In-Reply-To: <CA+_eA9iK-HV1VA1eZ9EAZSQwqeftwH1dTKTeDdkYgqxaq5tqZA@mail.gmail.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] [PATCH v2 0/3] virtio: proposal to optimize
	accesses to VQs
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Vincenzo Maffione <v.maffione@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>, Jason Wang <jasowang@redhat.com>, Markus Armbruster <armbru@redhat.com>, qemu-devel <qemu-devel@nongnu.org>, Giuseppe Lettieri <g.lettieri@iet.unipi.it>, Luigi Rizzo <rizzo@iet.unipi.it>


On 16/12/2015 15:25, Vincenzo Maffione wrote:
>> vhost-net actually had better performance, so virtio-net dataplane
>> was never committed.  As Michael mentioned, in practice on Linux you
>> use vhost, and non-Linux hypervisors you do not use QEMU. :)
>=20
> Yes, I understand. However, another possible use-case would using QEMU
> + virtio-net + netmap backend + Linux (e.g. for QEMU-sandboxed packet
> generators or packe processors, where very high packet rates are
> common), where is not possible to use vhost.

Yes, of course.  That was tongue in cheek.  Another possibility for your
use case is to interface with netmap through vhost-user, but I'm happy
if you choose to improve virtio.c instead!

>> The main optimization that vring.c has is to cache the translation of
>> the rings.  Using address_space_map/unmap for rings in virtio.c would =
be
>> a noticeable improvement, as your numbers for patch 3 show.  However, =
by
>> caching translations you also conveniently "forget" to promptly mark t=
he
>> pages as dirty.  As you pointed out this is obviously an issue for
>> migration.  You can then add a notifier for runstate changes.  When
>> entering RUN_STATE_FINISH_MIGRATE or RUN_STATE_SAVE_VM the rings would
>> be unmapped, and then remapped the next time the VM starts running aga=
in.
>=20
> Ok so it seems feasible with a bit of care. The numbers we've been
> seing in various experiments have always shown that this optimization
> could easily double the 2 Mpps packet rate bottleneck.

Cool.  Bonus points for nicely abstracting it so that virtio.c is just a
user.

>> You also guessed right that there are consistency issues; for these yo=
u
>> can add a MemoryListener that invalidates all mappings.
>=20
> Yeah, but I don't know exactly what kind of inconsinstencies there can
> be. Maybe the memory we are mapping may be hot-unplugged?

Yes.  Just blow away all mappings in the MemoryListener commit callback.

>> That said, I'm wondering where the cost of address translation lies---=
is
>> it cache-unfriendly data structures, locked operations, or simply too
>> much code to execute?  It was quite surprising to me that on virtio-bl=
k
>> benchmarks we were spending 5% of the time doing memcpy! (I have just
>> extracted from my branch the patches to remove that, and sent them to
>> qemu-devel).
>=20
> I feel it's just too much code (but I may be wrong).

That is likely to be a good guess, but notice that the fast path doesn't
actually have _that much_ code, because a lot of "if"s that are almost
always false.

Looking at a profile would be useful.  Is it flat, or does something
(e.g. address_space_translate) actually stand out?

> I'm not sure whether you are thinking that 5% is too much or too little=
.
> To me it's too little, showing that most of the overhead it's
> somewhere else (e.g. memory translation, or backend processing). In a
> ideal transmission system, most of the overhead should be spent on
> copying, because it means that you successfully managed to suppress
> notifications and translation overhead.

On copying data, though---not on copying virtio descriptors.  5% for
those is entirely wasted time.

Also, note that I'm looking at disk I/O rather than networking, where
there should be no copies at all.

Paolo

>> Examples of missing optimizations in exec.c include:
>>
>> * caching enough information in RAM MemoryRegions to avoid the calls t=
o
>> qemu_get_ram_block (e.g. replace mr->ram_addr with a RAMBlock pointer)=
;
>>
>> * adding a MRU cache to address_space_lookup_region.
>>
>> In particular, the former should be easy if you want to give it a
>> try---easier than caching ring translations in virtio.c.
>=20
> Thank you so much for the insights :)