From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:41111)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1V7nOm-0007WD-64
	for qemu-devel@nongnu.org; Fri, 09 Aug 2013 10:10:41 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1V7nOg-0006ug-Uz
	for qemu-devel@nongnu.org; Fri, 09 Aug 2013 10:10:36 -0400
Received: from mail-ie0-f176.google.com ([209.85.223.176]:49977)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1V7nOg-0006uY-Po
	for qemu-devel@nongnu.org; Fri, 09 Aug 2013 10:10:30 -0400
Received: by mail-ie0-f176.google.com with SMTP id 9so2961062iec.35
	for <qemu-devel@nongnu.org>; Fri, 09 Aug 2013 07:10:29 -0700 (PDT)
From: Anthony Liguori <anthony@codemonkey.ws>
In-Reply-To: <87ob97nz7x.fsf@rustcorp.com.au>
References: <1375938949-22622-1-git-send-email-rusty@rustcorp.com.au>
	<1375938949-22622-2-git-send-email-rusty@rustcorp.com.au>
	<87li4cgvh1.fsf@codemonkey.ws> <87ob97nz7x.fsf@rustcorp.com.au>
Date: Fri, 09 Aug 2013 09:10:26 -0500
Message-ID: <8761vflzu5.fsf@codemonkey.ws>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Subject: Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring
	and config access
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Rusty Russell <rusty@rustcorp.com.au>, qemu-devel@nongnu.org

Rusty Russell <rusty@rustcorp.com.au> writes:

> Anthony Liguori <anthony@codemonkey.ws> writes:
>> I suspect this is a premature optimization.  With a weak function called
>> directly in the accessors below, I suspect you would see no measurable
>> performance overhead compared to this approach.
>>
>> It's all very predictable so the CPU should do a decent job optimizing
>> the if () away.
>
> Perhaps.  I was leery of introducing performance regressions, but the
> actual I/O tends to dominate anyway.
>
> So I tested this, by adding the patch (below) and benchmarking
> qemu-system-i386 on my laptop before and after.
>
> Setup: Intel(R) Core(TM) i5 CPU       M 560  @ 2.67GHz
> (Performance cpu governer enabled)
> Guest: virtio user net, virtio block on raw file, 1 CPU, 512MB RAM.
> (Qemu run under eatmydata to eliminate syncs)

FYI, cache=unsafe is equivalent to using eatmydata.

> First test: ping -f -c 10000 -q 10.0.2.0 (100 times)
> (Ping chosen since packets stay in qemu's user net code)
>
> BEFORE:
>         MIN: 824ms
>         MAX: 914ms
>         AVG: 876.95ms
>         STDDEV: 16ms
>
> AFTER:
>         MIN: 872ms
>         MAX: 933ms
>         AVG: 904.35ms
>         STDDEV: 15ms

I can reproduce this although I also see a larger standard deviation.

BEFORE:
	MIN: 496
	MAX: 1055
        AVG: 873.22
        STDEV: 136.88

AFTER:
        MIN: 494
        MAX: 1456
        AVG: 947.77
        STDEV: 150.89

In my datasets, the stdev is higher in the after case implying that
there is more variation.  Indeed, the MIN is pretty much the same.

GCC is inlining the functions, I'm still surprised that it's measurable
at all.

At any rate, I think the advantage of not increasing the amount of
target specific code outweighs the performance difference here.  As you
said, if there is real I/O, the differences isn't noticable.

Regards,

Anthony Liguori