From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from [140.186.70.92] (port=38220 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1P2mA6-0005DT-UZ
	for qemu-devel@nongnu.org; Mon, 04 Oct 2010 10:37:08 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <stefanha@gmail.com>) id 1P2m3d-0003HM-NK
	for qemu-devel@nongnu.org; Mon, 04 Oct 2010 10:30:28 -0400
Received: from mail-bw0-f45.google.com ([209.85.214.45]:43079)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stefanha@gmail.com>) id 1P2m3d-0003H8-IS
	for qemu-devel@nongnu.org; Mon, 04 Oct 2010 10:30:25 -0400
Received: by bwz4 with SMTP id 4so4246586bwz.4
	for <qemu-devel@nongnu.org>; Mon, 04 Oct 2010 07:30:24 -0700 (PDT)
MIME-Version: 1.0
In-Reply-To: <4CA862A7.2080302@redhat.com>
References: <1285855312-11739-1-git-send-email-stefanha@linux.vnet.ibm.com>
	<4CA862A7.2080302@redhat.com>
Date: Mon, 4 Oct 2010 15:30:20 +0100
Message-ID: <AANLkTinQANMzznSam5P=d3MuTR9_2ajgBQK80fdqU2V_@mail.gmail.com>
From: Stefan Hajnoczi <stefanha@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Subject: [Qemu-devel] Re: [PATCH] virtio: Use ioeventfd for virtqueue notify
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Avi Kivity <avi@redhat.com>
Cc: Steve Dobbelstein <steved@us.ibm.com>, Anthony Liguori <aliguori@us.ibm.com>, Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>, kvm@vger.kernel.org, "Michael S. Tsirkin" <mst@redhat.com>, qemu-devel@nongnu.org, Khoa Huynh <khoa@us.ibm.com>, Sridhar Samudrala <sri@us.ibm.com>

On Sun, Oct 3, 2010 at 12:01 PM, Avi Kivity <avi@redhat.com> wrote:
> =A0On 09/30/2010 04:01 PM, Stefan Hajnoczi wrote:
>>
>> Virtqueue notify is currently handled synchronously in userspace virtio.
>> This prevents the vcpu from executing guest code while hardware
>> emulation code handles the notify.
>>
>> On systems that support KVM, the ioeventfd mechanism can be used to make
>> virtqueue notify a lightweight exit by deferring hardware emulation to
>> the iothread and allowing the VM to continue execution. =A0This model is
>> similar to how vhost receives virtqueue notifies.
>
> Note that this is a tradeoff. =A0If an idle core is available and the
> scheduler places the iothread on that core, then the heavyweight exit is
> replaced by a lightweight exit + IPI. =A0If the iothread is co-located wi=
th
> the vcpu, then we'll take a heavyweight exit in any case.
>
> The first case is very likely if the host cpu is undercommitted and there=
 is
> heavy I/O activity. =A0This is a typical subsystem benchmark scenario (as
> opposed to a system benchmark like specvirt). =A0My feeling is that total
> system throughput will be decreased unless the scheduler is clever enough=
 to
> place the iothread and vcpu on the same host cpu when the system is
> overcommitted.
>
> We can't balance "feeling" against numbers, especially when we have a
> precedent in vhost-net, so I think this should go in. =A0But I think we s=
hould
> also try to understand the effects of the extra IPIs and cacheline bounci=
ng
> that this creates. =A0While virtio was designed to minimize this, we know=
 it
> has severe problems in this area.

Right, there is a danger of optimizing for subsystem benchmark cases
rather than real world usage.  I have posted some results that we've
gathered but more scrutiny is welcome.

>> Khoa Huynh<khoa@us.ibm.com> =A0collected the following data for
>> virtio-blk with cache=3Dnone,aio=3Dnative:
>>
>> FFSB Test =A0 =A0 =A0 =A0 =A0Threads =A0Unmodified =A0Patched
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (MB/s) =A0 =A0 =
=A0(MB/s)
>> Large file create =A01 =A0 =A0 =A0 =A021.7 =A0 =A0 =A0 =A021.8
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A08 =A0 =A0 =A0 =A0101.0 =A0 =A0 =
=A0 118.0
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A016 =A0 =A0 =A0 119.0 =A0 =A0 =A0 =
157.0
>>
>> Sequential reads =A0 1 =A0 =A0 =A0 =A021.9 =A0 =A0 =A0 =A023.2
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A08 =A0 =A0 =A0 =A0114.0 =A0 =A0 =
=A0 139.0
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A016 =A0 =A0 =A0 143.0 =A0 =A0 =A0 =
178.0
>>
>> Random reads =A0 =A0 =A0 1 =A0 =A0 =A0 =A03.3 =A0 =A0 =A0 =A0 3.6
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A08 =A0 =A0 =A0 =A023.0 =A0 =A0 =A0=
 =A025.4
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A016 =A0 =A0 =A0 43.3 =A0 =A0 =A0 =
=A047.8
>>
>> Random writes =A0 =A0 =A01 =A0 =A0 =A0 =A022.2 =A0 =A0 =A0 =A023.0
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A08 =A0 =A0 =A0 =A093.1 =A0 =A0 =A0=
 =A0111.6
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A016 =A0 =A0 =A0 110.5 =A0 =A0 =A0 =
132.0
>
> Impressive numbers. =A0Can you also provide efficiency (bytes per host cp=
u
> seconds)?

Khoa, do you have the host CPU numbers for these benchmark runs?

> How many guest vcpus were used with this? =A0With enough vcpus, there is =
also
> a reduction in cacheline bouncing, since the virtio state in the host get=
s
> to stay on one cpu (especially with aio=3Dnative).

Guest: 2 vcpu, 4 GB RAM
Host: 16 cpus, 12 GB RAM

Khoa, is this correct?

Stefan