From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=38220 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1P2mA6-0005DT-UZ for qemu-devel@nongnu.org; Mon, 04 Oct 2010 10:37:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1P2m3d-0003HM-NK for qemu-devel@nongnu.org; Mon, 04 Oct 2010 10:30:28 -0400 Received: from mail-bw0-f45.google.com ([209.85.214.45]:43079) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1P2m3d-0003H8-IS for qemu-devel@nongnu.org; Mon, 04 Oct 2010 10:30:25 -0400 Received: by bwz4 with SMTP id 4so4246586bwz.4 for ; Mon, 04 Oct 2010 07:30:24 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <4CA862A7.2080302@redhat.com> References: <1285855312-11739-1-git-send-email-stefanha@linux.vnet.ibm.com> <4CA862A7.2080302@redhat.com> Date: Mon, 4 Oct 2010 15:30:20 +0100 Message-ID: From: Stefan Hajnoczi Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: [Qemu-devel] Re: [PATCH] virtio: Use ioeventfd for virtqueue notify List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Avi Kivity Cc: Steve Dobbelstein , Anthony Liguori , Stefan Hajnoczi , kvm@vger.kernel.org, "Michael S. Tsirkin" , qemu-devel@nongnu.org, Khoa Huynh , Sridhar Samudrala On Sun, Oct 3, 2010 at 12:01 PM, Avi Kivity wrote: > =A0On 09/30/2010 04:01 PM, Stefan Hajnoczi wrote: >> >> Virtqueue notify is currently handled synchronously in userspace virtio. >> This prevents the vcpu from executing guest code while hardware >> emulation code handles the notify. >> >> On systems that support KVM, the ioeventfd mechanism can be used to make >> virtqueue notify a lightweight exit by deferring hardware emulation to >> the iothread and allowing the VM to continue execution. =A0This model is >> similar to how vhost receives virtqueue notifies. > > Note that this is a tradeoff. =A0If an idle core is available and the > scheduler places the iothread on that core, then the heavyweight exit is > replaced by a lightweight exit + IPI. =A0If the iothread is co-located wi= th > the vcpu, then we'll take a heavyweight exit in any case. > > The first case is very likely if the host cpu is undercommitted and there= is > heavy I/O activity. =A0This is a typical subsystem benchmark scenario (as > opposed to a system benchmark like specvirt). =A0My feeling is that total > system throughput will be decreased unless the scheduler is clever enough= to > place the iothread and vcpu on the same host cpu when the system is > overcommitted. > > We can't balance "feeling" against numbers, especially when we have a > precedent in vhost-net, so I think this should go in. =A0But I think we s= hould > also try to understand the effects of the extra IPIs and cacheline bounci= ng > that this creates. =A0While virtio was designed to minimize this, we know= it > has severe problems in this area. Right, there is a danger of optimizing for subsystem benchmark cases rather than real world usage. I have posted some results that we've gathered but more scrutiny is welcome. >> Khoa Huynh =A0collected the following data for >> virtio-blk with cache=3Dnone,aio=3Dnative: >> >> FFSB Test =A0 =A0 =A0 =A0 =A0Threads =A0Unmodified =A0Patched >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (MB/s) =A0 =A0 = =A0(MB/s) >> Large file create =A01 =A0 =A0 =A0 =A021.7 =A0 =A0 =A0 =A021.8 >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A08 =A0 =A0 =A0 =A0101.0 =A0 =A0 = =A0 118.0 >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A016 =A0 =A0 =A0 119.0 =A0 =A0 =A0 = 157.0 >> >> Sequential reads =A0 1 =A0 =A0 =A0 =A021.9 =A0 =A0 =A0 =A023.2 >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A08 =A0 =A0 =A0 =A0114.0 =A0 =A0 = =A0 139.0 >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A016 =A0 =A0 =A0 143.0 =A0 =A0 =A0 = 178.0 >> >> Random reads =A0 =A0 =A0 1 =A0 =A0 =A0 =A03.3 =A0 =A0 =A0 =A0 3.6 >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A08 =A0 =A0 =A0 =A023.0 =A0 =A0 =A0= =A025.4 >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A016 =A0 =A0 =A0 43.3 =A0 =A0 =A0 = =A047.8 >> >> Random writes =A0 =A0 =A01 =A0 =A0 =A0 =A022.2 =A0 =A0 =A0 =A023.0 >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A08 =A0 =A0 =A0 =A093.1 =A0 =A0 =A0= =A0111.6 >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A016 =A0 =A0 =A0 110.5 =A0 =A0 =A0 = 132.0 > > Impressive numbers. =A0Can you also provide efficiency (bytes per host cp= u > seconds)? Khoa, do you have the host CPU numbers for these benchmark runs? > How many guest vcpus were used with this? =A0With enough vcpus, there is = also > a reduction in cacheline bouncing, since the virtio state in the host get= s > to stay on one cpu (especially with aio=3Dnative). Guest: 2 vcpu, 4 GB RAM Host: 16 cpus, 12 GB RAM Khoa, is this correct? Stefan