From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=40791 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1P2MVh-0002gj-Tk for qemu-devel@nongnu.org; Sun, 03 Oct 2010 07:13:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1P2MKX-0001XE-00 for qemu-devel@nongnu.org; Sun, 03 Oct 2010 07:02:10 -0400 Received: from mx1.redhat.com ([209.132.183.28]:24343) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1P2MKW-0001Wy-PY for qemu-devel@nongnu.org; Sun, 03 Oct 2010 07:02:08 -0400 Message-ID: <4CA862A7.2080302@redhat.com> Date: Sun, 03 Oct 2010 13:01:59 +0200 From: Avi Kivity MIME-Version: 1.0 References: <1285855312-11739-1-git-send-email-stefanha@linux.vnet.ibm.com> In-Reply-To: <1285855312-11739-1-git-send-email-stefanha@linux.vnet.ibm.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: [PATCH] virtio: Use ioeventfd for virtqueue notify List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: Steve Dobbelstein , Anthony Liguori , kvm@vger.kernel.org, "Michael S. Tsirkin" , qemu-devel@nongnu.org, Khoa Huynh , Sridhar Samudrala On 09/30/2010 04:01 PM, Stefan Hajnoczi wrote: > Virtqueue notify is currently handled synchronously in userspace virtio. > This prevents the vcpu from executing guest code while hardware > emulation code handles the notify. > > On systems that support KVM, the ioeventfd mechanism can be used to make > virtqueue notify a lightweight exit by deferring hardware emulation to > the iothread and allowing the VM to continue execution. This model is > similar to how vhost receives virtqueue notifies. Note that this is a tradeoff. If an idle core is available and the scheduler places the iothread on that core, then the heavyweight exit is replaced by a lightweight exit + IPI. If the iothread is co-located with the vcpu, then we'll take a heavyweight exit in any case. The first case is very likely if the host cpu is undercommitted and there is heavy I/O activity. This is a typical subsystem benchmark scenario (as opposed to a system benchmark like specvirt). My feeling is that total system throughput will be decreased unless the scheduler is clever enough to place the iothread and vcpu on the same host cpu when the system is overcommitted. We can't balance "feeling" against numbers, especially when we have a precedent in vhost-net, so I think this should go in. But I think we should also try to understand the effects of the extra IPIs and cacheline bouncing that this creates. While virtio was designed to minimize this, we know it has severe problems in this area. > The result of this change is improved performance for userspace virtio > devices. Virtio-blk throughput increases especially for multithreaded > scenarios and virtio-net transmit throughput increases substantially. > Full numbers are below. > > This patch employs ioeventfd virtqueue notify for all virtio devices. > Linux kernels pre-2.6.34 only allow for 6 ioeventfds per VM and care > must be taken so that vhost-net, the other ioeventfd user in QEMU, is > able to function. On such kernels ioeventfd virtqueue notify will not > be used. > > Khoa Huynh collected the following data for > virtio-blk with cache=none,aio=native: > > FFSB Test Threads Unmodified Patched > (MB/s) (MB/s) > Large file create 1 21.7 21.8 > 8 101.0 118.0 > 16 119.0 157.0 > > Sequential reads 1 21.9 23.2 > 8 114.0 139.0 > 16 143.0 178.0 > > Random reads 1 3.3 3.6 > 8 23.0 25.4 > 16 43.3 47.8 > > Random writes 1 22.2 23.0 > 8 93.1 111.6 > 16 110.5 132.0 Impressive numbers. Can you also provide efficiency (bytes per host cpu seconds)? How many guest vcpus were used with this? With enough vcpus, there is also a reduction in cacheline bouncing, since the virtio state in the host gets to stay on one cpu (especially with aio=native). > Sridhar Samudrala collected the following data for > virtio-net with 2.6.36-rc1 on the host and 2.6.34 on the guest. > > Guest to Host TCP_STREAM throughput(Mb/sec) > ------------------------------------------- > Msg Size vhost-net virtio-net virtio-net/ioeventfd > 65536 12755 6430 7590 > 16384 8499 3084 5764 > 4096 4723 1578 3659 > 1024 1827 981 2060 Even more impressive (expected since the copying, which isn't present for block, is now shunted off into an iothread). On the last test you even exceeded vhost-net. Any theories how/why? Again, efficiency numbers would be interesting. > Host to Guest TCP_STREAM throughput(Mb/sec) > ------------------------------------------- > Msg Size vhost-net virtio-net virtio-net/ioeventfd > 65536 11156 5790 5853 > 16384 10787 5575 5691 > 4096 10452 5556 4277 > 1024 4437 3671 5277 Here you exceed vhost-net, too. > +static int kvm_check_many_iobus_devs(void) > +{ > + /* Older kernels have a 6 device limit on the KVM io bus. In that case > + * creating many ioeventfds must be avoided. This tests checks for the > + * limitation. > + */ > + EventNotifier notifiers[7]; > + int i, ret = 0; > + for (i = 0; i< ARRAY_SIZE(notifiers); i++) { > + ret = event_notifier_init(¬ifiers[i], 0); > + if (ret< 0) { > + break; > + } > + ret = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(¬ifiers[i]), 0, i, true); > + if (ret< 0) { > + event_notifier_cleanup(¬ifiers[i]); > + break; > + } > + } > + > + /* Decide whether many devices are supported or not */ > + ret = i == ARRAY_SIZE(notifiers); > + > + while (i--> 0) { > + kvm_set_ioeventfd_pio_word(event_notifier_get_fd(¬ifiers[i]), 0, i, false); > + event_notifier_cleanup(¬ifiers[i]); > + } > + return ret; > +} Sorry about that. IIRC there was a problem (shared by vhost-net) with interrupts remaining enabled in the window between the guest kicking the queue and the host waking up and disabling interrupts. An even more vague IIRC mst had an idea to fix this? -- error compiling committee.c: too many arguments to function