From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=50835 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PAQ17-00008c-T4 for qemu-devel@nongnu.org; Mon, 25 Oct 2010 12:35:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PAQ16-00016f-8d for qemu-devel@nongnu.org; Mon, 25 Oct 2010 12:35:25 -0400 Received: from mtagate7.uk.ibm.com ([194.196.100.167]:54215) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PAQ16-00015s-1R for qemu-devel@nongnu.org; Mon, 25 Oct 2010 12:35:24 -0400 Received: from d06nrmr1806.portsmouth.uk.ibm.com (d06nrmr1806.portsmouth.uk.ibm.com [9.149.39.193]) by mtagate7.uk.ibm.com (8.13.1/8.13.1) with ESMTP id o9PDQD9u024357 for ; Mon, 25 Oct 2010 13:26:13 GMT Received: from d06av10.portsmouth.uk.ibm.com (d06av10.portsmouth.uk.ibm.com [9.149.37.251]) by d06nrmr1806.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id o9PDQDj52920560 for ; Mon, 25 Oct 2010 14:26:13 +0100 Received: from d06av10.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av10.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id o9PDP7YF013410 for ; Mon, 25 Oct 2010 07:25:07 -0600 Date: Mon, 25 Oct 2010 14:25:05 +0100 From: Stefan Hajnoczi Message-ID: <20101025132458.GA2886@stefan-thinkpad.transitives.com> References: <1285855312-11739-1-git-send-email-stefanha@linux.vnet.ibm.com> <20101019133330.GB18341@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20101019133330.GB18341@redhat.com> Subject: [Qemu-devel] Re: [PATCH] virtio: Use ioeventfd for virtqueue notify List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: Steve Dobbelstein , Anthony Liguori , kvm@vger.kernel.org, qemu-devel@nongnu.org, Khoa Huynh , Sridhar Samudrala On Tue, Oct 19, 2010 at 03:33:41PM +0200, Michael S. Tsirkin wrote: > My main concern is with the fact that we add more state > in notifiers that can easily get out of sync with users. > If we absolutely need this state, let's try to at least > document the state machine, and make the API > for state transitions more transparent. I'll try to describe how it works. If you're happy with the design in principle then I can rework the code. Otherwise we can think about a different design. The goal is to use ioeventfd instead of the synchronous pio emulation path that userspace virtqueues use today. Both virtio-blk and virtio-net increase performance with this approach because it does not block the vcpu from executing guest code while the I/O operation is initiated. We want to automatically create an event notifier and setup ioeventfd for each initialized virtqueue. Vhost already uses ioeventfd so it is important not to interfere with devices that have enabled vhost. If vhost is enabled, then the device's virtqueues are off-limits and should not be tampered with. Furthermore, older kernels limit you to 6 ioeventfds per guest. On such systems it is risky to automatically use ioeventfd for userspace virtqueues, since that could take a precious ioeventfd away from another virtio device using vhost. Existing guest configurations would break so it is simplest to avoid using ioeventfd for userspace virtqueues on such hosts. The design adds logic into hw/virtio.c to automatically use ioeventfd for userspace virtqueues. Specific virtio devices like blk and net require no modification. The logic sits below the set_host_notifier() function that vhost uses. This design stays in sync because it speaks two interfaces that allow it to accurately track whether or not to use ioeventfd: 1. virtio_set_host_notifier() is used by vhost. When vhost enables the host notifier we stay out of the way. 2. virtio_reset()/virtio_set_status()/virtio_load() define the device life-cycle and transition the state machine appropriately. Migration is supported. Here is the state machine that tracks a virtqueue: assigned ^ / \ ^ e. / / c. g. \ \ b. / / \ \ / v f. v \ a. offlimits ---------------> deassigned <-- start <--------------- d. a. The virtqueue starts deassigned with no ioeventfd. b. When the device status becomes VIRTIO_CONFIG_S_DRIVER_OK we try to assign an ioeventfd to each virtqueue, except if the 6 ioeventfd limitation is present. c, d. The virtqueue becomes offlimits if vhost enables the host notifier. e. The ioeventfd becomes assigned again when the host notifier is disabled by vhost. f. Except when the 6 ioeventfd limitation is present, then the ioeventfd becomes unassigned because we want to avoid using ioeventfd. g. When the device is reset its virtqueues become deassigned again. Does this make sense? Stefan