From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48509) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cDr08-0004p4-Ry for qemu-devel@nongnu.org; Mon, 05 Dec 2016 06:00:06 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cDr03-0008SJ-UT for qemu-devel@nongnu.org; Mon, 05 Dec 2016 06:00:04 -0500 Received: from mx4-phx2.redhat.com ([209.132.183.25]:40167) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cDr03-0008SA-MF for qemu-devel@nongnu.org; Mon, 05 Dec 2016 05:59:59 -0500 Date: Mon, 5 Dec 2016 05:59:55 -0500 (EST) From: Paolo Bonzini Message-ID: <1548288582.1411113.1480935595505.JavaMail.zimbra@redhat.com> In-Reply-To: <20161205101452.GB22443@stefanha-x1.localdomain> References: <20161201192652.9509-1-stefanha@redhat.com> <20161201192652.9509-5-stefanha@redhat.com> <20161205005459.GA21702@lemon> <20161205101452.GB22443@stefanha-x1.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v4 04/13] virtio: poll virtqueues for new buffers List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: Fam Zheng , Stefan Hajnoczi , borntraeger@de.ibm.com, Karl Rister , qemu-devel@nongnu.org, v.maffione@gmail.com ----- Original Message ----- > From: "Stefan Hajnoczi" > To: "Fam Zheng" > Cc: "Stefan Hajnoczi" , borntraeger@de.ibm.com, "Karl Rister" , > qemu-devel@nongnu.org, "Paolo Bonzini" > Sent: Monday, December 5, 2016 11:14:52 AM > Subject: Re: [Qemu-devel] [PATCH v4 04/13] virtio: poll virtqueues for new buffers > > On Mon, Dec 05, 2016 at 08:54:59AM +0800, Fam Zheng wrote: > > On Thu, 12/01 19:26, Stefan Hajnoczi wrote: > > > Add an AioContext poll handler to detect new virtqueue buffers without > > > waiting for a guest->host notification. > > > > > > Signed-off-by: Stefan Hajnoczi > > > --- > > > hw/virtio/virtio.c | 16 +++++++++++++++- > > > 1 file changed, 15 insertions(+), 1 deletion(-) > > > > > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c > > > index 6f8ca25..3870411 100644 > > > --- a/hw/virtio/virtio.c > > > +++ b/hw/virtio/virtio.c > > > @@ -2047,13 +2047,27 @@ static void > > > virtio_queue_host_notifier_aio_read(EventNotifier *n) > > > } > > > } > > > > > > +static bool virtio_queue_host_notifier_aio_poll(void *opaque) > > > +{ > > > + EventNotifier *n = opaque; > > > + VirtQueue *vq = container_of(n, VirtQueue, host_notifier); > > > + > > > + if (virtio_queue_empty(vq)) { > > > > I notice this call path gets very hot in the profile: > > > > #0 address_space_translate_internal (d=0x7f853c22e970, addr=2140788738, > > xlat=xlat@entry=0x7f8549db59e8, plen=plen@entry=0x7f8549db5a68, > > resolve_subpage=resolve_subpage@entry=true) at > > /root/qemu-nvme/exec.c:419 > > #1 0x00007f854d3e8449 in address_space_translate (as=, > > addr=2140788738, > > xlat=xlat@entry=0x7f8549db5a70, plen=plen@entry=0x7f8549db5a68, > > is_write=is_write@entry=false) at /root/qemu-nvme/exec.c:462 > > #2 0x00007f854d3ee2fd in address_space_lduw_internal > > (endian=DEVICE_LITTLE_ENDIAN, > > result=0x0, attrs=..., addr=, as=) > > at /root/qemu-nvme/exec.c:3299 > > #3 address_space_lduw_le (result=0x0, attrs=..., addr=, > > as=) > > at /root/qemu-nvme/exec.c:3351 > > #4 lduw_le_phys (as=, addr=) at > > /root/qemu-nvme/exec.c:3369 > > #5 0x00007f854d475978 in virtio_lduw_phys (vdev=, > > pa=) > > at /root/qemu-nvme/include/hw/virtio/virtio-access.h:46 > > #6 vring_avail_idx (vq=, vq=) > > at /root/qemu-nvme/hw/virtio/virtio.c:143 > > #7 virtio_queue_empty (vq=vq@entry=0x7f854fd0c9e0) at > > /root/qemu-nvme/hw/virtio/virtio.c:246 > > #8 0x00007f854d475d20 in virtio_queue_empty (vq=0x7f854fd0c9e0) > > at /root/qemu-nvme/hw/virtio/virtio.c:2074 > > #9 virtio_queue_host_notifier_aio_poll (opaque=0x7f854fd0ca48) > > at /root/qemu-nvme/hw/virtio/virtio.c:2068 > > #10 0x00007f854d69116e in run_poll_handlers_once (ctx=) at > > aio-posix.c:493 > > #11 0x00007f854d691b08 in run_poll_handlers (max_ns=, > > ctx=0x7f854e908850) > > at aio-posix.c:530 > > #12 try_poll_mode (blocking=true, ctx=0x7f854e908850) at aio-posix.c:558 > > #13 aio_poll (ctx=0x7f854e908850, blocking=blocking@entry=true) at > > aio-posix.c:601 > > #14 0x00007f854d4ff12e in iothread_run (opaque=0x7f854e9082f0) at > > iothread.c:53 > > #15 0x00007f854c249dc5 in start_thread () from /lib64/libpthread.so.0 > > #16 0x00007f854acd973d in clone () from /lib64/libc.so.6 > > > > Is it too costly to do an address_space_translate per poll? > > Good insight. > > If we map guest RAM we should consider mapping the vring in > hw/virtio/virtio.c. That way not just polling benefits. > > The map/unmap API was not designed for repeated memory accesses. In the > bounce buffer case you have a stale copy - repeated reads will not > update it. Vrings should be in guest RAM so we don't really need to > support bounce buffering but it's still a hack to use the map/unmap API. > > Paolo: Thoughts about the memory API? Yeah, I had discussed this recently with Fam, and prototyped (read: compiled the code) an API like this: int64_t address_space_cache_init(MemoryRegionCache *cache, AddressSpace *as, hwaddr addr, hwaddr len, bool is_write); /* Only for is_write == true. */ void address_space_cache_invalidate(MemoryRegionCache *cache, hwaddr addr, hwaddr access_len); uint32_t address_space_lduw_cached(MemoryRegionCache *cache, hwaddr addr, MemTxAttrs attrs, MemTxResult *result); ... It's basically the same API as usual, but with address_space_translate reduced to *xlat = addr - cache->xlat; *plen = MIN(*plen, cache->len - *xlat); return cache->mr; and likewise for other hot spots of the ld/st API. Originally I thought of two ways we could use it: 1) update the cache with a MemoryListener 2) hoist the cache out of the hot loops, but still initialize it once per iteration. This was before Fam tested your patches, and the polling case makes (1) the only practical way. (2) of course would have been less code, but if the idea is used by other devices (Vincenzo and the netmap gang wanted something similar in e1000, for example) perhaps we could just add a method to PCIDeviceClass and VirtioDeviceClass. Separate from this, and unrelated to polling, we can and should use the map/unmap API for descriptors, both direct and indirect. This is simple and I hope to test it some time this week. Paolo