From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Me95g-00017i-CF for qemu-devel@nongnu.org; Thu, 20 Aug 2009 10:58:12 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Me95b-00011R-NB for qemu-devel@nongnu.org; Thu, 20 Aug 2009 10:58:11 -0400 Received: from [199.232.76.173] (port=50835 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Me95b-00010q-D8 for qemu-devel@nongnu.org; Thu, 20 Aug 2009 10:58:07 -0400 Received: from verein.lst.de ([213.95.11.210]:50896) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_3DES_EDE_CBC_SHA1:24) (Exim 4.60) (envelope-from ) id 1Me95a-0005kV-KC for qemu-devel@nongnu.org; Thu, 20 Aug 2009 10:58:06 -0400 Date: Thu, 20 Aug 2009 16:58:03 +0200 From: Christoph Hellwig Message-ID: <20090820145803.GA23578@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Subject: [Qemu-devel] [PATCH 0/2] native Linux AIO support revisited List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org This patchset introduces support native Linux AIO. The first patch just refactors the existing AIO emulation by thread pools to have a cleaner layering which allows the native AIO support to be implemented more easily. The second patch introduces real native Linux AIO support, although due to limitations in the kernel implementation we only can use it for cache=none. It is vaguely based on Anhony's earlier patches, but due to the refactoring in the first patch is is much simpler. Instead of trying to fit into the model of the Posix AIO API we directly integrate into the raw-posix code with a very lean interface (see the first patch for a more detailed explanation). That also means we can just register the AIO completion eventd directly with the qemu poll handler instead of needing an additional indirection. The IO code performs slightly better than the thread pool on most workloads I've thrown at it, and uses a lot less CPU time for it: iozone -s 1024m -r $num -I -f /dev/sdb output is in Kb/s: write 16k read 16k write 64k read 64k write 256k read 256k native 39133 75462 100980 156169 133642 168343 qemu 29998 48334 79870 116393 133090 161360 qemu+aio 32151 52902 82513 123893 133767 164113 dd if=/dev/zero of=$dev bs=20M oflag=direct count=400 dd if=$dev of=/dev/zero bs=20M iflag=direct count=400 output is in MB/s: write read native 116 123 qemu 116 100 qemu+aio 116 121 For all of this the AIO code used significantly less CPU time (no coparism to native due to VM startup overhead and other issues) real user sys qemu 25m45.885s 1m36.422s 1m49.394s qemu+aio 25m36.950s 1m14.178s 1m13.179s Note that the results have quite a bit of varions per run, so qemu+aio beeing faster in one of the tests above shouldn't mean too much, it's also been minimally slower in some. From various runs I would say that for larger block sizes we meat native performance, a little bit sooner with AIO, and a little bit without. All thes results are on a raw host device and using virtio. With image files on a filesystems there are potential blocking points in the AIO implementation. Those are relatively small or non-existant on already allocated (and at least for XFS that includes preallocated) files, but for spares files including waiting for disk I/O during allocations and need to be avoided to not kill performance. All the results also already include the MSI support for virtio-blk, btw. Based on this I would recommend to include this patch, but not use it by default for now. After some testing I would suggest to enable it by default for host devices and investigate a way to make it easily usable for files, possibly including some kernel support to tell us which files are "safe". These patches require my patch to my pthreads mandatory applies first, which already is in Anthony's queue. If you want to use them with qemu-kvm you also need to backout the compatfd changed to raw-block.c first.