From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dor Laor Subject: Re: bad O_DIRECT read and write performance with small block sizes with virtio Date: Tue, 03 Aug 2010 09:35:23 +0300 Message-ID: <4C57B8AB.5030703@redhat.com> References: <1280769301.11871.10.camel@dogen> <4C570470.9000803@codemonkey.ws> Reply-To: dlaor@redhat.com Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: John Leach , kvm@vger.kernel.org, Stefan Hajnoczi , Anthony Liguori To: Stefan Hajnoczi Return-path: Received: from mx1.redhat.com ([209.132.183.28]:55606 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755210Ab0HCGfe (ORCPT ); Tue, 3 Aug 2010 02:35:34 -0400 In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On 08/02/2010 11:50 PM, Stefan Hajnoczi wrote: > On Mon, Aug 2, 2010 at 6:46 PM, Anthony Liguori wrote: >> On 08/02/2010 12:15 PM, John Leach wrote: >>> >>> Hi, >>> >>> I've come across a problem with read and write disk IO performance when >>> using O_DIRECT from within a kvm guest. With O_DIRECT, reads and writes >>> are much slower with smaller block sizes. Depending on the block size >>> used, I've seen 10 times slower. >>> >>> For example, with an 8k block size, reading directly from /dev/vdb >>> without O_DIRECT I see 750 MB/s, but with O_DIRECT I see 79 MB/s. >>> >>> As a comparison, reading in O_DIRECT mode in 8k blocks directly from the >>> backend device on the host gives 2.3 GB/s. Reading in O_DIRECT mode >>> from a xen guest on the same hardware manages 263 MB/s. >>> >> >> Stefan has a few fixes for this behavior that help a lot. One of them >> (avoiding memset) is already upstream but not in 0.12.x. >> >> The other two are not done yet but should be on the ML in the next couple >> weeks. They involve using ioeventfd for notification and unlocking the >> block queue lock while doing a kick notification. > > Thanks for mentioning those patches. The ioeventfd patch will be sent > this week, I'm checking that migration works correctly and then need > to check that vhost-net still works. > >>> Writing is affected in the same way, and exhibits the same behaviour >>> with O_SYNC too. >>> >>> Watching with vmstat on the host, I see the same number of blocks being >>> read, but about 14 times the number of context switches in O_DIRECT mode >>> (4500 cs vs. 63000 cs) and a little more cpu usage. >>> >>> The device I'm writing to is a device-mapper zero device that generates >>> zeros on read and throws away writes, you can set it up >>> at /dev/mapper/zero like this: >>> >>> echo "0 21474836480 zero" | dmsetup create zero >>> >>> My libvirt config for the disk is: >>> >>> >>> >>> >>> >>>
>> function='0x0'/> >>> >>> >>> which translates to the kvm arg: >>> >>> -device >>> virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 >>> -drive file=/dev/mapper/zero,if=none,id=drive-virtio-disk1,cache=none aio=native and change the io scheduler on the host to deadline should help as well. >>> >>> I'm testing with dd: >>> >>> dd if=/dev/vdb of=/dev/null bs=8k iflag=direct >>> >>> As a side note, as you increase the block size read performance in >>> O_DIRECT mode starts to overtake non O_DIRECT mode reads (from about >>> 150k block size). By 550k block size I'm seeing 1 GB/s reads with >>> O_DIRECT and 770 MB/s without. > > Can you take QEMU out of the picture and run the same test on the host: > > dd if=/dev/vdb of=/dev/null bs=8k iflag=direct > vs > dd if=/dev/vdb of=/dev/null bs=8k > > This isn't quite the same because QEMU will use a helper thread doing > preadv. I'm not sure what syscall dd will use. > > It should be close enough to determine whether QEMU and device > emulation are involved at all though, or whether these differences are > due to the host kernel code path down to the device mapper zero device > being different for normal vs O_DIRECT. > > Stefan > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html