From: Dor Laor <dlaor@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: John Leach <john@brightbox.co.uk>,
kvm@vger.kernel.org,
Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>,
Anthony Liguori <anthony@codemonkey.ws>
Subject: Re: bad O_DIRECT read and write performance with small block sizes with virtio
Date: Tue, 03 Aug 2010 09:35:23 +0300 [thread overview]
Message-ID: <4C57B8AB.5030703@redhat.com> (raw)
In-Reply-To: <AANLkTin-Vfa_ZT00m+uUegK0WKMvk+oUiqPmVYwSOJxZ@mail.gmail.com>
On 08/02/2010 11:50 PM, Stefan Hajnoczi wrote:
> On Mon, Aug 2, 2010 at 6:46 PM, Anthony Liguori<anthony@codemonkey.ws> wrote:
>> On 08/02/2010 12:15 PM, John Leach wrote:
>>>
>>> Hi,
>>>
>>> I've come across a problem with read and write disk IO performance when
>>> using O_DIRECT from within a kvm guest. With O_DIRECT, reads and writes
>>> are much slower with smaller block sizes. Depending on the block size
>>> used, I've seen 10 times slower.
>>>
>>> For example, with an 8k block size, reading directly from /dev/vdb
>>> without O_DIRECT I see 750 MB/s, but with O_DIRECT I see 79 MB/s.
>>>
>>> As a comparison, reading in O_DIRECT mode in 8k blocks directly from the
>>> backend device on the host gives 2.3 GB/s. Reading in O_DIRECT mode
>>> from a xen guest on the same hardware manages 263 MB/s.
>>>
>>
>> Stefan has a few fixes for this behavior that help a lot. One of them
>> (avoiding memset) is already upstream but not in 0.12.x.
>>
>> The other two are not done yet but should be on the ML in the next couple
>> weeks. They involve using ioeventfd for notification and unlocking the
>> block queue lock while doing a kick notification.
>
> Thanks for mentioning those patches. The ioeventfd patch will be sent
> this week, I'm checking that migration works correctly and then need
> to check that vhost-net still works.
>
>>> Writing is affected in the same way, and exhibits the same behaviour
>>> with O_SYNC too.
>>>
>>> Watching with vmstat on the host, I see the same number of blocks being
>>> read, but about 14 times the number of context switches in O_DIRECT mode
>>> (4500 cs vs. 63000 cs) and a little more cpu usage.
>>>
>>> The device I'm writing to is a device-mapper zero device that generates
>>> zeros on read and throws away writes, you can set it up
>>> at /dev/mapper/zero like this:
>>>
>>> echo "0 21474836480 zero" | dmsetup create zero
>>>
>>> My libvirt config for the disk is:
>>>
>>> <disk type='block' device='disk'>
>>> <driver cache='none'/>
>>> <source dev='/dev/mapper/zero'/>
>>> <target dev='vdb' bus='virtio'/>
>>> <address type='pci' domain='0x0000' bus='0x00' slot='0x06'
>>> function='0x0'/>
>>> </disk>
>>>
>>> which translates to the kvm arg:
>>>
>>> -device
>>> virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
>>> -drive file=/dev/mapper/zero,if=none,id=drive-virtio-disk1,cache=none
aio=native and change the io scheduler on the host to deadline should
help as well.
>>>
>>> I'm testing with dd:
>>>
>>> dd if=/dev/vdb of=/dev/null bs=8k iflag=direct
>>>
>>> As a side note, as you increase the block size read performance in
>>> O_DIRECT mode starts to overtake non O_DIRECT mode reads (from about
>>> 150k block size). By 550k block size I'm seeing 1 GB/s reads with
>>> O_DIRECT and 770 MB/s without.
>
> Can you take QEMU out of the picture and run the same test on the host:
>
> dd if=/dev/vdb of=/dev/null bs=8k iflag=direct
> vs
> dd if=/dev/vdb of=/dev/null bs=8k
>
> This isn't quite the same because QEMU will use a helper thread doing
> preadv. I'm not sure what syscall dd will use.
>
> It should be close enough to determine whether QEMU and device
> emulation are involved at all though, or whether these differences are
> due to the host kernel code path down to the device mapper zero device
> being different for normal vs O_DIRECT.
>
> Stefan
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2010-08-03 6:35 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-08-02 17:15 bad O_DIRECT read and write performance with small block sizes with virtio John Leach
2010-08-02 17:46 ` Anthony Liguori
2010-08-02 20:50 ` Stefan Hajnoczi
2010-08-03 6:35 ` Dor Laor [this message]
2010-08-03 14:52 ` John Leach
2010-08-03 14:40 ` John Leach
2010-08-03 14:44 ` Avi Kivity
2010-08-03 14:57 ` John Leach
2010-08-03 16:14 ` Avi Kivity
2010-08-04 12:38 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C57B8AB.5030703@redhat.com \
--to=dlaor@redhat.com \
--cc=anthony@codemonkey.ws \
--cc=john@brightbox.co.uk \
--cc=kvm@vger.kernel.org \
--cc=stefanha@gmail.com \
--cc=stefanha@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox