public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Dor Laor <dlaor@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: John Leach <john@brightbox.co.uk>,
	kvm@vger.kernel.org,
	Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>,
	Anthony Liguori <anthony@codemonkey.ws>
Subject: Re: bad O_DIRECT read and write performance with small block sizes with virtio
Date: Tue, 03 Aug 2010 09:35:23 +0300	[thread overview]
Message-ID: <4C57B8AB.5030703@redhat.com> (raw)
In-Reply-To: <AANLkTin-Vfa_ZT00m+uUegK0WKMvk+oUiqPmVYwSOJxZ@mail.gmail.com>

On 08/02/2010 11:50 PM, Stefan Hajnoczi wrote:
> On Mon, Aug 2, 2010 at 6:46 PM, Anthony Liguori<anthony@codemonkey.ws>  wrote:
>> On 08/02/2010 12:15 PM, John Leach wrote:
>>>
>>> Hi,
>>>
>>> I've come across a problem with read and write disk IO performance when
>>> using O_DIRECT from within a kvm guest.  With O_DIRECT, reads and writes
>>> are much slower with smaller block sizes.  Depending on the block size
>>> used, I've seen 10 times slower.
>>>
>>> For example, with an 8k block size, reading directly from /dev/vdb
>>> without O_DIRECT I see 750 MB/s, but with O_DIRECT I see 79 MB/s.
>>>
>>> As a comparison, reading in O_DIRECT mode in 8k blocks directly from the
>>> backend device on the host gives 2.3 GB/s.  Reading in O_DIRECT mode
>>> from a xen guest on the same hardware manages 263 MB/s.
>>>
>>
>> Stefan has a few fixes for this behavior that help a lot.  One of them
>> (avoiding memset) is already upstream but not in 0.12.x.
>>
>> The other two are not done yet but should be on the ML in the next couple
>> weeks.  They involve using ioeventfd for notification and unlocking the
>> block queue lock while doing a kick notification.
>
> Thanks for mentioning those patches.  The ioeventfd patch will be sent
> this week, I'm checking that migration works correctly and then need
> to check that vhost-net still works.
>
>>> Writing is affected in the same way, and exhibits the same behaviour
>>> with O_SYNC too.
>>>
>>> Watching with vmstat on the host, I see the same number of blocks being
>>> read, but about 14 times the number of context switches in O_DIRECT mode
>>> (4500 cs vs. 63000 cs) and a little more cpu usage.
>>>
>>> The device I'm writing to is a device-mapper zero device that generates
>>> zeros on read and throws away writes, you can set it up
>>> at /dev/mapper/zero like this:
>>>
>>> echo "0 21474836480 zero" | dmsetup create zero
>>>
>>> My libvirt config for the disk is:
>>>
>>> <disk type='block' device='disk'>
>>>    <driver cache='none'/>
>>>    <source dev='/dev/mapper/zero'/>
>>>    <target dev='vdb' bus='virtio'/>
>>>    <address type='pci' domain='0x0000' bus='0x00' slot='0x06'
>>> function='0x0'/>
>>> </disk>
>>>
>>> which translates to the kvm arg:
>>>
>>> -device
>>> virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
>>> -drive file=/dev/mapper/zero,if=none,id=drive-virtio-disk1,cache=none

aio=native and change the io scheduler on the host to deadline should 
help as well.

>>>
>>> I'm testing with dd:
>>>
>>> dd if=/dev/vdb of=/dev/null bs=8k iflag=direct
>>>
>>> As a side note, as you increase the block size read performance in
>>> O_DIRECT mode starts to overtake non O_DIRECT mode reads (from about
>>> 150k block size). By 550k block size I'm seeing 1 GB/s reads with
>>> O_DIRECT and 770 MB/s without.
>
> Can you take QEMU out of the picture and run the same test on the host:
>
> dd if=/dev/vdb of=/dev/null bs=8k iflag=direct
> vs
> dd if=/dev/vdb of=/dev/null bs=8k
>
> This isn't quite the same because QEMU will use a helper thread doing
> preadv.  I'm not sure what syscall dd will use.
>
> It should be close enough to determine whether QEMU and device
> emulation are involved at all though, or whether these differences are
> due to the host kernel code path down to the device mapper zero device
> being different for normal vs O_DIRECT.
>
> Stefan
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


  reply	other threads:[~2010-08-03  6:35 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-02 17:15 bad O_DIRECT read and write performance with small block sizes with virtio John Leach
2010-08-02 17:46 ` Anthony Liguori
2010-08-02 20:50   ` Stefan Hajnoczi
2010-08-03  6:35     ` Dor Laor [this message]
2010-08-03 14:52       ` John Leach
2010-08-03 14:40     ` John Leach
2010-08-03 14:44       ` Avi Kivity
2010-08-03 14:57         ` John Leach
2010-08-03 16:14           ` Avi Kivity
2010-08-04 12:38             ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C57B8AB.5030703@redhat.com \
    --to=dlaor@redhat.com \
    --cc=anthony@codemonkey.ws \
    --cc=john@brightbox.co.uk \
    --cc=kvm@vger.kernel.org \
    --cc=stefanha@gmail.com \
    --cc=stefanha@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox