From: John Leach <john@brightbox.co.uk>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: kvm@vger.kernel.org, Anthony Liguori <anthony@codemonkey.ws>
Subject: Re: bad O_DIRECT read and write performance with small block sizes with virtio
Date: Tue, 03 Aug 2010 15:40:24 +0100 [thread overview]
Message-ID: <1280846424.13790.26.camel@dogen> (raw)
In-Reply-To: <AANLkTin-Vfa_ZT00m+uUegK0WKMvk+oUiqPmVYwSOJxZ@mail.gmail.com>
On Mon, 2010-08-02 at 21:50 +0100, Stefan Hajnoczi wrote:
> On Mon, Aug 2, 2010 at 6:46 PM, Anthony Liguori <anthony@codemonkey.ws> wrote:
> > On 08/02/2010 12:15 PM, John Leach wrote:
> >>
> >> Hi,
> >>
> >> I've come across a problem with read and write disk IO performance when
> >> using O_DIRECT from within a kvm guest. With O_DIRECT, reads and writes
> >> are much slower with smaller block sizes. Depending on the block size
> >> used, I've seen 10 times slower.
> >>
> >> For example, with an 8k block size, reading directly from /dev/vdb
> >> without O_DIRECT I see 750 MB/s, but with O_DIRECT I see 79 MB/s.
> >>
> >> As a comparison, reading in O_DIRECT mode in 8k blocks directly from the
> >> backend device on the host gives 2.3 GB/s. Reading in O_DIRECT mode
> >> from a xen guest on the same hardware manages 263 MB/s.
> >>
> >
> > Stefan has a few fixes for this behavior that help a lot. One of them
> > (avoiding memset) is already upstream but not in 0.12.x.
Anthony, that patch is already applied in the RHEL6 package I'm been
testing with - I've just manually confirmed that. Thanks though.
> >
> > The other two are not done yet but should be on the ML in the next couple
> > weeks. They involve using ioeventfd for notification and unlocking the
> > block queue lock while doing a kick notification.
>
> Thanks for mentioning those patches. The ioeventfd patch will be sent
> this week, I'm checking that migration works correctly and then need
> to check that vhost-net still works.
I'll give them a test as soon as I can get hold of them, thanks Stefan!
> >> Writing is affected in the same way, and exhibits the same behaviour
> >> with O_SYNC too.
> >>
> >> Watching with vmstat on the host, I see the same number of blocks being
> >> read, but about 14 times the number of context switches in O_DIRECT mode
> >> (4500 cs vs. 63000 cs) and a little more cpu usage.
> >>
> >> The device I'm writing to is a device-mapper zero device that generates
> >> zeros on read and throws away writes, you can set it up
> >> at /dev/mapper/zero like this:
> >>
> >> echo "0 21474836480 zero" | dmsetup create zero
> >>
> >> My libvirt config for the disk is:
> >>
> >> <disk type='block' device='disk'>
> >> <driver cache='none'/>
> >> <source dev='/dev/mapper/zero'/>
> >> <target dev='vdb' bus='virtio'/>
> >> <address type='pci' domain='0x0000' bus='0x00' slot='0x06'
> >> function='0x0'/>
> >> </disk>
> >>
> >> which translates to the kvm arg:
> >>
> >> -device
> >> virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
> >> -drive file=/dev/mapper/zero,if=none,id=drive-virtio-disk1,cache=none
> >>
> >> I'm testing with dd:
> >>
> >> dd if=/dev/vdb of=/dev/null bs=8k iflag=direct
> >>
> >> As a side note, as you increase the block size read performance in
> >> O_DIRECT mode starts to overtake non O_DIRECT mode reads (from about
> >> 150k block size). By 550k block size I'm seeing 1 GB/s reads with
> >> O_DIRECT and 770 MB/s without.
>
> Can you take QEMU out of the picture and run the same test on the host:
>
> dd if=/dev/vdb of=/dev/null bs=8k iflag=direct
> vs
> dd if=/dev/vdb of=/dev/null bs=8k
>
> This isn't quite the same because QEMU will use a helper thread doing
> preadv. I'm not sure what syscall dd will use.
>
> It should be close enough to determine whether QEMU and device
> emulation are involved at all though, or whether these differences are
> due to the host kernel code path down to the device mapper zero device
> being different for normal vs O_DIRECT.
dd if=/dev/mapper/zero of=/dev/null bs=8k count=1000000 iflag=direct
8192000000 bytes (8.2 GB) copied, 3.46529 s, 2.4 GB/s
dd if=/dev/mapper/zero of=/dev/null bs=8k count=1000000
8192000000 bytes (8.2 GB) copied, 5.5741 s, 1.5 GB/s
dd is just using read.
Thanks,
John.
next prev parent reply other threads:[~2010-08-03 14:40 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-08-02 17:15 bad O_DIRECT read and write performance with small block sizes with virtio John Leach
2010-08-02 17:46 ` Anthony Liguori
2010-08-02 20:50 ` Stefan Hajnoczi
2010-08-03 6:35 ` Dor Laor
2010-08-03 14:52 ` John Leach
2010-08-03 14:40 ` John Leach [this message]
2010-08-03 14:44 ` Avi Kivity
2010-08-03 14:57 ` John Leach
2010-08-03 16:14 ` Avi Kivity
2010-08-04 12:38 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1280846424.13790.26.camel@dogen \
--to=john@brightbox.co.uk \
--cc=anthony@codemonkey.ws \
--cc=kvm@vger.kernel.org \
--cc=stefanha@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox