[Qemu-devel] extremely low IOPS performance of QCOW2 image format on an SSD RAID1

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] extremely low IOPS performance of QCOW2 image format on an SSD RAID1
@ 2014-06-23  2:06 lihuiba
  2014-06-23  3:01 ` Fam Zheng
  0 siblings, 1 reply; 8+ messages in thread
From: lihuiba @ 2014-06-23  2:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: qiujian, mengcong

[-- Attachment #1: Type: text/plain, Size: 1816 bytes --]

Hi, all

I'm using a qcow2 image stored on a SSD RAID1 (2 x intel S3500), and I'm benchmarking the
system using fio. Although the throughput in VM (with KVM and virtio enabled) is acceptable (67%
of thoughtput in host), the IOPS performance seems is extremely low ---- only 2% of IOPS in host.

I was initially using qemu-1.1.2, and I also tried qemu-1.7.1 for comparison. There was no significant
difference.

In contrast, raw image and LVM perform very well. They usually achieve 90%+ of throughput and
60%+ of IOPS. So the problem must lie in the QCOW2 image format.

And I observed that, when I perform 4KB IOPS benchmark in VM with a QCOW2 image, fio in VM reports
it is reading 9.x MB/s, while iostat in host reports the SSD is being read 150+ MB/s. So QEMU or QCOW2
must have amplified the amount of read by nearly 16 times.

So, how can I fix or tune the performance issue of qcow2?

Thanks!

PS:
1. qemu parameters:
-enable-kvm -cpu qemu64 -rtc base=utc,clock=host,driftfix=none -usb -device usb-tablet -nodefaults -nodefconfig -no-kvm-pit-reinjection -global kvm-pit.lost_tick_policy=discard -machine pc,accel=kvm -vga std -k en-us -smp 8 -m 4096 -boot order=cdn -vnc :1 -drive file=$1,if=none,id=drive_0,cache=none,aio=native -device virtio-blk-pci,drive=drive_0,bus=pci.0,addr=0x5 -drive file=$2,if=none,id=drive_2,cache=none,aio=native -device virtio-blk-pci,drive=drive_2,bus=pci.0,addr=0x7

2. fio parameters for IOPS:
fio --filename=/dev/vdb --direct=1 --ioengine=libaio --iodepth 32 --thread --numjobs=1 --rw=randread --bs=4k --size=100% --runtime=60s --group_reporting --name=test

3. fio parameters for throughput:
fio --filename=/dev/vdb--direct=1 --ioengine=psync --thread --numjobs=3 --rw=randread --bs=1024k --size=100% --runtime=60s --name=randread --group_reporting -name=test

[-- Attachment #2: Type: text/html, Size: 2255 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] extremely low IOPS performance of QCOW2 image format on an SSD RAID1
  2014-06-23  2:06 [Qemu-devel] extremely low IOPS performance of QCOW2 image format on an SSD RAID1 lihuiba
@ 2014-06-23  3:01 ` Fam Zheng
  2014-06-23  3:14   ` lihuiba
  0 siblings, 1 reply; 8+ messages in thread
From: Fam Zheng @ 2014-06-23  3:01 UTC (permalink / raw)
  To: lihuiba; +Cc: qiujian, qemu-devel, mengcong

On Mon, 06/23 10:06, lihuiba wrote:
> Hi, all
> 
> 
> I'm using a qcow2 image stored on a SSD RAID1 (2 x intel S3500), and I'm benchmarking the
> system using fio. Although the throughput in VM (with KVM and virtio enabled) is acceptable (67%
> of thoughtput in host), the IOPS performance seems is extremely low ---- only 2% of IOPS in host.
> 
> 
> I was initially using qemu-1.1.2, and I also tried qemu-1.7.1 for comparison. There was no significant
> difference.
> 
> 
> In contrast, raw image and LVM perform very well. They usually achieve 90%+ of throughput and
> 60%+ of IOPS. So the problem must lie in the QCOW2 image format.
> 
> 
> And I observed that, when I perform 4KB IOPS benchmark in VM with a QCOW2 image, fio in VM reports
> it is reading 9.x MB/s, while iostat in host reports the SSD is being read 150+ MB/s. So QEMU or QCOW2
> must have amplified the amount of read by nearly 16 times.
> 
> 
> So, how can I fix or tune the performance issue of qcow2?

Did you prefill the image? Amplification could come from cluster allocation.

Fam

> 
> 
> Thanks!
> 
> 
> 
> 
> PS:
> 1. qemu parameters:
> -enable-kvm -cpu qemu64 -rtc base=utc,clock=host,driftfix=none -usb -device usb-tablet -nodefaults -nodefconfig -no-kvm-pit-reinjection -global kvm-pit.lost_tick_policy=discard -machine pc,accel=kvm -vga std -k en-us -smp 8 -m 4096 -boot order=cdn -vnc :1 -drive file=$1,if=none,id=drive_0,cache=none,aio=native -device virtio-blk-pci,drive=drive_0,bus=pci.0,addr=0x5 -drive file=$2,if=none,id=drive_2,cache=none,aio=native -device virtio-blk-pci,drive=drive_2,bus=pci.0,addr=0x7
> 
> 
> 2. fio parameters for IOPS:
> fio --filename=/dev/vdb --direct=1 --ioengine=libaio --iodepth 32 --thread --numjobs=1 --rw=randread --bs=4k --size=100% --runtime=60s --group_reporting --name=test
> 
> 
> 3. fio parameters for throughput:
> fio --filename=/dev/vdb--direct=1 --ioengine=psync --thread --numjobs=3 --rw=randread --bs=1024k --size=100% --runtime=60s --name=randread --group_reporting -name=test
> 
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] extremely low IOPS performance of QCOW2 image format on an SSD RAID1
  2014-06-23  3:01 ` Fam Zheng
@ 2014-06-23  3:14   ` lihuiba
  2014-06-23  3:22     ` Fam Zheng
  2014-06-23  7:30     ` Stefan Hajnoczi
  0 siblings, 2 replies; 8+ messages in thread
From: lihuiba @ 2014-06-23  3:14 UTC (permalink / raw)
  To: Fam Zheng; +Cc: qiujian, qemu-devel, mengcong

[-- Attachment #1: Type: text/plain, Size: 2546 bytes --]

>Did you prefill the image? Amplification could come from cluster allocation.
Yes! 
I forgot to mention that I created the qcow2 image with 'preallocation=metadata', and I have allocated
the data blocks with dd in VM.


Creating image in host:
qemu-img create -f qcow2 -preallocation=metadata test.qcow2 100G


Allocating the blocks in VM:
dd if=/dev/zero of=/dev/vdb bs=1M
where vdb is the target image.







At 2014-06-23 11:01:20, "Fam Zheng" <famz@redhat.com> wrote:
>On Mon, 06/23 10:06, lihuiba wrote:
>> Hi, all
>> 
>> 
>> I'm using a qcow2 image stored on a SSD RAID1 (2 x intel S3500), and I'm benchmarking the
>> system using fio. Although the throughput in VM (with KVM and virtio enabled) is acceptable (67%
>> of thoughtput in host), the IOPS performance seems is extremely low ---- only 2% of IOPS in host.
>> 
>> 
>> I was initially using qemu-1.1.2, and I also tried qemu-1.7.1 for comparison. There was no significant
>> difference.
>> 
>> 
>> In contrast, raw image and LVM perform very well. They usually achieve 90%+ of throughput and
>> 60%+ of IOPS. So the problem must lie in the QCOW2 image format.
>> 
>> 
>> And I observed that, when I perform 4KB IOPS benchmark in VM with a QCOW2 image, fio in VM reports
>> it is reading 9.x MB/s, while iostat in host reports the SSD is being read 150+ MB/s. So QEMU or QCOW2
>> must have amplified the amount of read by nearly 16 times.
>> 
>> 
>> So, how can I fix or tune the performance issue of qcow2?
>
>Did you prefill the image? Amplification could come from cluster allocation.
>
>Fam
>
>> 
>> 
>> Thanks!
>> 
>> 
>> 
>> 
>> PS:
>> 1. qemu parameters:
>> -enable-kvm -cpu qemu64 -rtc base=utc,clock=host,driftfix=none -usb -device usb-tablet -nodefaults -nodefconfig -no-kvm-pit-reinjection -global kvm-pit.lost_tick_policy=discard -machine pc,accel=kvm -vga std -k en-us -smp 8 -m 4096 -boot order=cdn -vnc :1 -drive file=$1,if=none,id=drive_0,cache=none,aio=native -device virtio-blk-pci,drive=drive_0,bus=pci.0,addr=0x5 -drive file=$2,if=none,id=drive_2,cache=none,aio=native -device virtio-blk-pci,drive=drive_2,bus=pci.0,addr=0x7
>> 
>> 
>> 2. fio parameters for IOPS:
>> fio --filename=/dev/vdb --direct=1 --ioengine=libaio --iodepth 32 --thread --numjobs=1 --rw=randread --bs=4k --size=100% --runtime=60s --group_reporting --name=test
>> 
>> 
>> 3. fio parameters for throughput:
>> fio --filename=/dev/vdb--direct=1 --ioengine=psync --thread --numjobs=3 --rw=randread --bs=1024k --size=100% --runtime=60s --name=randread --group_reporting -name=test
>> 
>> 
>> 
>

[-- Attachment #2: Type: text/html, Size: 3189 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] extremely low IOPS performance of QCOW2 image format on an SSD RAID1
  2014-06-23  3:14   ` lihuiba
@ 2014-06-23  3:22     ` Fam Zheng
  2014-06-23  6:20       ` lihuiba
  2014-06-23  7:30     ` Stefan Hajnoczi
  1 sibling, 1 reply; 8+ messages in thread
From: Fam Zheng @ 2014-06-23  3:22 UTC (permalink / raw)
  To: lihuiba; +Cc: kwolf, stefanha, qiujian, qemu-devel, mengcong

Cc'ing more qcow2 experts.

On Mon, 06/23 11:14, lihuiba wrote:
> >Did you prefill the image? Amplification could come from cluster allocation.
> Yes! 
> I forgot to mention that I created the qcow2 image with 'preallocation=metadata', and I have allocated
> the data blocks with dd in VM.
> 
> 
> Creating image in host:
> qemu-img create -f qcow2 -preallocation=metadata test.qcow2 100G
> 
> 
> Allocating the blocks in VM:
> dd if=/dev/zero of=/dev/vdb bs=1M
> where vdb is the target image.
> 
> 
> 
> 
> 
> 
> 
> At 2014-06-23 11:01:20, "Fam Zheng" <famz@redhat.com> wrote:
> >On Mon, 06/23 10:06, lihuiba wrote:
> >> Hi, all
> >> 
> >> 
> >> I'm using a qcow2 image stored on a SSD RAID1 (2 x intel S3500), and I'm benchmarking the
> >> system using fio. Although the throughput in VM (with KVM and virtio enabled) is acceptable (67%
> >> of thoughtput in host), the IOPS performance seems is extremely low ---- only 2% of IOPS in host.
> >> 
> >> 
> >> I was initially using qemu-1.1.2, and I also tried qemu-1.7.1 for comparison. There was no significant
> >> difference.
> >> 
> >> 
> >> In contrast, raw image and LVM perform very well. They usually achieve 90%+ of throughput and
> >> 60%+ of IOPS. So the problem must lie in the QCOW2 image format.
> >> 
> >> 
> >> And I observed that, when I perform 4KB IOPS benchmark in VM with a QCOW2 image, fio in VM reports
> >> it is reading 9.x MB/s, while iostat in host reports the SSD is being read 150+ MB/s. So QEMU or QCOW2
> >> must have amplified the amount of read by nearly 16 times.
> >> 
> >> 
> >> So, how can I fix or tune the performance issue of qcow2?
> >
> >Did you prefill the image? Amplification could come from cluster allocation.
> >
> >Fam
> >
> >> 
> >> 
> >> Thanks!
> >> 
> >> 
> >> 
> >> 
> >> PS:
> >> 1. qemu parameters:
> >> -enable-kvm -cpu qemu64 -rtc base=utc,clock=host,driftfix=none -usb -device usb-tablet -nodefaults -nodefconfig -no-kvm-pit-reinjection -global kvm-pit.lost_tick_policy=discard -machine pc,accel=kvm -vga std -k en-us -smp 8 -m 4096 -boot order=cdn -vnc :1 -drive file=$1,if=none,id=drive_0,cache=none,aio=native -device virtio-blk-pci,drive=drive_0,bus=pci.0,addr=0x5 -drive file=$2,if=none,id=drive_2,cache=none,aio=native -device virtio-blk-pci,drive=drive_2,bus=pci.0,addr=0x7
> >> 
> >> 
> >> 2. fio parameters for IOPS:
> >> fio --filename=/dev/vdb --direct=1 --ioengine=libaio --iodepth 32 --thread --numjobs=1 --rw=randread --bs=4k --size=100% --runtime=60s --group_reporting --name=test
> >> 
> >> 
> >> 3. fio parameters for throughput:
> >> fio --filename=/dev/vdb--direct=1 --ioengine=psync --thread --numjobs=3 --rw=randread --bs=1024k --size=100% --runtime=60s --name=randread --group_reporting -name=test
> >> 
> >> 
> >> 
> >

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] extremely low IOPS performance of QCOW2 image format on an SSD RAID1
  2014-06-23  3:22     ` Fam Zheng
@ 2014-06-23  6:20       ` lihuiba
  2014-06-23  8:25         ` Stefan Hajnoczi
  0 siblings, 1 reply; 8+ messages in thread
From: lihuiba @ 2014-06-23  6:20 UTC (permalink / raw)
  To: Fam Zheng; +Cc: kwolf, qiujian, qemu-devel, stefanha, mengcong

[-- Attachment #1: Type: text/plain, Size: 3295 bytes --]

I think I have found the reason:
There's a cache in qemu that accelerates the transform of virtual LBA to cluster offset of qcow2 image.
The cache has a fixed size of 16x8192=128k in my configuration, which corresponds to a 8GB (128K*64KB)
mapping size. So when the "working set" of fio exceeds 8GB, the transform wil be degraded to reading
external table, and the performances goes extremely low.








At 2014-06-23 11:22:37, "Fam Zheng" <famz@redhat.com> wrote:
>Cc'ing more qcow2 experts.
>
>On Mon, 06/23 11:14, lihuiba wrote:
>> >Did you prefill the image? Amplification could come from cluster allocation.
>> Yes! 
>> I forgot to mention that I created the qcow2 image with 'preallocation=metadata', and I have allocated
>> the data blocks with dd in VM.
>> 
>> 
>> Creating image in host:
>> qemu-img create -f qcow2 -preallocation=metadata test.qcow2 100G
>> 
>> 
>> Allocating the blocks in VM:
>> dd if=/dev/zero of=/dev/vdb bs=1M
>> where vdb is the target image.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> At 2014-06-23 11:01:20, "Fam Zheng" <famz@redhat.com> wrote:
>> >On Mon, 06/23 10:06, lihuiba wrote:
>> >> Hi, all
>> >> 
>> >> 
>> >> I'm using a qcow2 image stored on a SSD RAID1 (2 x intel S3500), and I'm benchmarking the
>> >> system using fio. Although the throughput in VM (with KVM and virtio enabled) is acceptable (67%
>> >> of thoughtput in host), the IOPS performance seems is extremely low ---- only 2% of IOPS in host.
>> >> 
>> >> 
>> >> I was initially using qemu-1.1.2, and I also tried qemu-1.7.1 for comparison. There was no significant
>> >> difference.
>> >> 
>> >> 
>> >> In contrast, raw image and LVM perform very well. They usually achieve 90%+ of throughput and
>> >> 60%+ of IOPS. So the problem must lie in the QCOW2 image format.
>> >> 
>> >> 
>> >> And I observed that, when I perform 4KB IOPS benchmark in VM with a QCOW2 image, fio in VM reports
>> >> it is reading 9.x MB/s, while iostat in host reports the SSD is being read 150+ MB/s. So QEMU or QCOW2
>> >> must have amplified the amount of read by nearly 16 times.
>> >> 
>> >> 
>> >> So, how can I fix or tune the performance issue of qcow2?
>> >
>> >Did you prefill the image? Amplification could come from cluster allocation.
>> >
>> >Fam
>> >
>> >> 
>> >> 
>> >> Thanks!
>> >> 
>> >> 
>> >> 
>> >> 
>> >> PS:
>> >> 1. qemu parameters:
>> >> -enable-kvm -cpu qemu64 -rtc base=utc,clock=host,driftfix=none -usb -device usb-tablet -nodefaults -nodefconfig -no-kvm-pit-reinjection -global kvm-pit.lost_tick_policy=discard -machine pc,accel=kvm -vga std -k en-us -smp 8 -m 4096 -boot order=cdn -vnc :1 -drive file=$1,if=none,id=drive_0,cache=none,aio=native -device virtio-blk-pci,drive=drive_0,bus=pci.0,addr=0x5 -drive file=$2,if=none,id=drive_2,cache=none,aio=native -device virtio-blk-pci,drive=drive_2,bus=pci.0,addr=0x7
>> >> 
>> >> 
>> >> 2. fio parameters for IOPS:
>> >> fio --filename=/dev/vdb --direct=1 --ioengine=libaio --iodepth 32 --thread --numjobs=1 --rw=randread --bs=4k --size=100% --runtime=60s --group_reporting --name=test
>> >> 
>> >> 
>> >> 3. fio parameters for throughput:
>> >> fio --filename=/dev/vdb--direct=1 --ioengine=psync --thread --numjobs=3 --rw=randread --bs=1024k --size=100% --runtime=60s --name=randread --group_reporting -name=test
>> >> 
>> >> 
>> >> 
>> >
>

[-- Attachment #2: Type: text/html, Size: 4315 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] extremely low IOPS performance of QCOW2 image format on an SSD RAID1
  2014-06-23  3:14   ` lihuiba
  2014-06-23  3:22     ` Fam Zheng
@ 2014-06-23  7:30     ` Stefan Hajnoczi
  1 sibling, 0 replies; 8+ messages in thread
From: Stefan Hajnoczi @ 2014-06-23  7:30 UTC (permalink / raw)
  To: lihuiba; +Cc: qiujian, Fam Zheng, qemu-devel, mengcong

[-- Attachment #1: Type: text/plain, Size: 924 bytes --]

On Mon, Jun 23, 2014 at 11:14:25AM +0800, lihuiba wrote:
> >Did you prefill the image? Amplification could come from cluster allocation.
> Yes! 
> I forgot to mention that I created the qcow2 image with 'preallocation=metadata', and I have allocated
> the data blocks with dd in VM.
> 
> 
> Creating image in host:
> qemu-img create -f qcow2 -preallocation=metadata test.qcow2 100G
> 
> 
> Allocating the blocks in VM:
> dd if=/dev/zero of=/dev/vdb bs=1M
> where vdb is the target image.

If you used dd inside the guest to preallocate the entire image, then
both benchmark read and write requests work similarly to raw image
access (the qcow2 metadata is cached in RAM and metadata updates are
necessary, so accesses are very similar to raw).

Did you use the same file system on the host when comparing qcow2 to
raw image files?

Please post the full QEMU command-line and fio job files.

Stefan

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] extremely low IOPS performance of QCOW2 image format on an SSD RAID1
  2014-06-23  6:20       ` lihuiba
@ 2014-06-23  8:25         ` Stefan Hajnoczi
       [not found]           ` <225b854d.98bc.146ccc39a83.Coremail.magazine.lihuiba@163.com>
  0 siblings, 1 reply; 8+ messages in thread
From: Stefan Hajnoczi @ 2014-06-23  8:25 UTC (permalink / raw)
  To: lihuiba; +Cc: kwolf, qiujian, Fam Zheng, qemu-devel, mengcong

[-- Attachment #1: Type: text/plain, Size: 742 bytes --]

On Mon, Jun 23, 2014 at 02:20:25PM +0800, lihuiba wrote:
> I think I have found the reason:
> There's a cache in qemu that accelerates the transform of virtual LBA to cluster offset of qcow2 image.
> The cache has a fixed size of 16x8192=128k in my configuration, which corresponds to a 8GB (128K*64KB)
> mapping size. So when the "working set" of fio exceeds 8GB, the transform wil be degraded to reading
> external table, and the performances goes extremely low.

Can you confirm that making L2_CACHE_SIZE much bigger solves the
problem?

You also have the optional of specifying the cluster size when creating
the qcow2 image file.  A larger cluster size reduces the amount of
metadata overhead and therefore increases cache hits.

Stefan

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] extremely low IOPS performance of QCOW2 image format on an SSD RAID1
       [not found]           ` <225b854d.98bc.146ccc39a83.Coremail.magazine.lihuiba@163.com>
@ 2014-06-24 10:15             ` Kevin Wolf
  0 siblings, 0 replies; 8+ messages in thread
From: Kevin Wolf @ 2014-06-24 10:15 UTC (permalink / raw)
  To: lihuiba; +Cc: qiujian, Fam Zheng, qemu-devel, Stefan Hajnoczi, mengcong

Am 24.06.2014 um 09:25 hat lihuiba geschrieben:
> >Can you confirm that making L2_CACHE_SIZE much bigger solves the
> >problem?
> Yes, it is confirmed.
> When I run fio randread with a 7GB or 8GB size, result is close to that of raw image.
> But when the size is increased to 9GB, the result drops dramatically. And I have modified
> qcow2-cache.c to print a log when cache misses. When testing the 9GB size, there are
> lots of cache misses shown.
> 
> >You also have the optional of specifying the cluster size when creating
> >the qcow2 image file.  A larger cluster size reduces the amount of
> >metadata overhead and therefore increases cache hits.
> I didn't find any command-line option to increase the size of the cache, so I increased
> cluster_size to 1MB or 2MB. This worked very well for me.
> 
> BTW
> qemu-img of version 1.7.1 has bug when creating qcow2 image with options preallocation=metadata
> and cluster_size > 64K. It reports:
> 
> 
> # qemu-img create -f qcow2 -o preallocation=metadata,cluster_size=1M asdf.qcow2 100G
> Formatting 'asdf.qcow2', fmt=qcow2 size=107374182400 encryption=off cluster_size=1048576 preallocation='metadata' lazy_refcounts=off 
> qemu-img: block/qcow2-cluster.c:1196: qcow2_alloc_cluster_offset: Assertion `n_start * (1ULL << 9) == offset_into_cluster(s, offset)' failed.

This appears to be fixed in current git master.

Kevin

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-06-24 10:16 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-23  2:06 [Qemu-devel] extremely low IOPS performance of QCOW2 image format on an SSD RAID1 lihuiba
2014-06-23  3:01 ` Fam Zheng
2014-06-23  3:14   ` lihuiba
2014-06-23  3:22     ` Fam Zheng
2014-06-23  6:20       ` lihuiba
2014-06-23  8:25         ` Stefan Hajnoczi
     [not found]           ` <225b854d.98bc.146ccc39a83.Coremail.magazine.lihuiba@163.com>
2014-06-24 10:15             ` Kevin Wolf
2014-06-23  7:30     ` Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).