qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Denis V. Lunev" <den@virtuozzo.com>
To: Stefan Hajnoczi <stefanha@gmail.com>,
	Denis Plotnikov <dplotnikov@virtuozzo.com>
Cc: qemu-devel <qemu-devel@nongnu.org>,
	Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Subject: Re: QEMU 5.0 virtio-blk performance regression with high queue depths
Date: Wed, 16 Sep 2020 19:43:30 +0300	[thread overview]
Message-ID: <dd5d6d0f-cc52-d3d5-0bbc-c57dcfef6842@virtuozzo.com> (raw)
In-Reply-To: <709da6a3-d158-270b-fb63-43ef65dfe668@virtuozzo.com>

On 9/16/20 5:07 PM, Denis V. Lunev wrote:
> On 9/16/20 4:32 PM, Stefan Hajnoczi wrote:
>> On Thu, Aug 27, 2020 at 3:24 PM Stefan Hajnoczi <stefanha@gmail.com> wrote:
>>> Hi Denis,
>>> A performance regression was found after the virtio-blk queue-size
>>> property was increased from 128 to 256 in QEMU 5.0 in commit
>>> c9b7d9ec21dfca716f0bb3b68dee75660d86629c ("virtio: increase virtqueue
>>> size for virtio-scsi and virtio-blk"). I wanted to let you know if case
>>> you have ideas or see something similar.
>> Ping, have you noticed performance regressions after switching to
>> virtio-blk queue-size 256?
> oops, I have missed original letter.
>
> Denis Plotnikov have left the team at the moment.
>
>
>>> Throughput and IOPS of the following fio benchmarks dropped by 30-40%:
>>>
>>>   # mkfs.xfs /dev/vdb
>>>   # mount /dev/vdb /mnt
>>>   # fio --rw=%s --bs=%s --iodepth=64 --runtime=1m --direct=1 --filename=/mnt/%s --name=job1 --ioengine=libaio --thread --group_reporting --numjobs=16 --size=512MB --time_based --output=/tmp/fio_result &> /dev/null
>>>     - rw: read write
>>>     - bs: 4k 64k
>>>
>>> Note that there are 16 threads submitting 64 requests each! The guest
>>> block device queue depth will be maxed out. The virtqueue should be full
>>> most of the time.
>>>
>>> Have you seen regressions after virtio-blk queue-size was increased in
>>> QEMU 5.0?
>>>
>>> Here are the details of the host storage:
>>>
>>>   # mkfs.xfs /dev/sdb # 60GB SSD drive
>>>   # mount /dev/sdb /mnt/test
>>>   # qemu-img create -f qcow2 /mnt/test/storage2.qcow2 40G
>>>
>>> The guest command-line is:
>>>
>>>   # MALLOC_PERTURB_=1 numactl \
>>>     -m 1  /usr/libexec/qemu-kvm \
>>>     -S  \
>>>     -name 'avocado-vt-vm1'  \
>>>     -sandbox on  \
>>>     -machine q35 \
>>>     -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
>>>     -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
>>>     -nodefaults \
>>>     -device VGA,bus=pcie.0,addr=0x2 \
>>>     -m 4096  \
>>>     -smp 2,maxcpus=2,cores=1,threads=1,dies=1,sockets=2  \
>>>     -cpu 'IvyBridge',+kvm_pv_unhalt \
>>>     -chardev socket,server,id=qmp_id_qmpmonitor1,nowait,path=/var/tmp/avocado_bapfdqao/monitor-qmpmonitor1-20200721-014154-5HJGMjxW  \
>>>     -mon chardev=qmp_id_qmpmonitor1,mode=control \
>>>     -chardev socket,server,id=qmp_id_catch_monitor,nowait,path=/var/tmp/avocado_bapfdqao/monitor-catch_monitor-20200721-014154-5HJGMjxW  \
>>>     -mon chardev=qmp_id_catch_monitor,mode=control \
>>>     -device pvpanic,ioport=0x505,id=id31BN83 \
>>>     -chardev socket,server,id=chardev_serial0,nowait,path=/var/tmp/avocado_bapfdqao/serial-serial0-20200721-014154-5HJGMjxW \
>>>     -device isa-serial,id=serial0,chardev=chardev_serial0  \
>>>     -chardev socket,id=seabioslog_id_20200721-014154-5HJGMjxW,path=/var/tmp/avocado_bapfdqao/seabios-20200721-014154-5HJGMjxW,server,nowait \
>>>     -device isa-debugcon,chardev=seabioslog_id_20200721-014154-5HJGMjxW,iobase=0x402 \
>>>     -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
>>>     -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
>>>     -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
>>>     -blockdev node-name=file_image1,driver=file,aio=threads,filename=rootfs.qcow2,cache.direct=on,cache.no-flush=off \
>>>     -blockdev node-name=drive_image1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \
>>>     -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
>>>     -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,write-cache=on,bus=pcie-root-port-2,addr=0x0 \
>>>     -blockdev node-name=file_disk1,driver=file,aio=threads,filename=/mnt/test/storage2.qcow2,cache.direct=on,cache.no-flush=off \
>>>     -blockdev node-name=drive_disk1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_disk1 \
>>>     -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
>>>     -device virtio-blk-pci,id=disk1,drive=drive_disk1,bootindex=1,write-cache=on,bus=pcie-root-port-3,addr=0x0 \
>>>     -device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x1.0x4,bus=pcie.0,chassis=5 \
>>>     -device virtio-net-pci,mac=9a:37:37:37:37:4e,id=idBMd7vy,netdev=idLb51aS,bus=pcie-root-port-4,addr=0x0  \
>>>     -netdev tap,id=idLb51aS,fd=14  \
>>>     -vnc :0  \
>>>     -rtc base=utc,clock=host,driftfix=slew  \
>>>     -boot menu=off,order=cdn,once=c,strict=off \
>>>     -enable-kvm \
>>>     -device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=6
> I will make a check today.
>
> Talking about our performance measurements, we have not
> seen ANY performance degradation, especially 30-40%.
> This looking quite strange to me.
>
> Though there is quite important difference. We are always
> using O_DIRECT and 'native' AIO engine.
>
> Den

I have put my hands into this and it looks like you are right. There is
a difference. It is not as significant for me as in your case, but I observe
stable around 10% difference with 128 vs 256 queue size.

I have checked with:
- QEMU 5.1
- Fedora 31 in guest
- qcow2 (64k, 1Mb) and raw image on host
- nocache and both threaded/native IO modes

The test was run on Thinkpad Carbon X1 gen 6 laptop.

For the reference, I have seen 330k IOPS for read
at max which is looking awesome for native and 220k
IOPS for threads.

Den


  reply	other threads:[~2020-09-16 16:44 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-24 13:44 QEMU 5.0 virtio-blk performance regression with high queue depths Stefan Hajnoczi
2020-09-16 13:32 ` Stefan Hajnoczi
2020-09-16 14:07   ` Denis V. Lunev
2020-09-16 16:43     ` Denis V. Lunev [this message]
2020-09-17 12:41       ` Stefan Hajnoczi
2020-09-18  9:59         ` Denis V. Lunev
2020-09-21  8:42           ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dd5d6d0f-cc52-d3d5-0bbc-c57dcfef6842@virtuozzo.com \
    --to=den@virtuozzo.com \
    --cc=dplotnikov@virtuozzo.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).