From: Stefan Hajnoczi <stefanha@redhat.com>
To: Wei Li <wei.d.li@oracle.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>,
Dongli Zhang <dongli.zhang@oracle.com>,
Paolo Bonzini <pbonzini@redhat.com>,
qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] Following up questions related to QEMU and I/O Thread
Date: Tue, 23 Apr 2019 13:04:53 +0100 [thread overview]
Message-ID: <20190423120453.GF32465@stefanha-x1.localdomain> (raw)
In-Reply-To: <3F7E854A-3C1D-4204-8C35-893FC0614796@oracle.com>
[-- Attachment #1: Type: text/plain, Size: 2913 bytes --]
On Mon, Apr 22, 2019 at 09:21:53PM -0700, Wei Li wrote:
> 2. kvm_stat or perf record -a -e kvm:\* counters for vmexits and
> interrupt injections. If these counters vary greatly between queue
> sizes, then that is usually a clue. It's possible to get higher
> performance by spending more CPU cycles although your system doesn't
> have many CPUs available, so I'm not sure if this is the case.
>
> [wei]: vmexits looks like a reason. I am using FIO tool to read/write block storage via following sample command, interesting thing is that kvm:kvm_exit count decreased from 846K to 395K after I increased num_queues from 2 to 4 while the vCPU count is 2.
> 1). Does this mean using more queues than vCPU count may increase IOPS via spending more CPU cycle?
> 2). Could you please help me better understand how more queues is able to spend more CPU cycle? Thanks!
> FIO command: fio --filename=/dev/sdb --direct=1 --rw=randrw --bs=4k --ioengine=libaio --iodepth=64 --numjobs=4 --time_based --group_reporting --name=iops --runtime=60 --eta-newline=1
>
> 3. Power management and polling (kvm.ko halt_poll_ns, tuned profiles,
> and QEMU iothread poll-max-ns). It's expensive to wake a CPU when it
> goes into a low power mode due to idle. There are several features
> that can keep the CPU awake or even poll so that request latency is
> reduced. The reason why the number of queues may matter is that
> kicking multiple queues may keep the CPU awake more than batching
> multiple requests onto a small number of queues.
> [wei]: CPU awake could be another reason, I noticed that kvm:kvm_vcpu_wakeup count decreased from 151K to 47K after I increased num_queues from 2 to 4 while the vCPU count is 2.
This suggests that wakeups are involved in the performance difference.
> 1). Does this mean more queues may keep CPU more busy and awake which reduced the vcpu wakeup time?
Yes, although it depends on how I/O requests are distributed across the
queues. You can check /proc/interrupts inside the guest to see
interrupt counts for the virtqueues.
> 2). If using more num_queues than vCPU count is able to get higher IOPS for this case, is it safe to use 4 queues while it only have 2 vCPU, or there is any concern or impact by using more queues than vCPU count which I should keep in mind?
2 vs 4 queues should be functionally identical. The only difference is
performance.
> In addition, does Virtio-scsi support Batch I/O Submission feature which may be able to increase the IOPS via reducing the number of system calls?
I don't see obvious batching support in drivers/scsi/virtio_scsi.c. The
Linux block layer supports batching but I'm not sure if the SCSI layer
does.
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]
WARNING: multiple messages have this Message-ID (diff)
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Wei Li <wei.d.li@oracle.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>,
Paolo Bonzini <pbonzini@redhat.com>,
qemu-devel@nongnu.org, Dongli Zhang <dongli.zhang@oracle.com>
Subject: Re: [Qemu-devel] Following up questions related to QEMU and I/O Thread
Date: Tue, 23 Apr 2019 13:04:53 +0100 [thread overview]
Message-ID: <20190423120453.GF32465@stefanha-x1.localdomain> (raw)
Message-ID: <20190423120453.OAQEuWK2G-Vqxs3qX4O15K4rWGOrssDcCGUdIG6Ymx4@z> (raw)
In-Reply-To: <3F7E854A-3C1D-4204-8C35-893FC0614796@oracle.com>
[-- Attachment #1: Type: text/plain, Size: 2913 bytes --]
On Mon, Apr 22, 2019 at 09:21:53PM -0700, Wei Li wrote:
> 2. kvm_stat or perf record -a -e kvm:\* counters for vmexits and
> interrupt injections. If these counters vary greatly between queue
> sizes, then that is usually a clue. It's possible to get higher
> performance by spending more CPU cycles although your system doesn't
> have many CPUs available, so I'm not sure if this is the case.
>
> [wei]: vmexits looks like a reason. I am using FIO tool to read/write block storage via following sample command, interesting thing is that kvm:kvm_exit count decreased from 846K to 395K after I increased num_queues from 2 to 4 while the vCPU count is 2.
> 1). Does this mean using more queues than vCPU count may increase IOPS via spending more CPU cycle?
> 2). Could you please help me better understand how more queues is able to spend more CPU cycle? Thanks!
> FIO command: fio --filename=/dev/sdb --direct=1 --rw=randrw --bs=4k --ioengine=libaio --iodepth=64 --numjobs=4 --time_based --group_reporting --name=iops --runtime=60 --eta-newline=1
>
> 3. Power management and polling (kvm.ko halt_poll_ns, tuned profiles,
> and QEMU iothread poll-max-ns). It's expensive to wake a CPU when it
> goes into a low power mode due to idle. There are several features
> that can keep the CPU awake or even poll so that request latency is
> reduced. The reason why the number of queues may matter is that
> kicking multiple queues may keep the CPU awake more than batching
> multiple requests onto a small number of queues.
> [wei]: CPU awake could be another reason, I noticed that kvm:kvm_vcpu_wakeup count decreased from 151K to 47K after I increased num_queues from 2 to 4 while the vCPU count is 2.
This suggests that wakeups are involved in the performance difference.
> 1). Does this mean more queues may keep CPU more busy and awake which reduced the vcpu wakeup time?
Yes, although it depends on how I/O requests are distributed across the
queues. You can check /proc/interrupts inside the guest to see
interrupt counts for the virtqueues.
> 2). If using more num_queues than vCPU count is able to get higher IOPS for this case, is it safe to use 4 queues while it only have 2 vCPU, or there is any concern or impact by using more queues than vCPU count which I should keep in mind?
2 vs 4 queues should be functionally identical. The only difference is
performance.
> In addition, does Virtio-scsi support Batch I/O Submission feature which may be able to increase the IOPS via reducing the number of system calls?
I don't see obvious batching support in drivers/scsi/virtio_scsi.c. The
Linux block layer supports batching but I'm not sure if the SCSI layer
does.
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]
next prev parent reply other threads:[~2019-04-23 12:05 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-04 17:33 [Qemu-devel] Following up questions related to QEMU and I/O Thread Wei Li
2019-03-05 17:29 ` Stefan Hajnoczi
[not found] ` <2D7F11D0-4A02-4A0F-961D-854240376B17@oracle.com>
2019-04-01 9:07 ` Stefan Hajnoczi
2019-04-05 21:09 ` Wei Li
2019-04-05 21:09 ` Wei Li
2019-04-16 14:01 ` Paolo Bonzini
2019-04-16 14:01 ` Paolo Bonzini
2019-04-17 1:38 ` Wei Li
2019-04-17 1:38 ` Wei Li
2019-04-17 12:17 ` Paolo Bonzini
2019-04-17 12:17 ` Paolo Bonzini
2019-04-18 3:34 ` Wei Li
2019-04-18 3:34 ` Wei Li
[not found] ` <CC372DF3-1AC6-46B5-98A5-21159497034A@oracle.com>
2019-04-15 17:34 ` Wei Li
2019-04-15 17:34 ` Wei Li
2019-04-15 23:23 ` Dongli Zhang
2019-04-15 23:23 ` Dongli Zhang
2019-04-16 9:20 ` Stefan Hajnoczi
2019-04-16 9:20 ` Stefan Hajnoczi
2019-04-17 1:42 ` Wei Li
2019-04-17 1:42 ` Wei Li
[not found] ` <8E5AF770-69ED-4D44-8A25-B51344996D9E@oracle.com>
2019-04-23 4:21 ` Wei Li
2019-04-23 4:21 ` Wei Li
2019-04-23 12:04 ` Stefan Hajnoczi [this message]
2019-04-23 12:04 ` Stefan Hajnoczi
2019-04-26 8:14 ` Paolo Bonzini
2019-04-26 8:14 ` Paolo Bonzini
2019-04-26 23:02 ` Wei Li
2019-04-26 23:02 ` Wei Li
2019-04-27 4:24 ` Paolo Bonzini
2019-04-27 4:24 ` Paolo Bonzini
2019-04-29 17:49 ` Wei Li
2019-04-29 17:49 ` Wei Li
2019-04-29 13:40 ` Stefan Hajnoczi
2019-04-29 13:40 ` Stefan Hajnoczi
2019-04-29 17:56 ` Wei Li
2019-04-29 17:56 ` Wei Li
2019-05-01 16:36 ` Stefan Hajnoczi
2019-05-01 16:36 ` Stefan Hajnoczi
2019-05-03 16:21 ` Wei Li
2019-05-03 16:21 ` Wei Li
2019-05-03 18:05 ` Paolo Bonzini
2019-05-03 18:05 ` Paolo Bonzini
2019-05-03 18:11 ` Wei Li
2019-05-03 18:11 ` Wei Li
2019-04-30 11:21 ` Paolo Bonzini
2019-04-30 11:21 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190423120453.GF32465@stefanha-x1.localdomain \
--to=stefanha@redhat.com \
--cc=dongli.zhang@oracle.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
--cc=wei.d.li@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).