From: Sagi Grimberg <sagi@grimberg.me>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Qemu Developers <qemu-devel@nongnu.org>
Subject: Re: virtio-blk using a single iothread
Date: Sun, 11 Jun 2023 15:27:57 +0300 [thread overview]
Message-ID: <d8028f17-8d33-790b-8d3e-fa1170108774@grimberg.me> (raw)
In-Reply-To: <20230608160817.GK2138915@fedora>
On 6/8/23 19:08, Stefan Hajnoczi wrote:
> On Thu, Jun 08, 2023 at 10:40:57AM +0300, Sagi Grimberg wrote:
>> Hey Stefan, Paolo,
>>
>> I just had a report from a user experiencing lower virtio-blk
>> performance than he expected. This user is running virtio-blk on top of
>> nvme-tcp device. The guest is running 12 CPU cores.
>>
>> The guest read/write throughput is capped at around 30% of the available
>> throughput from the host (~800MB/s from the guest vs. 2800MB/s from the
>> host - 25Gb/s nic). The workload running on the guest is a
>> multi-threaded fio workload.
>>
>> What is observed is the fact that virtio-blk is using a single disk-wide
>> iothread processing all the vqs. Specifically nvme-tcp (similar to other
>> tcp based protocols) is negatively impacted by lack of thread
>> concurrency that can distribute I/O requests to different TCP
>> connections.
>>
>> We also attempted to move the iothread to a dedicated core, however that
>> did yield any meaningful performance improvements). The reason appears
>> to be less about CPU utilization on the iothread core, but more around
>> single TCP connection serialization.
>>
>> Moving to io=threads does increase the throughput, however sacrificing
>> latency significantly.
>>
>> So the user find itself with available host cpus and TCP connections
>> that it could easily use to get maximum throughput, without the ability
>> to leverage them. True, other guests will use different
>> threads/contexts, however the goal here is to allow the full performance
>> from a single device.
>>
>> I've seen several discussions and attempts in the past to allow a
>> virtio-blk device leverage multiple iothreads, but around 2 years ago
>> the discussions over this paused. So wanted to ask, are there any plans
>> or anything in the works to address this limitation?
>>
>> I've seen that the spdk folks are heading in this direction with their
>> vhost-blk implementation:
>> https://review.spdk.io/gerrit/c/spdk/spdk/+/16068
>
> Hi Sagi,
> Yes, there is an ongoing QEMU multi-queue block layer effort to make it
> possible for multiple IOThreads to process disk I/O for the same
> --blockdev in parallel.
Great to know.
> Most of my recent QEMU patches have been part of this effort. There is a
> work-in-progress branch that supports mapping virtio-blk virtqueues to
> specific IOThreads:
> https://gitlab.com/stefanha/qemu/-/commits/virtio-blk-iothread-vq-mapping
Thanks for the pointer.
> The syntax is:
>
> --device '{"driver":"virtio-blk-pci","iothread-vq-mapping":[{"iothread":"iothread0"},{"iothread":"iothread1"}],"drive":"drive0"}'
>
> This says "assign virtqueues round-robin to iothread0 and iothread1".
> Half the virtqueues will be processed by iothread0 and the other half by
> iothread1. There is also syntax for assigning specific virtqueues to
> each IOThread, but usually the automatic round-robin assignment is all
> that's needed.
>
> This work is not finished yet. Basic I/O (e.g. fio) works without
> crashes, but expect to hit issues if you use blockjobs, hotplug, etc.
>
> Performance optimization work has just begun, so it won't deliver all
> the benefits yet. I ran a benchmark yesterday where going from 1 to 2
> IOThreads increased performance by 25%. That's much less than we're
> aiming for; attaching two independent virtio-blk devices improves the
> performance by ~100%. I know we can get there eventually. Some of the
> bottlenecks are known (e.g. block statistics collection causes lock
> contention) and others are yet to be investigated.
Hmm, I rebased this branch on top of mainline master and ran a naive
test, and it seems that performance regressed quite a bit :(
I'm running this test on my laptop (Intel(R) Core(TM) i7-8650U CPU
@1.90GHz), so this is more qualitative test for BW only.
I use null_blk as the host device.
With mainline master I get ~9GB/s 64k randread, and with your branch
I get ~5GB/s, this is regardless of assigning iothreads (one or
two) or not.
my qemu command:
taskset -c 0-3 build/qemu-system-x86_64 -cpu host -m 1G -enable-kvm -smp
4 -drive
file=/var/lib/libvirt/images/ubuntu-22/root-disk-clone.qcow2,format=qcow2
-drive
if=none,id=drive0,cache=none,aio=native,format=raw,file=/dev/nullb0
-device virtio-blk-pci,drive=drive0,scsi=off -nographic
my guest fio jobfile:
--
[global]
group_reporting
runtime=3000
time_based
loops=1
direct=1
invalidate=1
randrepeat=0
norandommap
exitall
cpus_allowed=0-3
cpus_allowed_policy=split
[read]
filename=/dev/vda
numjobs=4
iodepth=32
bs=64k
rw=randread
ioengine=io_uring
--
Maybe I'm doing something wrong? Didn't expect to find a regression
against mainline on the default setup.
next prev parent reply other threads:[~2023-06-11 12:28 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-08 7:40 virtio-blk using a single iothread Sagi Grimberg
2023-06-08 16:08 ` Stefan Hajnoczi
2023-06-11 12:27 ` Sagi Grimberg [this message]
2023-06-21 12:23 ` Stefan Hajnoczi
2023-07-27 15:11 ` Stefan Hajnoczi
2023-07-31 15:51 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d8028f17-8d33-790b-8d3e-fa1170108774@grimberg.me \
--to=sagi@grimberg.me \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).