From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37493) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zqldk-00089O-DC for qemu-devel@nongnu.org; Mon, 26 Oct 2015 13:33:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zqldg-0000Ct-DD for qemu-devel@nongnu.org; Mon, 26 Oct 2015 13:33:00 -0400 Received: from mail-pa0-x22c.google.com ([2607:f8b0:400e:c03::22c]:35211) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zqldg-0000Cm-87 for qemu-devel@nongnu.org; Mon, 26 Oct 2015 13:32:56 -0400 Received: by pasz6 with SMTP id z6so194140411pas.2 for ; Mon, 26 Oct 2015 10:32:55 -0700 (PDT) Sender: Paolo Bonzini References: <562E48B9.6090600@redhat.com> <562E56B8.2030109@redhat.com> <562E5CD4.8010902@redhat.com> From: Paolo Bonzini Message-ID: <562E63C1.8020802@redhat.com> Date: Mon, 26 Oct 2015 18:32:49 +0100 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] 4k seq read splitting for virtio-blk - possible workarounds? List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Andrey Korolyov Cc: Sergey Fionov , Jens Axboe , Jeff Moyer , Peter Lieven , "qemu-devel@nongnu.org" On 26/10/2015 18:18, Andrey Korolyov wrote: > Yes, both cases are positive, thanks for very detailed explanation and > for tips. Does this also mean that most current distros which are > using 'broken' >=3.13 <4.2 driver would bring sequential read > performance, especially on rotating media, or media with high request > latency like hybrid disk, down to knees for virtio, which almost > always is a default selection? Yes, this is why I said the conversion was premature. On one hand I totally agree that virtio-blk is a great guinea pig for blk-mq conversion, on the other hand people are using the thing in production and the effects weren't quite understood. It's a common misconception that virt doesn't benefit from the elevator, but actually you get (well... used to get...) much better performance from the deadline scheduler than the noop scheduler. Merging is the main reason, because it lowers the amount of work that you have to do in the host. Even if you don't get better performance, merging will get better CPU utilization because the longer s/g lists take time to process in the host, and the effect's much larger than a few extra milliwatts in a bare-metal controller. Having a "real" multiqueue model in the host (real = one I/O thread and one AIO context per guest queue, with each I/O thread able to service multiple disks; rather than a "fake" multiqueue where you still have one I/O thread and AIO context per guest disk, so all the queues really funnel into one in the host) should fix this, but it's at least a few months away in QEMU... probably something like QEMU 2.8. My plan is for 2.6 to have fine-grained critical sections (patches written, will repost during 2.5 hard freeze), 2.7 (unlikely 2.6) to have fine-grained locks, and 2.8 or 2.9 to have multiqueue. Paolo