From: Denis Plotnikov <dplotnikov@virtuozzo.com>
To: qemu-devel@nongnu.org
Cc: fam@euphon.net, kwolf@redhat.com, ehabkost@redhat.com,
qemu-block@nongnu.org, mst@redhat.com, stefanha@redhat.com,
mreitz@redhat.com, den@virtuozzo.com
Subject: [PATCH v1 0/4] virtio: fix IO request length in virtio SCSI/block
Date: Tue, 5 Nov 2019 19:11:01 +0300 [thread overview]
Message-ID: <20191105161105.19016-1-dplotnikov@virtuozzo.com> (raw)
v1:
* make seg_max size dependent on virtuqueue size
* don't expose seg_max as property
* add new machine types with increased queue size
* add test to check the new machine types
* check queue size for non-modern virtio devices
---
From: "Denis V. Lunev" <den@openvz.org>
Linux guests submit IO requests no longer than PAGE_SIZE * max_seg
field reported by SCSI controler. Thus typical sequential read with
1 MB size results in the following pattern of the IO from the guest:
8,16 1 15754 2.766095122 2071 D R 2095104 + 1008 [dd]
8,16 1 15755 2.766108785 2071 D R 2096112 + 1008 [dd]
8,16 1 15756 2.766113486 2071 D R 2097120 + 32 [dd]
8,16 1 15757 2.767668961 0 C R 2095104 + 1008 [0]
8,16 1 15758 2.768534315 0 C R 2096112 + 1008 [0]
8,16 1 15759 2.768539782 0 C R 2097120 + 32 [0]
The IO was generated by
dd if=/dev/sda of=/dev/null bs=1024 iflag=direct
This effectively means that on rotational disks we will observe 3 IOPS
for each 2 MBs processed. This definitely negatively affects both
guest and host IO performance.
The cure is relatively simple - we should report lengthy scatter-gather
ability of the SCSI controller. Fortunately the situation here is very
good. VirtIO transport layer can accomodate 1024 items in one request
while we are using only 128. This situation is present since almost
very beginning. 2 items are dedicated for request metadata thus we
should publish VIRTQUEUE_MAX_SIZE - 2 as max_seg.
The following pattern is observed after the patch:
8,16 1 9921 2.662721340 2063 D R 2095104 + 1024 [dd]
8,16 1 9922 2.662737585 2063 D R 2096128 + 1024 [dd]
8,16 1 9923 2.665188167 0 C R 2095104 + 1024 [0]
8,16 1 9924 2.665198777 0 C R 2096128 + 1024 [0]
which is much better.
The dark side of this patch is that we are tweaking guest visible
parameter, though this should be relatively safe as above transport
layer support is present in QEMU/host Linux for a very long time.
The patch adds configurable property for VirtIO SCSI with a new default
and hardcode option for VirtBlock which does not provide good
configurable framework.
Unfortunately the commit can not be applied as is. For the real cure we
need guest to be fixed to accomodate that queue length, which is done
only in the latest 4.14 kernel. Thus we are going to expose the property
and tweak it on machine type level.
The problem with the old kernels is that they have
max_segments <= virtqueue_size restriction which cause the guest
crashing in the case of violation.
To fix the case described above in the old kernels we can increase
virtqueue_size to 256 and max_segments to 254. The pitfall here is
that seabios allows the virtqueue_size-s < 128, however, the seabios
patch extending that value to 256 is pending.
Denis Plotnikov (4):
virtio: protect non-modern devices from too big virtqueue size setting
virtio: make seg_max virtqueue size dependent
virtio: increase virtuqueue sizes in new machine types
iotests: add test for virtio-scsi and virtio-blk machine type settings
hw/block/virtio-blk.c | 2 +-
hw/core/machine.c | 14 ++++
hw/i386/pc_piix.c | 16 +++-
hw/i386/pc_q35.c | 14 +++-
hw/scsi/virtio-scsi.c | 2 +-
hw/virtio/virtio-blk-pci.c | 9 +++
hw/virtio/virtio-scsi-pci.c | 10 +++
include/hw/boards.h | 6 ++
tests/qemu-iotests/267 | 154 ++++++++++++++++++++++++++++++++++++
tests/qemu-iotests/267.out | 1 +
tests/qemu-iotests/group | 1 +
11 files changed, 222 insertions(+), 7 deletions(-)
create mode 100755 tests/qemu-iotests/267
create mode 100644 tests/qemu-iotests/267.out
--
2.17.0
next reply other threads:[~2019-11-05 16:14 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-05 16:11 Denis Plotnikov [this message]
2019-11-05 16:11 ` [PATCH v1 1/4] virtio: protect non-modern devices from too big virtqueue size setting Denis Plotnikov
2019-11-05 20:56 ` Michael S. Tsirkin
2019-11-06 7:46 ` Denis Plotnikov
2019-11-06 9:01 ` Stefan Hajnoczi
2019-11-06 9:19 ` Stefan Hajnoczi
2019-11-06 11:33 ` Michael S. Tsirkin
2019-11-06 9:18 ` Stefan Hajnoczi
2019-11-06 11:51 ` Michael S. Tsirkin
2019-11-05 16:11 ` [PATCH v1 2/4] virtio: make seg_max virtqueue size dependent Denis Plotnikov
2019-11-05 20:51 ` Michael S. Tsirkin
2019-11-06 10:07 ` Denis Lunev
2019-11-06 11:54 ` Michael S. Tsirkin
2019-11-08 7:43 ` Denis Plotnikov
2019-11-08 9:52 ` Michael S. Tsirkin
2019-11-05 16:11 ` [PATCH v1 3/4] virtio: increase virtuqueue sizes in new machine types Denis Plotnikov
2019-11-05 20:52 ` Michael S. Tsirkin
2019-11-05 16:11 ` [PATCH v1 4/4] iotests: add test for virtio-scsi and virtio-blk machine type settings Denis Plotnikov
2019-11-06 9:24 ` Stefan Hajnoczi
2019-11-06 10:04 ` Max Reitz
2019-11-06 19:26 ` Eduardo Habkost
2019-11-07 16:30 ` Cleber Rosa
2019-11-08 7:08 ` Denis Plotnikov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191105161105.19016-1-dplotnikov@virtuozzo.com \
--to=dplotnikov@virtuozzo.com \
--cc=den@virtuozzo.com \
--cc=ehabkost@redhat.com \
--cc=fam@euphon.net \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=mst@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).