* [Qemu-devel] [PATCH 0/1 V2] Add vhost-pci-blk driver
@ 2018-11-05 20:56 Vitaly Mayatskikh
2018-11-05 20:56 ` [Qemu-devel] [PATCH 1/1 " Vitaly Mayatskikh
0 siblings, 1 reply; 7+ messages in thread
From: Vitaly Mayatskikh @ 2018-11-05 20:56 UTC (permalink / raw)
To: qemu-block
Cc: Kevin Wolf, Max Reitz, Michael S . Tsirkin, Maxim Levitsky,
qemu-devel, Vitaly Mayatskikh
V2 changes:
- checkpatch style fixes
- correct size detection of disk image placed on a file system
This driver moves virtio-blk host-side processing to kernel (via new
vhost_blk kernel driver). It accelerates virtual disk performance
close to the bare metal levels, especially for parellel loads.
For example, fio numjobs=16 gets 101k randread IOPS using virtio-blk
and 1202k IOPS using vhost-blk, close to 1480k of raw disk performance.
See the IOPS numbers below.
The kernel part if you want to try:
- vhost_blk: https://lkml.org/lkml/2018/11/2/648
- vhost num-queues scalability fix: https://lkml.org/lkml/2018/11/2/550
# fio num-jobs
# A: bare metal over block
# B: bare metal over file
# C: virtio-blk over block
# D: virtio-blk over file
# E: vhost-blk over block
# F: vhost-blk over file
#
# A B C D E F
1 171k 151k 148k 151k 187k 175k
2 328k 302k 249k 241k 334k 296k
3 479k 437k 179k 174k 464k 404k
4 622k 568k 143k 183k 580k 492k
5 755k 697k 136k 128k 693k 579k
6 887k 808k 131k 120k 782k 640k
7 1004k 926k 126k 131k 863k 693k
8 1099k 1015k 117k 115k 931k 712k
9 1194k 1119k 115k 111k 991k 711k
10 1278k 1207k 109k 114k 1046k 695k
11 1345k 1280k 110k 108k 1091k 663k
12 1411k 1356k 104k 106k 1142k 629k
13 1466k 1423k 106k 106k 1170k 607k
14 1517k 1486k 103k 106k 1179k 589k
15 1552k 1543k 102k 102k 1191k 571k
16 1480k 1506k 101k 102k 1202k 566k
Vitaly Mayatskikh (1):
Add vhost-pci-blk driver
configure | 10 +
default-configs/virtio.mak | 1 +
hw/block/Makefile.objs | 1 +
hw/block/vhost-blk.c | 429 ++++++++++++++++++++++++++++++++++
hw/virtio/virtio-pci.c | 60 +++++
hw/virtio/virtio-pci.h | 19 ++
include/hw/virtio/vhost-blk.h | 43 ++++
7 files changed, 563 insertions(+)
create mode 100644 hw/block/vhost-blk.c
create mode 100644 include/hw/virtio/vhost-blk.h
--
2.17.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Qemu-devel] [PATCH 1/1 V2] Add vhost-pci-blk driver
2018-11-05 20:56 [Qemu-devel] [PATCH 0/1 V2] Add vhost-pci-blk driver Vitaly Mayatskikh
@ 2018-11-05 20:56 ` Vitaly Mayatskikh
2018-11-08 14:09 ` Dongli Zhang
2018-11-08 14:50 ` Kevin Wolf
0 siblings, 2 replies; 7+ messages in thread
From: Vitaly Mayatskikh @ 2018-11-05 20:56 UTC (permalink / raw)
To: qemu-block
Cc: Kevin Wolf, Max Reitz, Michael S . Tsirkin, Maxim Levitsky,
qemu-devel, Vitaly Mayatskikh
This driver uses the kernel-mode acceleration for virtio-blk and
allows to get a near bare metal disk performance inside a VM.
Signed-off-by: Vitaly Mayatskikh <v.mayatskih@gmail.com>
---
configure | 10 +
default-configs/virtio.mak | 1 +
hw/block/Makefile.objs | 1 +
hw/block/vhost-blk.c | 429 ++++++++++++++++++++++++++++++++++
hw/virtio/virtio-pci.c | 60 +++++
hw/virtio/virtio-pci.h | 19 ++
include/hw/virtio/vhost-blk.h | 43 ++++
7 files changed, 563 insertions(+)
create mode 100644 hw/block/vhost-blk.c
create mode 100644 include/hw/virtio/vhost-blk.h
diff --git a/configure b/configure
index 46ae1e8c76..787bc780da 100755
--- a/configure
+++ b/configure
@@ -371,6 +371,7 @@ vhost_crypto="no"
vhost_scsi="no"
vhost_vsock="no"
vhost_user=""
+vhost_blk=""
kvm="no"
hax="no"
hvf="no"
@@ -869,6 +870,7 @@ Linux)
vhost_crypto="yes"
vhost_scsi="yes"
vhost_vsock="yes"
+ vhost_blk="yes"
QEMU_INCLUDES="-I\$(SRC_PATH)/linux-headers -I$(pwd)/linux-headers $QEMU_INCLUDES"
supported_os="yes"
libudev="yes"
@@ -1263,6 +1265,10 @@ for opt do
;;
--enable-vhost-vsock) vhost_vsock="yes"
;;
+ --disable-vhost-blk) vhost_blk="no"
+ ;;
+ --enable-vhost-blk) vhost_blk="yes"
+ ;;
--disable-opengl) opengl="no"
;;
--enable-opengl) opengl="yes"
@@ -6000,6 +6006,7 @@ echo "vhost-crypto support $vhost_crypto"
echo "vhost-scsi support $vhost_scsi"
echo "vhost-vsock support $vhost_vsock"
echo "vhost-user support $vhost_user"
+echo "vhost-blk support $vhost_blk"
echo "Trace backends $trace_backends"
if have_backend "simple"; then
echo "Trace output file $trace_file-<pid>"
@@ -6461,6 +6468,9 @@ fi
if test "$vhost_user" = "yes" ; then
echo "CONFIG_VHOST_USER=y" >> $config_host_mak
fi
+if test "$vhost_blk" = "yes" ; then
+ echo "CONFIG_VHOST_BLK=y" >> $config_host_mak
+fi
if test "$blobs" = "yes" ; then
echo "INSTALL_BLOBS=yes" >> $config_host_mak
fi
diff --git a/default-configs/virtio.mak b/default-configs/virtio.mak
index 1304849018..765c0a2a04 100644
--- a/default-configs/virtio.mak
+++ b/default-configs/virtio.mak
@@ -1,5 +1,6 @@
CONFIG_VHOST_USER_SCSI=$(call land,$(CONFIG_VHOST_USER),$(CONFIG_LINUX))
CONFIG_VHOST_USER_BLK=$(call land,$(CONFIG_VHOST_USER),$(CONFIG_LINUX))
+CONFIG_VHOST_BLK=$(CONFIG_LINUX)
CONFIG_VIRTIO=y
CONFIG_VIRTIO_9P=y
CONFIG_VIRTIO_BALLOON=y
diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs
index 53ce5751ae..857ce823fc 100644
--- a/hw/block/Makefile.objs
+++ b/hw/block/Makefile.objs
@@ -14,3 +14,4 @@ obj-$(CONFIG_SH4) += tc58128.o
obj-$(CONFIG_VIRTIO_BLK) += virtio-blk.o
obj-$(CONFIG_VIRTIO_BLK) += dataplane/
obj-$(CONFIG_VHOST_USER_BLK) += vhost-user-blk.o
+obj-$(CONFIG_VHOST_BLK) += vhost-blk.o
diff --git a/hw/block/vhost-blk.c b/hw/block/vhost-blk.c
new file mode 100644
index 0000000000..4ca8040ee7
--- /dev/null
+++ b/hw/block/vhost-blk.c
@@ -0,0 +1,429 @@
+/*
+ * vhost-blk host device
+ *
+ * Copyright(C) 2018 IBM Corporation
+ *
+ * Authors:
+ * Vitaly Mayatskikh <v.mayatskih@gmail.com>
+ *
+ * Largely based on the "vhost-user-blk.c" implemented by:
+ * Changpeng Liu <changpeng.liu@intel.com>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "qemu/cutils.h"
+#include "qom/object.h"
+#include "hw/qdev-core.h"
+#include "hw/virtio/vhost.h"
+#include "hw/virtio/vhost-blk.h"
+#include "hw/virtio/virtio.h"
+#include "hw/virtio/virtio-bus.h"
+#include "hw/virtio/virtio-access.h"
+#include <sys/ioctl.h>
+#include <linux/fs.h>
+
+static const int feature_bits[] = {
+ VIRTIO_BLK_F_SIZE_MAX,
+ VIRTIO_BLK_F_SEG_MAX,
+ VIRTIO_BLK_F_BLK_SIZE,
+ VIRTIO_BLK_F_TOPOLOGY,
+ VIRTIO_BLK_F_MQ,
+ VIRTIO_BLK_F_RO,
+ VIRTIO_BLK_F_FLUSH,
+ VIRTIO_BLK_F_CONFIG_WCE,
+ VIRTIO_F_VERSION_1,
+ VIRTIO_RING_F_INDIRECT_DESC,
+ VIRTIO_RING_F_EVENT_IDX,
+ VIRTIO_F_NOTIFY_ON_EMPTY,
+ VHOST_INVALID_FEATURE_BIT
+};
+
+static void vhost_blk_get_config(VirtIODevice *vdev, uint8_t *config)
+{
+ VHostBlk *s = VHOST_BLK(vdev);
+ memcpy(config, &s->blkcfg, sizeof(struct virtio_blk_config));
+}
+
+static void vhost_blk_set_config(VirtIODevice *vdev, const uint8_t *config)
+{
+ VHostBlk *s = VHOST_BLK(vdev);
+ struct virtio_blk_config *blkcfg = (struct virtio_blk_config *)config;
+ int ret;
+
+ if (blkcfg->wce == s->blkcfg.wce) {
+ return;
+ }
+
+ ret = vhost_dev_set_config(&s->dev, &blkcfg->wce,
+ offsetof(struct virtio_blk_config, wce),
+ sizeof(blkcfg->wce),
+ VHOST_SET_CONFIG_TYPE_MASTER);
+ if (ret) {
+ error_report("set device config space failed");
+ return;
+ }
+
+ s->blkcfg.wce = blkcfg->wce;
+}
+
+static int vhost_blk_handle_config_change(struct vhost_dev *dev)
+{
+ int ret;
+ struct virtio_blk_config blkcfg;
+ VHostBlk *s = VHOST_BLK(dev->vdev);
+
+ ret = vhost_dev_get_config(dev, (uint8_t *)&blkcfg,
+ sizeof(struct virtio_blk_config));
+ if (ret < 0) {
+ error_report("get config space failed");
+ return -1;
+ }
+
+ /* valid for resize only */
+ if (blkcfg.capacity != s->blkcfg.capacity) {
+ s->blkcfg.capacity = blkcfg.capacity;
+ memcpy(dev->vdev->config, &s->blkcfg, sizeof(struct virtio_blk_config));
+ virtio_notify_config(dev->vdev);
+ }
+
+ return 0;
+}
+
+const VhostDevConfigOps vhost_blk_ops = {
+ .vhost_dev_config_notifier = vhost_blk_handle_config_change,
+};
+
+static void vhost_blk_start(VirtIODevice *vdev)
+{
+ VHostBlk *s = VHOST_BLK(vdev);
+ BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
+ VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
+ int i, ret;
+
+ if (!k->set_guest_notifiers) {
+ error_report("binding does not support guest notifiers");
+ return;
+ }
+
+ ret = vhost_dev_enable_notifiers(&s->dev, vdev);
+ if (ret < 0) {
+ error_report("Error enabling host notifiers: %d", -ret);
+ return;
+ }
+
+ ret = k->set_guest_notifiers(qbus->parent, s->dev.nvqs, true);
+ if (ret < 0) {
+ error_report("Error binding guest notifier: %d", -ret);
+ goto err_host_notifiers;
+ }
+
+ s->dev.acked_features = vdev->guest_features;
+
+ ret = vhost_dev_start(&s->dev, vdev);
+ if (ret < 0) {
+ error_report("Error starting vhost: %d", -ret);
+ goto err_guest_notifiers;
+ }
+ /* guest_notifier_mask/pending not used yet, so just unmask
+ * everything here. virtio-pci will do the right thing by
+ * enabling/disabling irqfd.
+ */
+ for (i = 0; i < s->dev.nvqs; i++) {
+ vhost_virtqueue_mask(&s->dev, vdev, i, false);
+ }
+
+ s->bs_fd = open(blk_bs(s->blk)->filename, O_RDWR);
+ if (s->bs_fd < 0) {
+ error_report("Error opening backing store: %d", -errno);
+ goto err_ioctl;
+ }
+ ret = ioctl(s->vhostfd, _IOW(0xaf, 0x50, int), &s->bs_fd);
+ if (ret < 0) {
+ error_report("Error setting up backend: %d", -errno);
+ goto err_ioctl;
+ }
+
+ return;
+
+err_ioctl:
+ if (s->bs_fd > 0) {
+ close(s->bs_fd);
+ }
+ vhost_dev_stop(&s->dev, vdev);
+err_guest_notifiers:
+ k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false);
+err_host_notifiers:
+ vhost_dev_disable_notifiers(&s->dev, vdev);
+}
+
+static void vhost_blk_stop(VirtIODevice *vdev)
+{
+ VHostBlk *s = VHOST_BLK(vdev);
+ BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
+ VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
+ int ret;
+
+ if (!k->set_guest_notifiers) {
+ return;
+ }
+
+ vhost_dev_stop(&s->dev, vdev);
+
+ ret = k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false);
+ if (ret < 0) {
+ error_report("vhost guest notifier cleanup failed: %d", ret);
+ return;
+ }
+
+ vhost_dev_disable_notifiers(&s->dev, vdev);
+ close(s->bs_fd);
+}
+
+static void vhost_blk_set_status(VirtIODevice *vdev, uint8_t status)
+{
+ VHostBlk *s = VHOST_BLK(vdev);
+ bool should_start = status & VIRTIO_CONFIG_S_DRIVER_OK;
+
+ if (!vdev->vm_running) {
+ should_start = false;
+ }
+
+ if (s->dev.started == should_start) {
+ return;
+ }
+
+ if (should_start) {
+ vhost_blk_start(vdev);
+ } else {
+ vhost_blk_stop(vdev);
+ }
+
+}
+
+static uint64_t vhost_blk_get_features(VirtIODevice *vdev,
+ uint64_t features,
+ Error **errp)
+{
+ VHostBlk *s = VHOST_BLK(vdev);
+
+ /* Turn on pre-defined features */
+ virtio_add_feature(&features, VIRTIO_BLK_F_SIZE_MAX);
+ virtio_add_feature(&features, VIRTIO_BLK_F_SEG_MAX);
+ virtio_add_feature(&features, VIRTIO_BLK_F_TOPOLOGY);
+ virtio_add_feature(&features, VIRTIO_BLK_F_BLK_SIZE);
+ virtio_add_feature(&features, VIRTIO_BLK_F_FLUSH);
+ virtio_add_feature(&features, VIRTIO_BLK_F_RO);
+
+ if (s->config_wce) {
+ virtio_add_feature(&features, VIRTIO_BLK_F_CONFIG_WCE);
+ }
+ if (s->num_queues > 1) {
+ virtio_add_feature(&features, VIRTIO_BLK_F_MQ);
+ }
+
+ return vhost_get_features(&s->dev, feature_bits, features);
+}
+
+static void vhost_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
+{
+}
+
+static int vhost_blk_cfg_init(VHostBlk *s)
+{
+ int ret;
+ uint64_t size64;
+ int size32;
+ int blksize;
+ int fd = -1;
+ struct stat st;
+ char *filename = blk_bs(s->blk)->filename;
+
+ fd = open(filename, O_RDWR);
+ if (fd < 0) {
+ error_report("Can't open device %s: %d", filename, errno);
+ goto out;
+ }
+
+ ret = fstat(fd, &st);
+ if (ret != 0) {
+ error_report("Can't fstat backend drive %s: %d", filename, errno);
+ goto out;
+ }
+ if (S_ISBLK(st.st_mode)) {
+ ret = ioctl(fd, BLKGETSIZE64, &size64);
+ if (ret != 0 && (errno == ENOTTY)) {
+ ret = ioctl(fd, BLKGETSIZE, &size32);
+ size64 = size32;
+ }
+ if (ret != 0) {
+ error_report("Can't get drive size: %d", errno);
+ goto out;
+ }
+ ret = ioctl(fd, BLKSSZGET, &blksize);
+ if (ret != 0) {
+ error_report("Can't get logical sector size, assuming 512: %d",
+ errno);
+ blksize = 512;
+ }
+ } else {
+ size64 = st.st_size;
+ blksize = st.st_blksize;
+ }
+
+ s->blkcfg.capacity = size64 / 512;
+ s->blkcfg.blk_size = blksize;
+ s->blkcfg.physical_block_exp = 0;
+ s->blkcfg.num_queues = s->num_queues;
+ /* TODO query actual block device */
+ s->blkcfg.size_max = 8192;
+ s->blkcfg.seg_max = 8192 / 512;
+ s->blkcfg.min_io_size = 512;
+ s->blkcfg.opt_io_size = 8192;
+
+out:
+ if (fd > 0) {
+ close(fd);
+ }
+ return ret;
+}
+
+static void vhost_blk_device_realize(DeviceState *dev, Error **errp)
+{
+ VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+ VHostBlk *s = VHOST_BLK(vdev);
+ int i, ret;
+
+ if (!s->blk) {
+ error_setg(errp, "drive property not set");
+ return;
+ }
+ if (!blk_is_inserted(s->blk)) {
+ error_setg(errp, "Device needs media, but drive is empty");
+ return;
+ }
+
+ if (!s->num_queues || s->num_queues > VIRTIO_QUEUE_MAX) {
+ error_setg(errp, "vhost-blk: invalid number of IO queues");
+ return;
+ }
+
+ if (!s->queue_size) {
+ error_setg(errp, "vhost-blk: queue size must be non-zero");
+ return;
+ }
+
+ virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK,
+ sizeof(struct virtio_blk_config));
+
+ s->dev.max_queues = s->num_queues;
+ s->dev.nvqs = s->num_queues;
+ s->dev.vqs = g_new(struct vhost_virtqueue, s->dev.nvqs);
+ s->dev.vq_index = 0;
+ s->dev.backend_features = 0;
+
+ vhost_dev_set_config_notifier(&s->dev, &vhost_blk_ops);
+
+ for (i = 0; i < s->dev.max_queues; i++) {
+ virtio_add_queue(vdev, s->queue_size,
+ vhost_blk_handle_output);
+ }
+
+ s->vhostfd = open("/dev/vhost-blk", O_RDWR);
+ if (s->vhostfd < 0) {
+ error_setg_errno(errp, -errno,
+ "vhost-blk: failed to open vhost device");
+ goto virtio_err;
+ }
+
+ ret = vhost_dev_init(&s->dev, (void *)(uintptr_t)s->vhostfd,
+ VHOST_BACKEND_TYPE_KERNEL, 0);
+ if (ret < 0) {
+ error_setg(errp, "vhost-blk: vhost initialization failed: %s",
+ strerror(-ret));
+ goto virtio_err;
+ }
+
+ vhost_blk_cfg_init(s);
+ blk_iostatus_enable(s->blk);
+ return;
+
+virtio_err:
+ g_free(s->dev.vqs);
+ virtio_cleanup(vdev);
+ close(s->vhostfd);
+}
+
+static void vhost_blk_device_unrealize(DeviceState *dev, Error **errp)
+{
+ VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+ VHostBlk *s = VHOST_BLK(dev);
+
+ vhost_blk_set_status(vdev, 0);
+ close(s->vhostfd);
+ vhost_dev_cleanup(&s->dev);
+ g_free(s->dev.vqs);
+ virtio_cleanup(vdev);
+}
+
+static void vhost_blk_instance_init(Object *obj)
+{
+ VHostBlk *s = VHOST_BLK(obj);
+
+ device_add_bootindex_property(obj, &s->bootindex, "bootindex",
+ "/disk@0,0", DEVICE(obj), NULL);
+}
+
+static const VMStateDescription vmstate_vhost_blk = {
+ .name = "vhost-blk",
+ .minimum_version_id = 1,
+ .version_id = 1,
+ .fields = (VMStateField[]) {
+ VMSTATE_VIRTIO_DEVICE,
+ VMSTATE_END_OF_LIST()
+ },
+};
+
+static Property vhost_blk_properties[] = {
+ DEFINE_PROP_DRIVE("drive", VHostBlk, blk),
+ DEFINE_PROP_UINT16("num-queues", VHostBlk, num_queues, 1),
+ DEFINE_PROP_UINT32("queue-size", VHostBlk, queue_size, 128),
+ DEFINE_PROP_BIT("config-wce", VHostBlk, config_wce, 0, true),
+ DEFINE_PROP_END_OF_LIST(),
+};
+
+static void vhost_blk_class_init(ObjectClass *klass, void *data)
+{
+ DeviceClass *dc = DEVICE_CLASS(klass);
+ VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass);
+
+ dc->props = vhost_blk_properties;
+ dc->vmsd = &vmstate_vhost_blk;
+ set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
+ vdc->realize = vhost_blk_device_realize;
+ vdc->unrealize = vhost_blk_device_unrealize;
+ vdc->get_config = vhost_blk_get_config;
+ vdc->set_config = vhost_blk_set_config;
+ vdc->get_features = vhost_blk_get_features;
+ vdc->set_status = vhost_blk_set_status;
+}
+
+static const TypeInfo vhost_blk_info = {
+ .name = TYPE_VHOST_BLK,
+ .parent = TYPE_VIRTIO_DEVICE,
+ .instance_size = sizeof(VHostBlk),
+ .instance_init = vhost_blk_instance_init,
+ .class_init = vhost_blk_class_init,
+};
+
+static void virtio_register_types(void)
+{
+ type_register_static(&vhost_blk_info);
+}
+
+type_init(virtio_register_types)
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index a954799267..ec00b54424 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -2060,6 +2060,63 @@ static const TypeInfo vhost_user_blk_pci_info = {
};
#endif
+#ifdef CONFIG_VHOST_BLK
+/* vhost-blk */
+
+static Property vhost_blk_pci_properties[] = {
+ DEFINE_PROP_UINT32("class", VirtIOPCIProxy, class_code, 0),
+ DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors,
+ DEV_NVECTORS_UNSPECIFIED),
+ DEFINE_PROP_END_OF_LIST(),
+};
+
+static void vhost_blk_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
+{
+ VHostBlkPCI *dev = VHOST_BLK_PCI(vpci_dev);
+ DeviceState *vdev = DEVICE(&dev->vdev);
+
+ if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) {
+ vpci_dev->nvectors = dev->vdev.num_queues + 1;
+ }
+
+ qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus));
+ object_property_set_bool(OBJECT(vdev), true, "realized", errp);
+}
+
+static void vhost_blk_pci_class_init(ObjectClass *klass, void *data)
+{
+ DeviceClass *dc = DEVICE_CLASS(klass);
+ VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass);
+ PCIDeviceClass *pcidev_k = PCI_DEVICE_CLASS(klass);
+
+ set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
+ dc->props = vhost_blk_pci_properties;
+ k->realize = vhost_blk_pci_realize;
+ pcidev_k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET;
+ pcidev_k->device_id = PCI_DEVICE_ID_VIRTIO_BLOCK;
+ pcidev_k->revision = VIRTIO_PCI_ABI_VERSION;
+ pcidev_k->class_id = PCI_CLASS_STORAGE_SCSI;
+}
+
+static void vhost_blk_pci_instance_init(Object *obj)
+{
+ VHostBlkPCI *dev = VHOST_BLK_PCI(obj);
+
+ virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev),
+ TYPE_VHOST_BLK);
+ object_property_add_alias(obj, "bootindex", OBJECT(&dev->vdev),
+ "bootindex", &error_abort);
+}
+
+static const TypeInfo vhost_blk_pci_info = {
+ .name = TYPE_VHOST_BLK_PCI,
+ .parent = TYPE_VIRTIO_PCI,
+ .instance_size = sizeof(VHostBlkPCI),
+ .instance_init = vhost_blk_pci_instance_init,
+ .class_init = vhost_blk_pci_class_init,
+};
+#endif
+
/* virtio-scsi-pci */
static Property virtio_scsi_pci_properties[] = {
@@ -2723,6 +2780,9 @@ static void virtio_pci_register_types(void)
#ifdef CONFIG_VHOST_VSOCK
type_register_static(&vhost_vsock_pci_info);
#endif
+#ifdef CONFIG_VHOST_BLK
+ type_register_static(&vhost_blk_pci_info);
+#endif
}
type_init(virtio_pci_register_types)
diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
index 813082b0d7..4cca04a45a 100644
--- a/hw/virtio/virtio-pci.h
+++ b/hw/virtio/virtio-pci.h
@@ -31,6 +31,10 @@
#include "hw/virtio/vhost-user-blk.h"
#endif
+#ifdef CONFIG_VHOST_BLK
+#include "hw/virtio/vhost-blk.h"
+#endif
+
#ifdef CONFIG_VIRTFS
#include "hw/9pfs/virtio-9p.h"
#endif
@@ -50,6 +54,7 @@ typedef struct VirtIONetPCI VirtIONetPCI;
typedef struct VHostSCSIPCI VHostSCSIPCI;
typedef struct VHostUserSCSIPCI VHostUserSCSIPCI;
typedef struct VHostUserBlkPCI VHostUserBlkPCI;
+typedef struct VHostBlkPCI VHostBlkPCI;
typedef struct VirtIORngPCI VirtIORngPCI;
typedef struct VirtIOInputPCI VirtIOInputPCI;
typedef struct VirtIOInputHIDPCI VirtIOInputHIDPCI;
@@ -262,6 +267,20 @@ struct VHostUserBlkPCI {
};
#endif
+#ifdef CONFIG_VHOST_BLK
+/*
+ * vhost-blk-pci: This extends VirtioPCIProxy.
+ */
+#define TYPE_VHOST_BLK_PCI "vhost-blk-pci"
+#define VHOST_BLK_PCI(obj) \
+ OBJECT_CHECK(VHostBlkPCI, (obj), TYPE_VHOST_BLK_PCI)
+
+struct VHostBlkPCI {
+ VirtIOPCIProxy parent_obj;
+ VHostBlk vdev;
+};
+#endif
+
/*
* virtio-blk-pci: This extends VirtioPCIProxy.
*/
diff --git a/include/hw/virtio/vhost-blk.h b/include/hw/virtio/vhost-blk.h
new file mode 100644
index 0000000000..6dc5d9634a
--- /dev/null
+++ b/include/hw/virtio/vhost-blk.h
@@ -0,0 +1,43 @@
+/*
+ * vhost-blk host device
+ * Copyright(C) 2018 IBM Corporation.
+ *
+ * Authors:
+ * Vitaly Mayatskikh <v.mayatskih@gmail.com>
+ *
+ * Based on vhost-user-blk.h, Copyright Intel, Corp. 2017
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#ifndef VHOST_BLK_H
+#define VHOST_BLK_H
+
+#include "standard-headers/linux/virtio_blk.h"
+#include "qemu-common.h"
+#include "hw/qdev.h"
+#include "hw/block/block.h"
+#include "hw/virtio/vhost.h"
+#include "sysemu/block-backend.h"
+
+#define TYPE_VHOST_BLK "vhost-blk"
+#define VHOST_BLK(obj) \
+ OBJECT_CHECK(VHostBlk, (obj), TYPE_VHOST_BLK)
+
+typedef struct VHostBlk {
+ VirtIODevice parent_obj;
+ BlockConf conf;
+ BlockBackend *blk;
+ int32_t bootindex;
+ struct virtio_blk_config blkcfg;
+ uint16_t num_queues;
+ uint32_t queue_size;
+ uint32_t config_wce;
+ int vhostfd;
+ int bs_fd;
+ struct vhost_dev dev;
+} VHostBlk;
+
+#endif
--
2.17.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V2] Add vhost-pci-blk driver
2018-11-05 20:56 ` [Qemu-devel] [PATCH 1/1 " Vitaly Mayatskikh
@ 2018-11-08 14:09 ` Dongli Zhang
2018-11-08 16:47 ` Michael S. Tsirkin
2018-11-08 14:50 ` Kevin Wolf
1 sibling, 1 reply; 7+ messages in thread
From: Dongli Zhang @ 2018-11-08 14:09 UTC (permalink / raw)
To: Vitaly Mayatskikh, qemu-block
Cc: Kevin Wolf, Michael S . Tsirkin, qemu-devel, Maxim Levitsky,
Max Reitz
It looks the kernel space vhost-blk can only process raw image.
How about to verify that only raw image is used in the drive command line when
vhost-blk-pci is paired with it?
Otherwise, vhost-blk-pci might be working with qcow2 image without any warning
on qemu side.
Dongli Zhang
On 11/06/2018 04:56 AM, Vitaly Mayatskikh wrote:
> This driver uses the kernel-mode acceleration for virtio-blk and
> allows to get a near bare metal disk performance inside a VM.
>
> Signed-off-by: Vitaly Mayatskikh <v.mayatskih@gmail.com>
> ---
> configure | 10 +
> default-configs/virtio.mak | 1 +
> hw/block/Makefile.objs | 1 +
> hw/block/vhost-blk.c | 429 ++++++++++++++++++++++++++++++++++
> hw/virtio/virtio-pci.c | 60 +++++
> hw/virtio/virtio-pci.h | 19 ++
> include/hw/virtio/vhost-blk.h | 43 ++++
> 7 files changed, 563 insertions(+)
> create mode 100644 hw/block/vhost-blk.c
> create mode 100644 include/hw/virtio/vhost-blk.h
>
> diff --git a/configure b/configure
> index 46ae1e8c76..787bc780da 100755
> --- a/configure
> +++ b/configure
> @@ -371,6 +371,7 @@ vhost_crypto="no"
> vhost_scsi="no"
> vhost_vsock="no"
> vhost_user=""
> +vhost_blk=""
> kvm="no"
> hax="no"
> hvf="no"
> @@ -869,6 +870,7 @@ Linux)
> vhost_crypto="yes"
> vhost_scsi="yes"
> vhost_vsock="yes"
> + vhost_blk="yes"
> QEMU_INCLUDES="-I\$(SRC_PATH)/linux-headers -I$(pwd)/linux-headers $QEMU_INCLUDES"
> supported_os="yes"
> libudev="yes"
> @@ -1263,6 +1265,10 @@ for opt do
> ;;
> --enable-vhost-vsock) vhost_vsock="yes"
> ;;
> + --disable-vhost-blk) vhost_blk="no"
> + ;;
> + --enable-vhost-blk) vhost_blk="yes"
> + ;;
> --disable-opengl) opengl="no"
> ;;
> --enable-opengl) opengl="yes"
> @@ -6000,6 +6006,7 @@ echo "vhost-crypto support $vhost_crypto"
> echo "vhost-scsi support $vhost_scsi"
> echo "vhost-vsock support $vhost_vsock"
> echo "vhost-user support $vhost_user"
> +echo "vhost-blk support $vhost_blk"
> echo "Trace backends $trace_backends"
> if have_backend "simple"; then
> echo "Trace output file $trace_file-<pid>"
> @@ -6461,6 +6468,9 @@ fi
> if test "$vhost_user" = "yes" ; then
> echo "CONFIG_VHOST_USER=y" >> $config_host_mak
> fi
> +if test "$vhost_blk" = "yes" ; then
> + echo "CONFIG_VHOST_BLK=y" >> $config_host_mak
> +fi
> if test "$blobs" = "yes" ; then
> echo "INSTALL_BLOBS=yes" >> $config_host_mak
> fi
> diff --git a/default-configs/virtio.mak b/default-configs/virtio.mak
> index 1304849018..765c0a2a04 100644
> --- a/default-configs/virtio.mak
> +++ b/default-configs/virtio.mak
> @@ -1,5 +1,6 @@
> CONFIG_VHOST_USER_SCSI=$(call land,$(CONFIG_VHOST_USER),$(CONFIG_LINUX))
> CONFIG_VHOST_USER_BLK=$(call land,$(CONFIG_VHOST_USER),$(CONFIG_LINUX))
> +CONFIG_VHOST_BLK=$(CONFIG_LINUX)
> CONFIG_VIRTIO=y
> CONFIG_VIRTIO_9P=y
> CONFIG_VIRTIO_BALLOON=y
> diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs
> index 53ce5751ae..857ce823fc 100644
> --- a/hw/block/Makefile.objs
> +++ b/hw/block/Makefile.objs
> @@ -14,3 +14,4 @@ obj-$(CONFIG_SH4) += tc58128.o
> obj-$(CONFIG_VIRTIO_BLK) += virtio-blk.o
> obj-$(CONFIG_VIRTIO_BLK) += dataplane/
> obj-$(CONFIG_VHOST_USER_BLK) += vhost-user-blk.o
> +obj-$(CONFIG_VHOST_BLK) += vhost-blk.o
> diff --git a/hw/block/vhost-blk.c b/hw/block/vhost-blk.c
> new file mode 100644
> index 0000000000..4ca8040ee7
> --- /dev/null
> +++ b/hw/block/vhost-blk.c
> @@ -0,0 +1,429 @@
> +/*
> + * vhost-blk host device
> + *
> + * Copyright(C) 2018 IBM Corporation
> + *
> + * Authors:
> + * Vitaly Mayatskikh <v.mayatskih@gmail.com>
> + *
> + * Largely based on the "vhost-user-blk.c" implemented by:
> + * Changpeng Liu <changpeng.liu@intel.com>
> + *
> + * This work is licensed under the terms of the GNU LGPL, version 2 or later.
> + * See the COPYING.LIB file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "qemu/error-report.h"
> +#include "qemu/cutils.h"
> +#include "qom/object.h"
> +#include "hw/qdev-core.h"
> +#include "hw/virtio/vhost.h"
> +#include "hw/virtio/vhost-blk.h"
> +#include "hw/virtio/virtio.h"
> +#include "hw/virtio/virtio-bus.h"
> +#include "hw/virtio/virtio-access.h"
> +#include <sys/ioctl.h>
> +#include <linux/fs.h>
> +
> +static const int feature_bits[] = {
> + VIRTIO_BLK_F_SIZE_MAX,
> + VIRTIO_BLK_F_SEG_MAX,
> + VIRTIO_BLK_F_BLK_SIZE,
> + VIRTIO_BLK_F_TOPOLOGY,
> + VIRTIO_BLK_F_MQ,
> + VIRTIO_BLK_F_RO,
> + VIRTIO_BLK_F_FLUSH,
> + VIRTIO_BLK_F_CONFIG_WCE,
> + VIRTIO_F_VERSION_1,
> + VIRTIO_RING_F_INDIRECT_DESC,
> + VIRTIO_RING_F_EVENT_IDX,
> + VIRTIO_F_NOTIFY_ON_EMPTY,
> + VHOST_INVALID_FEATURE_BIT
> +};
> +
> +static void vhost_blk_get_config(VirtIODevice *vdev, uint8_t *config)
> +{
> + VHostBlk *s = VHOST_BLK(vdev);
> + memcpy(config, &s->blkcfg, sizeof(struct virtio_blk_config));
> +}
> +
> +static void vhost_blk_set_config(VirtIODevice *vdev, const uint8_t *config)
> +{
> + VHostBlk *s = VHOST_BLK(vdev);
> + struct virtio_blk_config *blkcfg = (struct virtio_blk_config *)config;
> + int ret;
> +
> + if (blkcfg->wce == s->blkcfg.wce) {
> + return;
> + }
> +
> + ret = vhost_dev_set_config(&s->dev, &blkcfg->wce,
> + offsetof(struct virtio_blk_config, wce),
> + sizeof(blkcfg->wce),
> + VHOST_SET_CONFIG_TYPE_MASTER);
> + if (ret) {
> + error_report("set device config space failed");
> + return;
> + }
> +
> + s->blkcfg.wce = blkcfg->wce;
> +}
> +
> +static int vhost_blk_handle_config_change(struct vhost_dev *dev)
> +{
> + int ret;
> + struct virtio_blk_config blkcfg;
> + VHostBlk *s = VHOST_BLK(dev->vdev);
> +
> + ret = vhost_dev_get_config(dev, (uint8_t *)&blkcfg,
> + sizeof(struct virtio_blk_config));
> + if (ret < 0) {
> + error_report("get config space failed");
> + return -1;
> + }
> +
> + /* valid for resize only */
> + if (blkcfg.capacity != s->blkcfg.capacity) {
> + s->blkcfg.capacity = blkcfg.capacity;
> + memcpy(dev->vdev->config, &s->blkcfg, sizeof(struct virtio_blk_config));
> + virtio_notify_config(dev->vdev);
> + }
> +
> + return 0;
> +}
> +
> +const VhostDevConfigOps vhost_blk_ops = {
> + .vhost_dev_config_notifier = vhost_blk_handle_config_change,
> +};
> +
> +static void vhost_blk_start(VirtIODevice *vdev)
> +{
> + VHostBlk *s = VHOST_BLK(vdev);
> + BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
> + VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
> + int i, ret;
> +
> + if (!k->set_guest_notifiers) {
> + error_report("binding does not support guest notifiers");
> + return;
> + }
> +
> + ret = vhost_dev_enable_notifiers(&s->dev, vdev);
> + if (ret < 0) {
> + error_report("Error enabling host notifiers: %d", -ret);
> + return;
> + }
> +
> + ret = k->set_guest_notifiers(qbus->parent, s->dev.nvqs, true);
> + if (ret < 0) {
> + error_report("Error binding guest notifier: %d", -ret);
> + goto err_host_notifiers;
> + }
> +
> + s->dev.acked_features = vdev->guest_features;
> +
> + ret = vhost_dev_start(&s->dev, vdev);
> + if (ret < 0) {
> + error_report("Error starting vhost: %d", -ret);
> + goto err_guest_notifiers;
> + }
> + /* guest_notifier_mask/pending not used yet, so just unmask
> + * everything here. virtio-pci will do the right thing by
> + * enabling/disabling irqfd.
> + */
> + for (i = 0; i < s->dev.nvqs; i++) {
> + vhost_virtqueue_mask(&s->dev, vdev, i, false);
> + }
> +
> + s->bs_fd = open(blk_bs(s->blk)->filename, O_RDWR);
> + if (s->bs_fd < 0) {
> + error_report("Error opening backing store: %d", -errno);
> + goto err_ioctl;
> + }
> + ret = ioctl(s->vhostfd, _IOW(0xaf, 0x50, int), &s->bs_fd);
> + if (ret < 0) {
> + error_report("Error setting up backend: %d", -errno);
> + goto err_ioctl;
> + }
> +
> + return;
> +
> +err_ioctl:
> + if (s->bs_fd > 0) {
> + close(s->bs_fd);
> + }
> + vhost_dev_stop(&s->dev, vdev);
> +err_guest_notifiers:
> + k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false);
> +err_host_notifiers:
> + vhost_dev_disable_notifiers(&s->dev, vdev);
> +}
> +
> +static void vhost_blk_stop(VirtIODevice *vdev)
> +{
> + VHostBlk *s = VHOST_BLK(vdev);
> + BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
> + VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
> + int ret;
> +
> + if (!k->set_guest_notifiers) {
> + return;
> + }
> +
> + vhost_dev_stop(&s->dev, vdev);
> +
> + ret = k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false);
> + if (ret < 0) {
> + error_report("vhost guest notifier cleanup failed: %d", ret);
> + return;
> + }
> +
> + vhost_dev_disable_notifiers(&s->dev, vdev);
> + close(s->bs_fd);
> +}
> +
> +static void vhost_blk_set_status(VirtIODevice *vdev, uint8_t status)
> +{
> + VHostBlk *s = VHOST_BLK(vdev);
> + bool should_start = status & VIRTIO_CONFIG_S_DRIVER_OK;
> +
> + if (!vdev->vm_running) {
> + should_start = false;
> + }
> +
> + if (s->dev.started == should_start) {
> + return;
> + }
> +
> + if (should_start) {
> + vhost_blk_start(vdev);
> + } else {
> + vhost_blk_stop(vdev);
> + }
> +
> +}
> +
> +static uint64_t vhost_blk_get_features(VirtIODevice *vdev,
> + uint64_t features,
> + Error **errp)
> +{
> + VHostBlk *s = VHOST_BLK(vdev);
> +
> + /* Turn on pre-defined features */
> + virtio_add_feature(&features, VIRTIO_BLK_F_SIZE_MAX);
> + virtio_add_feature(&features, VIRTIO_BLK_F_SEG_MAX);
> + virtio_add_feature(&features, VIRTIO_BLK_F_TOPOLOGY);
> + virtio_add_feature(&features, VIRTIO_BLK_F_BLK_SIZE);
> + virtio_add_feature(&features, VIRTIO_BLK_F_FLUSH);
> + virtio_add_feature(&features, VIRTIO_BLK_F_RO);
> +
> + if (s->config_wce) {
> + virtio_add_feature(&features, VIRTIO_BLK_F_CONFIG_WCE);
> + }
> + if (s->num_queues > 1) {
> + virtio_add_feature(&features, VIRTIO_BLK_F_MQ);
> + }
> +
> + return vhost_get_features(&s->dev, feature_bits, features);
> +}
> +
> +static void vhost_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
> +{
> +}
> +
> +static int vhost_blk_cfg_init(VHostBlk *s)
> +{
> + int ret;
> + uint64_t size64;
> + int size32;
> + int blksize;
> + int fd = -1;
> + struct stat st;
> + char *filename = blk_bs(s->blk)->filename;
> +
> + fd = open(filename, O_RDWR);
> + if (fd < 0) {
> + error_report("Can't open device %s: %d", filename, errno);
> + goto out;
> + }
> +
> + ret = fstat(fd, &st);
> + if (ret != 0) {
> + error_report("Can't fstat backend drive %s: %d", filename, errno);
> + goto out;
> + }
> + if (S_ISBLK(st.st_mode)) {
> + ret = ioctl(fd, BLKGETSIZE64, &size64);
> + if (ret != 0 && (errno == ENOTTY)) {
> + ret = ioctl(fd, BLKGETSIZE, &size32);
> + size64 = size32;
> + }
> + if (ret != 0) {
> + error_report("Can't get drive size: %d", errno);
> + goto out;
> + }
> + ret = ioctl(fd, BLKSSZGET, &blksize);
> + if (ret != 0) {
> + error_report("Can't get logical sector size, assuming 512: %d",
> + errno);
> + blksize = 512;
> + }
> + } else {
> + size64 = st.st_size;
> + blksize = st.st_blksize;
> + }
> +
> + s->blkcfg.capacity = size64 / 512;
> + s->blkcfg.blk_size = blksize;
> + s->blkcfg.physical_block_exp = 0;
> + s->blkcfg.num_queues = s->num_queues;
> + /* TODO query actual block device */
> + s->blkcfg.size_max = 8192;
> + s->blkcfg.seg_max = 8192 / 512;
> + s->blkcfg.min_io_size = 512;
> + s->blkcfg.opt_io_size = 8192;
> +
> +out:
> + if (fd > 0) {
> + close(fd);
> + }
> + return ret;
> +}
> +
> +static void vhost_blk_device_realize(DeviceState *dev, Error **errp)
> +{
> + VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> + VHostBlk *s = VHOST_BLK(vdev);
> + int i, ret;
> +
> + if (!s->blk) {
> + error_setg(errp, "drive property not set");
> + return;
> + }
> + if (!blk_is_inserted(s->blk)) {
> + error_setg(errp, "Device needs media, but drive is empty");
> + return;
> + }
> +
> + if (!s->num_queues || s->num_queues > VIRTIO_QUEUE_MAX) {
> + error_setg(errp, "vhost-blk: invalid number of IO queues");
> + return;
> + }
> +
> + if (!s->queue_size) {
> + error_setg(errp, "vhost-blk: queue size must be non-zero");
> + return;
> + }
> +
> + virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK,
> + sizeof(struct virtio_blk_config));
> +
> + s->dev.max_queues = s->num_queues;
> + s->dev.nvqs = s->num_queues;
> + s->dev.vqs = g_new(struct vhost_virtqueue, s->dev.nvqs);
> + s->dev.vq_index = 0;
> + s->dev.backend_features = 0;
> +
> + vhost_dev_set_config_notifier(&s->dev, &vhost_blk_ops);
> +
> + for (i = 0; i < s->dev.max_queues; i++) {
> + virtio_add_queue(vdev, s->queue_size,
> + vhost_blk_handle_output);
> + }
> +
> + s->vhostfd = open("/dev/vhost-blk", O_RDWR);
> + if (s->vhostfd < 0) {
> + error_setg_errno(errp, -errno,
> + "vhost-blk: failed to open vhost device");
> + goto virtio_err;
> + }
> +
> + ret = vhost_dev_init(&s->dev, (void *)(uintptr_t)s->vhostfd,
> + VHOST_BACKEND_TYPE_KERNEL, 0);
> + if (ret < 0) {
> + error_setg(errp, "vhost-blk: vhost initialization failed: %s",
> + strerror(-ret));
> + goto virtio_err;
> + }
> +
> + vhost_blk_cfg_init(s);
> + blk_iostatus_enable(s->blk);
> + return;
> +
> +virtio_err:
> + g_free(s->dev.vqs);
> + virtio_cleanup(vdev);
> + close(s->vhostfd);
> +}
> +
> +static void vhost_blk_device_unrealize(DeviceState *dev, Error **errp)
> +{
> + VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> + VHostBlk *s = VHOST_BLK(dev);
> +
> + vhost_blk_set_status(vdev, 0);
> + close(s->vhostfd);
> + vhost_dev_cleanup(&s->dev);
> + g_free(s->dev.vqs);
> + virtio_cleanup(vdev);
> +}
> +
> +static void vhost_blk_instance_init(Object *obj)
> +{
> + VHostBlk *s = VHOST_BLK(obj);
> +
> + device_add_bootindex_property(obj, &s->bootindex, "bootindex",
> + "/disk@0,0", DEVICE(obj), NULL);
> +}
> +
> +static const VMStateDescription vmstate_vhost_blk = {
> + .name = "vhost-blk",
> + .minimum_version_id = 1,
> + .version_id = 1,
> + .fields = (VMStateField[]) {
> + VMSTATE_VIRTIO_DEVICE,
> + VMSTATE_END_OF_LIST()
> + },
> +};
> +
> +static Property vhost_blk_properties[] = {
> + DEFINE_PROP_DRIVE("drive", VHostBlk, blk),
> + DEFINE_PROP_UINT16("num-queues", VHostBlk, num_queues, 1),
> + DEFINE_PROP_UINT32("queue-size", VHostBlk, queue_size, 128),
> + DEFINE_PROP_BIT("config-wce", VHostBlk, config_wce, 0, true),
> + DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void vhost_blk_class_init(ObjectClass *klass, void *data)
> +{
> + DeviceClass *dc = DEVICE_CLASS(klass);
> + VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass);
> +
> + dc->props = vhost_blk_properties;
> + dc->vmsd = &vmstate_vhost_blk;
> + set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
> + vdc->realize = vhost_blk_device_realize;
> + vdc->unrealize = vhost_blk_device_unrealize;
> + vdc->get_config = vhost_blk_get_config;
> + vdc->set_config = vhost_blk_set_config;
> + vdc->get_features = vhost_blk_get_features;
> + vdc->set_status = vhost_blk_set_status;
> +}
> +
> +static const TypeInfo vhost_blk_info = {
> + .name = TYPE_VHOST_BLK,
> + .parent = TYPE_VIRTIO_DEVICE,
> + .instance_size = sizeof(VHostBlk),
> + .instance_init = vhost_blk_instance_init,
> + .class_init = vhost_blk_class_init,
> +};
> +
> +static void virtio_register_types(void)
> +{
> + type_register_static(&vhost_blk_info);
> +}
> +
> +type_init(virtio_register_types)
> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index a954799267..ec00b54424 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -2060,6 +2060,63 @@ static const TypeInfo vhost_user_blk_pci_info = {
> };
> #endif
>
> +#ifdef CONFIG_VHOST_BLK
> +/* vhost-blk */
> +
> +static Property vhost_blk_pci_properties[] = {
> + DEFINE_PROP_UINT32("class", VirtIOPCIProxy, class_code, 0),
> + DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors,
> + DEV_NVECTORS_UNSPECIFIED),
> + DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void vhost_blk_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
> +{
> + VHostBlkPCI *dev = VHOST_BLK_PCI(vpci_dev);
> + DeviceState *vdev = DEVICE(&dev->vdev);
> +
> + if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) {
> + vpci_dev->nvectors = dev->vdev.num_queues + 1;
> + }
> +
> + qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus));
> + object_property_set_bool(OBJECT(vdev), true, "realized", errp);
> +}
> +
> +static void vhost_blk_pci_class_init(ObjectClass *klass, void *data)
> +{
> + DeviceClass *dc = DEVICE_CLASS(klass);
> + VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass);
> + PCIDeviceClass *pcidev_k = PCI_DEVICE_CLASS(klass);
> +
> + set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
> + dc->props = vhost_blk_pci_properties;
> + k->realize = vhost_blk_pci_realize;
> + pcidev_k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET;
> + pcidev_k->device_id = PCI_DEVICE_ID_VIRTIO_BLOCK;
> + pcidev_k->revision = VIRTIO_PCI_ABI_VERSION;
> + pcidev_k->class_id = PCI_CLASS_STORAGE_SCSI;
> +}
> +
> +static void vhost_blk_pci_instance_init(Object *obj)
> +{
> + VHostBlkPCI *dev = VHOST_BLK_PCI(obj);
> +
> + virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev),
> + TYPE_VHOST_BLK);
> + object_property_add_alias(obj, "bootindex", OBJECT(&dev->vdev),
> + "bootindex", &error_abort);
> +}
> +
> +static const TypeInfo vhost_blk_pci_info = {
> + .name = TYPE_VHOST_BLK_PCI,
> + .parent = TYPE_VIRTIO_PCI,
> + .instance_size = sizeof(VHostBlkPCI),
> + .instance_init = vhost_blk_pci_instance_init,
> + .class_init = vhost_blk_pci_class_init,
> +};
> +#endif
> +
> /* virtio-scsi-pci */
>
> static Property virtio_scsi_pci_properties[] = {
> @@ -2723,6 +2780,9 @@ static void virtio_pci_register_types(void)
> #ifdef CONFIG_VHOST_VSOCK
> type_register_static(&vhost_vsock_pci_info);
> #endif
> +#ifdef CONFIG_VHOST_BLK
> + type_register_static(&vhost_blk_pci_info);
> +#endif
> }
>
> type_init(virtio_pci_register_types)
> diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
> index 813082b0d7..4cca04a45a 100644
> --- a/hw/virtio/virtio-pci.h
> +++ b/hw/virtio/virtio-pci.h
> @@ -31,6 +31,10 @@
> #include "hw/virtio/vhost-user-blk.h"
> #endif
>
> +#ifdef CONFIG_VHOST_BLK
> +#include "hw/virtio/vhost-blk.h"
> +#endif
> +
> #ifdef CONFIG_VIRTFS
> #include "hw/9pfs/virtio-9p.h"
> #endif
> @@ -50,6 +54,7 @@ typedef struct VirtIONetPCI VirtIONetPCI;
> typedef struct VHostSCSIPCI VHostSCSIPCI;
> typedef struct VHostUserSCSIPCI VHostUserSCSIPCI;
> typedef struct VHostUserBlkPCI VHostUserBlkPCI;
> +typedef struct VHostBlkPCI VHostBlkPCI;
> typedef struct VirtIORngPCI VirtIORngPCI;
> typedef struct VirtIOInputPCI VirtIOInputPCI;
> typedef struct VirtIOInputHIDPCI VirtIOInputHIDPCI;
> @@ -262,6 +267,20 @@ struct VHostUserBlkPCI {
> };
> #endif
>
> +#ifdef CONFIG_VHOST_BLK
> +/*
> + * vhost-blk-pci: This extends VirtioPCIProxy.
> + */
> +#define TYPE_VHOST_BLK_PCI "vhost-blk-pci"
> +#define VHOST_BLK_PCI(obj) \
> + OBJECT_CHECK(VHostBlkPCI, (obj), TYPE_VHOST_BLK_PCI)
> +
> +struct VHostBlkPCI {
> + VirtIOPCIProxy parent_obj;
> + VHostBlk vdev;
> +};
> +#endif
> +
> /*
> * virtio-blk-pci: This extends VirtioPCIProxy.
> */
> diff --git a/include/hw/virtio/vhost-blk.h b/include/hw/virtio/vhost-blk.h
> new file mode 100644
> index 0000000000..6dc5d9634a
> --- /dev/null
> +++ b/include/hw/virtio/vhost-blk.h
> @@ -0,0 +1,43 @@
> +/*
> + * vhost-blk host device
> + * Copyright(C) 2018 IBM Corporation.
> + *
> + * Authors:
> + * Vitaly Mayatskikh <v.mayatskih@gmail.com>
> + *
> + * Based on vhost-user-blk.h, Copyright Intel, Corp. 2017
> + *
> + * This work is licensed under the terms of the GNU LGPL, version 2 or later.
> + * See the COPYING.LIB file in the top-level directory.
> + *
> + */
> +
> +#ifndef VHOST_BLK_H
> +#define VHOST_BLK_H
> +
> +#include "standard-headers/linux/virtio_blk.h"
> +#include "qemu-common.h"
> +#include "hw/qdev.h"
> +#include "hw/block/block.h"
> +#include "hw/virtio/vhost.h"
> +#include "sysemu/block-backend.h"
> +
> +#define TYPE_VHOST_BLK "vhost-blk"
> +#define VHOST_BLK(obj) \
> + OBJECT_CHECK(VHostBlk, (obj), TYPE_VHOST_BLK)
> +
> +typedef struct VHostBlk {
> + VirtIODevice parent_obj;
> + BlockConf conf;
> + BlockBackend *blk;
> + int32_t bootindex;
> + struct virtio_blk_config blkcfg;
> + uint16_t num_queues;
> + uint32_t queue_size;
> + uint32_t config_wce;
> + int vhostfd;
> + int bs_fd;
> + struct vhost_dev dev;
> +} VHostBlk;
> +
> +#endif
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V2] Add vhost-pci-blk driver
2018-11-05 20:56 ` [Qemu-devel] [PATCH 1/1 " Vitaly Mayatskikh
2018-11-08 14:09 ` Dongli Zhang
@ 2018-11-08 14:50 ` Kevin Wolf
1 sibling, 0 replies; 7+ messages in thread
From: Kevin Wolf @ 2018-11-08 14:50 UTC (permalink / raw)
To: Vitaly Mayatskikh
Cc: qemu-block, Max Reitz, Michael S . Tsirkin, Maxim Levitsky,
qemu-devel
Am 05.11.2018 um 21:56 hat Vitaly Mayatskikh geschrieben:
> This driver uses the kernel-mode acceleration for virtio-blk and
> allows to get a near bare metal disk performance inside a VM.
>
> Signed-off-by: Vitaly Mayatskikh <v.mayatskih@gmail.com>
> ---
> configure | 10 +
> default-configs/virtio.mak | 1 +
> hw/block/Makefile.objs | 1 +
> hw/block/vhost-blk.c | 429 ++++++++++++++++++++++++++++++++++
> hw/virtio/virtio-pci.c | 60 +++++
> hw/virtio/virtio-pci.h | 19 ++
> include/hw/virtio/vhost-blk.h | 43 ++++
> 7 files changed, 563 insertions(+)
> create mode 100644 hw/block/vhost-blk.c
> create mode 100644 include/hw/virtio/vhost-blk.h
>
> diff --git a/configure b/configure
> index 46ae1e8c76..787bc780da 100755
> --- a/configure
> +++ b/configure
> @@ -371,6 +371,7 @@ vhost_crypto="no"
> vhost_scsi="no"
> vhost_vsock="no"
> vhost_user=""
> +vhost_blk=""
> kvm="no"
> hax="no"
> hvf="no"
> @@ -869,6 +870,7 @@ Linux)
> vhost_crypto="yes"
> vhost_scsi="yes"
> vhost_vsock="yes"
> + vhost_blk="yes"
> QEMU_INCLUDES="-I\$(SRC_PATH)/linux-headers -I$(pwd)/linux-headers $QEMU_INCLUDES"
> supported_os="yes"
> libudev="yes"
> @@ -1263,6 +1265,10 @@ for opt do
> ;;
> --enable-vhost-vsock) vhost_vsock="yes"
> ;;
> + --disable-vhost-blk) vhost_blk="no"
> + ;;
> + --enable-vhost-blk) vhost_blk="yes"
> + ;;
> --disable-opengl) opengl="no"
> ;;
> --enable-opengl) opengl="yes"
> @@ -6000,6 +6006,7 @@ echo "vhost-crypto support $vhost_crypto"
> echo "vhost-scsi support $vhost_scsi"
> echo "vhost-vsock support $vhost_vsock"
> echo "vhost-user support $vhost_user"
> +echo "vhost-blk support $vhost_blk"
> echo "Trace backends $trace_backends"
> if have_backend "simple"; then
> echo "Trace output file $trace_file-<pid>"
> @@ -6461,6 +6468,9 @@ fi
> if test "$vhost_user" = "yes" ; then
> echo "CONFIG_VHOST_USER=y" >> $config_host_mak
> fi
> +if test "$vhost_blk" = "yes" ; then
> + echo "CONFIG_VHOST_BLK=y" >> $config_host_mak
> +fi
> if test "$blobs" = "yes" ; then
> echo "INSTALL_BLOBS=yes" >> $config_host_mak
> fi
> diff --git a/default-configs/virtio.mak b/default-configs/virtio.mak
> index 1304849018..765c0a2a04 100644
> --- a/default-configs/virtio.mak
> +++ b/default-configs/virtio.mak
> @@ -1,5 +1,6 @@
> CONFIG_VHOST_USER_SCSI=$(call land,$(CONFIG_VHOST_USER),$(CONFIG_LINUX))
> CONFIG_VHOST_USER_BLK=$(call land,$(CONFIG_VHOST_USER),$(CONFIG_LINUX))
> +CONFIG_VHOST_BLK=$(CONFIG_LINUX)
> CONFIG_VIRTIO=y
> CONFIG_VIRTIO_9P=y
> CONFIG_VIRTIO_BALLOON=y
> diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs
> index 53ce5751ae..857ce823fc 100644
> --- a/hw/block/Makefile.objs
> +++ b/hw/block/Makefile.objs
> @@ -14,3 +14,4 @@ obj-$(CONFIG_SH4) += tc58128.o
> obj-$(CONFIG_VIRTIO_BLK) += virtio-blk.o
> obj-$(CONFIG_VIRTIO_BLK) += dataplane/
> obj-$(CONFIG_VHOST_USER_BLK) += vhost-user-blk.o
> +obj-$(CONFIG_VHOST_BLK) += vhost-blk.o
> diff --git a/hw/block/vhost-blk.c b/hw/block/vhost-blk.c
> new file mode 100644
> index 0000000000..4ca8040ee7
> --- /dev/null
> +++ b/hw/block/vhost-blk.c
> @@ -0,0 +1,429 @@
> +/*
> + * vhost-blk host device
> + *
> + * Copyright(C) 2018 IBM Corporation
> + *
> + * Authors:
> + * Vitaly Mayatskikh <v.mayatskih@gmail.com>
> + *
> + * Largely based on the "vhost-user-blk.c" implemented by:
> + * Changpeng Liu <changpeng.liu@intel.com>
You mean contrib/vhost-user-blk/vhost-user-blk.c in the QEMU tree? That
one has the following license:
* This work is licensed under the terms of the GNU GPL, version 2 only.
* See the COPYING file in the top-level directory.
You have this on the other hand:
> + * This work is licensed under the terms of the GNU LGPL, version 2 or later.
> + * See the COPYING.LIB file in the top-level directory.
I think you need to get rid of any vhost-user-blk.c parts before you can
make it LGPL2+.
Kevin
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V2] Add vhost-pci-blk driver
2018-11-08 14:09 ` Dongli Zhang
@ 2018-11-08 16:47 ` Michael S. Tsirkin
2018-11-09 1:48 ` Dongli Zhang
0 siblings, 1 reply; 7+ messages in thread
From: Michael S. Tsirkin @ 2018-11-08 16:47 UTC (permalink / raw)
To: Dongli Zhang
Cc: Vitaly Mayatskikh, qemu-block, Kevin Wolf, qemu-devel,
Maxim Levitsky, Max Reitz
On Thu, Nov 08, 2018 at 10:09:00PM +0800, Dongli Zhang wrote:
> It looks the kernel space vhost-blk can only process raw image.
>
> How about to verify that only raw image is used in the drive command line when
> vhost-blk-pci is paired with it?
>
> Otherwise, vhost-blk-pci might be working with qcow2 image without any warning
> on qemu side.
>
> Dongli Zhang
raw being raw can you really verify that?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V2] Add vhost-pci-blk driver
2018-11-08 16:47 ` Michael S. Tsirkin
@ 2018-11-09 1:48 ` Dongli Zhang
2018-11-09 9:33 ` Kevin Wolf
0 siblings, 1 reply; 7+ messages in thread
From: Dongli Zhang @ 2018-11-09 1:48 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Kevin Wolf, Vitaly Mayatskikh, qemu-block, qemu-devel, Max Reitz,
Maxim Levitsky
On 11/09/2018 12:47 AM, Michael S. Tsirkin wrote:
> On Thu, Nov 08, 2018 at 10:09:00PM +0800, Dongli Zhang wrote:
>> It looks the kernel space vhost-blk can only process raw image.
>>
>> How about to verify that only raw image is used in the drive command line when
>> vhost-blk-pci is paired with it?
>>
>> Otherwise, vhost-blk-pci might be working with qcow2 image without any warning
>> on qemu side.
>>
>> Dongli Zhang
>
> raw being raw can you really verify that?
>
I meant to verify the property 'format=' of '-drive', e.g., to check if
BlockBackend->root->bs->drv->format_name is raw?
We allow the user to erroneously give a qcow2 file with 'format=raw'. However,
if 'format=qcow2' is set explicitly, vhots-blk-pci would exit with error as only
raw is supported in kernel space.
Dongli Zhang
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH 1/1 V2] Add vhost-pci-blk driver
2018-11-09 1:48 ` Dongli Zhang
@ 2018-11-09 9:33 ` Kevin Wolf
0 siblings, 0 replies; 7+ messages in thread
From: Kevin Wolf @ 2018-11-09 9:33 UTC (permalink / raw)
To: Dongli Zhang
Cc: Michael S. Tsirkin, Vitaly Mayatskikh, qemu-block, qemu-devel,
Max Reitz, Maxim Levitsky
Am 09.11.2018 um 02:48 hat Dongli Zhang geschrieben:
>
>
> On 11/09/2018 12:47 AM, Michael S. Tsirkin wrote:
> > On Thu, Nov 08, 2018 at 10:09:00PM +0800, Dongli Zhang wrote:
> >> It looks the kernel space vhost-blk can only process raw image.
> >>
> >> How about to verify that only raw image is used in the drive command line when
> >> vhost-blk-pci is paired with it?
> >>
> >> Otherwise, vhost-blk-pci might be working with qcow2 image without any warning
> >> on qemu side.
> >>
> >> Dongli Zhang
> >
> > raw being raw can you really verify that?
>
> I meant to verify the property 'format=' of '-drive', e.g., to check if
> BlockBackend->root->bs->drv->format_name is raw?
I don't like string comparisons for this, but at least check for "file"
and "host_device" then, because "raw" is a format driver that can be
used above any protocol driver, not just for local files.
Though rather than comparing strings, I'd check bs->drv == &bdrv_file.
> We allow the user to erroneously give a qcow2 file with 'format=raw'.
> However, if 'format=qcow2' is set explicitly, vhots-blk-pci would exit
> with error as only raw is supported in kernel space.
Please note that the QEMU block layer offers various options even for
file-posix images, and silently ignoring them is probably not very nice.
So just checking the driver is not enough, you also need to check that
no feature is enabled that vhost can't provide.
At first sight, this includes at least detect-zeroes, copy-on-read and
cache.no-flush. There may be more, but this just needs some work to find
all of them. More interesting question: How will we make sure that the
list is updated when we add new options?
Is it actually a good idea to use the block layer at all for this, or
would just taking a filename string be a better idea for this device?
Of course, if you want to do automatic switchover as Stefan suggested,
havin a block node is required and we need to figure out the exact
conditions when we can switch and a way to keep them updated when we
change things.
Kevin
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2018-11-09 9:33 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-11-05 20:56 [Qemu-devel] [PATCH 0/1 V2] Add vhost-pci-blk driver Vitaly Mayatskikh
2018-11-05 20:56 ` [Qemu-devel] [PATCH 1/1 " Vitaly Mayatskikh
2018-11-08 14:09 ` Dongli Zhang
2018-11-08 16:47 ` Michael S. Tsirkin
2018-11-09 1:48 ` Dongli Zhang
2018-11-09 9:33 ` Kevin Wolf
2018-11-08 14:50 ` Kevin Wolf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).