* [Qemu-devel] [PATCH 0/2] Introduce vhost-user-scsi and sample application @ 2016-10-26 15:26 Felipe Franciosi 2016-10-26 15:26 ` [Qemu-devel] [PATCH 1/2] vus: Introduce vhost-user-scsi host device Felipe Franciosi 2016-10-26 15:26 ` [Qemu-devel] [PATCH 2/2] vus: Introduce a vhost-user-scsi sample application Felipe Franciosi 0 siblings, 2 replies; 7+ messages in thread From: Felipe Franciosi @ 2016-10-26 15:26 UTC (permalink / raw) To: Paolo Bonzini, Stefan Hajnoczi, Marc-Andre Lureau, Michael S. Tsirkin Cc: qemu-devel, Felipe Franciosi Based on various discussions on the 2016 KVM Forum, I'm sending over a vhost-user-scsi implementation for your consideration. This patch set introduces a new vhost-user SCSI device. While heavily based on vhost-scsi, it is implemented using vhost's userspace counterpart. The device has been coded and tested to work with live migration. A sample application based on the newly introduced libvhost-user is also included. It makes use of libiscsi for simplicity. For convenience, I'm maintaining an up-to-date version of these patches (including some necessary fixes for libvhost-user still under discussion) on: https://github.com/franciozzy/qemu/tree/vus-upstream See the individual patches for build and use instructions. Signed-off-by: Felipe Franciosi <felipe@nutanix.com> Felipe Franciosi (2): vus: Introduce vhost-user-scsi host device vus: Introduce a vhost-user-scsi sample application configure | 10 + hw/scsi/Makefile.objs | 1 + hw/scsi/vhost-user-scsi.c | 299 +++++++++++++ hw/virtio/virtio-pci.c | 58 +++ hw/virtio/virtio-pci.h | 15 + include/hw/virtio/vhost-user-scsi.h | 45 ++ include/hw/virtio/virtio-scsi.h | 5 + tests/Makefile.include | 2 + tests/vhost-user-scsi.c | 862 ++++++++++++++++++++++++++++++++++++ 9 files changed, 1297 insertions(+) create mode 100644 hw/scsi/vhost-user-scsi.c create mode 100644 include/hw/virtio/vhost-user-scsi.h create mode 100644 tests/vhost-user-scsi.c -- 1.9.4 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Qemu-devel] [PATCH 1/2] vus: Introduce vhost-user-scsi host device 2016-10-26 15:26 [Qemu-devel] [PATCH 0/2] Introduce vhost-user-scsi and sample application Felipe Franciosi @ 2016-10-26 15:26 ` Felipe Franciosi 2016-10-27 12:12 ` Paolo Bonzini 2016-10-26 15:26 ` [Qemu-devel] [PATCH 2/2] vus: Introduce a vhost-user-scsi sample application Felipe Franciosi 1 sibling, 1 reply; 7+ messages in thread From: Felipe Franciosi @ 2016-10-26 15:26 UTC (permalink / raw) To: Paolo Bonzini, Stefan Hajnoczi, Marc-Andre Lureau, Michael S. Tsirkin Cc: qemu-devel, Felipe Franciosi This commit introduces a vhost-user device for SCSI. This is based on the existing vhost-scsi implementation, but done over vhost-user instead. It also uses a chardev to connect to the backend. Unlike vhost-scsi (today), VMs using vhost-user-scsi can be live migrated. To use it, one must configure Qemu with --enable-vhost-user-scsi and start Qemu with a command line equivalent to: qemu-system-x86_64 \ -chardev socket,id=vus0,path=/tmp/vus.sock \ -device vhost-user-scsi-pci,chardev=vus0,bus=pci.0,addr=0x4 A separate commit presents a sample application linked with libiscsi to provide a backend for vhost-user-scsi. Signed-off-by: Felipe Franciosi <felipe@nutanix.com> --- configure | 10 ++ hw/scsi/Makefile.objs | 1 + hw/scsi/vhost-user-scsi.c | 299 ++++++++++++++++++++++++++++++++++++ hw/virtio/virtio-pci.c | 58 +++++++ hw/virtio/virtio-pci.h | 15 ++ include/hw/virtio/vhost-user-scsi.h | 45 ++++++ include/hw/virtio/virtio-scsi.h | 5 + 7 files changed, 433 insertions(+) create mode 100644 hw/scsi/vhost-user-scsi.c create mode 100644 include/hw/virtio/vhost-user-scsi.h diff --git a/configure b/configure index d3dafcb..0574ff2 100755 --- a/configure +++ b/configure @@ -228,6 +228,7 @@ xfs="" vhost_net="no" vhost_scsi="no" +vhost_user_scsi="no" vhost_vsock="no" kvm="no" rdma="" @@ -677,6 +678,7 @@ Haiku) kvm="yes" vhost_net="yes" vhost_scsi="yes" + vhost_user_scsi="yes" vhost_vsock="yes" QEMU_INCLUDES="-I\$(SRC_PATH)/linux-headers -I$(pwd)/linux-headers $QEMU_INCLUDES" ;; @@ -1019,6 +1021,10 @@ for opt do ;; --enable-vhost-scsi) vhost_scsi="yes" ;; + --disable-vhost-user-scsi) vhost_user_scsi="no" + ;; + --enable-vhost-user-scsi) vhost_user_scsi="yes" + ;; --disable-vhost-vsock) vhost_vsock="no" ;; --enable-vhost-vsock) vhost_vsock="yes" @@ -4951,6 +4957,7 @@ echo "posix_madvise $posix_madvise" echo "libcap-ng support $cap_ng" echo "vhost-net support $vhost_net" echo "vhost-scsi support $vhost_scsi" +echo "vhost-user-scsi support $vhost_user_scsi" echo "vhost-vsock support $vhost_vsock" echo "Trace backends $trace_backends" if have_backend "simple"; then @@ -5336,6 +5343,9 @@ fi if test "$vhost_scsi" = "yes" ; then echo "CONFIG_VHOST_SCSI=y" >> $config_host_mak fi +if test "$vhost_user_scsi" = "yes" ; then + echo "CONFIG_VHOST_USER_SCSI=y" >> $config_host_mak +fi if test "$vhost_net" = "yes" ; then echo "CONFIG_VHOST_NET_USED=y" >> $config_host_mak fi diff --git a/hw/scsi/Makefile.objs b/hw/scsi/Makefile.objs index 5a2248b..3338aad 100644 --- a/hw/scsi/Makefile.objs +++ b/hw/scsi/Makefile.objs @@ -11,4 +11,5 @@ obj-$(CONFIG_PSERIES) += spapr_vscsi.o ifeq ($(CONFIG_VIRTIO),y) obj-y += virtio-scsi.o virtio-scsi-dataplane.o obj-$(CONFIG_VHOST_SCSI) += vhost-scsi.o +obj-$(CONFIG_VHOST_USER_SCSI) += vhost-user-scsi.o endif diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c new file mode 100644 index 0000000..a7b9839 --- /dev/null +++ b/hw/scsi/vhost-user-scsi.c @@ -0,0 +1,299 @@ +/* + * vhost-user-scsi host device + * + * Copyright (c) 2016 Nutanix Inc. All rights reserved. + * + * Author: + * Felipe Franciosi <felipe@nutanix.com> + * + * This work is largely based on the "vhost-scsi" implementation by: + * Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> + * Nicholas Bellinger <nab@risingtidesystems.com> + * + * This work is licensed under the terms of the GNU LGPL, version 2 or later. + * See the COPYING.LIB file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include "migration/vmstate.h" +#include "qapi/error.h" +#include "qemu/error-report.h" +#include "qemu/typedefs.h" +#include "qom/object.h" +#include "hw/fw-path-provider.h" +#include "hw/qdev-core.h" +#include "hw/virtio/vhost.h" +#include "hw/virtio/vhost-backend.h" +#include "hw/virtio/vhost-user-scsi.h" +#include "hw/virtio/virtio.h" +#include "hw/virtio/virtio-access.h" +#include "hw/virtio/virtio-bus.h" +#include "sysemu/char.h" + +/* Features supported by the host application */ +static const int user_feature_bits[] = { + VIRTIO_F_NOTIFY_ON_EMPTY, + VIRTIO_RING_F_INDIRECT_DESC, + VIRTIO_RING_F_EVENT_IDX, + VIRTIO_SCSI_F_HOTPLUG, + VHOST_INVALID_FEATURE_BIT +}; + +static int vhost_user_scsi_start(VHostUserSCSI *s) +{ + int ret, i; + VirtIODevice *vdev = VIRTIO_DEVICE(s); + BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev))); + VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); + + if (!k->set_guest_notifiers) { + error_report("binding does not support guest notifiers"); + return -ENOSYS; + } + + ret = vhost_dev_enable_notifiers(&s->dev, vdev); + if (ret < 0) { + return ret; + } + + s->dev.acked_features = vdev->guest_features; + ret = vhost_dev_start(&s->dev, vdev); + if (ret < 0) { + error_report("Error starting vhost-user device"); + goto err_notifiers; + } + + ret = k->set_guest_notifiers(qbus->parent, s->dev.nvqs, true); + if (ret < 0) { + error_report("Error binding guest notifier"); + goto err_vhost_stop; + } + + /* guest_notifier_mask/pending not used yet, so just unmask + * everything here. virtio-pci will do the right thing by + * enabling/disabling irqfd. + */ + for (i = 0; i < s->dev.nvqs; i++) { + vhost_virtqueue_mask(&s->dev, vdev, s->dev.vq_index + i, false); + } + + return ret; + +err_vhost_stop: + vhost_dev_stop(&s->dev, vdev); +err_notifiers: + vhost_dev_disable_notifiers(&s->dev, vdev); + return ret; +} + +static void vhost_user_scsi_stop(VHostUserSCSI *s) +{ + VirtIODevice *vdev = VIRTIO_DEVICE(s); + BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev))); + VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); + int ret = 0; + + vhost_dev_stop(&s->dev, vdev); + vhost_dev_disable_notifiers(&s->dev, vdev); + + if (k->set_guest_notifiers) { + ret = k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false); + if (ret < 0) { + error_report("vhost guest notifier cleanup failed: %d", ret); + } + } + assert(ret >= 0); +} + +static uint64_t vhost_user_scsi_get_features(VirtIODevice *vdev, + uint64_t features, + Error **errp) +{ + VHostUserSCSI *s = VHOST_USER_SCSI(vdev); + + return vhost_get_features(&s->dev, user_feature_bits, features); +} + +static void vhost_user_scsi_set_config(VirtIODevice *vdev, + const uint8_t *config) +{ + VirtIOSCSIConfig *scsiconf = (VirtIOSCSIConfig *)config; + VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(vdev); + + if ((uint32_t)virtio_ldl_p(vdev, &scsiconf->sense_size) != vs->sense_size || + (uint32_t)virtio_ldl_p(vdev, &scsiconf->cdb_size) != vs->cdb_size) { + error_report("vhost-user-scsi doesn't allow sense or CDB sizes change"); + exit(1); + } +} + +static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status) +{ + VHostUserSCSI *s = (VHostUserSCSI *)vdev; + bool start = (status & VIRTIO_CONFIG_S_DRIVER_OK) && vdev->vm_running; + + if (s->dev.started == start) { + return; + } + + if (start) { + int ret; + + ret = vhost_user_scsi_start(s); + if (ret < 0) { + error_report("unable to start vhost-user-scsi: %s", strerror(-ret)); + exit(1); + } + } else { + vhost_user_scsi_stop(s); + } +} + +static void vhost_dummy_handle_output(VirtIODevice *vdev, VirtQueue *vq) +{ +} + +static void vhost_user_scsi_save(QEMUFile *f, void *opaque) +{ + VirtIODevice *vdev = VIRTIO_DEVICE(opaque); + virtio_save(vdev, f); +} + +static int vhost_user_scsi_load(QEMUFile *f, void *opaque, int version_id) +{ + VirtIODevice *vdev = VIRTIO_DEVICE(opaque); + return virtio_load(vdev, f, version_id); +} + +static void vhost_user_scsi_realize(DeviceState *dev, Error **errp) +{ + VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(dev); + VHostUserSCSI *s = VHOST_USER_SCSI(dev); + static int vhost_user_scsi_id; + Error *err = NULL; + int ret; + + if (!vs->conf.chardev.chr) { + error_setg(errp, "vhost-user-scsi: missing chardev"); + return; + } + + virtio_scsi_common_realize(dev, &err, vhost_dummy_handle_output, + vhost_dummy_handle_output, + vhost_dummy_handle_output); + if (err != NULL) { + error_propagate(errp, err); + return; + } + + s->dev.nvqs = VHOST_USER_SCSI_VQ_NUM_FIXED + vs->conf.num_queues; + s->dev.vqs = g_new(struct vhost_virtqueue, s->dev.nvqs); + s->dev.vq_index = 0; + s->dev.backend_features = 0; + + ret = vhost_dev_init(&s->dev, (void *)&vs->conf.chardev, + VHOST_BACKEND_TYPE_USER, 0); + if (ret < 0) { + error_setg(errp, "vhost-user-scsi: vhost initialization failed: %s", + strerror(-ret)); + return; + } + + /* Channel and lun both are 0 for bootable vhost-user-scsi disk */ + s->channel = 0; + s->lun = 0; + s->target = vs->conf.boot_tpgt; + + register_savevm(dev, "vhost-user-scsi", vhost_user_scsi_id++, 1, + vhost_user_scsi_save, vhost_user_scsi_load, s); +} + +static void vhost_user_scsi_unrealize(DeviceState *dev, Error **errp) +{ + VirtIODevice *vdev = VIRTIO_DEVICE(dev); + VHostUserSCSI *s = VHOST_USER_SCSI(dev); + + /* This will stop the vhost backend. */ + vhost_user_scsi_set_status(vdev, 0); + + vhost_dev_cleanup(&s->dev); + g_free(s->dev.vqs); + + virtio_scsi_common_unrealize(dev, errp); +} + +/* + * Implementation of an interface to adjust firmware path + * for the bootindex property handling. + */ +static char *vhost_user_scsi_get_fw_dev_path(FWPathProvider *p, BusState *bus, + DeviceState *dev) +{ + VHostUserSCSI *s = VHOST_USER_SCSI(dev); + /* format: /channel@<channel>/vhost-user-scsi@<target>,<lun> */ + return g_strdup_printf("/channel@%x/%s@%x,%x", s->channel, + qdev_fw_name(dev), s->target, s->lun); +} + +static Property vhost_user_scsi_properties[] = { + DEFINE_PROP_CHR("chardev", VHostUserSCSI, + parent_obj.conf.chardev), + DEFINE_PROP_UINT32("boot_tpgt", VHostUserSCSI, + parent_obj.conf.boot_tpgt, 0), + DEFINE_PROP_UINT32("num_queues", VHostUserSCSI, + parent_obj.conf.num_queues, 1), + DEFINE_PROP_UINT32("max_sectors", VHostUserSCSI, + parent_obj.conf.max_sectors, 0xFFFF), + DEFINE_PROP_UINT32("cmd_per_lun", VHostUserSCSI, + parent_obj.conf.cmd_per_lun, 128), + DEFINE_PROP_END_OF_LIST(), +}; + +static void vhost_user_scsi_class_init(ObjectClass *klass, void *data) +{ + DeviceClass *dc = DEVICE_CLASS(klass); + VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass); + FWPathProviderClass *fwc = FW_PATH_PROVIDER_CLASS(klass); + + dc->props = vhost_user_scsi_properties; + set_bit(DEVICE_CATEGORY_STORAGE, dc->categories); + vdc->realize = vhost_user_scsi_realize; + vdc->unrealize = vhost_user_scsi_unrealize; + vdc->get_features = vhost_user_scsi_get_features; + vdc->set_config = vhost_user_scsi_set_config; + vdc->set_status = vhost_user_scsi_set_status; + fwc->get_dev_path = vhost_user_scsi_get_fw_dev_path; +} + +static void vhost_user_scsi_instance_init(Object *obj) +{ + VHostUserSCSI *dev = VHOST_USER_SCSI(obj); + VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(obj); + + // Add the bootindex property for this object + device_add_bootindex_property(obj, &dev->bootindex, "bootindex", NULL, + DEVICE(dev), NULL); + + // Set boot index according the the device config + object_property_set_int(obj, vs->conf.bootindex, "bootindex", NULL); +} + +static const TypeInfo vhost_user_scsi_info = { + .name = TYPE_VHOST_USER_SCSI, + .parent = TYPE_VIRTIO_SCSI_COMMON, + .instance_size = sizeof(VHostUserSCSI), + .class_init = vhost_user_scsi_class_init, + .instance_init = vhost_user_scsi_instance_init, + .interfaces = (InterfaceInfo[]) { + { TYPE_FW_PATH_PROVIDER }, + { } + }, +}; + +static void virtio_register_types(void) +{ + type_register_static(&vhost_user_scsi_info); +} + +type_init(virtio_register_types) diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c index 06831de..b996d37 100644 --- a/hw/virtio/virtio-pci.c +++ b/hw/virtio/virtio-pci.c @@ -2098,6 +2098,61 @@ static const TypeInfo vhost_scsi_pci_info = { }; #endif +/* vhost-user-scsi-pci */ +#ifdef CONFIG_VHOST_USER_SCSI +static Property vhost_user_scsi_pci_properties[] = { + DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, + DEV_NVECTORS_UNSPECIFIED), + DEFINE_PROP_END_OF_LIST(), +}; + +static void vhost_user_scsi_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp) +{ + VHostUserSCSIPCI *dev = VHOST_USER_SCSI_PCI(vpci_dev); + DeviceState *vdev = DEVICE(&dev->vdev); + VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(vdev); + + if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) { + vpci_dev->nvectors = vs->conf.num_queues + 3; + } + + qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus)); + object_property_set_bool(OBJECT(vdev), true, "realized", errp); +} + +static void vhost_user_scsi_pci_class_init(ObjectClass *klass, void *data) +{ + DeviceClass *dc = DEVICE_CLASS(klass); + VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass); + PCIDeviceClass *pcidev_k = PCI_DEVICE_CLASS(klass); + k->realize = vhost_user_scsi_pci_realize; + set_bit(DEVICE_CATEGORY_STORAGE, dc->categories); + dc->props = vhost_user_scsi_pci_properties; + pcidev_k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET; + pcidev_k->device_id = PCI_DEVICE_ID_VIRTIO_SCSI; + pcidev_k->revision = 0x00; + pcidev_k->class_id = PCI_CLASS_STORAGE_SCSI; +} + +static void vhost_user_scsi_pci_instance_init(Object *obj) +{ + VHostUserSCSIPCI *dev = VHOST_USER_SCSI_PCI(obj); + + virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev), + TYPE_VHOST_USER_SCSI); + object_property_add_alias(obj, "bootindex", OBJECT(&dev->vdev), + "bootindex", &error_abort); +} + +static const TypeInfo vhost_user_scsi_pci_info = { + .name = TYPE_VHOST_USER_SCSI_PCI, + .parent = TYPE_VIRTIO_PCI, + .instance_size = sizeof(VHostUserSCSIPCI), + .instance_init = vhost_user_scsi_pci_instance_init, + .class_init = vhost_user_scsi_pci_class_init, +}; +#endif /* CONFIG_VHOST_USER_SCSI */ + /* vhost-vsock-pci */ #ifdef CONFIG_VHOST_VSOCK @@ -2577,6 +2632,9 @@ static void virtio_pci_register_types(void) #ifdef CONFIG_VHOST_SCSI type_register_static(&vhost_scsi_pci_info); #endif +#ifdef CONFIG_VHOST_USER_SCSI + type_register_static(&vhost_user_scsi_pci_info); +#endif #ifdef CONFIG_VHOST_VSOCK type_register_static(&vhost_vsock_pci_info); #endif diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h index b4edea6..404b2c4 100644 --- a/hw/virtio/virtio-pci.h +++ b/hw/virtio/virtio-pci.h @@ -31,6 +31,9 @@ #ifdef CONFIG_VHOST_SCSI #include "hw/virtio/vhost-scsi.h" #endif +#ifdef CONFIG_VHOST_USER_SCSI +#include "hw/virtio/vhost-user-scsi.h" +#endif #ifdef CONFIG_VHOST_VSOCK #include "hw/virtio/vhost-vsock.h" #endif @@ -42,6 +45,7 @@ typedef struct VirtIOBalloonPCI VirtIOBalloonPCI; typedef struct VirtIOSerialPCI VirtIOSerialPCI; typedef struct VirtIONetPCI VirtIONetPCI; typedef struct VHostSCSIPCI VHostSCSIPCI; +typedef struct VHostUserSCSIPCI VHostUserSCSIPCI; typedef struct VirtIORngPCI VirtIORngPCI; typedef struct VirtIOInputPCI VirtIOInputPCI; typedef struct VirtIOInputHIDPCI VirtIOInputHIDPCI; @@ -212,6 +216,17 @@ struct VHostSCSIPCI { }; #endif +#ifdef CONFIG_VHOST_USER_SCSI +#define TYPE_VHOST_USER_SCSI_PCI "vhost-user-scsi-pci" +#define VHOST_USER_SCSI_PCI(obj) \ + OBJECT_CHECK(VHostUserSCSIPCI, (obj), TYPE_VHOST_USER_SCSI_PCI) + +struct VHostUserSCSIPCI { + VirtIOPCIProxy parent_obj; + VHostUserSCSI vdev; +}; +#endif + /* * virtio-blk-pci: This extends VirtioPCIProxy. */ diff --git a/include/hw/virtio/vhost-user-scsi.h b/include/hw/virtio/vhost-user-scsi.h new file mode 100644 index 0000000..4a16181 --- /dev/null +++ b/include/hw/virtio/vhost-user-scsi.h @@ -0,0 +1,45 @@ +/* + * vhost-user-scsi host device + * + * Copyright (c) 2016 Nutanix Inc. All rights reserved. + * + * Author: + * Felipe Franciosi <felipe@nutanix.com> + * + * This file is largely based on "vhost-scsi.h" by: + * Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> + * + * This work is licensed under the terms of the GNU LGPL, version 2 or later. + * See the COPYING.LIB file in the top-level directory. + * + */ + +#ifndef VHOST_USER_SCSI_H +#define VHOST_USER_SCSI_H + +#include "qemu-common.h" +#include "hw/qdev.h" +#include "hw/virtio/virtio-scsi.h" +#include "hw/virtio/vhost.h" + +enum vhost_user_scsi_vq_list { + VHOST_USER_SCSI_VQ_CONTROL = 0, + VHOST_USER_SCSI_VQ_EVENT = 1, + VHOST_USER_SCSI_VQ_NUM_FIXED = 2, +}; + +#define TYPE_VHOST_USER_SCSI "vhost-user-scsi" +#define VHOST_USER_SCSI(obj) \ + OBJECT_CHECK(VHostUserSCSI, (obj), TYPE_VHOST_USER_SCSI) + +typedef struct VHostUserSCSI { + VirtIOSCSICommon parent_obj; + + struct vhost_dev dev; + int32_t bootindex; + int channel; + int target; + int lun; +} VHostUserSCSI; + +#endif /* VHOST_USER_SCSI_H */ diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h index a1e0cfb..4b97386 100644 --- a/include/hw/virtio/virtio-scsi.h +++ b/include/hw/virtio/virtio-scsi.h @@ -21,6 +21,7 @@ #include "hw/virtio/virtio.h" #include "hw/pci/pci.h" #include "hw/scsi/scsi.h" +#include "sysemu/char.h" #include "sysemu/iothread.h" #define TYPE_VIRTIO_SCSI_COMMON "virtio-scsi-common" @@ -53,6 +54,10 @@ struct VirtIOSCSIConf { char *wwpn; uint32_t boot_tpgt; IOThread *iothread; +#ifdef CONFIG_VHOST_USER_SCSI + CharBackend chardev; + int32_t bootindex; +#endif }; struct VirtIOSCSI; -- 1.9.4 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] vus: Introduce vhost-user-scsi host device 2016-10-26 15:26 ` [Qemu-devel] [PATCH 1/2] vus: Introduce vhost-user-scsi host device Felipe Franciosi @ 2016-10-27 12:12 ` Paolo Bonzini 0 siblings, 0 replies; 7+ messages in thread From: Paolo Bonzini @ 2016-10-27 12:12 UTC (permalink / raw) To: Felipe Franciosi, Stefan Hajnoczi, Marc-Andre Lureau, Michael S. Tsirkin Cc: qemu-devel On 26/10/2016 17:26, Felipe Franciosi wrote: > This commit introduces a vhost-user device for SCSI. This is based > on the existing vhost-scsi implementation, but done over vhost-user > instead. It also uses a chardev to connect to the backend. Unlike > vhost-scsi (today), VMs using vhost-user-scsi can be live migrated. > > To use it, one must configure Qemu with --enable-vhost-user-scsi and > start Qemu with a command line equivalent to: > > qemu-system-x86_64 \ > -chardev socket,id=vus0,path=/tmp/vus.sock \ > -device vhost-user-scsi-pci,chardev=vus0,bus=pci.0,addr=0x4 > > A separate commit presents a sample application linked with libiscsi to > provide a backend for vhost-user-scsi. Hi, most of the code you copy can be kept in one place only, by using the vhost_ops struct. Please make a hierarchy virtio-scsi-common vhost-scsi-common vhost-scsi (adds vhostfd+wwpn) vhost-user-scsi (adds chardev) with abstract methods in vhost-scsi-common to abstract e.g. opening /dev/scsi and setting/clearing the endpoint, which are only needed by vhost-scsi's realize method. This should avoid the duplication. Thanks! Paolo > Signed-off-by: Felipe Franciosi <felipe@nutanix.com> > --- > configure | 10 ++ > hw/scsi/Makefile.objs | 1 + > hw/scsi/vhost-user-scsi.c | 299 ++++++++++++++++++++++++++++++++++++ > hw/virtio/virtio-pci.c | 58 +++++++ > hw/virtio/virtio-pci.h | 15 ++ > include/hw/virtio/vhost-user-scsi.h | 45 ++++++ > include/hw/virtio/virtio-scsi.h | 5 + > 7 files changed, 433 insertions(+) > create mode 100644 hw/scsi/vhost-user-scsi.c > create mode 100644 include/hw/virtio/vhost-user-scsi.h > > diff --git a/configure b/configure > index d3dafcb..0574ff2 100755 > --- a/configure > +++ b/configure > @@ -228,6 +228,7 @@ xfs="" > > vhost_net="no" > vhost_scsi="no" > +vhost_user_scsi="no" > vhost_vsock="no" > kvm="no" > rdma="" > @@ -677,6 +678,7 @@ Haiku) > kvm="yes" > vhost_net="yes" > vhost_scsi="yes" > + vhost_user_scsi="yes" > vhost_vsock="yes" > QEMU_INCLUDES="-I\$(SRC_PATH)/linux-headers -I$(pwd)/linux-headers $QEMU_INCLUDES" > ;; > @@ -1019,6 +1021,10 @@ for opt do > ;; > --enable-vhost-scsi) vhost_scsi="yes" > ;; > + --disable-vhost-user-scsi) vhost_user_scsi="no" > + ;; > + --enable-vhost-user-scsi) vhost_user_scsi="yes" > + ;; > --disable-vhost-vsock) vhost_vsock="no" > ;; > --enable-vhost-vsock) vhost_vsock="yes" > @@ -4951,6 +4957,7 @@ echo "posix_madvise $posix_madvise" > echo "libcap-ng support $cap_ng" > echo "vhost-net support $vhost_net" > echo "vhost-scsi support $vhost_scsi" > +echo "vhost-user-scsi support $vhost_user_scsi" > echo "vhost-vsock support $vhost_vsock" > echo "Trace backends $trace_backends" > if have_backend "simple"; then > @@ -5336,6 +5343,9 @@ fi > if test "$vhost_scsi" = "yes" ; then > echo "CONFIG_VHOST_SCSI=y" >> $config_host_mak > fi > +if test "$vhost_user_scsi" = "yes" ; then > + echo "CONFIG_VHOST_USER_SCSI=y" >> $config_host_mak > +fi > if test "$vhost_net" = "yes" ; then > echo "CONFIG_VHOST_NET_USED=y" >> $config_host_mak > fi > diff --git a/hw/scsi/Makefile.objs b/hw/scsi/Makefile.objs > index 5a2248b..3338aad 100644 > --- a/hw/scsi/Makefile.objs > +++ b/hw/scsi/Makefile.objs > @@ -11,4 +11,5 @@ obj-$(CONFIG_PSERIES) += spapr_vscsi.o > ifeq ($(CONFIG_VIRTIO),y) > obj-y += virtio-scsi.o virtio-scsi-dataplane.o > obj-$(CONFIG_VHOST_SCSI) += vhost-scsi.o > +obj-$(CONFIG_VHOST_USER_SCSI) += vhost-user-scsi.o > endif > diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c > new file mode 100644 > index 0000000..a7b9839 > --- /dev/null > +++ b/hw/scsi/vhost-user-scsi.c > @@ -0,0 +1,299 @@ > +/* > + * vhost-user-scsi host device > + * > + * Copyright (c) 2016 Nutanix Inc. All rights reserved. > + * > + * Author: > + * Felipe Franciosi <felipe@nutanix.com> > + * > + * This work is largely based on the "vhost-scsi" implementation by: > + * Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> > + * Nicholas Bellinger <nab@risingtidesystems.com> > + * > + * This work is licensed under the terms of the GNU LGPL, version 2 or later. > + * See the COPYING.LIB file in the top-level directory. > + * > + */ > + > +#include "qemu/osdep.h" > +#include "migration/vmstate.h" > +#include "qapi/error.h" > +#include "qemu/error-report.h" > +#include "qemu/typedefs.h" > +#include "qom/object.h" > +#include "hw/fw-path-provider.h" > +#include "hw/qdev-core.h" > +#include "hw/virtio/vhost.h" > +#include "hw/virtio/vhost-backend.h" > +#include "hw/virtio/vhost-user-scsi.h" > +#include "hw/virtio/virtio.h" > +#include "hw/virtio/virtio-access.h" > +#include "hw/virtio/virtio-bus.h" > +#include "sysemu/char.h" > + > +/* Features supported by the host application */ > +static const int user_feature_bits[] = { > + VIRTIO_F_NOTIFY_ON_EMPTY, > + VIRTIO_RING_F_INDIRECT_DESC, > + VIRTIO_RING_F_EVENT_IDX, > + VIRTIO_SCSI_F_HOTPLUG, > + VHOST_INVALID_FEATURE_BIT > +}; > + > +static int vhost_user_scsi_start(VHostUserSCSI *s) > +{ > + int ret, i; > + VirtIODevice *vdev = VIRTIO_DEVICE(s); > + BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev))); > + VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); > + > + if (!k->set_guest_notifiers) { > + error_report("binding does not support guest notifiers"); > + return -ENOSYS; > + } > + > + ret = vhost_dev_enable_notifiers(&s->dev, vdev); > + if (ret < 0) { > + return ret; > + } > + > + s->dev.acked_features = vdev->guest_features; > + ret = vhost_dev_start(&s->dev, vdev); > + if (ret < 0) { > + error_report("Error starting vhost-user device"); > + goto err_notifiers; > + } > + > + ret = k->set_guest_notifiers(qbus->parent, s->dev.nvqs, true); > + if (ret < 0) { > + error_report("Error binding guest notifier"); > + goto err_vhost_stop; > + } > + > + /* guest_notifier_mask/pending not used yet, so just unmask > + * everything here. virtio-pci will do the right thing by > + * enabling/disabling irqfd. > + */ > + for (i = 0; i < s->dev.nvqs; i++) { > + vhost_virtqueue_mask(&s->dev, vdev, s->dev.vq_index + i, false); > + } > + > + return ret; > + > +err_vhost_stop: > + vhost_dev_stop(&s->dev, vdev); > +err_notifiers: > + vhost_dev_disable_notifiers(&s->dev, vdev); > + return ret; > +} > + > +static void vhost_user_scsi_stop(VHostUserSCSI *s) > +{ > + VirtIODevice *vdev = VIRTIO_DEVICE(s); > + BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev))); > + VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); > + int ret = 0; > + > + vhost_dev_stop(&s->dev, vdev); > + vhost_dev_disable_notifiers(&s->dev, vdev); > + > + if (k->set_guest_notifiers) { > + ret = k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false); > + if (ret < 0) { > + error_report("vhost guest notifier cleanup failed: %d", ret); > + } > + } > + assert(ret >= 0); > +} > + > +static uint64_t vhost_user_scsi_get_features(VirtIODevice *vdev, > + uint64_t features, > + Error **errp) > +{ > + VHostUserSCSI *s = VHOST_USER_SCSI(vdev); > + > + return vhost_get_features(&s->dev, user_feature_bits, features); > +} > + > +static void vhost_user_scsi_set_config(VirtIODevice *vdev, > + const uint8_t *config) > +{ > + VirtIOSCSIConfig *scsiconf = (VirtIOSCSIConfig *)config; > + VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(vdev); > + > + if ((uint32_t)virtio_ldl_p(vdev, &scsiconf->sense_size) != vs->sense_size || > + (uint32_t)virtio_ldl_p(vdev, &scsiconf->cdb_size) != vs->cdb_size) { > + error_report("vhost-user-scsi doesn't allow sense or CDB sizes change"); > + exit(1); > + } > +} > + > +static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status) > +{ > + VHostUserSCSI *s = (VHostUserSCSI *)vdev; > + bool start = (status & VIRTIO_CONFIG_S_DRIVER_OK) && vdev->vm_running; > + > + if (s->dev.started == start) { > + return; > + } > + > + if (start) { > + int ret; > + > + ret = vhost_user_scsi_start(s); > + if (ret < 0) { > + error_report("unable to start vhost-user-scsi: %s", strerror(-ret)); > + exit(1); > + } > + } else { > + vhost_user_scsi_stop(s); > + } > +} > + > +static void vhost_dummy_handle_output(VirtIODevice *vdev, VirtQueue *vq) > +{ > +} > + > +static void vhost_user_scsi_save(QEMUFile *f, void *opaque) > +{ > + VirtIODevice *vdev = VIRTIO_DEVICE(opaque); > + virtio_save(vdev, f); > +} > + > +static int vhost_user_scsi_load(QEMUFile *f, void *opaque, int version_id) > +{ > + VirtIODevice *vdev = VIRTIO_DEVICE(opaque); > + return virtio_load(vdev, f, version_id); > +} > + > +static void vhost_user_scsi_realize(DeviceState *dev, Error **errp) > +{ > + VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(dev); > + VHostUserSCSI *s = VHOST_USER_SCSI(dev); > + static int vhost_user_scsi_id; > + Error *err = NULL; > + int ret; > + > + if (!vs->conf.chardev.chr) { > + error_setg(errp, "vhost-user-scsi: missing chardev"); > + return; > + } > + > + virtio_scsi_common_realize(dev, &err, vhost_dummy_handle_output, > + vhost_dummy_handle_output, > + vhost_dummy_handle_output); > + if (err != NULL) { > + error_propagate(errp, err); > + return; > + } > + > + s->dev.nvqs = VHOST_USER_SCSI_VQ_NUM_FIXED + vs->conf.num_queues; > + s->dev.vqs = g_new(struct vhost_virtqueue, s->dev.nvqs); > + s->dev.vq_index = 0; > + s->dev.backend_features = 0; > + > + ret = vhost_dev_init(&s->dev, (void *)&vs->conf.chardev, > + VHOST_BACKEND_TYPE_USER, 0); > + if (ret < 0) { > + error_setg(errp, "vhost-user-scsi: vhost initialization failed: %s", > + strerror(-ret)); > + return; > + } > + > + /* Channel and lun both are 0 for bootable vhost-user-scsi disk */ > + s->channel = 0; > + s->lun = 0; > + s->target = vs->conf.boot_tpgt; > + > + register_savevm(dev, "vhost-user-scsi", vhost_user_scsi_id++, 1, > + vhost_user_scsi_save, vhost_user_scsi_load, s); > +} > + > +static void vhost_user_scsi_unrealize(DeviceState *dev, Error **errp) > +{ > + VirtIODevice *vdev = VIRTIO_DEVICE(dev); > + VHostUserSCSI *s = VHOST_USER_SCSI(dev); > + > + /* This will stop the vhost backend. */ > + vhost_user_scsi_set_status(vdev, 0); > + > + vhost_dev_cleanup(&s->dev); > + g_free(s->dev.vqs); > + > + virtio_scsi_common_unrealize(dev, errp); > +} > + > +/* > + * Implementation of an interface to adjust firmware path > + * for the bootindex property handling. > + */ > +static char *vhost_user_scsi_get_fw_dev_path(FWPathProvider *p, BusState *bus, > + DeviceState *dev) > +{ > + VHostUserSCSI *s = VHOST_USER_SCSI(dev); > + /* format: /channel@<channel>/vhost-user-scsi@<target>,<lun> */ > + return g_strdup_printf("/channel@%x/%s@%x,%x", s->channel, > + qdev_fw_name(dev), s->target, s->lun); > +} > + > +static Property vhost_user_scsi_properties[] = { > + DEFINE_PROP_CHR("chardev", VHostUserSCSI, > + parent_obj.conf.chardev), > + DEFINE_PROP_UINT32("boot_tpgt", VHostUserSCSI, > + parent_obj.conf.boot_tpgt, 0), > + DEFINE_PROP_UINT32("num_queues", VHostUserSCSI, > + parent_obj.conf.num_queues, 1), > + DEFINE_PROP_UINT32("max_sectors", VHostUserSCSI, > + parent_obj.conf.max_sectors, 0xFFFF), > + DEFINE_PROP_UINT32("cmd_per_lun", VHostUserSCSI, > + parent_obj.conf.cmd_per_lun, 128), > + DEFINE_PROP_END_OF_LIST(), > +}; > + > +static void vhost_user_scsi_class_init(ObjectClass *klass, void *data) > +{ > + DeviceClass *dc = DEVICE_CLASS(klass); > + VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass); > + FWPathProviderClass *fwc = FW_PATH_PROVIDER_CLASS(klass); > + > + dc->props = vhost_user_scsi_properties; > + set_bit(DEVICE_CATEGORY_STORAGE, dc->categories); > + vdc->realize = vhost_user_scsi_realize; > + vdc->unrealize = vhost_user_scsi_unrealize; > + vdc->get_features = vhost_user_scsi_get_features; > + vdc->set_config = vhost_user_scsi_set_config; > + vdc->set_status = vhost_user_scsi_set_status; > + fwc->get_dev_path = vhost_user_scsi_get_fw_dev_path; > +} > + > +static void vhost_user_scsi_instance_init(Object *obj) > +{ > + VHostUserSCSI *dev = VHOST_USER_SCSI(obj); > + VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(obj); > + > + // Add the bootindex property for this object > + device_add_bootindex_property(obj, &dev->bootindex, "bootindex", NULL, > + DEVICE(dev), NULL); > + > + // Set boot index according the the device config > + object_property_set_int(obj, vs->conf.bootindex, "bootindex", NULL); > +} > + > +static const TypeInfo vhost_user_scsi_info = { > + .name = TYPE_VHOST_USER_SCSI, > + .parent = TYPE_VIRTIO_SCSI_COMMON, > + .instance_size = sizeof(VHostUserSCSI), > + .class_init = vhost_user_scsi_class_init, > + .instance_init = vhost_user_scsi_instance_init, > + .interfaces = (InterfaceInfo[]) { > + { TYPE_FW_PATH_PROVIDER }, > + { } > + }, > +}; > + > +static void virtio_register_types(void) > +{ > + type_register_static(&vhost_user_scsi_info); > +} > + > +type_init(virtio_register_types) > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c > index 06831de..b996d37 100644 > --- a/hw/virtio/virtio-pci.c > +++ b/hw/virtio/virtio-pci.c > @@ -2098,6 +2098,61 @@ static const TypeInfo vhost_scsi_pci_info = { > }; > #endif > > +/* vhost-user-scsi-pci */ > +#ifdef CONFIG_VHOST_USER_SCSI > +static Property vhost_user_scsi_pci_properties[] = { > + DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, > + DEV_NVECTORS_UNSPECIFIED), > + DEFINE_PROP_END_OF_LIST(), > +}; > + > +static void vhost_user_scsi_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp) > +{ > + VHostUserSCSIPCI *dev = VHOST_USER_SCSI_PCI(vpci_dev); > + DeviceState *vdev = DEVICE(&dev->vdev); > + VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(vdev); > + > + if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) { > + vpci_dev->nvectors = vs->conf.num_queues + 3; > + } > + > + qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus)); > + object_property_set_bool(OBJECT(vdev), true, "realized", errp); > +} > + > +static void vhost_user_scsi_pci_class_init(ObjectClass *klass, void *data) > +{ > + DeviceClass *dc = DEVICE_CLASS(klass); > + VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass); > + PCIDeviceClass *pcidev_k = PCI_DEVICE_CLASS(klass); > + k->realize = vhost_user_scsi_pci_realize; > + set_bit(DEVICE_CATEGORY_STORAGE, dc->categories); > + dc->props = vhost_user_scsi_pci_properties; > + pcidev_k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET; > + pcidev_k->device_id = PCI_DEVICE_ID_VIRTIO_SCSI; > + pcidev_k->revision = 0x00; > + pcidev_k->class_id = PCI_CLASS_STORAGE_SCSI; > +} > + > +static void vhost_user_scsi_pci_instance_init(Object *obj) > +{ > + VHostUserSCSIPCI *dev = VHOST_USER_SCSI_PCI(obj); > + > + virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev), > + TYPE_VHOST_USER_SCSI); > + object_property_add_alias(obj, "bootindex", OBJECT(&dev->vdev), > + "bootindex", &error_abort); > +} > + > +static const TypeInfo vhost_user_scsi_pci_info = { > + .name = TYPE_VHOST_USER_SCSI_PCI, > + .parent = TYPE_VIRTIO_PCI, > + .instance_size = sizeof(VHostUserSCSIPCI), > + .instance_init = vhost_user_scsi_pci_instance_init, > + .class_init = vhost_user_scsi_pci_class_init, > +}; > +#endif /* CONFIG_VHOST_USER_SCSI */ > + > /* vhost-vsock-pci */ > > #ifdef CONFIG_VHOST_VSOCK > @@ -2577,6 +2632,9 @@ static void virtio_pci_register_types(void) > #ifdef CONFIG_VHOST_SCSI > type_register_static(&vhost_scsi_pci_info); > #endif > +#ifdef CONFIG_VHOST_USER_SCSI > + type_register_static(&vhost_user_scsi_pci_info); > +#endif > #ifdef CONFIG_VHOST_VSOCK > type_register_static(&vhost_vsock_pci_info); > #endif > diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h > index b4edea6..404b2c4 100644 > --- a/hw/virtio/virtio-pci.h > +++ b/hw/virtio/virtio-pci.h > @@ -31,6 +31,9 @@ > #ifdef CONFIG_VHOST_SCSI > #include "hw/virtio/vhost-scsi.h" > #endif > +#ifdef CONFIG_VHOST_USER_SCSI > +#include "hw/virtio/vhost-user-scsi.h" > +#endif > #ifdef CONFIG_VHOST_VSOCK > #include "hw/virtio/vhost-vsock.h" > #endif > @@ -42,6 +45,7 @@ typedef struct VirtIOBalloonPCI VirtIOBalloonPCI; > typedef struct VirtIOSerialPCI VirtIOSerialPCI; > typedef struct VirtIONetPCI VirtIONetPCI; > typedef struct VHostSCSIPCI VHostSCSIPCI; > +typedef struct VHostUserSCSIPCI VHostUserSCSIPCI; > typedef struct VirtIORngPCI VirtIORngPCI; > typedef struct VirtIOInputPCI VirtIOInputPCI; > typedef struct VirtIOInputHIDPCI VirtIOInputHIDPCI; > @@ -212,6 +216,17 @@ struct VHostSCSIPCI { > }; > #endif > > +#ifdef CONFIG_VHOST_USER_SCSI > +#define TYPE_VHOST_USER_SCSI_PCI "vhost-user-scsi-pci" > +#define VHOST_USER_SCSI_PCI(obj) \ > + OBJECT_CHECK(VHostUserSCSIPCI, (obj), TYPE_VHOST_USER_SCSI_PCI) > + > +struct VHostUserSCSIPCI { > + VirtIOPCIProxy parent_obj; > + VHostUserSCSI vdev; > +}; > +#endif > + > /* > * virtio-blk-pci: This extends VirtioPCIProxy. > */ > diff --git a/include/hw/virtio/vhost-user-scsi.h b/include/hw/virtio/vhost-user-scsi.h > new file mode 100644 > index 0000000..4a16181 > --- /dev/null > +++ b/include/hw/virtio/vhost-user-scsi.h > @@ -0,0 +1,45 @@ > +/* > + * vhost-user-scsi host device > + * > + * Copyright (c) 2016 Nutanix Inc. All rights reserved. > + * > + * Author: > + * Felipe Franciosi <felipe@nutanix.com> > + * > + * This file is largely based on "vhost-scsi.h" by: > + * Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> > + * > + * This work is licensed under the terms of the GNU LGPL, version 2 or later. > + * See the COPYING.LIB file in the top-level directory. > + * > + */ > + > +#ifndef VHOST_USER_SCSI_H > +#define VHOST_USER_SCSI_H > + > +#include "qemu-common.h" > +#include "hw/qdev.h" > +#include "hw/virtio/virtio-scsi.h" > +#include "hw/virtio/vhost.h" > + > +enum vhost_user_scsi_vq_list { > + VHOST_USER_SCSI_VQ_CONTROL = 0, > + VHOST_USER_SCSI_VQ_EVENT = 1, > + VHOST_USER_SCSI_VQ_NUM_FIXED = 2, > +}; > + > +#define TYPE_VHOST_USER_SCSI "vhost-user-scsi" > +#define VHOST_USER_SCSI(obj) \ > + OBJECT_CHECK(VHostUserSCSI, (obj), TYPE_VHOST_USER_SCSI) > + > +typedef struct VHostUserSCSI { > + VirtIOSCSICommon parent_obj; > + > + struct vhost_dev dev; > + int32_t bootindex; > + int channel; > + int target; > + int lun; > +} VHostUserSCSI; > + > +#endif /* VHOST_USER_SCSI_H */ > diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h > index a1e0cfb..4b97386 100644 > --- a/include/hw/virtio/virtio-scsi.h > +++ b/include/hw/virtio/virtio-scsi.h > @@ -21,6 +21,7 @@ > #include "hw/virtio/virtio.h" > #include "hw/pci/pci.h" > #include "hw/scsi/scsi.h" > +#include "sysemu/char.h" > #include "sysemu/iothread.h" > > #define TYPE_VIRTIO_SCSI_COMMON "virtio-scsi-common" > @@ -53,6 +54,10 @@ struct VirtIOSCSIConf { > char *wwpn; > uint32_t boot_tpgt; > IOThread *iothread; > +#ifdef CONFIG_VHOST_USER_SCSI > + CharBackend chardev; > + int32_t bootindex; > +#endif > }; > > struct VirtIOSCSI; > ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Qemu-devel] [PATCH 2/2] vus: Introduce a vhost-user-scsi sample application 2016-10-26 15:26 [Qemu-devel] [PATCH 0/2] Introduce vhost-user-scsi and sample application Felipe Franciosi 2016-10-26 15:26 ` [Qemu-devel] [PATCH 1/2] vus: Introduce vhost-user-scsi host device Felipe Franciosi @ 2016-10-26 15:26 ` Felipe Franciosi 2016-10-27 12:16 ` Paolo Bonzini 1 sibling, 1 reply; 7+ messages in thread From: Felipe Franciosi @ 2016-10-26 15:26 UTC (permalink / raw) To: Paolo Bonzini, Stefan Hajnoczi, Marc-Andre Lureau, Michael S. Tsirkin Cc: qemu-devel, Felipe Franciosi This commit introduces a vhost-user-scsi backend sample application. It must be linked with libiscsi and libvhost-user. To use it, compile with: make tests/vhost-user-scsi And run as follows: tests/vhost-user-scsi -u /tmp/vus.sock -i iscsi://uri_to_target/ The application is currently limited at one LUN only and it processes requests synchronously (therefore only achieving QD1). The purpose of the code is to show how a backend can be implemented and to test the vhost-user-scsi Qemu implementation. If a different instance of this vhost-user-scsi application is executed at a remote host, a VM can be live migrated to such a host. Signed-off-by: Felipe Franciosi <felipe@nutanix.com> --- tests/Makefile.include | 2 + tests/vhost-user-scsi.c | 862 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 864 insertions(+) create mode 100644 tests/vhost-user-scsi.c diff --git a/tests/Makefile.include b/tests/Makefile.include index 7e6fd23..e61fe54 100644 --- a/tests/Makefile.include +++ b/tests/Makefile.include @@ -685,6 +685,8 @@ tests/test-filter-redirector$(EXESUF): tests/test-filter-redirector.o $(qtest-ob tests/test-x86-cpuid-compat$(EXESUF): tests/test-x86-cpuid-compat.o $(qtest-obj-y) tests/ivshmem-test$(EXESUF): tests/ivshmem-test.o contrib/ivshmem-server/ivshmem-server.o $(libqos-pc-obj-y) tests/vhost-user-bridge$(EXESUF): tests/vhost-user-bridge.o contrib/libvhost-user/libvhost-user.o $(test-util-obj-y) +tests/vhost-user-scsi.o-cflags := $(LIBISCSI_CFLAGS) +tests/vhost-user-scsi$(EXESUF): tests/vhost-user-scsi.o contrib/libvhost-user/libvhost-user.o $(test-util-obj-y) $(test-block-obj-y) tests/test-uuid$(EXESUF): tests/test-uuid.o $(test-util-obj-y) tests/test-arm-mptimer$(EXESUF): tests/test-arm-mptimer.o diff --git a/tests/vhost-user-scsi.c b/tests/vhost-user-scsi.c new file mode 100644 index 0000000..c92b3b2 --- /dev/null +++ b/tests/vhost-user-scsi.c @@ -0,0 +1,862 @@ +/* + * vhost-user-scsi sample application + * + * Copyright (c) 2016 Nutanix Inc. All rights reserved. + * + * Author: + * Felipe Franciosi <felipe@nutanix.com> + * + * This work is licensed under the terms of the GNU GPL, version 2 only. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "contrib/libvhost-user/libvhost-user.h" +#include "hw/virtio/virtio-scsi.h" +#include "iscsi/iscsi.h" + +#include <poll.h> + +#define VHOST_USER_SCSI_DEBUG 1 + +/** Log helpers **/ + +#define PPRE \ + struct timespec ts; \ + char timebuf[64]; \ + struct tm tm; \ + (void)clock_gettime(CLOCK_REALTIME, &ts); \ + (void)strftime(timebuf, 64, "%Y%m%d %T", gmtime_r(&ts.tv_sec, &tm)) + +#define PEXT(lvl, msg, ...) do { \ + PPRE; \ + fprintf(stderr, "%s.%06ld " lvl ": %s:%s():%d: " msg "\n", \ + timebuf, ts.tv_nsec/1000, \ + __FILE__, __FUNCTION__, __LINE__, ## __VA_ARGS__); \ +} while(0) + +#define PNOR(lvl, msg, ...) do { \ + PPRE; \ + fprintf(stderr, "%s.%06ld " lvl ": " msg "\n", \ + timebuf, ts.tv_nsec/1000, ## __VA_ARGS__); \ +} while(0); + +#ifdef VHOST_USER_SCSI_DEBUG +#define PDBG(msg, ...) PEXT("DBG", msg, ## __VA_ARGS__) +#define PERR(msg, ...) PEXT("ERR", msg, ## __VA_ARGS__) +#define PLOG(msg, ...) PEXT("LOG", msg, ## __VA_ARGS__) +#else +#define PDBG(msg, ...) { } +#define PERR(msg, ...) PNOR("ERR", msg, ## __VA_ARGS__) +#define PLOG(msg, ...) PNOR("LOG", msg, ## __VA_ARGS__) +#endif + +/** vhost-user-scsi specific definitions **/ + +/* TODO: MAX is defined at 8, should be 1024 */ +#define VUS_SCHED_MAX_FDS (1 + (2*VHOST_MAX_NR_VIRTQUEUE)) + +#define VDEV_SCSI_MAX_LUNS 1 // Only 1 lun supported today +#define VDEV_SCSI_MAX_DEVS 1 // Only 1 devices supported today + +#define ISCSI_INITIATOR "iqn.2016-10.com.nutanix:vhost-user-scsi" + +typedef void (*misc_cb) (short evt, void *pvt); + +typedef struct sched_data { + vu_watch_cb cb1; + misc_cb cb2; + void *pvt; + short evt; +} sched_data_t; + +typedef struct sched { + VuDev *vu_dev; + nfds_t nfds; + struct pollfd fds[VUS_SCHED_MAX_FDS]; + sched_data_t data[VUS_SCHED_MAX_FDS]; + int quit; +} sched_t; + +typedef struct iscsi_lun { + struct iscsi_context *iscsi_ctx; + int iscsi_lun; +} iscsi_lun_t; + +typedef struct vhost_scsi_dev { + VuDev vu_dev; + int server_sock; + sched_t sched; + iscsi_lun_t luns[VDEV_SCSI_MAX_LUNS]; +} vhost_scsi_dev_t; + +static vhost_scsi_dev_t *vhost_scsi_devs[VDEV_SCSI_MAX_DEVS]; + +static vhost_scsi_dev_t *vdev_scsi_find_by_vu(VuDev *vu_dev); + +/** poll-based scheduler for libvhost-user and misc callbacks **/ + +static int sched_add(sched_t *sched, int fd, short int evt, + vu_watch_cb cb1, misc_cb cb2, void *pvt) { + int i; + + assert(sched); + assert(fd >= 0); + assert(evt); + assert(cb1 || cb2); + assert(!(cb1 && cb2)); // only one of the cbs should be used + + for (i=0; i<sched->nfds && i<VUS_SCHED_MAX_FDS; i++) { + if (sched->fds[i].fd == fd) { + break; + } + } + if (i == VUS_SCHED_MAX_FDS) { + PERR("Error adding fd: max number of fds reached"); + return -1; + } + + sched->fds[i].fd = fd; + sched->fds[i].events = evt; + sched->data[i].cb1 = cb1; + sched->data[i].cb2 = cb2; + sched->data[i].pvt = pvt; + sched->data[i].evt = evt; + + if (sched->nfds <= i) { + sched->nfds = i+1; + } + + PDBG("sched@%p: add fd %d to slot %i", sched, fd, i); + + return 0; +} + +static int sched_del(sched_t *sched, int fd) { + int i; + + assert(sched); + assert(fd >= 0); + + for (i=0; i<sched->nfds; i++) { + if (sched->fds[i].fd == fd) { + break; + } + } + if (sched->nfds == i) { +#ifdef VUS_PEDANTIC_SCHEDULER + PERR("Error deleting fd %d: fd not found", fd); + return -1; +#else + return 0; +#endif + } + + sched->nfds--; + if (sched->nfds > 0) { + // Overwrite deleted entry with last entry from scheduler + memcpy(&sched->fds[i], &sched->fds[sched->nfds], + sizeof(struct pollfd)); + memcpy(&sched->data[i], &sched->data[sched->nfds], + sizeof(sched_data_t)); + } + memset(&sched->fds[sched->nfds], 0, sizeof(struct pollfd)); + memset(&sched->data[sched->nfds], 0, sizeof(sched_data_t)); + + PDBG("sched@%p: del fd %d from slot %i", sched, fd, i); + + return 0; +} + +static int sched_loop(sched_t *sched) { + int i, n; + + assert(sched); + assert(sched->nfds > 0); + + while (!sched->quit) { + n = poll(sched->fds, sched->nfds, -1); + if (n < 0) { + PERR("Error polling: %s", strerror(errno)); + return -1; + } + + for (i=0; i<sched->nfds && n; i++) { + if (sched->fds[i].revents != 0) { + + if (sched->data[i].cb1) { + int vu_evt = 0; + + if (sched->fds[i].revents & POLLIN) vu_evt |= VU_WATCH_IN; + if (sched->fds[i].revents & POLLOUT) vu_evt |= VU_WATCH_OUT; + if (sched->fds[i].revents & POLLPRI) vu_evt |= VU_WATCH_PRI; + if (sched->fds[i].revents & POLLERR) vu_evt |= VU_WATCH_ERR; + if (sched->fds[i].revents & POLLHUP) vu_evt |= VU_WATCH_HUP; + + PDBG("sched@%p: fd[%d] (%d): cb1(%p, %d, %p)", sched, i, + sched->fds[i].fd, sched->vu_dev, vu_evt, + sched->data[i].pvt); + + sched->data[i].cb1(sched->vu_dev, vu_evt, + sched->data[i].pvt); + } else { + PDBG("sched@%p: fd[%d] (%d): cbb(%hd, %p)", sched, i, + sched->fds[i].fd, sched->fds[i].revents, + sched->data[i].pvt); + + sched->data[i].cb2(sched->fds[i].revents, + sched->data[i].pvt); + } + + n--; + } + } + } + + return 0; +} + +/** from libiscsi's scsi-lowlevel.h **/ + +#define SCSI_CDB_MAX_SIZE 16 + +struct scsi_iovector { + struct scsi_iovec *iov; + int niov; + int nalloc; + size_t offset; + int consumed; +}; + +struct scsi_allocated_memory { + struct scsi_allocated_memory *next; + char buf[0]; +}; + +struct scsi_data { + int size; + unsigned char *data; +}; + +enum scsi_sense_key { + SCSI_SENSE_NO_SENSE = 0x00, + SCSI_SENSE_RECOVERED_ERROR = 0x01, + SCSI_SENSE_NOT_READY = 0x02, + SCSI_SENSE_MEDIUM_ERROR = 0x03, + SCSI_SENSE_HARDWARE_ERROR = 0x04, + SCSI_SENSE_ILLEGAL_REQUEST = 0x05, + SCSI_SENSE_UNIT_ATTENTION = 0x06, + SCSI_SENSE_DATA_PROTECTION = 0x07, + SCSI_SENSE_BLANK_CHECK = 0x08, + SCSI_SENSE_VENDOR_SPECIFIC = 0x09, + SCSI_SENSE_COPY_ABORTED = 0x0a, + SCSI_SENSE_COMMAND_ABORTED = 0x0b, + SCSI_SENSE_OBSOLETE_ERROR_CODE = 0x0c, + SCSI_SENSE_OVERFLOW_COMMAND = 0x0d, + SCSI_SENSE_MISCOMPARE = 0x0e +}; + +struct scsi_sense { + unsigned char error_type; + enum scsi_sense_key key; + int ascq; + unsigned sense_specific:1; + unsigned ill_param_in_cdb:1; + unsigned bit_pointer_valid:1; + unsigned char bit_pointer; + uint16_t field_pointer; +}; + +enum scsi_residual { + SCSI_RESIDUAL_NO_RESIDUAL = 0, + SCSI_RESIDUAL_UNDERFLOW, + SCSI_RESIDUAL_OVERFLOW +}; + +struct scsi_task { + int status; + int cdb_size; + int xfer_dir; + int expxferlen; + unsigned char cdb[SCSI_CDB_MAX_SIZE]; + enum scsi_residual residual_status; + size_t residual; + struct scsi_sense sense; + struct scsi_data datain; + struct scsi_allocated_memory *mem; + void *ptr; + + uint32_t itt; + uint32_t cmdsn; + uint32_t lun; + + struct scsi_iovector iovector_in; + struct scsi_iovector iovector_out; +}; + +/** libiscsi integration **/ + +static int iscsi_add_lun(iscsi_lun_t *lun, char *iscsi_uri) { + struct iscsi_url *iscsi_url; + struct iscsi_context *iscsi_ctx; + int ret = 0; + + assert(lun); + + iscsi_ctx = iscsi_create_context(ISCSI_INITIATOR); + if (!iscsi_ctx) { + PERR("Unable to create iSCSI context"); + return -1; + } + + iscsi_url = iscsi_parse_full_url(iscsi_ctx, iscsi_uri); + if (!iscsi_url) { + PERR("Unable to parse iSCSI URL: %s", iscsi_get_error(iscsi_ctx)); + goto fail; + } + + iscsi_set_session_type(iscsi_ctx, ISCSI_SESSION_NORMAL); + iscsi_set_header_digest(iscsi_ctx, ISCSI_HEADER_DIGEST_NONE_CRC32C); + if (iscsi_full_connect_sync(iscsi_ctx, iscsi_url->portal, iscsi_url->lun)) { + PERR("Unable to login to iSCSI portal: %s", iscsi_get_error(iscsi_ctx)); + goto fail; + } + + lun->iscsi_ctx = iscsi_ctx; + lun->iscsi_lun = iscsi_url->lun; + + PDBG("Context %p created for lun 0: %s", iscsi_ctx, iscsi_uri); + +out: + if (iscsi_url) { + iscsi_destroy_url(iscsi_url); + } + return ret; + +fail: + (void)iscsi_destroy_context(iscsi_ctx); + ret = -1; + goto out; +} + +static struct scsi_task *scsi_task_new(int cdb_len, uint8_t *cdb, int dir, + int xfer_len) { + struct scsi_task *task; + + assert(cdb_len > 0); + assert(cdb); + + task = calloc(1, sizeof(struct scsi_task)); + if (!task) { + PERR("Error allocating task: %s", strerror(errno)); + return NULL; + } + + memcpy(task->cdb, cdb, cdb_len); + task->cdb_size = cdb_len; + task->xfer_dir = dir; + task->expxferlen = xfer_len; + + return task; +} + +static int get_cdb_len(uint8_t *cdb) { + switch(cdb[0] >> 5){ + case 0: + return 6; + case 1: + case 2: + return 10; + case 4: + return 16; + case 5: + return 12; + } + PERR("Unable to determine cdb len (0x%02hhX)", cdb[0]>>5); + return -1; +} + +static int handle_cmd_sync(struct iscsi_context *ctx, + VirtIOSCSICmdReq *req, + struct iovec *out, unsigned int out_len, + VirtIOSCSICmdResp *rsp, + struct iovec *in, unsigned int in_len) { + struct scsi_task *task; + uint32_t dir; + uint32_t len; + int cdb_len; + int i; + + if (!((!req->lun[1]) && (req->lun[2] == 0x40) && (!req->lun[3]))) { + // Ignore anything different than target=0, lun=0 + PDBG("Ignoring unconnected lun (0x%hhX, 0x%hhX)", + req->lun[1], req->lun[3]); + rsp->status = SCSI_STATUS_CHECK_CONDITION; + memset(rsp->sense, 0, sizeof(rsp->sense)); + rsp->sense_len = 18; + rsp->sense[0] = 0x70; + rsp->sense[2] = 0x05; // ILLEGAL_REQUEST + rsp->sense[7] = 10; + rsp->sense[12] = 0x24; + + return 0; + } + + cdb_len = get_cdb_len(req->cdb); + if (cdb_len == -1) { + return -1; + } + + len = 0; + if (!out_len && !in_len) { + dir = SCSI_XFER_NONE; + } else if (out_len) { + dir = SCSI_XFER_TO_DEV; + for (i=0; i<out_len; i++) { + len += out[i].iov_len; + } + } else { + dir = SCSI_XFER_FROM_DEV; + for (i=0; i<in_len; i++) { + len += in[i].iov_len; + } + } + + task = scsi_task_new(cdb_len, req->cdb, dir, len); + if (!task) { + PERR("Unable to create iscsi task"); + return -1; + } + + if (dir == SCSI_XFER_TO_DEV) { + task->iovector_out.iov = (struct scsi_iovec *)out; + task->iovector_out.niov = out_len; + } else if (dir == SCSI_XFER_FROM_DEV) { + task->iovector_in.iov = (struct scsi_iovec *)in; + task->iovector_in.niov = in_len; + } + + PDBG("Sending iscsi cmd (cdb_len=%d, dir=%d, task=%p)", + cdb_len, dir, task); + if (!iscsi_scsi_command_sync(ctx, 0, task, NULL)) { + PERR("Error serving SCSI command"); + free(task); + return -1; + } + + memset(rsp, 0, sizeof(*rsp)); + + rsp->status = task->status; + rsp->resid = task->residual; + + if (task->status == SCSI_STATUS_CHECK_CONDITION) { + rsp->response = VIRTIO_SCSI_S_FAILURE; + rsp->sense_len = task->datain.size - 2; + memcpy(rsp->sense, &task->datain.data[2], rsp->sense_len); + } + + free(task); + + PDBG("Filled in rsp: status=%hhX, resid=%u, response=%hhX, sense_len=%u", + rsp->status, rsp->resid, rsp->response, rsp->sense_len); + + return 0; +} + +/** libvhost-user callbacks **/ + +static void vus_panic_cb(VuDev *vu_dev, const char *buf) { + vhost_scsi_dev_t *vdev_scsi; + + assert(vu_dev); + + vdev_scsi = vdev_scsi_find_by_vu(vu_dev); + + if (buf) { + PERR("vu_panic: %s", buf); + } + + if (vdev_scsi) { + vdev_scsi->sched.quit = 1; + } +} + +static void vus_add_watch_cb(VuDev *vu_dev, int fd, int vu_evt, vu_watch_cb cb, + void *pvt) { + vhost_scsi_dev_t *vdev_scsi; + int poll_evt = 0; + + assert(vu_dev); + assert(fd >= 0); + assert(cb); + + vdev_scsi = vdev_scsi_find_by_vu(vu_dev); + if (!vdev_scsi) { + vus_panic_cb(vu_dev, NULL); + } + + /* TODO: VU_WATCH_* should match POLL*, check it */ + if (vu_evt & VU_WATCH_IN) poll_evt |= POLLIN; + if (vu_evt & VU_WATCH_OUT) poll_evt |= POLLOUT; + if (vu_evt & VU_WATCH_PRI) poll_evt |= POLLPRI; + if (vu_evt & VU_WATCH_ERR) poll_evt |= POLLERR; + if (vu_evt & VU_WATCH_HUP) poll_evt |= POLLHUP; + + if (sched_add(&vdev_scsi->sched, fd, poll_evt, cb, NULL, pvt)) { + vus_panic_cb(vu_dev, NULL); + } +} + +static void vus_del_watch_cb(VuDev *vu_dev, int fd) { + vhost_scsi_dev_t *vdev_scsi; + + assert(vu_dev); + assert(fd >= 0); + + vdev_scsi = vdev_scsi_find_by_vu(vu_dev); + if (!vdev_scsi) { + vus_panic_cb(vu_dev, NULL); + return; + } + + if (sched_del(&vdev_scsi->sched, fd)) { + vus_panic_cb(vu_dev, NULL); + } +} + +static void vus_proc_ctl(VuDev *vu_dev, int idx) { + /* Control VQ not implemented */ +} + +static void vus_proc_evt(VuDev *vu_dev, int idx) { + /* Event VQ not implemented */ +} + +static void vus_proc_req(VuDev *vu_dev, int idx) { + vhost_scsi_dev_t *vdev_scsi; + VuVirtq *vq; + + assert(vu_dev); + + vdev_scsi = vdev_scsi_find_by_vu(vu_dev); + if (!vdev_scsi) { + vus_panic_cb(vu_dev, NULL); + return; + } + + if ((idx < 0) || (idx >= VHOST_MAX_NR_VIRTQUEUE)) { + PERR("VQ Index out of range: %d", idx); + vus_panic_cb(vu_dev, NULL); + return; + } + + vq = vu_get_queue(vu_dev, idx); + if (!vq) { + PERR("Error fetching VQ (dev=%p, idx=%d)", vu_dev, idx); + vus_panic_cb(vu_dev, NULL); + return; + } + + PDBG("Got kicked on vq[%d]@%p", idx, vq); + + while(1) { + VuVirtqElement *elem; + VirtIOSCSICmdReq *req; + VirtIOSCSICmdResp *rsp; + + elem = vu_queue_pop(vu_dev, vq, sizeof(VuVirtqElement)); + if (!elem) { + PDBG("No more elements pending on vq[%d]@%p", idx, vq); + break; + } + PDBG("Popped elem@%p", elem); + + assert(!((elem->out_num > 1) && (elem->in_num > 1))); + assert((elem->out_num > 0) && (elem->in_num > 0)); + + if (elem->out_sg[0].iov_len < sizeof(VirtIOSCSICmdReq)) { + PERR("Invalid virtio-scsi req header"); + vus_panic_cb(vu_dev, NULL); + break; + } + req = (VirtIOSCSICmdReq *)elem->out_sg[0].iov_base; + + if (elem->in_sg[0].iov_len < sizeof(VirtIOSCSICmdResp)) { + PERR("Invalid virtio-scsi rsp header"); + vus_panic_cb(vu_dev, NULL); + break; + } + rsp = (VirtIOSCSICmdResp *)elem->in_sg[0].iov_base; + + if (handle_cmd_sync(vdev_scsi->luns[0].iscsi_ctx, + req, &elem->out_sg[1], elem->out_num-1, + rsp, &elem->in_sg[1], elem->in_num-1) != 0) { + vus_panic_cb(vu_dev, NULL); + break; + } + + vu_queue_push(vu_dev, vq, elem, 0); + vu_queue_notify(vu_dev, vq); + + free(elem); + } + +} + +static void vus_queue_set_started(VuDev *vu_dev, int idx, bool started) { + VuVirtq *vq; + + assert(vu_dev); + + if ((idx < 0) || (idx >= VHOST_MAX_NR_VIRTQUEUE)) { + PERR("VQ Index out of range: %d", idx); + vus_panic_cb(vu_dev, NULL); + return; + } + + vq = vu_get_queue(vu_dev, idx); + + switch(idx) { + case 0: + vu_set_queue_handler(vu_dev, vq, started?vus_proc_ctl:NULL); + break; + case 1: + vu_set_queue_handler(vu_dev, vq, started?vus_proc_evt:NULL); + break; + default: + vu_set_queue_handler(vu_dev, vq, started?vus_proc_req:NULL); + } +} + +static const VuDevIface vus_iface = { + .queue_set_started = vus_queue_set_started, +}; + +static void vus_vhost_cb(VuDev *vu_dev, int vu_evt, void *data) { + assert(vu_dev); + + if (!vu_dispatch(vu_dev) != 0) { + PERR("Error processing vhost message"); + vus_panic_cb(vu_dev, NULL); + } +} + +/** util **/ + +static int unix_sock_new(char *unix_fn) { + int sock; + struct sockaddr_un un; + size_t len; + + assert(unix_fn); + + sock = socket(AF_UNIX, SOCK_STREAM, 0); + if (sock <= 0) { + perror("socket"); + return -1; + } + + un.sun_family = AF_UNIX; + (void)snprintf(un.sun_path, sizeof(un.sun_path), "%s", unix_fn); + len = sizeof(un.sun_family) + strlen(un.sun_path); + + (void)unlink(unix_fn); + if (bind(sock, (struct sockaddr *)&un, len) < 0) { + perror("bind"); + goto fail; + } + + if (listen(sock, 1) < 0) { + perror("listen"); + goto fail; + } + + return sock; + +fail: + (void)close(sock); + + return -1; +} + +/** vhost-user-scsi **/ + +static vhost_scsi_dev_t *vdev_scsi_find_by_vu(VuDev *vu_dev) { + int i; + + assert(vu_dev); + + for (i=0; i<VDEV_SCSI_MAX_DEVS; i++) { + if (&vhost_scsi_devs[i]->vu_dev == vu_dev) { + return vhost_scsi_devs[i]; + } + } + + PERR("Unknown VuDev %p", vu_dev); + return NULL; +} + +static void vdev_scsi_deinit(vhost_scsi_dev_t *vdev_scsi) { + if (!vdev_scsi) { + return; + } + + if (vdev_scsi->server_sock >= 0) { + struct sockaddr_storage ss; + socklen_t sslen = sizeof(ss); + + if (getsockname(vdev_scsi->server_sock, (struct sockaddr *)&ss, + &sslen) == 0) { + struct sockaddr_un *su = (struct sockaddr_un *)&ss; + (void)unlink(su->sun_path); + } + + (void)close(vdev_scsi->server_sock); + } +} + +static vhost_scsi_dev_t *vdev_scsi_new(char *unix_fn) { + vhost_scsi_dev_t *vdev_scsi; + + assert(unix_fn); + + vdev_scsi = calloc(1, sizeof(vhost_scsi_dev_t)); + if (!vdev_scsi) { + perror("calloc"); + return NULL; + } + + vdev_scsi->server_sock = unix_sock_new(unix_fn); + if (vdev_scsi->server_sock < 0) { + free(vdev_scsi); + return NULL; + } + + vdev_scsi->sched.vu_dev = &vdev_scsi->vu_dev; + + return vdev_scsi; +} + +static int vdev_scsi_iscsi_add_lun(vhost_scsi_dev_t *vdev_scsi, + char *iscsi_uri, uint32_t lun) { + assert(vdev_scsi); + assert(iscsi_uri); + assert(lun < VDEV_SCSI_MAX_LUNS); + + if (vdev_scsi->luns[lun].iscsi_ctx) { + PERR("Lun %d already configured", lun); + return -1; + } + + if (iscsi_add_lun(&vdev_scsi->luns[lun], iscsi_uri) != 0) { + return -1; + } + + return 0; +} + +static int vdev_scsi_run(vhost_scsi_dev_t *vdev_scsi) { + int cli_sock; + int ret = 0; + + assert(vdev_scsi); + assert(vdev_scsi->server_sock >= 0); + + cli_sock = accept(vdev_scsi->server_sock, (void *)0, (void *)0); + if (cli_sock < 0) { + perror("accept"); + return -1; + } + + vu_init(&vdev_scsi->vu_dev, + cli_sock, + vus_panic_cb, + vus_add_watch_cb, + vus_del_watch_cb, + &vus_iface); + + ret = sched_add(&vdev_scsi->sched, cli_sock, POLLIN, vus_vhost_cb, NULL, 0); + if (ret) { + goto fail; + } + + if (sched_loop(&vdev_scsi->sched) != 0) { + goto fail; + } + +out: + vu_deinit(&vdev_scsi->vu_dev); + + return ret; + +fail: + ret = -1; + goto out; +} + +int main(int argc, char **argv) +{ + vhost_scsi_dev_t *vdev_scsi = NULL; + char *unix_fn = NULL; + char *iscsi_uri = NULL; + int opt, err = EXIT_SUCCESS; + + while ((opt = getopt(argc, argv, "u:i:")) != -1) { + switch (opt) { + case 'h': + goto help; + case 'u': + unix_fn = strdup(optarg); + break; + case 'i': + iscsi_uri = strdup(optarg); + break; + default: + goto help; + } + } + if (!unix_fn || !iscsi_uri) { + goto help; + } + + vdev_scsi = vdev_scsi_new(unix_fn); + if (!vdev_scsi) { + goto err; + } + vhost_scsi_devs[0] = vdev_scsi; + + if (vdev_scsi_iscsi_add_lun(vdev_scsi, iscsi_uri, 0) != 0) { + goto err; + } + + if (vdev_scsi_run(vdev_scsi) != 0) { + goto err; + } + +out: + if (vdev_scsi) { + vdev_scsi_deinit(vdev_scsi); + free(vdev_scsi); + } + if (unix_fn) { + free(unix_fn); + } + if (iscsi_uri) { + free(iscsi_uri); + } + + return err; + +err: + err = EXIT_FAILURE; + goto out; + +help: + fprintf(stderr, "Usage: %s [ -u unix_sock_path -i iscsi_uri ] | [ -h ]\n", + argv[0]); + fprintf(stderr, " -u path to unix socket\n"); + fprintf(stderr, " -i iscsi uri for lun 0\n"); + fprintf(stderr, " -h print help and quit\n"); + + goto err; +} -- 1.9.4 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] vus: Introduce a vhost-user-scsi sample application 2016-10-26 15:26 ` [Qemu-devel] [PATCH 2/2] vus: Introduce a vhost-user-scsi sample application Felipe Franciosi @ 2016-10-27 12:16 ` Paolo Bonzini 2016-10-27 12:48 ` Felipe Franciosi 0 siblings, 1 reply; 7+ messages in thread From: Paolo Bonzini @ 2016-10-27 12:16 UTC (permalink / raw) To: Felipe Franciosi, Stefan Hajnoczi, Marc-Andre Lureau, Michael S. Tsirkin Cc: qemu-devel On 26/10/2016 17:26, Felipe Franciosi wrote: > This commit introduces a vhost-user-scsi backend sample application. It > must be linked with libiscsi and libvhost-user. > > To use it, compile with: > make tests/vhost-user-scsi > > And run as follows: > tests/vhost-user-scsi -u /tmp/vus.sock -i iscsi://uri_to_target/ > > The application is currently limited at one LUN only and it processes > requests synchronously (therefore only achieving QD1). The purpose of > the code is to show how a backend can be implemented and to test the > vhost-user-scsi Qemu implementation. > > If a different instance of this vhost-user-scsi application is executed > at a remote host, a VM can be live migrated to such a host. Hi, the right directory for this is contrib/. Is it possible to use GSource and GIOChannel instead for the event loop? There is some dead code (for example cb2 as far as I can see) and having the millionth implementation of an event loop distracts from the meat of the code. :) Thanks, Paolo > Signed-off-by: Felipe Franciosi <felipe@nutanix.com> > --- > tests/Makefile.include | 2 + > tests/vhost-user-scsi.c | 862 ++++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 864 insertions(+) > create mode 100644 tests/vhost-user-scsi.c > > diff --git a/tests/Makefile.include b/tests/Makefile.include > index 7e6fd23..e61fe54 100644 > --- a/tests/Makefile.include > +++ b/tests/Makefile.include > @@ -685,6 +685,8 @@ tests/test-filter-redirector$(EXESUF): tests/test-filter-redirector.o $(qtest-ob > tests/test-x86-cpuid-compat$(EXESUF): tests/test-x86-cpuid-compat.o $(qtest-obj-y) > tests/ivshmem-test$(EXESUF): tests/ivshmem-test.o contrib/ivshmem-server/ivshmem-server.o $(libqos-pc-obj-y) > tests/vhost-user-bridge$(EXESUF): tests/vhost-user-bridge.o contrib/libvhost-user/libvhost-user.o $(test-util-obj-y) > +tests/vhost-user-scsi.o-cflags := $(LIBISCSI_CFLAGS) > +tests/vhost-user-scsi$(EXESUF): tests/vhost-user-scsi.o contrib/libvhost-user/libvhost-user.o $(test-util-obj-y) $(test-block-obj-y) > tests/test-uuid$(EXESUF): tests/test-uuid.o $(test-util-obj-y) > tests/test-arm-mptimer$(EXESUF): tests/test-arm-mptimer.o > > diff --git a/tests/vhost-user-scsi.c b/tests/vhost-user-scsi.c > new file mode 100644 > index 0000000..c92b3b2 > --- /dev/null > +++ b/tests/vhost-user-scsi.c > @@ -0,0 +1,862 @@ > +/* > + * vhost-user-scsi sample application > + * > + * Copyright (c) 2016 Nutanix Inc. All rights reserved. > + * > + * Author: > + * Felipe Franciosi <felipe@nutanix.com> > + * > + * This work is licensed under the terms of the GNU GPL, version 2 only. > + * See the COPYING file in the top-level directory. > + */ > + > +#include "qemu/osdep.h" > +#include "contrib/libvhost-user/libvhost-user.h" > +#include "hw/virtio/virtio-scsi.h" > +#include "iscsi/iscsi.h" > + > +#include <poll.h> > + > +#define VHOST_USER_SCSI_DEBUG 1 > + > +/** Log helpers **/ > + > +#define PPRE \ > + struct timespec ts; \ > + char timebuf[64]; \ > + struct tm tm; \ > + (void)clock_gettime(CLOCK_REALTIME, &ts); \ > + (void)strftime(timebuf, 64, "%Y%m%d %T", gmtime_r(&ts.tv_sec, &tm)) > + > +#define PEXT(lvl, msg, ...) do { \ > + PPRE; \ > + fprintf(stderr, "%s.%06ld " lvl ": %s:%s():%d: " msg "\n", \ > + timebuf, ts.tv_nsec/1000, \ > + __FILE__, __FUNCTION__, __LINE__, ## __VA_ARGS__); \ > +} while(0) > + > +#define PNOR(lvl, msg, ...) do { \ > + PPRE; \ > + fprintf(stderr, "%s.%06ld " lvl ": " msg "\n", \ > + timebuf, ts.tv_nsec/1000, ## __VA_ARGS__); \ > +} while(0); > + > +#ifdef VHOST_USER_SCSI_DEBUG > +#define PDBG(msg, ...) PEXT("DBG", msg, ## __VA_ARGS__) > +#define PERR(msg, ...) PEXT("ERR", msg, ## __VA_ARGS__) > +#define PLOG(msg, ...) PEXT("LOG", msg, ## __VA_ARGS__) > +#else > +#define PDBG(msg, ...) { } > +#define PERR(msg, ...) PNOR("ERR", msg, ## __VA_ARGS__) > +#define PLOG(msg, ...) PNOR("LOG", msg, ## __VA_ARGS__) > +#endif > + > +/** vhost-user-scsi specific definitions **/ > + > +/* TODO: MAX is defined at 8, should be 1024 */ > +#define VUS_SCHED_MAX_FDS (1 + (2*VHOST_MAX_NR_VIRTQUEUE)) > + > +#define VDEV_SCSI_MAX_LUNS 1 // Only 1 lun supported today > +#define VDEV_SCSI_MAX_DEVS 1 // Only 1 devices supported today > + > +#define ISCSI_INITIATOR "iqn.2016-10.com.nutanix:vhost-user-scsi" > + > +typedef void (*misc_cb) (short evt, void *pvt); > + > +typedef struct sched_data { > + vu_watch_cb cb1; > + misc_cb cb2; > + void *pvt; > + short evt; > +} sched_data_t; > + > +typedef struct sched { > + VuDev *vu_dev; > + nfds_t nfds; > + struct pollfd fds[VUS_SCHED_MAX_FDS]; > + sched_data_t data[VUS_SCHED_MAX_FDS]; > + int quit; > +} sched_t; > + > +typedef struct iscsi_lun { > + struct iscsi_context *iscsi_ctx; > + int iscsi_lun; > +} iscsi_lun_t; > + > +typedef struct vhost_scsi_dev { > + VuDev vu_dev; > + int server_sock; > + sched_t sched; > + iscsi_lun_t luns[VDEV_SCSI_MAX_LUNS]; > +} vhost_scsi_dev_t; > + > +static vhost_scsi_dev_t *vhost_scsi_devs[VDEV_SCSI_MAX_DEVS]; > + > +static vhost_scsi_dev_t *vdev_scsi_find_by_vu(VuDev *vu_dev); > + > +/** poll-based scheduler for libvhost-user and misc callbacks **/ > + > +static int sched_add(sched_t *sched, int fd, short int evt, > + vu_watch_cb cb1, misc_cb cb2, void *pvt) { > + int i; > + > + assert(sched); > + assert(fd >= 0); > + assert(evt); > + assert(cb1 || cb2); > + assert(!(cb1 && cb2)); // only one of the cbs should be used > + > + for (i=0; i<sched->nfds && i<VUS_SCHED_MAX_FDS; i++) { > + if (sched->fds[i].fd == fd) { > + break; > + } > + } > + if (i == VUS_SCHED_MAX_FDS) { > + PERR("Error adding fd: max number of fds reached"); > + return -1; > + } > + > + sched->fds[i].fd = fd; > + sched->fds[i].events = evt; > + sched->data[i].cb1 = cb1; > + sched->data[i].cb2 = cb2; > + sched->data[i].pvt = pvt; > + sched->data[i].evt = evt; > + > + if (sched->nfds <= i) { > + sched->nfds = i+1; > + } > + > + PDBG("sched@%p: add fd %d to slot %i", sched, fd, i); > + > + return 0; > +} > + > +static int sched_del(sched_t *sched, int fd) { > + int i; > + > + assert(sched); > + assert(fd >= 0); > + > + for (i=0; i<sched->nfds; i++) { > + if (sched->fds[i].fd == fd) { > + break; > + } > + } > + if (sched->nfds == i) { > +#ifdef VUS_PEDANTIC_SCHEDULER > + PERR("Error deleting fd %d: fd not found", fd); > + return -1; > +#else > + return 0; > +#endif > + } > + > + sched->nfds--; > + if (sched->nfds > 0) { > + // Overwrite deleted entry with last entry from scheduler > + memcpy(&sched->fds[i], &sched->fds[sched->nfds], > + sizeof(struct pollfd)); > + memcpy(&sched->data[i], &sched->data[sched->nfds], > + sizeof(sched_data_t)); > + } > + memset(&sched->fds[sched->nfds], 0, sizeof(struct pollfd)); > + memset(&sched->data[sched->nfds], 0, sizeof(sched_data_t)); > + > + PDBG("sched@%p: del fd %d from slot %i", sched, fd, i); > + > + return 0; > +} > + > +static int sched_loop(sched_t *sched) { > + int i, n; > + > + assert(sched); > + assert(sched->nfds > 0); > + > + while (!sched->quit) { > + n = poll(sched->fds, sched->nfds, -1); > + if (n < 0) { > + PERR("Error polling: %s", strerror(errno)); > + return -1; > + } > + > + for (i=0; i<sched->nfds && n; i++) { > + if (sched->fds[i].revents != 0) { > + > + if (sched->data[i].cb1) { > + int vu_evt = 0; > + > + if (sched->fds[i].revents & POLLIN) vu_evt |= VU_WATCH_IN; > + if (sched->fds[i].revents & POLLOUT) vu_evt |= VU_WATCH_OUT; > + if (sched->fds[i].revents & POLLPRI) vu_evt |= VU_WATCH_PRI; > + if (sched->fds[i].revents & POLLERR) vu_evt |= VU_WATCH_ERR; > + if (sched->fds[i].revents & POLLHUP) vu_evt |= VU_WATCH_HUP; > + > + PDBG("sched@%p: fd[%d] (%d): cb1(%p, %d, %p)", sched, i, > + sched->fds[i].fd, sched->vu_dev, vu_evt, > + sched->data[i].pvt); > + > + sched->data[i].cb1(sched->vu_dev, vu_evt, > + sched->data[i].pvt); > + } else { > + PDBG("sched@%p: fd[%d] (%d): cbb(%hd, %p)", sched, i, > + sched->fds[i].fd, sched->fds[i].revents, > + sched->data[i].pvt); > + > + sched->data[i].cb2(sched->fds[i].revents, > + sched->data[i].pvt); > + } > + > + n--; > + } > + } > + } > + > + return 0; > +} > + > +/** from libiscsi's scsi-lowlevel.h **/ > + > +#define SCSI_CDB_MAX_SIZE 16 > + > +struct scsi_iovector { > + struct scsi_iovec *iov; > + int niov; > + int nalloc; > + size_t offset; > + int consumed; > +}; > + > +struct scsi_allocated_memory { > + struct scsi_allocated_memory *next; > + char buf[0]; > +}; > + > +struct scsi_data { > + int size; > + unsigned char *data; > +}; > + > +enum scsi_sense_key { > + SCSI_SENSE_NO_SENSE = 0x00, > + SCSI_SENSE_RECOVERED_ERROR = 0x01, > + SCSI_SENSE_NOT_READY = 0x02, > + SCSI_SENSE_MEDIUM_ERROR = 0x03, > + SCSI_SENSE_HARDWARE_ERROR = 0x04, > + SCSI_SENSE_ILLEGAL_REQUEST = 0x05, > + SCSI_SENSE_UNIT_ATTENTION = 0x06, > + SCSI_SENSE_DATA_PROTECTION = 0x07, > + SCSI_SENSE_BLANK_CHECK = 0x08, > + SCSI_SENSE_VENDOR_SPECIFIC = 0x09, > + SCSI_SENSE_COPY_ABORTED = 0x0a, > + SCSI_SENSE_COMMAND_ABORTED = 0x0b, > + SCSI_SENSE_OBSOLETE_ERROR_CODE = 0x0c, > + SCSI_SENSE_OVERFLOW_COMMAND = 0x0d, > + SCSI_SENSE_MISCOMPARE = 0x0e > +}; > + > +struct scsi_sense { > + unsigned char error_type; > + enum scsi_sense_key key; > + int ascq; > + unsigned sense_specific:1; > + unsigned ill_param_in_cdb:1; > + unsigned bit_pointer_valid:1; > + unsigned char bit_pointer; > + uint16_t field_pointer; > +}; > + > +enum scsi_residual { > + SCSI_RESIDUAL_NO_RESIDUAL = 0, > + SCSI_RESIDUAL_UNDERFLOW, > + SCSI_RESIDUAL_OVERFLOW > +}; > + > +struct scsi_task { > + int status; > + int cdb_size; > + int xfer_dir; > + int expxferlen; > + unsigned char cdb[SCSI_CDB_MAX_SIZE]; > + enum scsi_residual residual_status; > + size_t residual; > + struct scsi_sense sense; > + struct scsi_data datain; > + struct scsi_allocated_memory *mem; > + void *ptr; > + > + uint32_t itt; > + uint32_t cmdsn; > + uint32_t lun; > + > + struct scsi_iovector iovector_in; > + struct scsi_iovector iovector_out; > +}; > + > +/** libiscsi integration **/ > + > +static int iscsi_add_lun(iscsi_lun_t *lun, char *iscsi_uri) { > + struct iscsi_url *iscsi_url; > + struct iscsi_context *iscsi_ctx; > + int ret = 0; > + > + assert(lun); > + > + iscsi_ctx = iscsi_create_context(ISCSI_INITIATOR); > + if (!iscsi_ctx) { > + PERR("Unable to create iSCSI context"); > + return -1; > + } > + > + iscsi_url = iscsi_parse_full_url(iscsi_ctx, iscsi_uri); > + if (!iscsi_url) { > + PERR("Unable to parse iSCSI URL: %s", iscsi_get_error(iscsi_ctx)); > + goto fail; > + } > + > + iscsi_set_session_type(iscsi_ctx, ISCSI_SESSION_NORMAL); > + iscsi_set_header_digest(iscsi_ctx, ISCSI_HEADER_DIGEST_NONE_CRC32C); > + if (iscsi_full_connect_sync(iscsi_ctx, iscsi_url->portal, iscsi_url->lun)) { > + PERR("Unable to login to iSCSI portal: %s", iscsi_get_error(iscsi_ctx)); > + goto fail; > + } > + > + lun->iscsi_ctx = iscsi_ctx; > + lun->iscsi_lun = iscsi_url->lun; > + > + PDBG("Context %p created for lun 0: %s", iscsi_ctx, iscsi_uri); > + > +out: > + if (iscsi_url) { > + iscsi_destroy_url(iscsi_url); > + } > + return ret; > + > +fail: > + (void)iscsi_destroy_context(iscsi_ctx); > + ret = -1; > + goto out; > +} > + > +static struct scsi_task *scsi_task_new(int cdb_len, uint8_t *cdb, int dir, > + int xfer_len) { > + struct scsi_task *task; > + > + assert(cdb_len > 0); > + assert(cdb); > + > + task = calloc(1, sizeof(struct scsi_task)); > + if (!task) { > + PERR("Error allocating task: %s", strerror(errno)); > + return NULL; > + } > + > + memcpy(task->cdb, cdb, cdb_len); > + task->cdb_size = cdb_len; > + task->xfer_dir = dir; > + task->expxferlen = xfer_len; > + > + return task; > +} > + > +static int get_cdb_len(uint8_t *cdb) { > + switch(cdb[0] >> 5){ > + case 0: > + return 6; > + case 1: > + case 2: > + return 10; > + case 4: > + return 16; > + case 5: > + return 12; > + } > + PERR("Unable to determine cdb len (0x%02hhX)", cdb[0]>>5); > + return -1; > +} > + > +static int handle_cmd_sync(struct iscsi_context *ctx, > + VirtIOSCSICmdReq *req, > + struct iovec *out, unsigned int out_len, > + VirtIOSCSICmdResp *rsp, > + struct iovec *in, unsigned int in_len) { > + struct scsi_task *task; > + uint32_t dir; > + uint32_t len; > + int cdb_len; > + int i; > + > + if (!((!req->lun[1]) && (req->lun[2] == 0x40) && (!req->lun[3]))) { > + // Ignore anything different than target=0, lun=0 > + PDBG("Ignoring unconnected lun (0x%hhX, 0x%hhX)", > + req->lun[1], req->lun[3]); > + rsp->status = SCSI_STATUS_CHECK_CONDITION; > + memset(rsp->sense, 0, sizeof(rsp->sense)); > + rsp->sense_len = 18; > + rsp->sense[0] = 0x70; > + rsp->sense[2] = 0x05; // ILLEGAL_REQUEST > + rsp->sense[7] = 10; > + rsp->sense[12] = 0x24; > + > + return 0; > + } > + > + cdb_len = get_cdb_len(req->cdb); > + if (cdb_len == -1) { > + return -1; > + } > + > + len = 0; > + if (!out_len && !in_len) { > + dir = SCSI_XFER_NONE; > + } else if (out_len) { > + dir = SCSI_XFER_TO_DEV; > + for (i=0; i<out_len; i++) { > + len += out[i].iov_len; > + } > + } else { > + dir = SCSI_XFER_FROM_DEV; > + for (i=0; i<in_len; i++) { > + len += in[i].iov_len; > + } > + } > + > + task = scsi_task_new(cdb_len, req->cdb, dir, len); > + if (!task) { > + PERR("Unable to create iscsi task"); > + return -1; > + } > + > + if (dir == SCSI_XFER_TO_DEV) { > + task->iovector_out.iov = (struct scsi_iovec *)out; > + task->iovector_out.niov = out_len; > + } else if (dir == SCSI_XFER_FROM_DEV) { > + task->iovector_in.iov = (struct scsi_iovec *)in; > + task->iovector_in.niov = in_len; > + } > + > + PDBG("Sending iscsi cmd (cdb_len=%d, dir=%d, task=%p)", > + cdb_len, dir, task); > + if (!iscsi_scsi_command_sync(ctx, 0, task, NULL)) { > + PERR("Error serving SCSI command"); > + free(task); > + return -1; > + } > + > + memset(rsp, 0, sizeof(*rsp)); > + > + rsp->status = task->status; > + rsp->resid = task->residual; > + > + if (task->status == SCSI_STATUS_CHECK_CONDITION) { > + rsp->response = VIRTIO_SCSI_S_FAILURE; > + rsp->sense_len = task->datain.size - 2; > + memcpy(rsp->sense, &task->datain.data[2], rsp->sense_len); > + } > + > + free(task); > + > + PDBG("Filled in rsp: status=%hhX, resid=%u, response=%hhX, sense_len=%u", > + rsp->status, rsp->resid, rsp->response, rsp->sense_len); > + > + return 0; > +} > + > +/** libvhost-user callbacks **/ > + > +static void vus_panic_cb(VuDev *vu_dev, const char *buf) { > + vhost_scsi_dev_t *vdev_scsi; > + > + assert(vu_dev); > + > + vdev_scsi = vdev_scsi_find_by_vu(vu_dev); > + > + if (buf) { > + PERR("vu_panic: %s", buf); > + } > + > + if (vdev_scsi) { > + vdev_scsi->sched.quit = 1; > + } > +} > + > +static void vus_add_watch_cb(VuDev *vu_dev, int fd, int vu_evt, vu_watch_cb cb, > + void *pvt) { > + vhost_scsi_dev_t *vdev_scsi; > + int poll_evt = 0; > + > + assert(vu_dev); > + assert(fd >= 0); > + assert(cb); > + > + vdev_scsi = vdev_scsi_find_by_vu(vu_dev); > + if (!vdev_scsi) { > + vus_panic_cb(vu_dev, NULL); > + } > + > + /* TODO: VU_WATCH_* should match POLL*, check it */ > + if (vu_evt & VU_WATCH_IN) poll_evt |= POLLIN; > + if (vu_evt & VU_WATCH_OUT) poll_evt |= POLLOUT; > + if (vu_evt & VU_WATCH_PRI) poll_evt |= POLLPRI; > + if (vu_evt & VU_WATCH_ERR) poll_evt |= POLLERR; > + if (vu_evt & VU_WATCH_HUP) poll_evt |= POLLHUP; > + > + if (sched_add(&vdev_scsi->sched, fd, poll_evt, cb, NULL, pvt)) { > + vus_panic_cb(vu_dev, NULL); > + } > +} > + > +static void vus_del_watch_cb(VuDev *vu_dev, int fd) { > + vhost_scsi_dev_t *vdev_scsi; > + > + assert(vu_dev); > + assert(fd >= 0); > + > + vdev_scsi = vdev_scsi_find_by_vu(vu_dev); > + if (!vdev_scsi) { > + vus_panic_cb(vu_dev, NULL); > + return; > + } > + > + if (sched_del(&vdev_scsi->sched, fd)) { > + vus_panic_cb(vu_dev, NULL); > + } > +} > + > +static void vus_proc_ctl(VuDev *vu_dev, int idx) { > + /* Control VQ not implemented */ > +} > + > +static void vus_proc_evt(VuDev *vu_dev, int idx) { > + /* Event VQ not implemented */ > +} > + > +static void vus_proc_req(VuDev *vu_dev, int idx) { > + vhost_scsi_dev_t *vdev_scsi; > + VuVirtq *vq; > + > + assert(vu_dev); > + > + vdev_scsi = vdev_scsi_find_by_vu(vu_dev); > + if (!vdev_scsi) { > + vus_panic_cb(vu_dev, NULL); > + return; > + } > + > + if ((idx < 0) || (idx >= VHOST_MAX_NR_VIRTQUEUE)) { > + PERR("VQ Index out of range: %d", idx); > + vus_panic_cb(vu_dev, NULL); > + return; > + } > + > + vq = vu_get_queue(vu_dev, idx); > + if (!vq) { > + PERR("Error fetching VQ (dev=%p, idx=%d)", vu_dev, idx); > + vus_panic_cb(vu_dev, NULL); > + return; > + } > + > + PDBG("Got kicked on vq[%d]@%p", idx, vq); > + > + while(1) { > + VuVirtqElement *elem; > + VirtIOSCSICmdReq *req; > + VirtIOSCSICmdResp *rsp; > + > + elem = vu_queue_pop(vu_dev, vq, sizeof(VuVirtqElement)); > + if (!elem) { > + PDBG("No more elements pending on vq[%d]@%p", idx, vq); > + break; > + } > + PDBG("Popped elem@%p", elem); > + > + assert(!((elem->out_num > 1) && (elem->in_num > 1))); > + assert((elem->out_num > 0) && (elem->in_num > 0)); > + > + if (elem->out_sg[0].iov_len < sizeof(VirtIOSCSICmdReq)) { > + PERR("Invalid virtio-scsi req header"); > + vus_panic_cb(vu_dev, NULL); > + break; > + } > + req = (VirtIOSCSICmdReq *)elem->out_sg[0].iov_base; > + > + if (elem->in_sg[0].iov_len < sizeof(VirtIOSCSICmdResp)) { > + PERR("Invalid virtio-scsi rsp header"); > + vus_panic_cb(vu_dev, NULL); > + break; > + } > + rsp = (VirtIOSCSICmdResp *)elem->in_sg[0].iov_base; > + > + if (handle_cmd_sync(vdev_scsi->luns[0].iscsi_ctx, > + req, &elem->out_sg[1], elem->out_num-1, > + rsp, &elem->in_sg[1], elem->in_num-1) != 0) { > + vus_panic_cb(vu_dev, NULL); > + break; > + } > + > + vu_queue_push(vu_dev, vq, elem, 0); > + vu_queue_notify(vu_dev, vq); > + > + free(elem); > + } > + > +} > + > +static void vus_queue_set_started(VuDev *vu_dev, int idx, bool started) { > + VuVirtq *vq; > + > + assert(vu_dev); > + > + if ((idx < 0) || (idx >= VHOST_MAX_NR_VIRTQUEUE)) { > + PERR("VQ Index out of range: %d", idx); > + vus_panic_cb(vu_dev, NULL); > + return; > + } > + > + vq = vu_get_queue(vu_dev, idx); > + > + switch(idx) { > + case 0: > + vu_set_queue_handler(vu_dev, vq, started?vus_proc_ctl:NULL); > + break; > + case 1: > + vu_set_queue_handler(vu_dev, vq, started?vus_proc_evt:NULL); > + break; > + default: > + vu_set_queue_handler(vu_dev, vq, started?vus_proc_req:NULL); > + } > +} > + > +static const VuDevIface vus_iface = { > + .queue_set_started = vus_queue_set_started, > +}; > + > +static void vus_vhost_cb(VuDev *vu_dev, int vu_evt, void *data) { > + assert(vu_dev); > + > + if (!vu_dispatch(vu_dev) != 0) { > + PERR("Error processing vhost message"); > + vus_panic_cb(vu_dev, NULL); > + } > +} > + > +/** util **/ > + > +static int unix_sock_new(char *unix_fn) { > + int sock; > + struct sockaddr_un un; > + size_t len; > + > + assert(unix_fn); > + > + sock = socket(AF_UNIX, SOCK_STREAM, 0); > + if (sock <= 0) { > + perror("socket"); > + return -1; > + } > + > + un.sun_family = AF_UNIX; > + (void)snprintf(un.sun_path, sizeof(un.sun_path), "%s", unix_fn); > + len = sizeof(un.sun_family) + strlen(un.sun_path); > + > + (void)unlink(unix_fn); > + if (bind(sock, (struct sockaddr *)&un, len) < 0) { > + perror("bind"); > + goto fail; > + } > + > + if (listen(sock, 1) < 0) { > + perror("listen"); > + goto fail; > + } > + > + return sock; > + > +fail: > + (void)close(sock); > + > + return -1; > +} > + > +/** vhost-user-scsi **/ > + > +static vhost_scsi_dev_t *vdev_scsi_find_by_vu(VuDev *vu_dev) { > + int i; > + > + assert(vu_dev); > + > + for (i=0; i<VDEV_SCSI_MAX_DEVS; i++) { > + if (&vhost_scsi_devs[i]->vu_dev == vu_dev) { > + return vhost_scsi_devs[i]; > + } > + } > + > + PERR("Unknown VuDev %p", vu_dev); > + return NULL; > +} > + > +static void vdev_scsi_deinit(vhost_scsi_dev_t *vdev_scsi) { > + if (!vdev_scsi) { > + return; > + } > + > + if (vdev_scsi->server_sock >= 0) { > + struct sockaddr_storage ss; > + socklen_t sslen = sizeof(ss); > + > + if (getsockname(vdev_scsi->server_sock, (struct sockaddr *)&ss, > + &sslen) == 0) { > + struct sockaddr_un *su = (struct sockaddr_un *)&ss; > + (void)unlink(su->sun_path); > + } > + > + (void)close(vdev_scsi->server_sock); > + } > +} > + > +static vhost_scsi_dev_t *vdev_scsi_new(char *unix_fn) { > + vhost_scsi_dev_t *vdev_scsi; > + > + assert(unix_fn); > + > + vdev_scsi = calloc(1, sizeof(vhost_scsi_dev_t)); > + if (!vdev_scsi) { > + perror("calloc"); > + return NULL; > + } > + > + vdev_scsi->server_sock = unix_sock_new(unix_fn); > + if (vdev_scsi->server_sock < 0) { > + free(vdev_scsi); > + return NULL; > + } > + > + vdev_scsi->sched.vu_dev = &vdev_scsi->vu_dev; > + > + return vdev_scsi; > +} > + > +static int vdev_scsi_iscsi_add_lun(vhost_scsi_dev_t *vdev_scsi, > + char *iscsi_uri, uint32_t lun) { > + assert(vdev_scsi); > + assert(iscsi_uri); > + assert(lun < VDEV_SCSI_MAX_LUNS); > + > + if (vdev_scsi->luns[lun].iscsi_ctx) { > + PERR("Lun %d already configured", lun); > + return -1; > + } > + > + if (iscsi_add_lun(&vdev_scsi->luns[lun], iscsi_uri) != 0) { > + return -1; > + } > + > + return 0; > +} > + > +static int vdev_scsi_run(vhost_scsi_dev_t *vdev_scsi) { > + int cli_sock; > + int ret = 0; > + > + assert(vdev_scsi); > + assert(vdev_scsi->server_sock >= 0); > + > + cli_sock = accept(vdev_scsi->server_sock, (void *)0, (void *)0); > + if (cli_sock < 0) { > + perror("accept"); > + return -1; > + } > + > + vu_init(&vdev_scsi->vu_dev, > + cli_sock, > + vus_panic_cb, > + vus_add_watch_cb, > + vus_del_watch_cb, > + &vus_iface); > + > + ret = sched_add(&vdev_scsi->sched, cli_sock, POLLIN, vus_vhost_cb, NULL, 0); > + if (ret) { > + goto fail; > + } > + > + if (sched_loop(&vdev_scsi->sched) != 0) { > + goto fail; > + } > + > +out: > + vu_deinit(&vdev_scsi->vu_dev); > + > + return ret; > + > +fail: > + ret = -1; > + goto out; > +} > + > +int main(int argc, char **argv) > +{ > + vhost_scsi_dev_t *vdev_scsi = NULL; > + char *unix_fn = NULL; > + char *iscsi_uri = NULL; > + int opt, err = EXIT_SUCCESS; > + > + while ((opt = getopt(argc, argv, "u:i:")) != -1) { > + switch (opt) { > + case 'h': > + goto help; > + case 'u': > + unix_fn = strdup(optarg); > + break; > + case 'i': > + iscsi_uri = strdup(optarg); > + break; > + default: > + goto help; > + } > + } > + if (!unix_fn || !iscsi_uri) { > + goto help; > + } > + > + vdev_scsi = vdev_scsi_new(unix_fn); > + if (!vdev_scsi) { > + goto err; > + } > + vhost_scsi_devs[0] = vdev_scsi; > + > + if (vdev_scsi_iscsi_add_lun(vdev_scsi, iscsi_uri, 0) != 0) { > + goto err; > + } > + > + if (vdev_scsi_run(vdev_scsi) != 0) { > + goto err; > + } > + > +out: > + if (vdev_scsi) { > + vdev_scsi_deinit(vdev_scsi); > + free(vdev_scsi); > + } > + if (unix_fn) { > + free(unix_fn); > + } > + if (iscsi_uri) { > + free(iscsi_uri); > + } > + > + return err; > + > +err: > + err = EXIT_FAILURE; > + goto out; > + > +help: > + fprintf(stderr, "Usage: %s [ -u unix_sock_path -i iscsi_uri ] | [ -h ]\n", > + argv[0]); > + fprintf(stderr, " -u path to unix socket\n"); > + fprintf(stderr, " -i iscsi uri for lun 0\n"); > + fprintf(stderr, " -h print help and quit\n"); > + > + goto err; > +} > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] vus: Introduce a vhost-user-scsi sample application 2016-10-27 12:16 ` Paolo Bonzini @ 2016-10-27 12:48 ` Felipe Franciosi 2016-10-27 12:58 ` Paolo Bonzini 0 siblings, 1 reply; 7+ messages in thread From: Felipe Franciosi @ 2016-10-27 12:48 UTC (permalink / raw) To: Paolo Bonzini Cc: Felipe Franciosi, Stefan Hajnoczi, Marc-Andre Lureau, Michael S. Tsirkin, qemu-devel@nongnu.org Hello, > On 27 Oct 2016, at 13:16, Paolo Bonzini <pbonzini@redhat.com> wrote: > > > > On 26/10/2016 17:26, Felipe Franciosi wrote: >> This commit introduces a vhost-user-scsi backend sample application. It >> must be linked with libiscsi and libvhost-user. >> >> To use it, compile with: >> make tests/vhost-user-scsi >> >> And run as follows: >> tests/vhost-user-scsi -u /tmp/vus.sock -i iscsi://uri_to_target/ >> >> The application is currently limited at one LUN only and it processes >> requests synchronously (therefore only achieving QD1). The purpose of >> the code is to show how a backend can be implemented and to test the >> vhost-user-scsi Qemu implementation. >> >> If a different instance of this vhost-user-scsi application is executed >> at a remote host, a VM can be live migrated to such a host. > > Hi, > > the right directory for this is contrib/. Cool. I was following suit from vhost-user-bridge which lives in tests/ today. To me, it makes more sense for these to be in contrib/. I'll place my sample application there for v2 and perhaps we should move vhost-user-bridge later? > > Is it possible to use GSource and GIOChannel instead for the event loop? > There is some dead code (for example cb2 as far as I can see) and > having the millionth implementation of an event loop distracts from the > meat of the code. :) That's true. I'll have a stab at using glib's event loop. The cb2 was meant to be used for libiscsi's async submission, but I ended up with QD1 for simplicity. You're right, it looks pretty dead at the minute. :) Cheers, Felipe > > Thanks, > > Paolo > >> Signed-off-by: Felipe Franciosi <felipe@nutanix.com> >> --- >> tests/Makefile.include | 2 + >> tests/vhost-user-scsi.c | 862 ++++++++++++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 864 insertions(+) >> create mode 100644 tests/vhost-user-scsi.c >> >> diff --git a/tests/Makefile.include b/tests/Makefile.include >> index 7e6fd23..e61fe54 100644 >> --- a/tests/Makefile.include >> +++ b/tests/Makefile.include >> @@ -685,6 +685,8 @@ tests/test-filter-redirector$(EXESUF): tests/test-filter-redirector.o $(qtest-ob >> tests/test-x86-cpuid-compat$(EXESUF): tests/test-x86-cpuid-compat.o $(qtest-obj-y) >> tests/ivshmem-test$(EXESUF): tests/ivshmem-test.o contrib/ivshmem-server/ivshmem-server.o $(libqos-pc-obj-y) >> tests/vhost-user-bridge$(EXESUF): tests/vhost-user-bridge.o contrib/libvhost-user/libvhost-user.o $(test-util-obj-y) >> +tests/vhost-user-scsi.o-cflags := $(LIBISCSI_CFLAGS) >> +tests/vhost-user-scsi$(EXESUF): tests/vhost-user-scsi.o contrib/libvhost-user/libvhost-user.o $(test-util-obj-y) $(test-block-obj-y) >> tests/test-uuid$(EXESUF): tests/test-uuid.o $(test-util-obj-y) >> tests/test-arm-mptimer$(EXESUF): tests/test-arm-mptimer.o >> >> diff --git a/tests/vhost-user-scsi.c b/tests/vhost-user-scsi.c >> new file mode 100644 >> index 0000000..c92b3b2 >> --- /dev/null >> +++ b/tests/vhost-user-scsi.c >> @@ -0,0 +1,862 @@ >> +/* >> + * vhost-user-scsi sample application >> + * >> + * Copyright (c) 2016 Nutanix Inc. All rights reserved. >> + * >> + * Author: >> + * Felipe Franciosi <felipe@nutanix.com> >> + * >> + * This work is licensed under the terms of the GNU GPL, version 2 only. >> + * See the COPYING file in the top-level directory. >> + */ >> + >> +#include "qemu/osdep.h" >> +#include "contrib/libvhost-user/libvhost-user.h" >> +#include "hw/virtio/virtio-scsi.h" >> +#include "iscsi/iscsi.h" >> + >> +#include <poll.h> >> + >> +#define VHOST_USER_SCSI_DEBUG 1 >> + >> +/** Log helpers **/ >> + >> +#define PPRE \ >> + struct timespec ts; \ >> + char timebuf[64]; \ >> + struct tm tm; \ >> + (void)clock_gettime(CLOCK_REALTIME, &ts); \ >> + (void)strftime(timebuf, 64, "%Y%m%d %T", gmtime_r(&ts.tv_sec, &tm)) >> + >> +#define PEXT(lvl, msg, ...) do { \ >> + PPRE; \ >> + fprintf(stderr, "%s.%06ld " lvl ": %s:%s():%d: " msg "\n", \ >> + timebuf, ts.tv_nsec/1000, \ >> + __FILE__, __FUNCTION__, __LINE__, ## __VA_ARGS__); \ >> +} while(0) >> + >> +#define PNOR(lvl, msg, ...) do { \ >> + PPRE; \ >> + fprintf(stderr, "%s.%06ld " lvl ": " msg "\n", \ >> + timebuf, ts.tv_nsec/1000, ## __VA_ARGS__); \ >> +} while(0); >> + >> +#ifdef VHOST_USER_SCSI_DEBUG >> +#define PDBG(msg, ...) PEXT("DBG", msg, ## __VA_ARGS__) >> +#define PERR(msg, ...) PEXT("ERR", msg, ## __VA_ARGS__) >> +#define PLOG(msg, ...) PEXT("LOG", msg, ## __VA_ARGS__) >> +#else >> +#define PDBG(msg, ...) { } >> +#define PERR(msg, ...) PNOR("ERR", msg, ## __VA_ARGS__) >> +#define PLOG(msg, ...) PNOR("LOG", msg, ## __VA_ARGS__) >> +#endif >> + >> +/** vhost-user-scsi specific definitions **/ >> + >> +/* TODO: MAX is defined at 8, should be 1024 */ >> +#define VUS_SCHED_MAX_FDS (1 + (2*VHOST_MAX_NR_VIRTQUEUE)) >> + >> +#define VDEV_SCSI_MAX_LUNS 1 // Only 1 lun supported today >> +#define VDEV_SCSI_MAX_DEVS 1 // Only 1 devices supported today >> + >> +#define ISCSI_INITIATOR "iqn.2016-10.com.nutanix:vhost-user-scsi" >> + >> +typedef void (*misc_cb) (short evt, void *pvt); >> + >> +typedef struct sched_data { >> + vu_watch_cb cb1; >> + misc_cb cb2; >> + void *pvt; >> + short evt; >> +} sched_data_t; >> + >> +typedef struct sched { >> + VuDev *vu_dev; >> + nfds_t nfds; >> + struct pollfd fds[VUS_SCHED_MAX_FDS]; >> + sched_data_t data[VUS_SCHED_MAX_FDS]; >> + int quit; >> +} sched_t; >> + >> +typedef struct iscsi_lun { >> + struct iscsi_context *iscsi_ctx; >> + int iscsi_lun; >> +} iscsi_lun_t; >> + >> +typedef struct vhost_scsi_dev { >> + VuDev vu_dev; >> + int server_sock; >> + sched_t sched; >> + iscsi_lun_t luns[VDEV_SCSI_MAX_LUNS]; >> +} vhost_scsi_dev_t; >> + >> +static vhost_scsi_dev_t *vhost_scsi_devs[VDEV_SCSI_MAX_DEVS]; >> + >> +static vhost_scsi_dev_t *vdev_scsi_find_by_vu(VuDev *vu_dev); >> + >> +/** poll-based scheduler for libvhost-user and misc callbacks **/ >> + >> +static int sched_add(sched_t *sched, int fd, short int evt, >> + vu_watch_cb cb1, misc_cb cb2, void *pvt) { >> + int i; >> + >> + assert(sched); >> + assert(fd >= 0); >> + assert(evt); >> + assert(cb1 || cb2); >> + assert(!(cb1 && cb2)); // only one of the cbs should be used >> + >> + for (i=0; i<sched->nfds && i<VUS_SCHED_MAX_FDS; i++) { >> + if (sched->fds[i].fd == fd) { >> + break; >> + } >> + } >> + if (i == VUS_SCHED_MAX_FDS) { >> + PERR("Error adding fd: max number of fds reached"); >> + return -1; >> + } >> + >> + sched->fds[i].fd = fd; >> + sched->fds[i].events = evt; >> + sched->data[i].cb1 = cb1; >> + sched->data[i].cb2 = cb2; >> + sched->data[i].pvt = pvt; >> + sched->data[i].evt = evt; >> + >> + if (sched->nfds <= i) { >> + sched->nfds = i+1; >> + } >> + >> + PDBG("sched@%p: add fd %d to slot %i", sched, fd, i); >> + >> + return 0; >> +} >> + >> +static int sched_del(sched_t *sched, int fd) { >> + int i; >> + >> + assert(sched); >> + assert(fd >= 0); >> + >> + for (i=0; i<sched->nfds; i++) { >> + if (sched->fds[i].fd == fd) { >> + break; >> + } >> + } >> + if (sched->nfds == i) { >> +#ifdef VUS_PEDANTIC_SCHEDULER >> + PERR("Error deleting fd %d: fd not found", fd); >> + return -1; >> +#else >> + return 0; >> +#endif >> + } >> + >> + sched->nfds--; >> + if (sched->nfds > 0) { >> + // Overwrite deleted entry with last entry from scheduler >> + memcpy(&sched->fds[i], &sched->fds[sched->nfds], >> + sizeof(struct pollfd)); >> + memcpy(&sched->data[i], &sched->data[sched->nfds], >> + sizeof(sched_data_t)); >> + } >> + memset(&sched->fds[sched->nfds], 0, sizeof(struct pollfd)); >> + memset(&sched->data[sched->nfds], 0, sizeof(sched_data_t)); >> + >> + PDBG("sched@%p: del fd %d from slot %i", sched, fd, i); >> + >> + return 0; >> +} >> + >> +static int sched_loop(sched_t *sched) { >> + int i, n; >> + >> + assert(sched); >> + assert(sched->nfds > 0); >> + >> + while (!sched->quit) { >> + n = poll(sched->fds, sched->nfds, -1); >> + if (n < 0) { >> + PERR("Error polling: %s", strerror(errno)); >> + return -1; >> + } >> + >> + for (i=0; i<sched->nfds && n; i++) { >> + if (sched->fds[i].revents != 0) { >> + >> + if (sched->data[i].cb1) { >> + int vu_evt = 0; >> + >> + if (sched->fds[i].revents & POLLIN) vu_evt |= VU_WATCH_IN; >> + if (sched->fds[i].revents & POLLOUT) vu_evt |= VU_WATCH_OUT; >> + if (sched->fds[i].revents & POLLPRI) vu_evt |= VU_WATCH_PRI; >> + if (sched->fds[i].revents & POLLERR) vu_evt |= VU_WATCH_ERR; >> + if (sched->fds[i].revents & POLLHUP) vu_evt |= VU_WATCH_HUP; >> + >> + PDBG("sched@%p: fd[%d] (%d): cb1(%p, %d, %p)", sched, i, >> + sched->fds[i].fd, sched->vu_dev, vu_evt, >> + sched->data[i].pvt); >> + >> + sched->data[i].cb1(sched->vu_dev, vu_evt, >> + sched->data[i].pvt); >> + } else { >> + PDBG("sched@%p: fd[%d] (%d): cbb(%hd, %p)", sched, i, >> + sched->fds[i].fd, sched->fds[i].revents, >> + sched->data[i].pvt); >> + >> + sched->data[i].cb2(sched->fds[i].revents, >> + sched->data[i].pvt); >> + } >> + >> + n--; >> + } >> + } >> + } >> + >> + return 0; >> +} >> + >> +/** from libiscsi's scsi-lowlevel.h **/ >> + >> +#define SCSI_CDB_MAX_SIZE 16 >> + >> +struct scsi_iovector { >> + struct scsi_iovec *iov; >> + int niov; >> + int nalloc; >> + size_t offset; >> + int consumed; >> +}; >> + >> +struct scsi_allocated_memory { >> + struct scsi_allocated_memory *next; >> + char buf[0]; >> +}; >> + >> +struct scsi_data { >> + int size; >> + unsigned char *data; >> +}; >> + >> +enum scsi_sense_key { >> + SCSI_SENSE_NO_SENSE = 0x00, >> + SCSI_SENSE_RECOVERED_ERROR = 0x01, >> + SCSI_SENSE_NOT_READY = 0x02, >> + SCSI_SENSE_MEDIUM_ERROR = 0x03, >> + SCSI_SENSE_HARDWARE_ERROR = 0x04, >> + SCSI_SENSE_ILLEGAL_REQUEST = 0x05, >> + SCSI_SENSE_UNIT_ATTENTION = 0x06, >> + SCSI_SENSE_DATA_PROTECTION = 0x07, >> + SCSI_SENSE_BLANK_CHECK = 0x08, >> + SCSI_SENSE_VENDOR_SPECIFIC = 0x09, >> + SCSI_SENSE_COPY_ABORTED = 0x0a, >> + SCSI_SENSE_COMMAND_ABORTED = 0x0b, >> + SCSI_SENSE_OBSOLETE_ERROR_CODE = 0x0c, >> + SCSI_SENSE_OVERFLOW_COMMAND = 0x0d, >> + SCSI_SENSE_MISCOMPARE = 0x0e >> +}; >> + >> +struct scsi_sense { >> + unsigned char error_type; >> + enum scsi_sense_key key; >> + int ascq; >> + unsigned sense_specific:1; >> + unsigned ill_param_in_cdb:1; >> + unsigned bit_pointer_valid:1; >> + unsigned char bit_pointer; >> + uint16_t field_pointer; >> +}; >> + >> +enum scsi_residual { >> + SCSI_RESIDUAL_NO_RESIDUAL = 0, >> + SCSI_RESIDUAL_UNDERFLOW, >> + SCSI_RESIDUAL_OVERFLOW >> +}; >> + >> +struct scsi_task { >> + int status; >> + int cdb_size; >> + int xfer_dir; >> + int expxferlen; >> + unsigned char cdb[SCSI_CDB_MAX_SIZE]; >> + enum scsi_residual residual_status; >> + size_t residual; >> + struct scsi_sense sense; >> + struct scsi_data datain; >> + struct scsi_allocated_memory *mem; >> + void *ptr; >> + >> + uint32_t itt; >> + uint32_t cmdsn; >> + uint32_t lun; >> + >> + struct scsi_iovector iovector_in; >> + struct scsi_iovector iovector_out; >> +}; >> + >> +/** libiscsi integration **/ >> + >> +static int iscsi_add_lun(iscsi_lun_t *lun, char *iscsi_uri) { >> + struct iscsi_url *iscsi_url; >> + struct iscsi_context *iscsi_ctx; >> + int ret = 0; >> + >> + assert(lun); >> + >> + iscsi_ctx = iscsi_create_context(ISCSI_INITIATOR); >> + if (!iscsi_ctx) { >> + PERR("Unable to create iSCSI context"); >> + return -1; >> + } >> + >> + iscsi_url = iscsi_parse_full_url(iscsi_ctx, iscsi_uri); >> + if (!iscsi_url) { >> + PERR("Unable to parse iSCSI URL: %s", iscsi_get_error(iscsi_ctx)); >> + goto fail; >> + } >> + >> + iscsi_set_session_type(iscsi_ctx, ISCSI_SESSION_NORMAL); >> + iscsi_set_header_digest(iscsi_ctx, ISCSI_HEADER_DIGEST_NONE_CRC32C); >> + if (iscsi_full_connect_sync(iscsi_ctx, iscsi_url->portal, iscsi_url->lun)) { >> + PERR("Unable to login to iSCSI portal: %s", iscsi_get_error(iscsi_ctx)); >> + goto fail; >> + } >> + >> + lun->iscsi_ctx = iscsi_ctx; >> + lun->iscsi_lun = iscsi_url->lun; >> + >> + PDBG("Context %p created for lun 0: %s", iscsi_ctx, iscsi_uri); >> + >> +out: >> + if (iscsi_url) { >> + iscsi_destroy_url(iscsi_url); >> + } >> + return ret; >> + >> +fail: >> + (void)iscsi_destroy_context(iscsi_ctx); >> + ret = -1; >> + goto out; >> +} >> + >> +static struct scsi_task *scsi_task_new(int cdb_len, uint8_t *cdb, int dir, >> + int xfer_len) { >> + struct scsi_task *task; >> + >> + assert(cdb_len > 0); >> + assert(cdb); >> + >> + task = calloc(1, sizeof(struct scsi_task)); >> + if (!task) { >> + PERR("Error allocating task: %s", strerror(errno)); >> + return NULL; >> + } >> + >> + memcpy(task->cdb, cdb, cdb_len); >> + task->cdb_size = cdb_len; >> + task->xfer_dir = dir; >> + task->expxferlen = xfer_len; >> + >> + return task; >> +} >> + >> +static int get_cdb_len(uint8_t *cdb) { >> + switch(cdb[0] >> 5){ >> + case 0: >> + return 6; >> + case 1: >> + case 2: >> + return 10; >> + case 4: >> + return 16; >> + case 5: >> + return 12; >> + } >> + PERR("Unable to determine cdb len (0x%02hhX)", cdb[0]>>5); >> + return -1; >> +} >> + >> +static int handle_cmd_sync(struct iscsi_context *ctx, >> + VirtIOSCSICmdReq *req, >> + struct iovec *out, unsigned int out_len, >> + VirtIOSCSICmdResp *rsp, >> + struct iovec *in, unsigned int in_len) { >> + struct scsi_task *task; >> + uint32_t dir; >> + uint32_t len; >> + int cdb_len; >> + int i; >> + >> + if (!((!req->lun[1]) && (req->lun[2] == 0x40) && (!req->lun[3]))) { >> + // Ignore anything different than target=0, lun=0 >> + PDBG("Ignoring unconnected lun (0x%hhX, 0x%hhX)", >> + req->lun[1], req->lun[3]); >> + rsp->status = SCSI_STATUS_CHECK_CONDITION; >> + memset(rsp->sense, 0, sizeof(rsp->sense)); >> + rsp->sense_len = 18; >> + rsp->sense[0] = 0x70; >> + rsp->sense[2] = 0x05; // ILLEGAL_REQUEST >> + rsp->sense[7] = 10; >> + rsp->sense[12] = 0x24; >> + >> + return 0; >> + } >> + >> + cdb_len = get_cdb_len(req->cdb); >> + if (cdb_len == -1) { >> + return -1; >> + } >> + >> + len = 0; >> + if (!out_len && !in_len) { >> + dir = SCSI_XFER_NONE; >> + } else if (out_len) { >> + dir = SCSI_XFER_TO_DEV; >> + for (i=0; i<out_len; i++) { >> + len += out[i].iov_len; >> + } >> + } else { >> + dir = SCSI_XFER_FROM_DEV; >> + for (i=0; i<in_len; i++) { >> + len += in[i].iov_len; >> + } >> + } >> + >> + task = scsi_task_new(cdb_len, req->cdb, dir, len); >> + if (!task) { >> + PERR("Unable to create iscsi task"); >> + return -1; >> + } >> + >> + if (dir == SCSI_XFER_TO_DEV) { >> + task->iovector_out.iov = (struct scsi_iovec *)out; >> + task->iovector_out.niov = out_len; >> + } else if (dir == SCSI_XFER_FROM_DEV) { >> + task->iovector_in.iov = (struct scsi_iovec *)in; >> + task->iovector_in.niov = in_len; >> + } >> + >> + PDBG("Sending iscsi cmd (cdb_len=%d, dir=%d, task=%p)", >> + cdb_len, dir, task); >> + if (!iscsi_scsi_command_sync(ctx, 0, task, NULL)) { >> + PERR("Error serving SCSI command"); >> + free(task); >> + return -1; >> + } >> + >> + memset(rsp, 0, sizeof(*rsp)); >> + >> + rsp->status = task->status; >> + rsp->resid = task->residual; >> + >> + if (task->status == SCSI_STATUS_CHECK_CONDITION) { >> + rsp->response = VIRTIO_SCSI_S_FAILURE; >> + rsp->sense_len = task->datain.size - 2; >> + memcpy(rsp->sense, &task->datain.data[2], rsp->sense_len); >> + } >> + >> + free(task); >> + >> + PDBG("Filled in rsp: status=%hhX, resid=%u, response=%hhX, sense_len=%u", >> + rsp->status, rsp->resid, rsp->response, rsp->sense_len); >> + >> + return 0; >> +} >> + >> +/** libvhost-user callbacks **/ >> + >> +static void vus_panic_cb(VuDev *vu_dev, const char *buf) { >> + vhost_scsi_dev_t *vdev_scsi; >> + >> + assert(vu_dev); >> + >> + vdev_scsi = vdev_scsi_find_by_vu(vu_dev); >> + >> + if (buf) { >> + PERR("vu_panic: %s", buf); >> + } >> + >> + if (vdev_scsi) { >> + vdev_scsi->sched.quit = 1; >> + } >> +} >> + >> +static void vus_add_watch_cb(VuDev *vu_dev, int fd, int vu_evt, vu_watch_cb cb, >> + void *pvt) { >> + vhost_scsi_dev_t *vdev_scsi; >> + int poll_evt = 0; >> + >> + assert(vu_dev); >> + assert(fd >= 0); >> + assert(cb); >> + >> + vdev_scsi = vdev_scsi_find_by_vu(vu_dev); >> + if (!vdev_scsi) { >> + vus_panic_cb(vu_dev, NULL); >> + } >> + >> + /* TODO: VU_WATCH_* should match POLL*, check it */ >> + if (vu_evt & VU_WATCH_IN) poll_evt |= POLLIN; >> + if (vu_evt & VU_WATCH_OUT) poll_evt |= POLLOUT; >> + if (vu_evt & VU_WATCH_PRI) poll_evt |= POLLPRI; >> + if (vu_evt & VU_WATCH_ERR) poll_evt |= POLLERR; >> + if (vu_evt & VU_WATCH_HUP) poll_evt |= POLLHUP; >> + >> + if (sched_add(&vdev_scsi->sched, fd, poll_evt, cb, NULL, pvt)) { >> + vus_panic_cb(vu_dev, NULL); >> + } >> +} >> + >> +static void vus_del_watch_cb(VuDev *vu_dev, int fd) { >> + vhost_scsi_dev_t *vdev_scsi; >> + >> + assert(vu_dev); >> + assert(fd >= 0); >> + >> + vdev_scsi = vdev_scsi_find_by_vu(vu_dev); >> + if (!vdev_scsi) { >> + vus_panic_cb(vu_dev, NULL); >> + return; >> + } >> + >> + if (sched_del(&vdev_scsi->sched, fd)) { >> + vus_panic_cb(vu_dev, NULL); >> + } >> +} >> + >> +static void vus_proc_ctl(VuDev *vu_dev, int idx) { >> + /* Control VQ not implemented */ >> +} >> + >> +static void vus_proc_evt(VuDev *vu_dev, int idx) { >> + /* Event VQ not implemented */ >> +} >> + >> +static void vus_proc_req(VuDev *vu_dev, int idx) { >> + vhost_scsi_dev_t *vdev_scsi; >> + VuVirtq *vq; >> + >> + assert(vu_dev); >> + >> + vdev_scsi = vdev_scsi_find_by_vu(vu_dev); >> + if (!vdev_scsi) { >> + vus_panic_cb(vu_dev, NULL); >> + return; >> + } >> + >> + if ((idx < 0) || (idx >= VHOST_MAX_NR_VIRTQUEUE)) { >> + PERR("VQ Index out of range: %d", idx); >> + vus_panic_cb(vu_dev, NULL); >> + return; >> + } >> + >> + vq = vu_get_queue(vu_dev, idx); >> + if (!vq) { >> + PERR("Error fetching VQ (dev=%p, idx=%d)", vu_dev, idx); >> + vus_panic_cb(vu_dev, NULL); >> + return; >> + } >> + >> + PDBG("Got kicked on vq[%d]@%p", idx, vq); >> + >> + while(1) { >> + VuVirtqElement *elem; >> + VirtIOSCSICmdReq *req; >> + VirtIOSCSICmdResp *rsp; >> + >> + elem = vu_queue_pop(vu_dev, vq, sizeof(VuVirtqElement)); >> + if (!elem) { >> + PDBG("No more elements pending on vq[%d]@%p", idx, vq); >> + break; >> + } >> + PDBG("Popped elem@%p", elem); >> + >> + assert(!((elem->out_num > 1) && (elem->in_num > 1))); >> + assert((elem->out_num > 0) && (elem->in_num > 0)); >> + >> + if (elem->out_sg[0].iov_len < sizeof(VirtIOSCSICmdReq)) { >> + PERR("Invalid virtio-scsi req header"); >> + vus_panic_cb(vu_dev, NULL); >> + break; >> + } >> + req = (VirtIOSCSICmdReq *)elem->out_sg[0].iov_base; >> + >> + if (elem->in_sg[0].iov_len < sizeof(VirtIOSCSICmdResp)) { >> + PERR("Invalid virtio-scsi rsp header"); >> + vus_panic_cb(vu_dev, NULL); >> + break; >> + } >> + rsp = (VirtIOSCSICmdResp *)elem->in_sg[0].iov_base; >> + >> + if (handle_cmd_sync(vdev_scsi->luns[0].iscsi_ctx, >> + req, &elem->out_sg[1], elem->out_num-1, >> + rsp, &elem->in_sg[1], elem->in_num-1) != 0) { >> + vus_panic_cb(vu_dev, NULL); >> + break; >> + } >> + >> + vu_queue_push(vu_dev, vq, elem, 0); >> + vu_queue_notify(vu_dev, vq); >> + >> + free(elem); >> + } >> + >> +} >> + >> +static void vus_queue_set_started(VuDev *vu_dev, int idx, bool started) { >> + VuVirtq *vq; >> + >> + assert(vu_dev); >> + >> + if ((idx < 0) || (idx >= VHOST_MAX_NR_VIRTQUEUE)) { >> + PERR("VQ Index out of range: %d", idx); >> + vus_panic_cb(vu_dev, NULL); >> + return; >> + } >> + >> + vq = vu_get_queue(vu_dev, idx); >> + >> + switch(idx) { >> + case 0: >> + vu_set_queue_handler(vu_dev, vq, started?vus_proc_ctl:NULL); >> + break; >> + case 1: >> + vu_set_queue_handler(vu_dev, vq, started?vus_proc_evt:NULL); >> + break; >> + default: >> + vu_set_queue_handler(vu_dev, vq, started?vus_proc_req:NULL); >> + } >> +} >> + >> +static const VuDevIface vus_iface = { >> + .queue_set_started = vus_queue_set_started, >> +}; >> + >> +static void vus_vhost_cb(VuDev *vu_dev, int vu_evt, void *data) { >> + assert(vu_dev); >> + >> + if (!vu_dispatch(vu_dev) != 0) { >> + PERR("Error processing vhost message"); >> + vus_panic_cb(vu_dev, NULL); >> + } >> +} >> + >> +/** util **/ >> + >> +static int unix_sock_new(char *unix_fn) { >> + int sock; >> + struct sockaddr_un un; >> + size_t len; >> + >> + assert(unix_fn); >> + >> + sock = socket(AF_UNIX, SOCK_STREAM, 0); >> + if (sock <= 0) { >> + perror("socket"); >> + return -1; >> + } >> + >> + un.sun_family = AF_UNIX; >> + (void)snprintf(un.sun_path, sizeof(un.sun_path), "%s", unix_fn); >> + len = sizeof(un.sun_family) + strlen(un.sun_path); >> + >> + (void)unlink(unix_fn); >> + if (bind(sock, (struct sockaddr *)&un, len) < 0) { >> + perror("bind"); >> + goto fail; >> + } >> + >> + if (listen(sock, 1) < 0) { >> + perror("listen"); >> + goto fail; >> + } >> + >> + return sock; >> + >> +fail: >> + (void)close(sock); >> + >> + return -1; >> +} >> + >> +/** vhost-user-scsi **/ >> + >> +static vhost_scsi_dev_t *vdev_scsi_find_by_vu(VuDev *vu_dev) { >> + int i; >> + >> + assert(vu_dev); >> + >> + for (i=0; i<VDEV_SCSI_MAX_DEVS; i++) { >> + if (&vhost_scsi_devs[i]->vu_dev == vu_dev) { >> + return vhost_scsi_devs[i]; >> + } >> + } >> + >> + PERR("Unknown VuDev %p", vu_dev); >> + return NULL; >> +} >> + >> +static void vdev_scsi_deinit(vhost_scsi_dev_t *vdev_scsi) { >> + if (!vdev_scsi) { >> + return; >> + } >> + >> + if (vdev_scsi->server_sock >= 0) { >> + struct sockaddr_storage ss; >> + socklen_t sslen = sizeof(ss); >> + >> + if (getsockname(vdev_scsi->server_sock, (struct sockaddr *)&ss, >> + &sslen) == 0) { >> + struct sockaddr_un *su = (struct sockaddr_un *)&ss; >> + (void)unlink(su->sun_path); >> + } >> + >> + (void)close(vdev_scsi->server_sock); >> + } >> +} >> + >> +static vhost_scsi_dev_t *vdev_scsi_new(char *unix_fn) { >> + vhost_scsi_dev_t *vdev_scsi; >> + >> + assert(unix_fn); >> + >> + vdev_scsi = calloc(1, sizeof(vhost_scsi_dev_t)); >> + if (!vdev_scsi) { >> + perror("calloc"); >> + return NULL; >> + } >> + >> + vdev_scsi->server_sock = unix_sock_new(unix_fn); >> + if (vdev_scsi->server_sock < 0) { >> + free(vdev_scsi); >> + return NULL; >> + } >> + >> + vdev_scsi->sched.vu_dev = &vdev_scsi->vu_dev; >> + >> + return vdev_scsi; >> +} >> + >> +static int vdev_scsi_iscsi_add_lun(vhost_scsi_dev_t *vdev_scsi, >> + char *iscsi_uri, uint32_t lun) { >> + assert(vdev_scsi); >> + assert(iscsi_uri); >> + assert(lun < VDEV_SCSI_MAX_LUNS); >> + >> + if (vdev_scsi->luns[lun].iscsi_ctx) { >> + PERR("Lun %d already configured", lun); >> + return -1; >> + } >> + >> + if (iscsi_add_lun(&vdev_scsi->luns[lun], iscsi_uri) != 0) { >> + return -1; >> + } >> + >> + return 0; >> +} >> + >> +static int vdev_scsi_run(vhost_scsi_dev_t *vdev_scsi) { >> + int cli_sock; >> + int ret = 0; >> + >> + assert(vdev_scsi); >> + assert(vdev_scsi->server_sock >= 0); >> + >> + cli_sock = accept(vdev_scsi->server_sock, (void *)0, (void *)0); >> + if (cli_sock < 0) { >> + perror("accept"); >> + return -1; >> + } >> + >> + vu_init(&vdev_scsi->vu_dev, >> + cli_sock, >> + vus_panic_cb, >> + vus_add_watch_cb, >> + vus_del_watch_cb, >> + &vus_iface); >> + >> + ret = sched_add(&vdev_scsi->sched, cli_sock, POLLIN, vus_vhost_cb, NULL, 0); >> + if (ret) { >> + goto fail; >> + } >> + >> + if (sched_loop(&vdev_scsi->sched) != 0) { >> + goto fail; >> + } >> + >> +out: >> + vu_deinit(&vdev_scsi->vu_dev); >> + >> + return ret; >> + >> +fail: >> + ret = -1; >> + goto out; >> +} >> + >> +int main(int argc, char **argv) >> +{ >> + vhost_scsi_dev_t *vdev_scsi = NULL; >> + char *unix_fn = NULL; >> + char *iscsi_uri = NULL; >> + int opt, err = EXIT_SUCCESS; >> + >> + while ((opt = getopt(argc, argv, "u:i:")) != -1) { >> + switch (opt) { >> + case 'h': >> + goto help; >> + case 'u': >> + unix_fn = strdup(optarg); >> + break; >> + case 'i': >> + iscsi_uri = strdup(optarg); >> + break; >> + default: >> + goto help; >> + } >> + } >> + if (!unix_fn || !iscsi_uri) { >> + goto help; >> + } >> + >> + vdev_scsi = vdev_scsi_new(unix_fn); >> + if (!vdev_scsi) { >> + goto err; >> + } >> + vhost_scsi_devs[0] = vdev_scsi; >> + >> + if (vdev_scsi_iscsi_add_lun(vdev_scsi, iscsi_uri, 0) != 0) { >> + goto err; >> + } >> + >> + if (vdev_scsi_run(vdev_scsi) != 0) { >> + goto err; >> + } >> + >> +out: >> + if (vdev_scsi) { >> + vdev_scsi_deinit(vdev_scsi); >> + free(vdev_scsi); >> + } >> + if (unix_fn) { >> + free(unix_fn); >> + } >> + if (iscsi_uri) { >> + free(iscsi_uri); >> + } >> + >> + return err; >> + >> +err: >> + err = EXIT_FAILURE; >> + goto out; >> + >> +help: >> + fprintf(stderr, "Usage: %s [ -u unix_sock_path -i iscsi_uri ] | [ -h ]\n", >> + argv[0]); >> + fprintf(stderr, " -u path to unix socket\n"); >> + fprintf(stderr, " -i iscsi uri for lun 0\n"); >> + fprintf(stderr, " -h print help and quit\n"); >> + >> + goto err; >> +} >> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] vus: Introduce a vhost-user-scsi sample application 2016-10-27 12:48 ` Felipe Franciosi @ 2016-10-27 12:58 ` Paolo Bonzini 0 siblings, 0 replies; 7+ messages in thread From: Paolo Bonzini @ 2016-10-27 12:58 UTC (permalink / raw) To: Felipe Franciosi Cc: Stefan Hajnoczi, Marc-Andre Lureau, Michael S. Tsirkin, qemu-devel@nongnu.org, Victor Kaplansky On 27/10/2016 14:48, Felipe Franciosi wrote: > Hello, > >> On 27 Oct 2016, at 13:16, Paolo Bonzini <pbonzini@redhat.com> wrote: >> >> >> >> On 26/10/2016 17:26, Felipe Franciosi wrote: >>> This commit introduces a vhost-user-scsi backend sample application. It >>> must be linked with libiscsi and libvhost-user. >>> >>> To use it, compile with: >>> make tests/vhost-user-scsi >>> >>> And run as follows: >>> tests/vhost-user-scsi -u /tmp/vus.sock -i iscsi://uri_to_target/ >>> >>> The application is currently limited at one LUN only and it processes >>> requests synchronously (therefore only achieving QD1). The purpose of >>> the code is to show how a backend can be implemented and to test the >>> vhost-user-scsi Qemu implementation. >>> >>> If a different instance of this vhost-user-scsi application is executed >>> at a remote host, a VM can be live migrated to such a host. >> >> Hi, >> >> the right directory for this is contrib/. > > Cool. I was following suit from vhost-user-bridge which lives in > tests/ today. To me, it makes more sense for these to be in contrib/. > I'll place my sample application there for v2 and perhaps we should move > vhost-user-bridge later? Yes, that would make sense. Adding Victor in Cc. Paolo ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-10-27 12:58 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-10-26 15:26 [Qemu-devel] [PATCH 0/2] Introduce vhost-user-scsi and sample application Felipe Franciosi 2016-10-26 15:26 ` [Qemu-devel] [PATCH 1/2] vus: Introduce vhost-user-scsi host device Felipe Franciosi 2016-10-27 12:12 ` Paolo Bonzini 2016-10-26 15:26 ` [Qemu-devel] [PATCH 2/2] vus: Introduce a vhost-user-scsi sample application Felipe Franciosi 2016-10-27 12:16 ` Paolo Bonzini 2016-10-27 12:48 ` Felipe Franciosi 2016-10-27 12:58 ` Paolo Bonzini
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).