* Re: [Qemu-devel] [PATCH v2 0/4] *** Introduce a new vhost-user-blk host device to Qemu ***
2017-08-10 10:12 [Qemu-devel] [PATCH v2 0/4] *** Introduce a new vhost-user-blk host device to Qemu *** Changpeng Liu
@ 2017-08-09 16:58 ` Michael S. Tsirkin
2017-08-10 10:12 ` [Qemu-devel] [PATCH v2 1/4] vhost-user: add new vhost user messages to support virtio config space Changpeng Liu
` (3 subsequent siblings)
4 siblings, 0 replies; 13+ messages in thread
From: Michael S. Tsirkin @ 2017-08-09 16:58 UTC (permalink / raw)
To: Changpeng Liu
Cc: qemu-devel, stefanha, pbonzini, marcandre.lureau, felipe,
james.r.harris
On Thu, Aug 10, 2017 at 06:12:27PM +0800, Changpeng Liu wrote:
> Althrough virtio scsi specification was designed as a replacement for virtio_blk,
> there are still many users using virtio_blk. Qemu 2.9 introduced a new device
> vhost user scsi which can process I/O in user space for virtio_scsi, this commit
> introduces a new vhost user block host device, which can support virtio_blk in
> Guest OS, and I/O processing in another I/O target.
>
> Due to the limitation for virtio_blk specification, virtio_blk device cannot get
> block information such as capacity, block size etc via the specification, several
> new vhost user messages were added to support deliver virtio config space
> information between Qemu and I/O target, VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG
> messages used for get/set config space from/to I/O target, VHOST_USER_SET_CONFIG_FD
> was added for event notifier in case the change of virtio config space. Also, those
> messages can be used for vhost device live migration as well.
As we are busy wrapping up a QEMU release, please remember to repost after the
release.
> Changpeng Liu (4):
> vhost-user: add new vhost user messages to support virtio config space
> vhost-user-blk: introduce a new vhost-user-blk host device
> contrib/libvhost-user: enable virtio config space messages
> contrib/vhost-user-blk: introduce a vhost-user-blk sample application
>
> .gitignore | 1 +
> Makefile | 3 +
> Makefile.objs | 2 +
> configure | 11 +
> contrib/libvhost-user/libvhost-user.c | 51 +++
> contrib/libvhost-user/libvhost-user.h | 14 +
> contrib/vhost-user-blk/Makefile.objs | 1 +
> contrib/vhost-user-blk/vhost-user-blk.c | 735 ++++++++++++++++++++++++++++++++
> docs/interop/vhost-user.txt | 31 ++
> hw/block/Makefile.objs | 3 +
> hw/block/vhost-user-blk.c | 360 ++++++++++++++++
> hw/virtio/vhost-user.c | 86 ++++
> hw/virtio/vhost.c | 63 +++
> hw/virtio/virtio-pci.c | 55 +++
> hw/virtio/virtio-pci.h | 18 +
> include/hw/virtio/vhost-backend.h | 8 +
> include/hw/virtio/vhost-user-blk.h | 40 ++
> include/hw/virtio/vhost.h | 16 +
> 18 files changed, 1498 insertions(+)
> create mode 100644 contrib/vhost-user-blk/Makefile.objs
> create mode 100644 contrib/vhost-user-blk/vhost-user-blk.c
> create mode 100644 hw/block/vhost-user-blk.c
> create mode 100644 include/hw/virtio/vhost-user-blk.h
>
> --
> 1.9.3
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH v2 1/4] vhost-user: add new vhost user messages to support virtio config space
2017-08-10 10:12 [Qemu-devel] [PATCH v2 0/4] *** Introduce a new vhost-user-blk host device to Qemu *** Changpeng Liu
2017-08-09 16:58 ` Michael S. Tsirkin
@ 2017-08-10 10:12 ` Changpeng Liu
2017-08-09 15:10 ` Marc-André Lureau
2017-08-10 10:12 ` [Qemu-devel] [PATCH v2 2/4] vhost-user-blk: introduce a new vhost-user-blk host device Changpeng Liu
` (2 subsequent siblings)
4 siblings, 1 reply; 13+ messages in thread
From: Changpeng Liu @ 2017-08-10 10:12 UTC (permalink / raw)
To: changpeng.liu, qemu-devel
Cc: stefanha, pbonzini, mst, marcandre.lureau, felipe, james.r.harris
Add VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages which can be
used for live migration for vhost user devices, also vhost user devices
can benifit from the messages to get/set virtio config space from/to the
I/O target besides Qemu. For the purpose to support virtio config space
change, VHOST_USER_SET_CONFIG_FD message is added as the event notifier
in case virtio config space change.
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
---
docs/interop/vhost-user.txt | 31 ++++++++++++++
hw/virtio/vhost-user.c | 86 +++++++++++++++++++++++++++++++++++++++
hw/virtio/vhost.c | 63 ++++++++++++++++++++++++++++
include/hw/virtio/vhost-backend.h | 8 ++++
include/hw/virtio/vhost.h | 16 ++++++++
5 files changed, 204 insertions(+)
diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
index 954771d..19dfc61 100644
--- a/docs/interop/vhost-user.txt
+++ b/docs/interop/vhost-user.txt
@@ -116,6 +116,10 @@ Depending on the request type, payload can be:
- 3: IOTLB invalidate
- 4: IOTLB access fail
+ * Virtio device config space
+
+ 256 Bytes static virito config space
+
In QEMU the vhost-user message is implemented with the following struct:
typedef struct VhostUserMsg {
@@ -129,6 +133,7 @@ typedef struct VhostUserMsg {
VhostUserMemory memory;
VhostUserLog log;
struct vhost_iotlb_msg iotlb;
+ uint8_t config[256];
};
} QEMU_PACKED VhostUserMsg;
@@ -596,6 +601,32 @@ Master message types
and expect this message once (per VQ) during device configuration
(ie. before the master starts the VQ).
+ * VHOST_USER_GET_CONFIG
+ Id: 24
+ Equivalent ioctl: N/A
+ Master payload: virtio device config space
+
+ Submitted by the vhost-user master to fetch the contents of the virtio
+ config space. The vhost-user master may cache the contents to avoid
+ repeated VHOST_USER_GET_CONFIG calls.
+
+* VHOST_USER_SET_CONFIG
+ Id: 25
+ Equivalent ioctl: N/A
+ Master payload: virtio device config space
+
+ Submitted by the vhost-user master when the guest writes to virtio
+ config space and also after live migration on the destination host.
+
+* VHOST_USER_SET_CONFIG_FD
+ Id: 26
+ Equivalent ioctl: N/A
+ Master payload: N/A
+
+ Sets the notifier file descriptor, which is passed as ancillary data.
+ Vhost-user master uses the file descriptor as event callback when the
+ virtio config space changed.
+
Slave message types
-------------------
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 093675e..4b402c5 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -26,6 +26,11 @@
#define VHOST_MEMORY_MAX_NREGIONS 8
#define VHOST_USER_F_PROTOCOL_FEATURES 30
+/*
+ * Maximum size of virtio device config space
+ */
+#define VHOST_USER_MAX_CONFIG_SIZE 256
+
enum VhostUserProtocolFeature {
VHOST_USER_PROTOCOL_F_MQ = 0,
VHOST_USER_PROTOCOL_F_LOG_SHMFD = 1,
@@ -65,6 +70,9 @@ typedef enum VhostUserRequest {
VHOST_USER_SET_SLAVE_REQ_FD = 21,
VHOST_USER_IOTLB_MSG = 22,
VHOST_USER_SET_VRING_ENDIAN = 23,
+ VHOST_USER_GET_CONFIG = 24,
+ VHOST_USER_SET_CONFIG = 25,
+ VHOST_USER_SET_CONFIG_FD = 26,
VHOST_USER_MAX
} VhostUserRequest;
@@ -109,6 +117,7 @@ typedef struct VhostUserMsg {
VhostUserMemory memory;
VhostUserLog log;
struct vhost_iotlb_msg iotlb;
+ uint8_t config[VHOST_USER_MAX_CONFIG_SIZE];
} payload;
} QEMU_PACKED VhostUserMsg;
@@ -922,6 +931,80 @@ static void vhost_user_set_iotlb_callback(struct vhost_dev *dev, int enabled)
/* No-op as the receive channel is not dedicated to IOTLB messages. */
}
+static int vhost_user_get_config(struct vhost_dev *dev, uint8_t *config,
+ size_t config_len)
+{
+ VhostUserMsg msg = {
+ .request = VHOST_USER_GET_CONFIG,
+ .flags = VHOST_USER_VERSION,
+ .size = config_len,
+ };
+
+ if (config_len == 0 || config_len > VHOST_USER_PAYLOAD_SIZE) {
+ error_report("bad config length");
+ return -1;
+ }
+
+ if (vhost_user_write(dev, &msg, NULL, 0) < 0) {
+ return -1;
+ }
+
+ if (vhost_user_read(dev, &msg) < 0) {
+ return -1;
+ }
+
+ if (msg.request != VHOST_USER_GET_CONFIG) {
+ error_report("Received unexpected msg type. Expected %d received %d",
+ VHOST_USER_GET_CONFIG, msg.request);
+ return -1;
+ }
+
+ if (msg.size != config_len) {
+ error_report("Received bad msg size.");
+ return -1;
+ }
+
+ memcpy(config, &msg.payload.config, config_len);
+
+ return 0;
+}
+
+static int vhost_user_set_config(struct vhost_dev *dev, const uint8_t *config,
+ size_t config_len)
+{
+ VhostUserMsg msg = {
+ .request = VHOST_USER_SET_CONFIG,
+ .flags = VHOST_USER_VERSION,
+ .size = config_len,
+ };
+
+ if (config_len == 0 || config_len > VHOST_USER_PAYLOAD_SIZE) {
+ error_report("bad config length");
+ return -1;
+ }
+
+ memcpy(&msg.payload.config, config, config_len);
+ if (vhost_user_write(dev, &msg, NULL, 0) < 0) {
+ return -1;
+ }
+
+ return 0;
+}
+
+static int vhost_user_set_config_fd(struct vhost_dev *dev, int fd)
+{
+ VhostUserMsg msg = {
+ .request = VHOST_USER_SET_CONFIG_FD,
+ .flags = VHOST_USER_VERSION,
+ };
+
+ if (vhost_user_write(dev, &msg, &fd, 1) < 0) {
+ return -1;
+ }
+
+ return 0;
+}
+
const VhostOps user_ops = {
.backend_type = VHOST_BACKEND_TYPE_USER,
.vhost_backend_init = vhost_user_init,
@@ -948,4 +1031,7 @@ const VhostOps user_ops = {
.vhost_net_set_mtu = vhost_user_net_set_mtu,
.vhost_set_iotlb_callback = vhost_user_set_iotlb_callback,
.vhost_send_device_iotlb_msg = vhost_user_send_device_iotlb_msg,
+ .vhost_get_config = vhost_user_get_config,
+ .vhost_set_config = vhost_user_set_config,
+ .vhost_set_config_fd = vhost_user_set_config_fd,
};
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 6eddb09..0d39a55 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1353,6 +1353,9 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
for (i = 0; i < hdev->nvqs; ++i) {
vhost_virtqueue_cleanup(hdev->vqs + i);
}
+ if (hdev->config_ops) {
+ event_notifier_cleanup(&hdev->config_notifier);
+ }
if (hdev->mem) {
/* those are only safe after successful init */
memory_listener_unregister(&hdev->memory_listener);
@@ -1496,6 +1499,66 @@ void vhost_ack_features(struct vhost_dev *hdev, const int *feature_bits,
}
}
+int vhost_dev_get_config(struct vhost_dev *hdev, uint8_t *config,
+ size_t config_len)
+{
+ assert(hdev->vhost_ops);
+
+ if (hdev->vhost_ops->vhost_get_config) {
+ return hdev->vhost_ops->vhost_get_config(hdev, config, config_len);
+ }
+
+ return 0;
+}
+
+int vhost_dev_set_config(struct vhost_dev *hdev, const uint8_t *config,
+ size_t config_len)
+{
+ assert(hdev->vhost_ops);
+
+ if (hdev->vhost_ops->vhost_set_config) {
+ return hdev->vhost_ops->vhost_set_config(hdev, config, config_len);
+ }
+
+ return 0;
+}
+
+static void vhost_dev_config_notifier_read(EventNotifier *n)
+{
+ struct vhost_dev *hdev = container_of(n, struct vhost_dev,
+ config_notifier);
+
+ if (event_notifier_test_and_clear(n)) {
+ if (hdev->config_ops) {
+ hdev->config_ops->vhost_dev_config_notifier(hdev);
+ }
+ }
+}
+
+int vhost_dev_set_config_notifier(struct vhost_dev *hdev,
+ const VhostDevConfigOps *ops)
+{
+ int r, fd;
+
+ assert(hdev->vhost_ops);
+
+ r = event_notifier_init(&hdev->config_notifier, 0);
+ if (r < 0) {
+ return r;
+ }
+
+ hdev->config_ops = ops;
+ event_notifier_set_handler(&hdev->config_notifier,
+ vhost_dev_config_notifier_read);
+
+ if (hdev->vhost_ops->vhost_set_config_fd) {
+ fd = event_notifier_get_fd(&hdev->config_notifier);
+ return hdev->vhost_ops->vhost_set_config_fd(hdev, fd);
+ }
+
+ return 0;
+}
+
/* Host notifiers must be enabled at this point. */
int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
{
diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
index a7a5f22..df6769e 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -84,6 +84,11 @@ typedef void (*vhost_set_iotlb_callback_op)(struct vhost_dev *dev,
int enabled);
typedef int (*vhost_send_device_iotlb_msg_op)(struct vhost_dev *dev,
struct vhost_iotlb_msg *imsg);
+typedef int (*vhost_set_config_op)(struct vhost_dev *dev, const uint8_t *config,
+ size_t config_len);
+typedef int (*vhost_get_config_op)(struct vhost_dev *dev, uint8_t *config,
+ size_t config_len);
+typedef int (*vhost_set_config_fd_op)(struct vhost_dev *dev, int fd);
typedef struct VhostOps {
VhostBackendType backend_type;
@@ -118,6 +123,9 @@ typedef struct VhostOps {
vhost_vsock_set_running_op vhost_vsock_set_running;
vhost_set_iotlb_callback_op vhost_set_iotlb_callback;
vhost_send_device_iotlb_msg_op vhost_send_device_iotlb_msg;
+ vhost_get_config_op vhost_get_config;
+ vhost_set_config_op vhost_set_config;
+ vhost_set_config_fd_op vhost_set_config_fd;
} VhostOps;
extern const VhostOps user_ops;
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 467dc77..ff172f2 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -46,6 +46,12 @@ struct vhost_iommu {
QLIST_ENTRY(vhost_iommu) iommu_next;
};
+typedef struct VhostDevConfigOps {
+ /* Vhost device config space changed callback
+ */
+ void (*vhost_dev_config_notifier)(struct vhost_dev *dev);
+} VhostDevConfigOps;
+
struct vhost_memory;
struct vhost_dev {
VirtIODevice *vdev;
@@ -76,6 +82,8 @@ struct vhost_dev {
QLIST_ENTRY(vhost_dev) entry;
QLIST_HEAD(, vhost_iommu) iommu_list;
IOMMUNotifier n;
+ EventNotifier config_notifier;
+ const VhostDevConfigOps *config_ops;
};
int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
@@ -106,4 +114,12 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
struct vhost_vring_file *file);
int vhost_device_iotlb_miss(struct vhost_dev *dev, uint64_t iova, int write);
+int vhost_dev_get_config(struct vhost_dev *dev, uint8_t *config,
+ size_t config_len);
+int vhost_dev_set_config(struct vhost_dev *dev, const uint8_t *config,
+ size_t config_len);
+/* notifier callback in case vhost device config space changed
+ */
+int vhost_dev_set_config_notifier(struct vhost_dev *dev,
+ const VhostDevConfigOps *ops);
#endif
--
1.9.3
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH v2 1/4] vhost-user: add new vhost user messages to support virtio config space
2017-08-10 10:12 ` [Qemu-devel] [PATCH v2 1/4] vhost-user: add new vhost user messages to support virtio config space Changpeng Liu
@ 2017-08-09 15:10 ` Marc-André Lureau
0 siblings, 0 replies; 13+ messages in thread
From: Marc-André Lureau @ 2017-08-09 15:10 UTC (permalink / raw)
To: Changpeng Liu; +Cc: qemu-devel, stefanha, pbonzini, mst, felipe, james r harris
Hi
----- Original Message -----
> Add VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages which can be
> used for live migration for vhost user devices, also vhost user devices
> can benifit from the messages to get/set virtio config space from/to the
benefit
> I/O target besides Qemu. For the purpose to support virtio config space
> change, VHOST_USER_SET_CONFIG_FD message is added as the event notifier
> in case virtio config space change.
>
> Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
> ---
> docs/interop/vhost-user.txt | 31 ++++++++++++++
> hw/virtio/vhost-user.c | 86
> +++++++++++++++++++++++++++++++++++++++
> hw/virtio/vhost.c | 63 ++++++++++++++++++++++++++++
> include/hw/virtio/vhost-backend.h | 8 ++++
> include/hw/virtio/vhost.h | 16 ++++++++
> 5 files changed, 204 insertions(+)
>
> diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
> index 954771d..19dfc61 100644
> --- a/docs/interop/vhost-user.txt
> +++ b/docs/interop/vhost-user.txt
> @@ -116,6 +116,10 @@ Depending on the request type, payload can be:
> - 3: IOTLB invalidate
> - 4: IOTLB access fail
>
> + * Virtio device config space
> +
> + 256 Bytes static virito config space
> +
trailing spaces
> In QEMU the vhost-user message is implemented with the following struct:
>
> typedef struct VhostUserMsg {
> @@ -129,6 +133,7 @@ typedef struct VhostUserMsg {
> VhostUserMemory memory;
> VhostUserLog log;
> struct vhost_iotlb_msg iotlb;
> + uint8_t config[256];
> };
> } QEMU_PACKED VhostUserMsg;
>
> @@ -596,6 +601,32 @@ Master message types
> and expect this message once (per VQ) during device configuration
> (ie. before the master starts the VQ).
>
> + * VHOST_USER_GET_CONFIG
> + Id: 24
> + Equivalent ioctl: N/A
> + Master payload: virtio device config space
> +
> + Submitted by the vhost-user master to fetch the contents of the virtio
> + config space. The vhost-user master may cache the contents to avoid
> + repeated VHOST_USER_GET_CONFIG calls.
> +
> +* VHOST_USER_SET_CONFIG
> + Id: 25
> + Equivalent ioctl: N/A
> + Master payload: virtio device config space
> +
> + Submitted by the vhost-user master when the guest writes to virtio
> + config space and also after live migration on the destination host.
> +
> +* VHOST_USER_SET_CONFIG_FD
> + Id: 26
> + Equivalent ioctl: N/A
> + Master payload: N/A
> +
> + Sets the notifier file descriptor, which is passed as ancillary data.
> + Vhost-user master uses the file descriptor as event callback when the
> + virtio config space changed.
So this is a fd for the slave to notify of config change? Shouldn't we use the "slave communication" instead?
> +
> Slave message types
> -------------------
>
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 093675e..4b402c5 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -26,6 +26,11 @@
> #define VHOST_MEMORY_MAX_NREGIONS 8
> #define VHOST_USER_F_PROTOCOL_FEATURES 30
>
> +/*
> + * Maximum size of virtio device config space
> + */
> +#define VHOST_USER_MAX_CONFIG_SIZE 256
> +
> enum VhostUserProtocolFeature {
> VHOST_USER_PROTOCOL_F_MQ = 0,
> VHOST_USER_PROTOCOL_F_LOG_SHMFD = 1,
> @@ -65,6 +70,9 @@ typedef enum VhostUserRequest {
> VHOST_USER_SET_SLAVE_REQ_FD = 21,
> VHOST_USER_IOTLB_MSG = 22,
> VHOST_USER_SET_VRING_ENDIAN = 23,
> + VHOST_USER_GET_CONFIG = 24,
> + VHOST_USER_SET_CONFIG = 25,
> + VHOST_USER_SET_CONFIG_FD = 26,
> VHOST_USER_MAX
> } VhostUserRequest;
>
> @@ -109,6 +117,7 @@ typedef struct VhostUserMsg {
> VhostUserMemory memory;
> VhostUserLog log;
> struct vhost_iotlb_msg iotlb;
> + uint8_t config[VHOST_USER_MAX_CONFIG_SIZE];
> } payload;
> } QEMU_PACKED VhostUserMsg;
>
> @@ -922,6 +931,80 @@ static void vhost_user_set_iotlb_callback(struct
> vhost_dev *dev, int enabled)
> /* No-op as the receive channel is not dedicated to IOTLB messages. */
> }
>
> +static int vhost_user_get_config(struct vhost_dev *dev, uint8_t *config,
> + size_t config_len)
> +{
> + VhostUserMsg msg = {
> + .request = VHOST_USER_GET_CONFIG,
> + .flags = VHOST_USER_VERSION,
> + .size = config_len,
> + };
> +
> + if (config_len == 0 || config_len > VHOST_USER_PAYLOAD_SIZE) {
> + error_report("bad config length");
> + return -1;
> + }
> +
> + if (vhost_user_write(dev, &msg, NULL, 0) < 0) {
> + return -1;
> + }
> +
> + if (vhost_user_read(dev, &msg) < 0) {
> + return -1;
> + }
> +
> + if (msg.request != VHOST_USER_GET_CONFIG) {
> + error_report("Received unexpected msg type. Expected %d received
> %d",
> + VHOST_USER_GET_CONFIG, msg.request);
> + return -1;
> + }
> +
> + if (msg.size != config_len) {
> + error_report("Received bad msg size.");
> + return -1;
> + }
> +
> + memcpy(config, &msg.payload.config, config_len);
> +
> + return 0;
> +}
> +
> +static int vhost_user_set_config(struct vhost_dev *dev, const uint8_t
> *config,
> + size_t config_len)
> +{
> + VhostUserMsg msg = {
> + .request = VHOST_USER_SET_CONFIG,
> + .flags = VHOST_USER_VERSION,
> + .size = config_len,
> + };
> +
> + if (config_len == 0 || config_len > VHOST_USER_PAYLOAD_SIZE) {
> + error_report("bad config length");
> + return -1;
> + }
> +
> + memcpy(&msg.payload.config, config, config_len);
> + if (vhost_user_write(dev, &msg, NULL, 0) < 0) {
> + return -1;
> + }
> +
> + return 0;
> +}
> +
> +static int vhost_user_set_config_fd(struct vhost_dev *dev, int fd)
> +{
> + VhostUserMsg msg = {
> + .request = VHOST_USER_SET_CONFIG_FD,
> + .flags = VHOST_USER_VERSION,
> + };
> +
> + if (vhost_user_write(dev, &msg, &fd, 1) < 0) {
> + return -1;
> + }
> +
> + return 0;
> +}
> +
> const VhostOps user_ops = {
> .backend_type = VHOST_BACKEND_TYPE_USER,
> .vhost_backend_init = vhost_user_init,
> @@ -948,4 +1031,7 @@ const VhostOps user_ops = {
> .vhost_net_set_mtu = vhost_user_net_set_mtu,
> .vhost_set_iotlb_callback = vhost_user_set_iotlb_callback,
> .vhost_send_device_iotlb_msg = vhost_user_send_device_iotlb_msg,
> + .vhost_get_config = vhost_user_get_config,
> + .vhost_set_config = vhost_user_set_config,
> + .vhost_set_config_fd = vhost_user_set_config_fd,
> };
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 6eddb09..0d39a55 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -1353,6 +1353,9 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
> for (i = 0; i < hdev->nvqs; ++i) {
> vhost_virtqueue_cleanup(hdev->vqs + i);
> }
> + if (hdev->config_ops) {
> + event_notifier_cleanup(&hdev->config_notifier);
> + }
> if (hdev->mem) {
> /* those are only safe after successful init */
> memory_listener_unregister(&hdev->memory_listener);
> @@ -1496,6 +1499,66 @@ void vhost_ack_features(struct vhost_dev *hdev, const
> int *feature_bits,
> }
> }
>
> +int vhost_dev_get_config(struct vhost_dev *hdev, uint8_t *config,
> + size_t config_len)
> +{
> + assert(hdev->vhost_ops);
> +
> + if (hdev->vhost_ops->vhost_get_config) {
> + return hdev->vhost_ops->vhost_get_config(hdev, config, config_len);
> + }
> +
> + return 0;
> +}
> +
> +int vhost_dev_set_config(struct vhost_dev *hdev, const uint8_t *config,
> + size_t config_len)
> +{
> + assert(hdev->vhost_ops);
> +
> + if (hdev->vhost_ops->vhost_set_config) {
> + return hdev->vhost_ops->vhost_set_config(hdev, config, config_len);
> + }
> +
> + return 0;
> +}
> +
> +static void vhost_dev_config_notifier_read(EventNotifier *n)
> +{
> + struct vhost_dev *hdev = container_of(n, struct vhost_dev,
> + config_notifier);
> +
> + if (event_notifier_test_and_clear(n)) {
> + if (hdev->config_ops) {
> + hdev->config_ops->vhost_dev_config_notifier(hdev);
> + }
> + }
> +}
> +
> +int vhost_dev_set_config_notifier(struct vhost_dev *hdev,
> + const VhostDevConfigOps *ops)
> +{
> + int r, fd;
> +
> + assert(hdev->vhost_ops);
> +
> + r = event_notifier_init(&hdev->config_notifier, 0);
> + if (r < 0) {
> + return r;
> + }
> +
> + hdev->config_ops = ops;
> + event_notifier_set_handler(&hdev->config_notifier,
> + vhost_dev_config_notifier_read);
> +
> + if (hdev->vhost_ops->vhost_set_config_fd) {
> + fd = event_notifier_get_fd(&hdev->config_notifier);
> + return hdev->vhost_ops->vhost_set_config_fd(hdev, fd);
> + }
> +
> + return 0;
> +}
> +
> /* Host notifiers must be enabled at this point. */
> int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
> {
> diff --git a/include/hw/virtio/vhost-backend.h
> b/include/hw/virtio/vhost-backend.h
> index a7a5f22..df6769e 100644
> --- a/include/hw/virtio/vhost-backend.h
> +++ b/include/hw/virtio/vhost-backend.h
> @@ -84,6 +84,11 @@ typedef void (*vhost_set_iotlb_callback_op)(struct
> vhost_dev *dev,
> int enabled);
> typedef int (*vhost_send_device_iotlb_msg_op)(struct vhost_dev *dev,
> struct vhost_iotlb_msg *imsg);
> +typedef int (*vhost_set_config_op)(struct vhost_dev *dev, const uint8_t
> *config,
> + size_t config_len);
> +typedef int (*vhost_get_config_op)(struct vhost_dev *dev, uint8_t *config,
> + size_t config_len);
> +typedef int (*vhost_set_config_fd_op)(struct vhost_dev *dev, int fd);
>
> typedef struct VhostOps {
> VhostBackendType backend_type;
> @@ -118,6 +123,9 @@ typedef struct VhostOps {
> vhost_vsock_set_running_op vhost_vsock_set_running;
> vhost_set_iotlb_callback_op vhost_set_iotlb_callback;
> vhost_send_device_iotlb_msg_op vhost_send_device_iotlb_msg;
> + vhost_get_config_op vhost_get_config;
> + vhost_set_config_op vhost_set_config;
> + vhost_set_config_fd_op vhost_set_config_fd;
> } VhostOps;
>
> extern const VhostOps user_ops;
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index 467dc77..ff172f2 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -46,6 +46,12 @@ struct vhost_iommu {
> QLIST_ENTRY(vhost_iommu) iommu_next;
> };
>
> +typedef struct VhostDevConfigOps {
> + /* Vhost device config space changed callback
> + */
> + void (*vhost_dev_config_notifier)(struct vhost_dev *dev);
> +} VhostDevConfigOps;
> +
> struct vhost_memory;
> struct vhost_dev {
> VirtIODevice *vdev;
> @@ -76,6 +82,8 @@ struct vhost_dev {
> QLIST_ENTRY(vhost_dev) entry;
> QLIST_HEAD(, vhost_iommu) iommu_list;
> IOMMUNotifier n;
> + EventNotifier config_notifier;
> + const VhostDevConfigOps *config_ops;
> };
>
> int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> @@ -106,4 +114,12 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
> struct vhost_vring_file *file);
>
> int vhost_device_iotlb_miss(struct vhost_dev *dev, uint64_t iova, int
> write);
> +int vhost_dev_get_config(struct vhost_dev *dev, uint8_t *config,
> + size_t config_len);
> +int vhost_dev_set_config(struct vhost_dev *dev, const uint8_t *config,
> + size_t config_len);
> +/* notifier callback in case vhost device config space changed
> + */
> +int vhost_dev_set_config_notifier(struct vhost_dev *dev,
> + const VhostDevConfigOps *ops);
> #endif
> --
> 1.9.3
Looks good otherwise
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH v2 2/4] vhost-user-blk: introduce a new vhost-user-blk host device
2017-08-10 10:12 [Qemu-devel] [PATCH v2 0/4] *** Introduce a new vhost-user-blk host device to Qemu *** Changpeng Liu
2017-08-09 16:58 ` Michael S. Tsirkin
2017-08-10 10:12 ` [Qemu-devel] [PATCH v2 1/4] vhost-user: add new vhost user messages to support virtio config space Changpeng Liu
@ 2017-08-10 10:12 ` Changpeng Liu
2017-08-09 15:39 ` Marc-André Lureau
2017-08-09 17:10 ` Michael S. Tsirkin
2017-08-10 10:12 ` [Qemu-devel] [PATCH v2 3/4] contrib/libvhost-user: enable virtio config space messages Changpeng Liu
2017-08-10 10:12 ` [Qemu-devel] [PATCH v2 4/4] contrib/vhost-user-blk: introduce a vhost-user-blk sample application Changpeng Liu
4 siblings, 2 replies; 13+ messages in thread
From: Changpeng Liu @ 2017-08-10 10:12 UTC (permalink / raw)
To: changpeng.liu, qemu-devel
Cc: stefanha, pbonzini, mst, marcandre.lureau, felipe, james.r.harris
This commit introduces a new vhost-user device for block, it uses a
chardev to connect with the backend, same with Qemu virito-blk device,
Guest OS still uses the virtio-blk frontend driver.
To use it, start Qemu with command line like this:
qemu-system-x86_64 \
-chardev socket,id=char0,path=/path/vhost.socket \
-device vhost-user-blk-pci,chardev=char0,num_queues=...
Different with exist Qemu virtio-blk host device, it makes more easy
for users to implement their own I/O processing logic, such as all
user space I/O stack against hardware block device. It uses the new
vhost messages(VHOST_USER_GET_CONFIG) to get block virtio config
information from backend process.
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
---
configure | 11 ++
hw/block/Makefile.objs | 3 +
hw/block/vhost-user-blk.c | 360 +++++++++++++++++++++++++++++++++++++
hw/virtio/virtio-pci.c | 55 ++++++
hw/virtio/virtio-pci.h | 18 ++
include/hw/virtio/vhost-user-blk.h | 40 +++++
6 files changed, 487 insertions(+)
create mode 100644 hw/block/vhost-user-blk.c
create mode 100644 include/hw/virtio/vhost-user-blk.h
diff --git a/configure b/configure
index dd73cce..1452c66 100755
--- a/configure
+++ b/configure
@@ -305,6 +305,7 @@ tcg="yes"
vhost_net="no"
vhost_scsi="no"
+vhost_user_blk="no"
vhost_vsock="no"
vhost_user=""
kvm="no"
@@ -779,6 +780,7 @@ Linux)
kvm="yes"
vhost_net="yes"
vhost_scsi="yes"
+ vhost_user_blk="yes"
vhost_vsock="yes"
QEMU_INCLUDES="-I\$(SRC_PATH)/linux-headers -I$(pwd)/linux-headers $QEMU_INCLUDES"
supported_os="yes"
@@ -1136,6 +1138,10 @@ for opt do
;;
--enable-vhost-scsi) vhost_scsi="yes"
;;
+ --disable-vhost-user-blk) vhost_user_blk="no"
+ ;;
+ --enable-vhost-user-blk) vhost_user_blk="yes"
+ ;;
--disable-vhost-vsock) vhost_vsock="no"
;;
--enable-vhost-vsock) vhost_vsock="yes"
@@ -1506,6 +1512,7 @@ disabled with --disable-FEATURE, default is enabled if available:
cap-ng libcap-ng support
attr attr and xattr support
vhost-net vhost-net acceleration support
+ vhost-user-blk VM virtio-blk acceleration in user space
spice spice
rbd rados block device (rbd)
libiscsi iscsi support
@@ -5365,6 +5372,7 @@ echo "posix_madvise $posix_madvise"
echo "libcap-ng support $cap_ng"
echo "vhost-net support $vhost_net"
echo "vhost-scsi support $vhost_scsi"
+echo "vhost-user-blk support $vhost_user_blk"
echo "vhost-vsock support $vhost_vsock"
echo "vhost-user support $vhost_user"
echo "Trace backends $trace_backends"
@@ -5776,6 +5784,9 @@ fi
if test "$vhost_scsi" = "yes" ; then
echo "CONFIG_VHOST_SCSI=y" >> $config_host_mak
fi
+if test "$vhost_user_blk" = "yes" ; then
+ echo "CONFIG_VHOST_USER_BLK=y" >> $config_host_mak
+fi
if test "$vhost_net" = "yes" -a "$vhost_user" = "yes"; then
echo "CONFIG_VHOST_NET_USED=y" >> $config_host_mak
fi
diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs
index e0ed980..4c19a58 100644
--- a/hw/block/Makefile.objs
+++ b/hw/block/Makefile.objs
@@ -13,3 +13,6 @@ obj-$(CONFIG_SH4) += tc58128.o
obj-$(CONFIG_VIRTIO) += virtio-blk.o
obj-$(CONFIG_VIRTIO) += dataplane/
+ifeq ($(CONFIG_VIRTIO),y)
+obj-$(CONFIG_VHOST_USER_BLK) += vhost-user-blk.o
+endif
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
new file mode 100644
index 0000000..8aa9fa9
--- /dev/null
+++ b/hw/block/vhost-user-blk.c
@@ -0,0 +1,360 @@
+/*
+ * vhost-user-blk host device
+ *
+ * Copyright IBM, Corp. 2011
+ * Copyright(C) 2017 Intel Corporation.
+ *
+ * Authors:
+ * Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
+ * Changpeng Liu <changpeng.liu@intel.com>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "qemu/typedefs.h"
+#include "qemu/cutils.h"
+#include "qom/object.h"
+#include "hw/qdev-core.h"
+#include "hw/virtio/vhost.h"
+#include "hw/virtio/vhost-user-blk.h"
+#include "hw/virtio/virtio.h"
+#include "hw/virtio/virtio-bus.h"
+#include "hw/virtio/virtio-access.h"
+
+static const int user_feature_bits[] = {
+ VIRTIO_BLK_F_SIZE_MAX,
+ VIRTIO_BLK_F_SEG_MAX,
+ VIRTIO_BLK_F_GEOMETRY,
+ VIRTIO_BLK_F_BLK_SIZE,
+ VIRTIO_BLK_F_TOPOLOGY,
+ VIRTIO_BLK_F_SCSI,
+ VIRTIO_BLK_F_MQ,
+ VIRTIO_BLK_F_RO,
+ VIRTIO_BLK_F_FLUSH,
+ VIRTIO_BLK_F_BARRIER,
+ VIRTIO_BLK_F_WCE,
+ VIRTIO_F_VERSION_1,
+ VIRTIO_RING_F_INDIRECT_DESC,
+ VIRTIO_RING_F_EVENT_IDX,
+ VIRTIO_F_NOTIFY_ON_EMPTY,
+ VHOST_INVALID_FEATURE_BIT
+};
+
+static void vhost_user_blk_update_config(VirtIODevice *vdev, uint8_t *config)
+{
+ VHostUserBlk *s = VHOST_USER_BLK(vdev);
+
+ memcpy(config, &s->blkcfg, sizeof(struct virtio_blk_config));
+}
+
+static void vhost_user_blk_set_config(VirtIODevice *vdev, const uint8_t *config)
+{
+ VHostUserBlk *s = VHOST_USER_BLK(vdev);
+ struct virtio_blk_config *blkcfg = (struct virtio_blk_config *)config;
+ int ret;
+
+ if (blkcfg->wce == s->blkcfg.wce) {
+ return;
+ }
+
+ ret = vhost_dev_set_config(&s->dev, config,
+ sizeof(struct virtio_blk_config));
+ if (ret) {
+ error_report("set device config space failed");
+ return;
+ }
+
+ s->blkcfg.wce = blkcfg->wce;
+}
+
+static void vhost_user_blk_handle_config_change(struct vhost_dev *dev)
+{
+ int ret;
+ struct virtio_blk_config blkcfg;
+ VHostUserBlk *s = VHOST_USER_BLK(dev->vdev);
+
+ ret = vhost_dev_get_config(dev, (uint8_t *)&blkcfg,
+ sizeof(struct virtio_blk_config));
+ if (ret < 0) {
+ error_report("get config space failed");
+ return;
+ }
+
+ memcpy(&s->blkcfg, &blkcfg, sizeof(struct virtio_blk_config));
+ memcpy(dev->vdev->config, &blkcfg, sizeof(struct virtio_blk_config));
+
+ virtio_notify_config(dev->vdev);
+}
+
+const VhostDevConfigOps blk_ops = {
+ .vhost_dev_config_notifier = vhost_user_blk_handle_config_change,
+};
+
+static void vhost_user_blk_start(VirtIODevice *vdev)
+{
+ VHostUserBlk *s = VHOST_USER_BLK(vdev);
+ BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
+ VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
+ int i, ret;
+
+ if (!k->set_guest_notifiers) {
+ error_report("binding does not support guest notifiers");
+ return;
+ }
+
+ ret = vhost_dev_enable_notifiers(&s->dev, vdev);
+ if (ret < 0) {
+ error_report("Error enabling host notifiers: %d", -ret);
+ return;
+ }
+
+ ret = k->set_guest_notifiers(qbus->parent, s->dev.nvqs, true);
+ if (ret < 0) {
+ error_report("Error binding guest notifier: %d", -ret);
+ goto err_host_notifiers;
+ }
+
+ s->dev.acked_features = vdev->guest_features;
+ ret = vhost_dev_start(&s->dev, vdev);
+ if (ret < 0) {
+ error_report("Error starting vhost: %d", -ret);
+ goto err_guest_notifiers;
+ }
+
+ /* guest_notifier_mask/pending not used yet, so just unmask
+ * everything here. virtio-pci will do the right thing by
+ * enabling/disabling irqfd.
+ */
+ for (i = 0; i < s->dev.nvqs; i++) {
+ vhost_virtqueue_mask(&s->dev, vdev, i, false);
+ }
+
+ return;
+
+err_guest_notifiers:
+ k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false);
+err_host_notifiers:
+ vhost_dev_disable_notifiers(&s->dev, vdev);
+}
+
+static void vhost_user_blk_stop(VirtIODevice *vdev)
+{
+ VHostUserBlk *s = VHOST_USER_BLK(vdev);
+ BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
+ VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
+ int ret;
+
+ if (!k->set_guest_notifiers) {
+ return;
+ }
+
+ vhost_dev_stop(&s->dev, vdev);
+
+ ret = k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false);
+ if (ret < 0) {
+ error_report("vhost guest notifier cleanup failed: %d", ret);
+ return;
+ }
+
+ vhost_dev_disable_notifiers(&s->dev, vdev);
+}
+
+static void vhost_user_blk_set_status(VirtIODevice *vdev, uint8_t status)
+{
+ VHostUserBlk *s = VHOST_USER_BLK(vdev);
+ bool should_start = status & VIRTIO_CONFIG_S_DRIVER_OK;
+
+ if (!vdev->vm_running) {
+ should_start = false;
+ }
+
+ if (s->dev.started == should_start) {
+ return;
+ }
+
+ if (should_start) {
+ vhost_user_blk_start(vdev);
+ } else {
+ vhost_user_blk_stop(vdev);
+ }
+
+}
+
+static uint64_t vhost_user_blk_get_features(VirtIODevice *vdev,
+ uint64_t features,
+ Error **errp)
+{
+ VHostUserBlk *s = VHOST_USER_BLK(vdev);
+ uint64_t get_features;
+
+ /* Turn on pre-defined features */
+ features |= s->host_features;
+
+ get_features = vhost_get_features(&s->dev, user_feature_bits, features);
+
+ return get_features;
+}
+
+static void vhost_user_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
+{
+
+}
+
+static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
+{
+ VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+ VHostUserBlk *s = VHOST_USER_BLK(vdev);
+ int i, ret;
+
+ if (!s->chardev.chr) {
+ error_setg(errp, "vhost-user-blk: chardev is mandatory");
+ return;
+ }
+
+ if (!s->num_queues || s->num_queues > VIRTIO_QUEUE_MAX) {
+ error_setg(errp, "vhost-user-blk: invalid number of IO queues");
+ return;
+ }
+
+ if (!s->queue_size) {
+ error_setg(errp, "vhost-user-blk: queue size must be non-zero");
+ return;
+ }
+
+ virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK,
+ sizeof(struct virtio_blk_config));
+
+ for (i = 0; i < s->num_queues; i++) {
+ virtio_add_queue(vdev, s->queue_size,
+ vhost_user_blk_handle_output);
+ }
+
+ s->dev.nvqs = s->num_queues;
+ s->dev.vqs = g_new(struct vhost_virtqueue, s->dev.nvqs);
+ s->dev.vq_index = 0;
+ s->dev.backend_features = 0;
+
+ ret = vhost_dev_init(&s->dev, &s->chardev, VHOST_BACKEND_TYPE_USER, 0);
+ if (ret < 0) {
+ error_setg(errp, "vhost-user-blk: vhost initialization failed: %s",
+ strerror(-ret));
+ goto virtio_err;
+ }
+
+ ret = vhost_dev_get_config(&s->dev, (uint8_t *)&s->blkcfg,
+ sizeof(struct virtio_blk_config));
+ if (ret < 0) {
+ error_setg(errp, "vhost-user-blk: get block config failed");
+ goto vhost_err;
+ }
+
+ if (s->blkcfg.num_queues != s->num_queues) {
+ s->blkcfg.num_queues = s->num_queues;
+ }
+
+ vhost_dev_set_config_notifier(&s->dev, &blk_ops);
+
+ return;
+
+vhost_err:
+ vhost_dev_cleanup(&s->dev);
+virtio_err:
+ g_free(s->dev.vqs);
+ virtio_cleanup(vdev);
+}
+
+static void vhost_user_blk_device_unrealize(DeviceState *dev, Error **errp)
+{
+ VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+ VHostUserBlk *s = VHOST_USER_BLK(dev);
+
+ vhost_user_blk_set_status(vdev, 0);
+ vhost_dev_cleanup(&s->dev);
+ g_free(s->dev.vqs);
+ virtio_cleanup(vdev);
+}
+
+static void vhost_user_blk_instance_init(Object *obj)
+{
+ VHostUserBlk *s = VHOST_USER_BLK(obj);
+
+ device_add_bootindex_property(obj, &s->bootindex, "bootindex",
+ "/disk@0,0", DEVICE(obj), NULL);
+}
+
+static const VMStateDescription vmstate_vhost_user_blk = {
+ .name = "vhost-user-blk",
+ .minimum_version_id = 1,
+ .version_id = 1,
+ .fields = (VMStateField[]) {
+ VMSTATE_VIRTIO_DEVICE,
+ VMSTATE_END_OF_LIST()
+ },
+};
+
+static Property vhost_user_blk_properties[] = {
+ DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
+ DEFINE_PROP_UINT16("num_queues", VHostUserBlk, num_queues, 1),
+ DEFINE_PROP_UINT32("queue_size", VHostUserBlk, queue_size, 128),
+ DEFINE_PROP_BIT64("f_size_max", VHostUserBlk, host_features,
+ VIRTIO_BLK_F_SIZE_MAX, true),
+ DEFINE_PROP_BIT64("f_sizemax", VHostUserBlk, host_features,
+ VIRTIO_BLK_F_SIZE_MAX, true),
+ DEFINE_PROP_BIT64("f_segmax", VHostUserBlk, host_features,
+ VIRTIO_BLK_F_SEG_MAX, true),
+ DEFINE_PROP_BIT64("f_geometry", VHostUserBlk, host_features,
+ VIRTIO_BLK_F_GEOMETRY, true),
+ DEFINE_PROP_BIT64("f_readonly", VHostUserBlk, host_features,
+ VIRTIO_BLK_F_RO, false),
+ DEFINE_PROP_BIT64("f_blocksize", VHostUserBlk, host_features,
+ VIRTIO_BLK_F_BLK_SIZE, true),
+ DEFINE_PROP_BIT64("f_topology", VHostUserBlk, host_features,
+ VIRTIO_BLK_F_TOPOLOGY, true),
+ DEFINE_PROP_BIT64("f_multiqueue", VHostUserBlk, host_features,
+ VIRTIO_BLK_F_MQ, true),
+ DEFINE_PROP_BIT64("f_flush", VHostUserBlk, host_features,
+ VIRTIO_BLK_F_FLUSH, true),
+ DEFINE_PROP_BIT64("f_barrier", VHostUserBlk, host_features,
+ VIRTIO_BLK_F_BARRIER, false),
+ DEFINE_PROP_BIT64("f_scsi", VHostUserBlk, host_features,
+ VIRTIO_BLK_F_SCSI, false),
+ DEFINE_PROP_BIT64("f_wce", VHostUserBlk, host_features,
+ VIRTIO_BLK_F_WCE, false),
+ DEFINE_PROP_END_OF_LIST(),
+};
+
+static void vhost_user_blk_class_init(ObjectClass *klass, void *data)
+{
+ DeviceClass *dc = DEVICE_CLASS(klass);
+ VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass);
+
+ dc->props = vhost_user_blk_properties;
+ dc->vmsd = &vmstate_vhost_user_blk;
+ set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
+ vdc->realize = vhost_user_blk_device_realize;
+ vdc->unrealize = vhost_user_blk_device_unrealize;
+ vdc->get_config = vhost_user_blk_update_config;
+ vdc->set_config = vhost_user_blk_set_config;
+ vdc->get_features = vhost_user_blk_get_features;
+ vdc->set_status = vhost_user_blk_set_status;
+}
+
+static const TypeInfo vhost_user_blk_info = {
+ .name = TYPE_VHOST_USER_BLK,
+ .parent = TYPE_VIRTIO_DEVICE,
+ .instance_size = sizeof(VHostUserBlk),
+ .instance_init = vhost_user_blk_instance_init,
+ .class_init = vhost_user_blk_class_init,
+};
+
+static void virtio_register_types(void)
+{
+ type_register_static(&vhost_user_blk_info);
+}
+
+type_init(virtio_register_types)
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 8b0d6b6..be9a992 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -2012,6 +2012,58 @@ static const TypeInfo virtio_blk_pci_info = {
.class_init = virtio_blk_pci_class_init,
};
+#ifdef CONFIG_VHOST_USER_BLK
+/* vhost-user-blk */
+
+static Property vhost_user_blk_pci_properties[] = {
+ DEFINE_PROP_UINT32("class", VirtIOPCIProxy, class_code, 0),
+ DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
+ DEFINE_PROP_END_OF_LIST(),
+};
+
+static void vhost_user_blk_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
+{
+ VHostUserBlkPCI *dev = VHOST_USER_BLK_PCI(vpci_dev);
+ DeviceState *vdev = DEVICE(&dev->vdev);
+
+ qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus));
+ object_property_set_bool(OBJECT(vdev), true, "realized", errp);
+}
+
+static void vhost_user_blk_pci_class_init(ObjectClass *klass, void *data)
+{
+ DeviceClass *dc = DEVICE_CLASS(klass);
+ VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass);
+ PCIDeviceClass *pcidev_k = PCI_DEVICE_CLASS(klass);
+
+ set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
+ dc->props = vhost_user_blk_pci_properties;
+ k->realize = vhost_user_blk_pci_realize;
+ pcidev_k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET;
+ pcidev_k->device_id = PCI_DEVICE_ID_VIRTIO_BLOCK;
+ pcidev_k->revision = VIRTIO_PCI_ABI_VERSION;
+ pcidev_k->class_id = PCI_CLASS_STORAGE_SCSI;
+}
+
+static void vhost_user_blk_pci_instance_init(Object *obj)
+{
+ VHostUserBlkPCI *dev = VHOST_USER_BLK_PCI(obj);
+
+ virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev),
+ TYPE_VHOST_USER_BLK);
+ object_property_add_alias(obj, "bootindex", OBJECT(&dev->vdev),
+ "bootindex", &error_abort);
+}
+
+static const TypeInfo vhost_user_blk_pci_info = {
+ .name = TYPE_VHOST_USER_BLK_PCI,
+ .parent = TYPE_VIRTIO_PCI,
+ .instance_size = sizeof(VHostUserBlkPCI),
+ .instance_init = vhost_user_blk_pci_instance_init,
+ .class_init = vhost_user_blk_pci_class_init,
+};
+#endif
+
/* virtio-scsi-pci */
static Property virtio_scsi_pci_properties[] = {
@@ -2658,6 +2710,9 @@ static void virtio_pci_register_types(void)
type_register_static(&virtio_9p_pci_info);
#endif
type_register_static(&virtio_blk_pci_info);
+#ifdef CONFIG_VHOST_USER_BLK
+ type_register_static(&vhost_user_blk_pci_info);
+#endif
type_register_static(&virtio_scsi_pci_info);
type_register_static(&virtio_balloon_pci_info);
type_register_static(&virtio_serial_pci_info);
diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
index 69f5959..19a0d01 100644
--- a/hw/virtio/virtio-pci.h
+++ b/hw/virtio/virtio-pci.h
@@ -27,6 +27,9 @@
#include "hw/virtio/virtio-gpu.h"
#include "hw/virtio/virtio-crypto.h"
#include "hw/virtio/vhost-user-scsi.h"
+#ifdef CONFIG_VHOST_USER_BLK
+#include "hw/virtio/vhost-user-blk.h"
+#endif
#ifdef CONFIG_VIRTFS
#include "hw/9pfs/virtio-9p.h"
@@ -46,6 +49,7 @@ typedef struct VirtIOSerialPCI VirtIOSerialPCI;
typedef struct VirtIONetPCI VirtIONetPCI;
typedef struct VHostSCSIPCI VHostSCSIPCI;
typedef struct VHostUserSCSIPCI VHostUserSCSIPCI;
+typedef struct VHostUserBlkPCI VHostUserBlkPCI;
typedef struct VirtIORngPCI VirtIORngPCI;
typedef struct VirtIOInputPCI VirtIOInputPCI;
typedef struct VirtIOInputHIDPCI VirtIOInputHIDPCI;
@@ -241,6 +245,20 @@ struct VHostUserSCSIPCI {
VHostUserSCSI vdev;
};
+#ifdef CONFIG_VHOST_USER_BLK
+/*
+ * vhost-user-blk-pci: This extends VirtioPCIProxy.
+ */
+#define TYPE_VHOST_USER_BLK_PCI "vhost-user-blk-pci"
+#define VHOST_USER_BLK_PCI(obj) \
+ OBJECT_CHECK(VHostUserBlkPCI, (obj), TYPE_VHOST_USER_BLK_PCI)
+
+struct VHostUserBlkPCI {
+ VirtIOPCIProxy parent_obj;
+ VHostUserBlk vdev;
+};
+#endif
+
/*
* virtio-blk-pci: This extends VirtioPCIProxy.
*/
diff --git a/include/hw/virtio/vhost-user-blk.h b/include/hw/virtio/vhost-user-blk.h
new file mode 100644
index 0000000..77d20f0
--- /dev/null
+++ b/include/hw/virtio/vhost-user-blk.h
@@ -0,0 +1,40 @@
+/*
+ * vhost-user-blk host device
+ * Copyright IBM, Corp. 2011
+ * Copyright(C) 2017 Intel Corporation.
+ *
+ * Authors:
+ * Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
+ * Changpeng Liu <changpeng.liu@intel.com>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#ifndef VHOST_USER_BLK_H
+#define VHOST_USER_BLK_H
+
+#include "standard-headers/linux/virtio_blk.h"
+#include "qemu-common.h"
+#include "hw/qdev.h"
+#include "hw/block/block.h"
+#include "chardev/char-fe.h"
+#include "hw/virtio/vhost.h"
+
+#define TYPE_VHOST_USER_BLK "vhost-user-blk"
+#define VHOST_USER_BLK(obj) \
+ OBJECT_CHECK(VHostUserBlk, (obj), TYPE_VHOST_USER_BLK)
+
+typedef struct VHostUserBlk {
+ VirtIODevice parent_obj;
+ CharBackend chardev;
+ int32_t bootindex;
+ uint64_t host_features;
+ struct virtio_blk_config blkcfg;
+ uint16_t num_queues;
+ uint32_t queue_size;
+ struct vhost_dev dev;
+} VHostUserBlk;
+
+#endif
--
1.9.3
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH v2 2/4] vhost-user-blk: introduce a new vhost-user-blk host device
2017-08-10 10:12 ` [Qemu-devel] [PATCH v2 2/4] vhost-user-blk: introduce a new vhost-user-blk host device Changpeng Liu
@ 2017-08-09 15:39 ` Marc-André Lureau
2017-08-10 0:42 ` Liu, Changpeng
2017-08-09 17:10 ` Michael S. Tsirkin
1 sibling, 1 reply; 13+ messages in thread
From: Marc-André Lureau @ 2017-08-09 15:39 UTC (permalink / raw)
To: Changpeng Liu; +Cc: qemu-devel, stefanha, pbonzini, mst, felipe, james r harris
Hi
----- Original Message -----
> This commit introduces a new vhost-user device for block, it uses a
> chardev to connect with the backend, same with Qemu virito-blk device,
> Guest OS still uses the virtio-blk frontend driver.
>
> To use it, start Qemu with command line like this:
>
> qemu-system-x86_64 \
> -chardev socket,id=char0,path=/path/vhost.socket \
> -device vhost-user-blk-pci,chardev=char0,num_queues=...
>
> Different with exist Qemu virtio-blk host device, it makes more easy
> for users to implement their own I/O processing logic, such as all
> user space I/O stack against hardware block device. It uses the new
> vhost messages(VHOST_USER_GET_CONFIG) to get block virtio config
> information from backend process.
>
> Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
> ---
> configure | 11 ++
> hw/block/Makefile.objs | 3 +
> hw/block/vhost-user-blk.c | 360
> +++++++++++++++++++++++++++++++++++++
> hw/virtio/virtio-pci.c | 55 ++++++
> hw/virtio/virtio-pci.h | 18 ++
> include/hw/virtio/vhost-user-blk.h | 40 +++++
> 6 files changed, 487 insertions(+)
> create mode 100644 hw/block/vhost-user-blk.c
> create mode 100644 include/hw/virtio/vhost-user-blk.h
>
> diff --git a/configure b/configure
> index dd73cce..1452c66 100755
> --- a/configure
> +++ b/configure
> @@ -305,6 +305,7 @@ tcg="yes"
>
> vhost_net="no"
> vhost_scsi="no"
> +vhost_user_blk="no"
> vhost_vsock="no"
> vhost_user=""
> kvm="no"
> @@ -779,6 +780,7 @@ Linux)
> kvm="yes"
> vhost_net="yes"
> vhost_scsi="yes"
> + vhost_user_blk="yes"
> vhost_vsock="yes"
> QEMU_INCLUDES="-I\$(SRC_PATH)/linux-headers -I$(pwd)/linux-headers
> $QEMU_INCLUDES"
> supported_os="yes"
> @@ -1136,6 +1138,10 @@ for opt do
> ;;
> --enable-vhost-scsi) vhost_scsi="yes"
> ;;
> + --disable-vhost-user-blk) vhost_user_blk="no"
> + ;;
> + --enable-vhost-user-blk) vhost_user_blk="yes"
> + ;;
I suggest we don't add yet another configure option, but reuse the recently introduced --enable-vhost-user (that should cover all vhost-user devices for now, but may learn to enable specific devices if needed in the future).
> --disable-vhost-vsock) vhost_vsock="no"
> ;;
> --enable-vhost-vsock) vhost_vsock="yes"
> @@ -1506,6 +1512,7 @@ disabled with --disable-FEATURE, default is enabled if
> available:
> cap-ng libcap-ng support
> attr attr and xattr support
> vhost-net vhost-net acceleration support
> + vhost-user-blk VM virtio-blk acceleration in user space
> spice spice
> rbd rados block device (rbd)
> libiscsi iscsi support
> @@ -5365,6 +5372,7 @@ echo "posix_madvise $posix_madvise"
> echo "libcap-ng support $cap_ng"
> echo "vhost-net support $vhost_net"
> echo "vhost-scsi support $vhost_scsi"
> +echo "vhost-user-blk support $vhost_user_blk"
> echo "vhost-vsock support $vhost_vsock"
> echo "vhost-user support $vhost_user"
> echo "Trace backends $trace_backends"
> @@ -5776,6 +5784,9 @@ fi
> if test "$vhost_scsi" = "yes" ; then
> echo "CONFIG_VHOST_SCSI=y" >> $config_host_mak
> fi
> +if test "$vhost_user_blk" = "yes" ; then
> + echo "CONFIG_VHOST_USER_BLK=y" >> $config_host_mak
> +fi
> if test "$vhost_net" = "yes" -a "$vhost_user" = "yes"; then
> echo "CONFIG_VHOST_NET_USED=y" >> $config_host_mak
> fi
> diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs
> index e0ed980..4c19a58 100644
> --- a/hw/block/Makefile.objs
> +++ b/hw/block/Makefile.objs
> @@ -13,3 +13,6 @@ obj-$(CONFIG_SH4) += tc58128.o
>
> obj-$(CONFIG_VIRTIO) += virtio-blk.o
> obj-$(CONFIG_VIRTIO) += dataplane/
> +ifeq ($(CONFIG_VIRTIO),y)
> +obj-$(CONFIG_VHOST_USER_BLK) += vhost-user-blk.o
> +endif
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> new file mode 100644
> index 0000000..8aa9fa9
> --- /dev/null
> +++ b/hw/block/vhost-user-blk.c
> @@ -0,0 +1,360 @@
> +/*
> + * vhost-user-blk host device
> + *
> + * Copyright IBM, Corp. 2011
> + * Copyright(C) 2017 Intel Corporation.
> + *
> + * Authors:
> + * Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
> + * Changpeng Liu <changpeng.liu@intel.com>
> + *
> + * This work is licensed under the terms of the GNU LGPL, version 2 or
> later.
> + * See the COPYING.LIB file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "qemu/error-report.h"
> +#include "qemu/typedefs.h"
> +#include "qemu/cutils.h"
> +#include "qom/object.h"
> +#include "hw/qdev-core.h"
> +#include "hw/virtio/vhost.h"
> +#include "hw/virtio/vhost-user-blk.h"
> +#include "hw/virtio/virtio.h"
> +#include "hw/virtio/virtio-bus.h"
> +#include "hw/virtio/virtio-access.h"
> +
> +static const int user_feature_bits[] = {
> + VIRTIO_BLK_F_SIZE_MAX,
> + VIRTIO_BLK_F_SEG_MAX,
> + VIRTIO_BLK_F_GEOMETRY,
> + VIRTIO_BLK_F_BLK_SIZE,
> + VIRTIO_BLK_F_TOPOLOGY,
> + VIRTIO_BLK_F_SCSI,
> + VIRTIO_BLK_F_MQ,
> + VIRTIO_BLK_F_RO,
> + VIRTIO_BLK_F_FLUSH,
> + VIRTIO_BLK_F_BARRIER,
> + VIRTIO_BLK_F_WCE,
> + VIRTIO_F_VERSION_1,
> + VIRTIO_RING_F_INDIRECT_DESC,
> + VIRTIO_RING_F_EVENT_IDX,
> + VIRTIO_F_NOTIFY_ON_EMPTY,
> + VHOST_INVALID_FEATURE_BIT
> +};
> +
> +static void vhost_user_blk_update_config(VirtIODevice *vdev, uint8_t
> *config)
> +{
> + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> +
> + memcpy(config, &s->blkcfg, sizeof(struct virtio_blk_config));
> +}
> +
> +static void vhost_user_blk_set_config(VirtIODevice *vdev, const uint8_t
> *config)
> +{
> + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> + struct virtio_blk_config *blkcfg = (struct virtio_blk_config *)config;
> + int ret;
> +
> + if (blkcfg->wce == s->blkcfg.wce) {
> + return;
Is write-cache the only config change the slave is interested in?
> + }
> +
> + ret = vhost_dev_set_config(&s->dev, config,
> + sizeof(struct virtio_blk_config));
> + if (ret) {
> + error_report("set device config space failed");
> + return;
> + }
> +
> + s->blkcfg.wce = blkcfg->wce;
> +}
> +
> +static void vhost_user_blk_handle_config_change(struct vhost_dev *dev)
> +{
> + int ret;
> + struct virtio_blk_config blkcfg;
> + VHostUserBlk *s = VHOST_USER_BLK(dev->vdev);
> +
> + ret = vhost_dev_get_config(dev, (uint8_t *)&blkcfg,
> + sizeof(struct virtio_blk_config));
> + if (ret < 0) {
> + error_report("get config space failed");
> + return;
> + }
> +
> + memcpy(&s->blkcfg, &blkcfg, sizeof(struct virtio_blk_config));
> + memcpy(dev->vdev->config, &blkcfg, sizeof(struct virtio_blk_config));
Why do you need to have s->blkcfg if you can use dev->vdev->config ?
> + virtio_notify_config(dev->vdev);
> +}
> +
> +const VhostDevConfigOps blk_ops = {
> + .vhost_dev_config_notifier = vhost_user_blk_handle_config_change,
> +};
> +
> +static void vhost_user_blk_start(VirtIODevice *vdev)
> +{
> + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> + BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
> + VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
> + int i, ret;
> +
> + if (!k->set_guest_notifiers) {
> + error_report("binding does not support guest notifiers");
> + return;
> + }
> +
> + ret = vhost_dev_enable_notifiers(&s->dev, vdev);
> + if (ret < 0) {
> + error_report("Error enabling host notifiers: %d", -ret);
> + return;
> + }
> +
> + ret = k->set_guest_notifiers(qbus->parent, s->dev.nvqs, true);
> + if (ret < 0) {
> + error_report("Error binding guest notifier: %d", -ret);
> + goto err_host_notifiers;
> + }
> +
> + s->dev.acked_features = vdev->guest_features;
> + ret = vhost_dev_start(&s->dev, vdev);
> + if (ret < 0) {
> + error_report("Error starting vhost: %d", -ret);
> + goto err_guest_notifiers;
> + }
> +
> + /* guest_notifier_mask/pending not used yet, so just unmask
> + * everything here. virtio-pci will do the right thing by
> + * enabling/disabling irqfd.
> + */
> + for (i = 0; i < s->dev.nvqs; i++) {
> + vhost_virtqueue_mask(&s->dev, vdev, i, false);
> + }
> +
> + return;
> +
> +err_guest_notifiers:
> + k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false);
> +err_host_notifiers:
> + vhost_dev_disable_notifiers(&s->dev, vdev);
> +}
> +
> +static void vhost_user_blk_stop(VirtIODevice *vdev)
> +{
> + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> + BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
> + VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
> + int ret;
> +
> + if (!k->set_guest_notifiers) {
> + return;
> + }
> +
> + vhost_dev_stop(&s->dev, vdev);
> +
> + ret = k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false);
> + if (ret < 0) {
> + error_report("vhost guest notifier cleanup failed: %d", ret);
> + return;
> + }
> +
> + vhost_dev_disable_notifiers(&s->dev, vdev);
> +}
> +
> +static void vhost_user_blk_set_status(VirtIODevice *vdev, uint8_t status)
> +{
> + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> + bool should_start = status & VIRTIO_CONFIG_S_DRIVER_OK;
> +
> + if (!vdev->vm_running) {
> + should_start = false;
> + }
> +
> + if (s->dev.started == should_start) {
> + return;
> + }
> +
> + if (should_start) {
> + vhost_user_blk_start(vdev);
> + } else {
> + vhost_user_blk_stop(vdev);
> + }
> +
> +}
> +
> +static uint64_t vhost_user_blk_get_features(VirtIODevice *vdev,
> + uint64_t features,
> + Error **errp)
> +{
> + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> + uint64_t get_features;
> +
> + /* Turn on pre-defined features */
> + features |= s->host_features;
> +
> + get_features = vhost_get_features(&s->dev, user_feature_bits, features);
> +
> + return get_features;
> +}
> +
> +static void vhost_user_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
> +{
> +
> +}
> +
> +static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
> +{
> + VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> + int i, ret;
> +
> + if (!s->chardev.chr) {
> + error_setg(errp, "vhost-user-blk: chardev is mandatory");
> + return;
> + }
> +
> + if (!s->num_queues || s->num_queues > VIRTIO_QUEUE_MAX) {
> + error_setg(errp, "vhost-user-blk: invalid number of IO queues");
> + return;
> + }
> +
> + if (!s->queue_size) {
> + error_setg(errp, "vhost-user-blk: queue size must be non-zero");
> + return;
> + }
> +
> + virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK,
> + sizeof(struct virtio_blk_config));
> +
> + for (i = 0; i < s->num_queues; i++) {
> + virtio_add_queue(vdev, s->queue_size,
> + vhost_user_blk_handle_output);
> + }
> +
> + s->dev.nvqs = s->num_queues;
> + s->dev.vqs = g_new(struct vhost_virtqueue, s->dev.nvqs);
> + s->dev.vq_index = 0;
> + s->dev.backend_features = 0;
> +
> + ret = vhost_dev_init(&s->dev, &s->chardev, VHOST_BACKEND_TYPE_USER, 0);
> + if (ret < 0) {
> + error_setg(errp, "vhost-user-blk: vhost initialization failed: %s",
> + strerror(-ret));
> + goto virtio_err;
> + }
> +
> + ret = vhost_dev_get_config(&s->dev, (uint8_t *)&s->blkcfg,
> + sizeof(struct virtio_blk_config));
> + if (ret < 0) {
> + error_setg(errp, "vhost-user-blk: get block config failed");
> + goto vhost_err;
> + }
> +
> + if (s->blkcfg.num_queues != s->num_queues) {
> + s->blkcfg.num_queues = s->num_queues;
> + }
> +
> + vhost_dev_set_config_notifier(&s->dev, &blk_ops);
> +
> + return;
> +
> +vhost_err:
> + vhost_dev_cleanup(&s->dev);
> +virtio_err:
> + g_free(s->dev.vqs);
> + virtio_cleanup(vdev);
> +}
> +
> +static void vhost_user_blk_device_unrealize(DeviceState *dev, Error **errp)
> +{
> + VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> + VHostUserBlk *s = VHOST_USER_BLK(dev);
> +
> + vhost_user_blk_set_status(vdev, 0);
> + vhost_dev_cleanup(&s->dev);
> + g_free(s->dev.vqs);
> + virtio_cleanup(vdev);
> +}
> +
> +static void vhost_user_blk_instance_init(Object *obj)
> +{
> + VHostUserBlk *s = VHOST_USER_BLK(obj);
> +
> + device_add_bootindex_property(obj, &s->bootindex, "bootindex",
> + "/disk@0,0", DEVICE(obj), NULL);
> +}
> +
> +static const VMStateDescription vmstate_vhost_user_blk = {
> + .name = "vhost-user-blk",
> + .minimum_version_id = 1,
> + .version_id = 1,
> + .fields = (VMStateField[]) {
> + VMSTATE_VIRTIO_DEVICE,
> + VMSTATE_END_OF_LIST()
> + },
> +};
> +
> +static Property vhost_user_blk_properties[] = {
> + DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
> + DEFINE_PROP_UINT16("num_queues", VHostUserBlk, num_queues, 1),
> + DEFINE_PROP_UINT32("queue_size", VHostUserBlk, queue_size, 128),
> + DEFINE_PROP_BIT64("f_size_max", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_SIZE_MAX, true),
> + DEFINE_PROP_BIT64("f_sizemax", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_SIZE_MAX, true),
> + DEFINE_PROP_BIT64("f_segmax", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_SEG_MAX, true),
> + DEFINE_PROP_BIT64("f_geometry", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_GEOMETRY, true),
> + DEFINE_PROP_BIT64("f_readonly", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_RO, false),
> + DEFINE_PROP_BIT64("f_blocksize", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_BLK_SIZE, true),
> + DEFINE_PROP_BIT64("f_topology", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_TOPOLOGY, true),
> + DEFINE_PROP_BIT64("f_multiqueue", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_MQ, true),
> + DEFINE_PROP_BIT64("f_flush", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_FLUSH, true),
> + DEFINE_PROP_BIT64("f_barrier", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_BARRIER, false),
> + DEFINE_PROP_BIT64("f_scsi", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_SCSI, false),
> + DEFINE_PROP_BIT64("f_wce", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_WCE, false),
> + DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void vhost_user_blk_class_init(ObjectClass *klass, void *data)
> +{
> + DeviceClass *dc = DEVICE_CLASS(klass);
> + VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass);
> +
> + dc->props = vhost_user_blk_properties;
> + dc->vmsd = &vmstate_vhost_user_blk;
> + set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
> + vdc->realize = vhost_user_blk_device_realize;
> + vdc->unrealize = vhost_user_blk_device_unrealize;
> + vdc->get_config = vhost_user_blk_update_config;
> + vdc->set_config = vhost_user_blk_set_config;
> + vdc->get_features = vhost_user_blk_get_features;
> + vdc->set_status = vhost_user_blk_set_status;
> +}
> +
> +static const TypeInfo vhost_user_blk_info = {
> + .name = TYPE_VHOST_USER_BLK,
> + .parent = TYPE_VIRTIO_DEVICE,
> + .instance_size = sizeof(VHostUserBlk),
> + .instance_init = vhost_user_blk_instance_init,
> + .class_init = vhost_user_blk_class_init,
> +};
> +
> +static void virtio_register_types(void)
> +{
> + type_register_static(&vhost_user_blk_info);
> +}
> +
> +type_init(virtio_register_types)
> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index 8b0d6b6..be9a992 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -2012,6 +2012,58 @@ static const TypeInfo virtio_blk_pci_info = {
> .class_init = virtio_blk_pci_class_init,
> };
>
> +#ifdef CONFIG_VHOST_USER_BLK
> +/* vhost-user-blk */
> +
> +static Property vhost_user_blk_pci_properties[] = {
> + DEFINE_PROP_UINT32("class", VirtIOPCIProxy, class_code, 0),
> + DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
> + DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void vhost_user_blk_pci_realize(VirtIOPCIProxy *vpci_dev, Error
> **errp)
> +{
> + VHostUserBlkPCI *dev = VHOST_USER_BLK_PCI(vpci_dev);
> + DeviceState *vdev = DEVICE(&dev->vdev);
> +
> + qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus));
> + object_property_set_bool(OBJECT(vdev), true, "realized", errp);
> +}
> +
> +static void vhost_user_blk_pci_class_init(ObjectClass *klass, void *data)
> +{
> + DeviceClass *dc = DEVICE_CLASS(klass);
> + VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass);
> + PCIDeviceClass *pcidev_k = PCI_DEVICE_CLASS(klass);
> +
> + set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
> + dc->props = vhost_user_blk_pci_properties;
> + k->realize = vhost_user_blk_pci_realize;
> + pcidev_k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET;
> + pcidev_k->device_id = PCI_DEVICE_ID_VIRTIO_BLOCK;
> + pcidev_k->revision = VIRTIO_PCI_ABI_VERSION;
> + pcidev_k->class_id = PCI_CLASS_STORAGE_SCSI;
> +}
> +
> +static void vhost_user_blk_pci_instance_init(Object *obj)
> +{
> + VHostUserBlkPCI *dev = VHOST_USER_BLK_PCI(obj);
> +
> + virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev),
> + TYPE_VHOST_USER_BLK);
> + object_property_add_alias(obj, "bootindex", OBJECT(&dev->vdev),
> + "bootindex", &error_abort);
> +}
> +
> +static const TypeInfo vhost_user_blk_pci_info = {
> + .name = TYPE_VHOST_USER_BLK_PCI,
> + .parent = TYPE_VIRTIO_PCI,
> + .instance_size = sizeof(VHostUserBlkPCI),
> + .instance_init = vhost_user_blk_pci_instance_init,
> + .class_init = vhost_user_blk_pci_class_init,
> +};
> +#endif
> +
> /* virtio-scsi-pci */
>
> static Property virtio_scsi_pci_properties[] = {
> @@ -2658,6 +2710,9 @@ static void virtio_pci_register_types(void)
> type_register_static(&virtio_9p_pci_info);
> #endif
> type_register_static(&virtio_blk_pci_info);
> +#ifdef CONFIG_VHOST_USER_BLK
> + type_register_static(&vhost_user_blk_pci_info);
> +#endif
> type_register_static(&virtio_scsi_pci_info);
> type_register_static(&virtio_balloon_pci_info);
> type_register_static(&virtio_serial_pci_info);
> diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
> index 69f5959..19a0d01 100644
> --- a/hw/virtio/virtio-pci.h
> +++ b/hw/virtio/virtio-pci.h
> @@ -27,6 +27,9 @@
> #include "hw/virtio/virtio-gpu.h"
> #include "hw/virtio/virtio-crypto.h"
> #include "hw/virtio/vhost-user-scsi.h"
> +#ifdef CONFIG_VHOST_USER_BLK
> +#include "hw/virtio/vhost-user-blk.h"
> +#endif
>
> #ifdef CONFIG_VIRTFS
> #include "hw/9pfs/virtio-9p.h"
> @@ -46,6 +49,7 @@ typedef struct VirtIOSerialPCI VirtIOSerialPCI;
> typedef struct VirtIONetPCI VirtIONetPCI;
> typedef struct VHostSCSIPCI VHostSCSIPCI;
> typedef struct VHostUserSCSIPCI VHostUserSCSIPCI;
> +typedef struct VHostUserBlkPCI VHostUserBlkPCI;
> typedef struct VirtIORngPCI VirtIORngPCI;
> typedef struct VirtIOInputPCI VirtIOInputPCI;
> typedef struct VirtIOInputHIDPCI VirtIOInputHIDPCI;
> @@ -241,6 +245,20 @@ struct VHostUserSCSIPCI {
> VHostUserSCSI vdev;
> };
>
> +#ifdef CONFIG_VHOST_USER_BLK
> +/*
> + * vhost-user-blk-pci: This extends VirtioPCIProxy.
> + */
> +#define TYPE_VHOST_USER_BLK_PCI "vhost-user-blk-pci"
> +#define VHOST_USER_BLK_PCI(obj) \
> + OBJECT_CHECK(VHostUserBlkPCI, (obj), TYPE_VHOST_USER_BLK_PCI)
> +
> +struct VHostUserBlkPCI {
> + VirtIOPCIProxy parent_obj;
> + VHostUserBlk vdev;
> +};
> +#endif
> +
> /*
> * virtio-blk-pci: This extends VirtioPCIProxy.
> */
> diff --git a/include/hw/virtio/vhost-user-blk.h
> b/include/hw/virtio/vhost-user-blk.h
> new file mode 100644
> index 0000000..77d20f0
> --- /dev/null
> +++ b/include/hw/virtio/vhost-user-blk.h
> @@ -0,0 +1,40 @@
> +/*
> + * vhost-user-blk host device
> + * Copyright IBM, Corp. 2011
> + * Copyright(C) 2017 Intel Corporation.
> + *
> + * Authors:
> + * Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
> + * Changpeng Liu <changpeng.liu@intel.com>
> + *
> + * This work is licensed under the terms of the GNU LGPL, version 2 or
> later.
> + * See the COPYING.LIB file in the top-level directory.
> + *
> + */
> +
> +#ifndef VHOST_USER_BLK_H
> +#define VHOST_USER_BLK_H
> +
> +#include "standard-headers/linux/virtio_blk.h"
> +#include "qemu-common.h"
> +#include "hw/qdev.h"
> +#include "hw/block/block.h"
> +#include "chardev/char-fe.h"
> +#include "hw/virtio/vhost.h"
> +
> +#define TYPE_VHOST_USER_BLK "vhost-user-blk"
> +#define VHOST_USER_BLK(obj) \
> + OBJECT_CHECK(VHostUserBlk, (obj), TYPE_VHOST_USER_BLK)
> +
> +typedef struct VHostUserBlk {
> + VirtIODevice parent_obj;
> + CharBackend chardev;
> + int32_t bootindex;
> + uint64_t host_features;
> + struct virtio_blk_config blkcfg;
> + uint16_t num_queues;
> + uint32_t queue_size;
> + struct vhost_dev dev;
> +} VHostUserBlk;
> +
> +#endif
> --
> 1.9.3
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH v2 2/4] vhost-user-blk: introduce a new vhost-user-blk host device
2017-08-09 15:39 ` Marc-André Lureau
@ 2017-08-10 0:42 ` Liu, Changpeng
0 siblings, 0 replies; 13+ messages in thread
From: Liu, Changpeng @ 2017-08-10 0:42 UTC (permalink / raw)
To: Marc-André Lureau
Cc: qemu-devel@nongnu.org, stefanha@gmail.com, pbonzini@redhat.com,
mst@redhat.com, felipe@nutanix.com, Harris, James R
> -----Original Message-----
> From: Marc-André Lureau [mailto:marcandre.lureau@redhat.com]
> Sent: Wednesday, August 9, 2017 11:39 PM
> To: Liu, Changpeng <changpeng.liu@intel.com>
> Cc: qemu-devel@nongnu.org; stefanha@gmail.com; pbonzini@redhat.com;
> mst@redhat.com; felipe@nutanix.com; Harris, James R
> <james.r.harris@intel.com>
> Subject: Re: [PATCH v2 2/4] vhost-user-blk: introduce a new vhost-user-blk host
> device
>
> Hi
>
> ----- Original Message -----
> > This commit introduces a new vhost-user device for block, it uses a
> > chardev to connect with the backend, same with Qemu virito-blk device,
> > Guest OS still uses the virtio-blk frontend driver.
> >
> > To use it, start Qemu with command line like this:
> >
> > qemu-system-x86_64 \
> > -chardev socket,id=char0,path=/path/vhost.socket \
> > -device vhost-user-blk-pci,chardev=char0,num_queues=...
> >
> > Different with exist Qemu virtio-blk host device, it makes more easy
> > for users to implement their own I/O processing logic, such as all
> > user space I/O stack against hardware block device. It uses the new
> > vhost messages(VHOST_USER_GET_CONFIG) to get block virtio config
> > information from backend process.
> >
> > Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
> > ---
> > configure | 11 ++
> > hw/block/Makefile.objs | 3 +
> > hw/block/vhost-user-blk.c | 360
> > +++++++++++++++++++++++++++++++++++++
> > hw/virtio/virtio-pci.c | 55 ++++++
> > hw/virtio/virtio-pci.h | 18 ++
> > include/hw/virtio/vhost-user-blk.h | 40 +++++
> > 6 files changed, 487 insertions(+)
> > create mode 100644 hw/block/vhost-user-blk.c
> > create mode 100644 include/hw/virtio/vhost-user-blk.h
> >
> > diff --git a/configure b/configure
> > index dd73cce..1452c66 100755
> > --- a/configure
> > +++ b/configure
> > @@ -305,6 +305,7 @@ tcg="yes"
> >
> > vhost_net="no"
> > vhost_scsi="no"
> > +vhost_user_blk="no"
> > vhost_vsock="no"
> > vhost_user=""
> > kvm="no"
> > @@ -779,6 +780,7 @@ Linux)
> > kvm="yes"
> > vhost_net="yes"
> > vhost_scsi="yes"
> > + vhost_user_blk="yes"
> > vhost_vsock="yes"
> > QEMU_INCLUDES="-I\$(SRC_PATH)/linux-headers -I$(pwd)/linux-headers
> > $QEMU_INCLUDES"
> > supported_os="yes"
> > @@ -1136,6 +1138,10 @@ for opt do
> > ;;
> > --enable-vhost-scsi) vhost_scsi="yes"
> > ;;
> > + --disable-vhost-user-blk) vhost_user_blk="no"
> > + ;;
> > + --enable-vhost-user-blk) vhost_user_blk="yes"
> > + ;;
>
> I suggest we don't add yet another configure option, but reuse the recently
> introduced --enable-vhost-user (that should cover all vhost-user devices for now,
> but may learn to enable specific devices if needed in the future).
Yes, I noticed there is a new vhost-user configuration, sounds good to me if other devices
such as vhost-net and vhost-scsi also use the same configuration option.
>
> > --disable-vhost-vsock) vhost_vsock="no"
> > ;;
> > --enable-vhost-vsock) vhost_vsock="yes"
> > @@ -1506,6 +1512,7 @@ disabled with --disable-FEATURE, default is enabled if
> > available:
> > cap-ng libcap-ng support
> > attr attr and xattr support
> > vhost-net vhost-net acceleration support
> > + vhost-user-blk VM virtio-blk acceleration in user space
> > spice spice
> > rbd rados block device (rbd)
> > libiscsi iscsi support
> > @@ -5365,6 +5372,7 @@ echo "posix_madvise $posix_madvise"
> > echo "libcap-ng support $cap_ng"
> > echo "vhost-net support $vhost_net"
> > echo "vhost-scsi support $vhost_scsi"
> > +echo "vhost-user-blk support $vhost_user_blk"
> > echo "vhost-vsock support $vhost_vsock"
> > echo "vhost-user support $vhost_user"
> > echo "Trace backends $trace_backends"
> > @@ -5776,6 +5784,9 @@ fi
> > if test "$vhost_scsi" = "yes" ; then
> > echo "CONFIG_VHOST_SCSI=y" >> $config_host_mak
> > fi
> > +if test "$vhost_user_blk" = "yes" ; then
> > + echo "CONFIG_VHOST_USER_BLK=y" >> $config_host_mak
> > +fi
> > if test "$vhost_net" = "yes" -a "$vhost_user" = "yes"; then
> > echo "CONFIG_VHOST_NET_USED=y" >> $config_host_mak
> > fi
> > diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs
> > index e0ed980..4c19a58 100644
> > --- a/hw/block/Makefile.objs
> > +++ b/hw/block/Makefile.objs
> > @@ -13,3 +13,6 @@ obj-$(CONFIG_SH4) += tc58128.o
> >
> > obj-$(CONFIG_VIRTIO) += virtio-blk.o
> > obj-$(CONFIG_VIRTIO) += dataplane/
> > +ifeq ($(CONFIG_VIRTIO),y)
> > +obj-$(CONFIG_VHOST_USER_BLK) += vhost-user-blk.o
> > +endif
> > diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> > new file mode 100644
> > index 0000000..8aa9fa9
> > --- /dev/null
> > +++ b/hw/block/vhost-user-blk.c
> > @@ -0,0 +1,360 @@
> > +/*
> > + * vhost-user-blk host device
> > + *
> > + * Copyright IBM, Corp. 2011
> > + * Copyright(C) 2017 Intel Corporation.
> > + *
> > + * Authors:
> > + * Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
> > + * Changpeng Liu <changpeng.liu@intel.com>
> > + *
> > + * This work is licensed under the terms of the GNU LGPL, version 2 or
> > later.
> > + * See the COPYING.LIB file in the top-level directory.
> > + *
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "qapi/error.h"
> > +#include "qemu/error-report.h"
> > +#include "qemu/typedefs.h"
> > +#include "qemu/cutils.h"
> > +#include "qom/object.h"
> > +#include "hw/qdev-core.h"
> > +#include "hw/virtio/vhost.h"
> > +#include "hw/virtio/vhost-user-blk.h"
> > +#include "hw/virtio/virtio.h"
> > +#include "hw/virtio/virtio-bus.h"
> > +#include "hw/virtio/virtio-access.h"
> > +
> > +static const int user_feature_bits[] = {
> > + VIRTIO_BLK_F_SIZE_MAX,
> > + VIRTIO_BLK_F_SEG_MAX,
> > + VIRTIO_BLK_F_GEOMETRY,
> > + VIRTIO_BLK_F_BLK_SIZE,
> > + VIRTIO_BLK_F_TOPOLOGY,
> > + VIRTIO_BLK_F_SCSI,
> > + VIRTIO_BLK_F_MQ,
> > + VIRTIO_BLK_F_RO,
> > + VIRTIO_BLK_F_FLUSH,
> > + VIRTIO_BLK_F_BARRIER,
> > + VIRTIO_BLK_F_WCE,
> > + VIRTIO_F_VERSION_1,
> > + VIRTIO_RING_F_INDIRECT_DESC,
> > + VIRTIO_RING_F_EVENT_IDX,
> > + VIRTIO_F_NOTIFY_ON_EMPTY,
> > + VHOST_INVALID_FEATURE_BIT
> > +};
> > +
> > +static void vhost_user_blk_update_config(VirtIODevice *vdev, uint8_t
> > *config)
> > +{
> > + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> > +
> > + memcpy(config, &s->blkcfg, sizeof(struct virtio_blk_config));
> > +}
> > +
> > +static void vhost_user_blk_set_config(VirtIODevice *vdev, const uint8_t
> > *config)
> > +{
> > + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> > + struct virtio_blk_config *blkcfg = (struct virtio_blk_config *)config;
> > + int ret;
> > +
> > + if (blkcfg->wce == s->blkcfg.wce) {
> > + return;
>
> Is write-cache the only config change the slave is interested in?
>
> > + }
> > +
> > + ret = vhost_dev_set_config(&s->dev, config,
> > + sizeof(struct virtio_blk_config));
> > + if (ret) {
> > + error_report("set device config space failed");
> > + return;
> > + }
> > +
> > + s->blkcfg.wce = blkcfg->wce;
> > +}
> > +
> > +static void vhost_user_blk_handle_config_change(struct vhost_dev *dev)
> > +{
> > + int ret;
> > + struct virtio_blk_config blkcfg;
> > + VHostUserBlk *s = VHOST_USER_BLK(dev->vdev);
> > +
> > + ret = vhost_dev_get_config(dev, (uint8_t *)&blkcfg,
> > + sizeof(struct virtio_blk_config));
> > + if (ret < 0) {
> > + error_report("get config space failed");
> > + return;
> > + }
> > +
> > + memcpy(&s->blkcfg, &blkcfg, sizeof(struct virtio_blk_config));
> > + memcpy(dev->vdev->config, &blkcfg, sizeof(struct virtio_blk_config));
>
> Why do you need to have s->blkcfg if you can use dev->vdev->config ?
Save a copy to local to avoid frequently get config from slave via socket message.
>
> > + virtio_notify_config(dev->vdev);
> > +}
> > +
> > +const VhostDevConfigOps blk_ops = {
> > + .vhost_dev_config_notifier = vhost_user_blk_handle_config_change,
> > +};
> > +
> > +static void vhost_user_blk_start(VirtIODevice *vdev)
> > +{
> > + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> > + BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
> > + VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
> > + int i, ret;
> > +
> > + if (!k->set_guest_notifiers) {
> > + error_report("binding does not support guest notifiers");
> > + return;
> > + }
> > +
> > + ret = vhost_dev_enable_notifiers(&s->dev, vdev);
> > + if (ret < 0) {
> > + error_report("Error enabling host notifiers: %d", -ret);
> > + return;
> > + }
> > +
> > + ret = k->set_guest_notifiers(qbus->parent, s->dev.nvqs, true);
> > + if (ret < 0) {
> > + error_report("Error binding guest notifier: %d", -ret);
> > + goto err_host_notifiers;
> > + }
> > +
> > + s->dev.acked_features = vdev->guest_features;
> > + ret = vhost_dev_start(&s->dev, vdev);
> > + if (ret < 0) {
> > + error_report("Error starting vhost: %d", -ret);
> > + goto err_guest_notifiers;
> > + }
> > +
> > + /* guest_notifier_mask/pending not used yet, so just unmask
> > + * everything here. virtio-pci will do the right thing by
> > + * enabling/disabling irqfd.
> > + */
> > + for (i = 0; i < s->dev.nvqs; i++) {
> > + vhost_virtqueue_mask(&s->dev, vdev, i, false);
> > + }
> > +
> > + return;
> > +
> > +err_guest_notifiers:
> > + k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false);
> > +err_host_notifiers:
> > + vhost_dev_disable_notifiers(&s->dev, vdev);
> > +}
> > +
> > +static void vhost_user_blk_stop(VirtIODevice *vdev)
> > +{
> > + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> > + BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
> > + VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
> > + int ret;
> > +
> > + if (!k->set_guest_notifiers) {
> > + return;
> > + }
> > +
> > + vhost_dev_stop(&s->dev, vdev);
> > +
> > + ret = k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false);
> > + if (ret < 0) {
> > + error_report("vhost guest notifier cleanup failed: %d", ret);
> > + return;
> > + }
> > +
> > + vhost_dev_disable_notifiers(&s->dev, vdev);
> > +}
> > +
> > +static void vhost_user_blk_set_status(VirtIODevice *vdev, uint8_t status)
> > +{
> > + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> > + bool should_start = status & VIRTIO_CONFIG_S_DRIVER_OK;
> > +
> > + if (!vdev->vm_running) {
> > + should_start = false;
> > + }
> > +
> > + if (s->dev.started == should_start) {
> > + return;
> > + }
> > +
> > + if (should_start) {
> > + vhost_user_blk_start(vdev);
> > + } else {
> > + vhost_user_blk_stop(vdev);
> > + }
> > +
> > +}
> > +
> > +static uint64_t vhost_user_blk_get_features(VirtIODevice *vdev,
> > + uint64_t features,
> > + Error **errp)
> > +{
> > + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> > + uint64_t get_features;
> > +
> > + /* Turn on pre-defined features */
> > + features |= s->host_features;
> > +
> > + get_features = vhost_get_features(&s->dev, user_feature_bits, features);
> > +
> > + return get_features;
> > +}
> > +
> > +static void vhost_user_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
> > +{
> > +
> > +}
> > +
> > +static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
> > +{
> > + VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> > + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> > + int i, ret;
> > +
> > + if (!s->chardev.chr) {
> > + error_setg(errp, "vhost-user-blk: chardev is mandatory");
> > + return;
> > + }
> > +
> > + if (!s->num_queues || s->num_queues > VIRTIO_QUEUE_MAX) {
> > + error_setg(errp, "vhost-user-blk: invalid number of IO queues");
> > + return;
> > + }
> > +
> > + if (!s->queue_size) {
> > + error_setg(errp, "vhost-user-blk: queue size must be non-zero");
> > + return;
> > + }
> > +
> > + virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK,
> > + sizeof(struct virtio_blk_config));
> > +
> > + for (i = 0; i < s->num_queues; i++) {
> > + virtio_add_queue(vdev, s->queue_size,
> > + vhost_user_blk_handle_output);
> > + }
> > +
> > + s->dev.nvqs = s->num_queues;
> > + s->dev.vqs = g_new(struct vhost_virtqueue, s->dev.nvqs);
> > + s->dev.vq_index = 0;
> > + s->dev.backend_features = 0;
> > +
> > + ret = vhost_dev_init(&s->dev, &s->chardev, VHOST_BACKEND_TYPE_USER,
> 0);
> > + if (ret < 0) {
> > + error_setg(errp, "vhost-user-blk: vhost initialization failed: %s",
> > + strerror(-ret));
> > + goto virtio_err;
> > + }
> > +
> > + ret = vhost_dev_get_config(&s->dev, (uint8_t *)&s->blkcfg,
> > + sizeof(struct virtio_blk_config));
> > + if (ret < 0) {
> > + error_setg(errp, "vhost-user-blk: get block config failed");
> > + goto vhost_err;
> > + }
> > +
> > + if (s->blkcfg.num_queues != s->num_queues) {
> > + s->blkcfg.num_queues = s->num_queues;
> > + }
> > +
> > + vhost_dev_set_config_notifier(&s->dev, &blk_ops);
> > +
> > + return;
> > +
> > +vhost_err:
> > + vhost_dev_cleanup(&s->dev);
> > +virtio_err:
> > + g_free(s->dev.vqs);
> > + virtio_cleanup(vdev);
> > +}
> > +
> > +static void vhost_user_blk_device_unrealize(DeviceState *dev, Error **errp)
> > +{
> > + VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> > + VHostUserBlk *s = VHOST_USER_BLK(dev);
> > +
> > + vhost_user_blk_set_status(vdev, 0);
> > + vhost_dev_cleanup(&s->dev);
> > + g_free(s->dev.vqs);
> > + virtio_cleanup(vdev);
> > +}
> > +
> > +static void vhost_user_blk_instance_init(Object *obj)
> > +{
> > + VHostUserBlk *s = VHOST_USER_BLK(obj);
> > +
> > + device_add_bootindex_property(obj, &s->bootindex, "bootindex",
> > + "/disk@0,0", DEVICE(obj), NULL);
> > +}
> > +
> > +static const VMStateDescription vmstate_vhost_user_blk = {
> > + .name = "vhost-user-blk",
> > + .minimum_version_id = 1,
> > + .version_id = 1,
> > + .fields = (VMStateField[]) {
> > + VMSTATE_VIRTIO_DEVICE,
> > + VMSTATE_END_OF_LIST()
> > + },
> > +};
> > +
> > +static Property vhost_user_blk_properties[] = {
> > + DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
> > + DEFINE_PROP_UINT16("num_queues", VHostUserBlk, num_queues, 1),
> > + DEFINE_PROP_UINT32("queue_size", VHostUserBlk, queue_size, 128),
> > + DEFINE_PROP_BIT64("f_size_max", VHostUserBlk, host_features,
> > + VIRTIO_BLK_F_SIZE_MAX, true),
> > + DEFINE_PROP_BIT64("f_sizemax", VHostUserBlk, host_features,
> > + VIRTIO_BLK_F_SIZE_MAX, true),
> > + DEFINE_PROP_BIT64("f_segmax", VHostUserBlk, host_features,
> > + VIRTIO_BLK_F_SEG_MAX, true),
> > + DEFINE_PROP_BIT64("f_geometry", VHostUserBlk, host_features,
> > + VIRTIO_BLK_F_GEOMETRY, true),
> > + DEFINE_PROP_BIT64("f_readonly", VHostUserBlk, host_features,
> > + VIRTIO_BLK_F_RO, false),
> > + DEFINE_PROP_BIT64("f_blocksize", VHostUserBlk, host_features,
> > + VIRTIO_BLK_F_BLK_SIZE, true),
> > + DEFINE_PROP_BIT64("f_topology", VHostUserBlk, host_features,
> > + VIRTIO_BLK_F_TOPOLOGY, true),
> > + DEFINE_PROP_BIT64("f_multiqueue", VHostUserBlk, host_features,
> > + VIRTIO_BLK_F_MQ, true),
> > + DEFINE_PROP_BIT64("f_flush", VHostUserBlk, host_features,
> > + VIRTIO_BLK_F_FLUSH, true),
> > + DEFINE_PROP_BIT64("f_barrier", VHostUserBlk, host_features,
> > + VIRTIO_BLK_F_BARRIER, false),
> > + DEFINE_PROP_BIT64("f_scsi", VHostUserBlk, host_features,
> > + VIRTIO_BLK_F_SCSI, false),
> > + DEFINE_PROP_BIT64("f_wce", VHostUserBlk, host_features,
> > + VIRTIO_BLK_F_WCE, false),
> > + DEFINE_PROP_END_OF_LIST(),
> > +};
> > +
> > +static void vhost_user_blk_class_init(ObjectClass *klass, void *data)
> > +{
> > + DeviceClass *dc = DEVICE_CLASS(klass);
> > + VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass);
> > +
> > + dc->props = vhost_user_blk_properties;
> > + dc->vmsd = &vmstate_vhost_user_blk;
> > + set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
> > + vdc->realize = vhost_user_blk_device_realize;
> > + vdc->unrealize = vhost_user_blk_device_unrealize;
> > + vdc->get_config = vhost_user_blk_update_config;
> > + vdc->set_config = vhost_user_blk_set_config;
> > + vdc->get_features = vhost_user_blk_get_features;
> > + vdc->set_status = vhost_user_blk_set_status;
> > +}
> > +
> > +static const TypeInfo vhost_user_blk_info = {
> > + .name = TYPE_VHOST_USER_BLK,
> > + .parent = TYPE_VIRTIO_DEVICE,
> > + .instance_size = sizeof(VHostUserBlk),
> > + .instance_init = vhost_user_blk_instance_init,
> > + .class_init = vhost_user_blk_class_init,
> > +};
> > +
> > +static void virtio_register_types(void)
> > +{
> > + type_register_static(&vhost_user_blk_info);
> > +}
> > +
> > +type_init(virtio_register_types)
> > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > index 8b0d6b6..be9a992 100644
> > --- a/hw/virtio/virtio-pci.c
> > +++ b/hw/virtio/virtio-pci.c
> > @@ -2012,6 +2012,58 @@ static const TypeInfo virtio_blk_pci_info = {
> > .class_init = virtio_blk_pci_class_init,
> > };
> >
> > +#ifdef CONFIG_VHOST_USER_BLK
> > +/* vhost-user-blk */
> > +
> > +static Property vhost_user_blk_pci_properties[] = {
> > + DEFINE_PROP_UINT32("class", VirtIOPCIProxy, class_code, 0),
> > + DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
> > + DEFINE_PROP_END_OF_LIST(),
> > +};
> > +
> > +static void vhost_user_blk_pci_realize(VirtIOPCIProxy *vpci_dev, Error
> > **errp)
> > +{
> > + VHostUserBlkPCI *dev = VHOST_USER_BLK_PCI(vpci_dev);
> > + DeviceState *vdev = DEVICE(&dev->vdev);
> > +
> > + qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus));
> > + object_property_set_bool(OBJECT(vdev), true, "realized", errp);
> > +}
> > +
> > +static void vhost_user_blk_pci_class_init(ObjectClass *klass, void *data)
> > +{
> > + DeviceClass *dc = DEVICE_CLASS(klass);
> > + VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass);
> > + PCIDeviceClass *pcidev_k = PCI_DEVICE_CLASS(klass);
> > +
> > + set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
> > + dc->props = vhost_user_blk_pci_properties;
> > + k->realize = vhost_user_blk_pci_realize;
> > + pcidev_k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET;
> > + pcidev_k->device_id = PCI_DEVICE_ID_VIRTIO_BLOCK;
> > + pcidev_k->revision = VIRTIO_PCI_ABI_VERSION;
> > + pcidev_k->class_id = PCI_CLASS_STORAGE_SCSI;
> > +}
> > +
> > +static void vhost_user_blk_pci_instance_init(Object *obj)
> > +{
> > + VHostUserBlkPCI *dev = VHOST_USER_BLK_PCI(obj);
> > +
> > + virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev),
> > + TYPE_VHOST_USER_BLK);
> > + object_property_add_alias(obj, "bootindex", OBJECT(&dev->vdev),
> > + "bootindex", &error_abort);
> > +}
> > +
> > +static const TypeInfo vhost_user_blk_pci_info = {
> > + .name = TYPE_VHOST_USER_BLK_PCI,
> > + .parent = TYPE_VIRTIO_PCI,
> > + .instance_size = sizeof(VHostUserBlkPCI),
> > + .instance_init = vhost_user_blk_pci_instance_init,
> > + .class_init = vhost_user_blk_pci_class_init,
> > +};
> > +#endif
> > +
> > /* virtio-scsi-pci */
> >
> > static Property virtio_scsi_pci_properties[] = {
> > @@ -2658,6 +2710,9 @@ static void virtio_pci_register_types(void)
> > type_register_static(&virtio_9p_pci_info);
> > #endif
> > type_register_static(&virtio_blk_pci_info);
> > +#ifdef CONFIG_VHOST_USER_BLK
> > + type_register_static(&vhost_user_blk_pci_info);
> > +#endif
> > type_register_static(&virtio_scsi_pci_info);
> > type_register_static(&virtio_balloon_pci_info);
> > type_register_static(&virtio_serial_pci_info);
> > diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
> > index 69f5959..19a0d01 100644
> > --- a/hw/virtio/virtio-pci.h
> > +++ b/hw/virtio/virtio-pci.h
> > @@ -27,6 +27,9 @@
> > #include "hw/virtio/virtio-gpu.h"
> > #include "hw/virtio/virtio-crypto.h"
> > #include "hw/virtio/vhost-user-scsi.h"
> > +#ifdef CONFIG_VHOST_USER_BLK
> > +#include "hw/virtio/vhost-user-blk.h"
> > +#endif
> >
> > #ifdef CONFIG_VIRTFS
> > #include "hw/9pfs/virtio-9p.h"
> > @@ -46,6 +49,7 @@ typedef struct VirtIOSerialPCI VirtIOSerialPCI;
> > typedef struct VirtIONetPCI VirtIONetPCI;
> > typedef struct VHostSCSIPCI VHostSCSIPCI;
> > typedef struct VHostUserSCSIPCI VHostUserSCSIPCI;
> > +typedef struct VHostUserBlkPCI VHostUserBlkPCI;
> > typedef struct VirtIORngPCI VirtIORngPCI;
> > typedef struct VirtIOInputPCI VirtIOInputPCI;
> > typedef struct VirtIOInputHIDPCI VirtIOInputHIDPCI;
> > @@ -241,6 +245,20 @@ struct VHostUserSCSIPCI {
> > VHostUserSCSI vdev;
> > };
> >
> > +#ifdef CONFIG_VHOST_USER_BLK
> > +/*
> > + * vhost-user-blk-pci: This extends VirtioPCIProxy.
> > + */
> > +#define TYPE_VHOST_USER_BLK_PCI "vhost-user-blk-pci"
> > +#define VHOST_USER_BLK_PCI(obj) \
> > + OBJECT_CHECK(VHostUserBlkPCI, (obj), TYPE_VHOST_USER_BLK_PCI)
> > +
> > +struct VHostUserBlkPCI {
> > + VirtIOPCIProxy parent_obj;
> > + VHostUserBlk vdev;
> > +};
> > +#endif
> > +
> > /*
> > * virtio-blk-pci: This extends VirtioPCIProxy.
> > */
> > diff --git a/include/hw/virtio/vhost-user-blk.h
> > b/include/hw/virtio/vhost-user-blk.h
> > new file mode 100644
> > index 0000000..77d20f0
> > --- /dev/null
> > +++ b/include/hw/virtio/vhost-user-blk.h
> > @@ -0,0 +1,40 @@
> > +/*
> > + * vhost-user-blk host device
> > + * Copyright IBM, Corp. 2011
> > + * Copyright(C) 2017 Intel Corporation.
> > + *
> > + * Authors:
> > + * Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
> > + * Changpeng Liu <changpeng.liu@intel.com>
> > + *
> > + * This work is licensed under the terms of the GNU LGPL, version 2 or
> > later.
> > + * See the COPYING.LIB file in the top-level directory.
> > + *
> > + */
> > +
> > +#ifndef VHOST_USER_BLK_H
> > +#define VHOST_USER_BLK_H
> > +
> > +#include "standard-headers/linux/virtio_blk.h"
> > +#include "qemu-common.h"
> > +#include "hw/qdev.h"
> > +#include "hw/block/block.h"
> > +#include "chardev/char-fe.h"
> > +#include "hw/virtio/vhost.h"
> > +
> > +#define TYPE_VHOST_USER_BLK "vhost-user-blk"
> > +#define VHOST_USER_BLK(obj) \
> > + OBJECT_CHECK(VHostUserBlk, (obj), TYPE_VHOST_USER_BLK)
> > +
> > +typedef struct VHostUserBlk {
> > + VirtIODevice parent_obj;
> > + CharBackend chardev;
> > + int32_t bootindex;
> > + uint64_t host_features;
> > + struct virtio_blk_config blkcfg;
> > + uint16_t num_queues;
> > + uint32_t queue_size;
> > + struct vhost_dev dev;
> > +} VHostUserBlk;
> > +
> > +#endif
> > --
> > 1.9.3
> >
> >
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH v2 2/4] vhost-user-blk: introduce a new vhost-user-blk host device
2017-08-10 10:12 ` [Qemu-devel] [PATCH v2 2/4] vhost-user-blk: introduce a new vhost-user-blk host device Changpeng Liu
2017-08-09 15:39 ` Marc-André Lureau
@ 2017-08-09 17:10 ` Michael S. Tsirkin
2017-08-10 9:29 ` Paolo Bonzini
1 sibling, 1 reply; 13+ messages in thread
From: Michael S. Tsirkin @ 2017-08-09 17:10 UTC (permalink / raw)
To: Changpeng Liu
Cc: qemu-devel, stefanha, pbonzini, marcandre.lureau, felipe,
james.r.harris
I only had time for a quick look. More review when
you repost after release.
On Thu, Aug 10, 2017 at 06:12:29PM +0800, Changpeng Liu wrote:
> This commit introduces a new vhost-user device for block, it uses a
> chardev to connect with the backend, same with Qemu virito-blk device,
> Guest OS still uses the virtio-blk frontend driver.
>
> To use it, start Qemu with command line like this:
>
> qemu-system-x86_64 \
> -chardev socket,id=char0,path=/path/vhost.socket \
> -device vhost-user-blk-pci,chardev=char0,num_queues=...
>
> Different with exist Qemu virtio-blk host device, it makes more easy
> for users to implement their own I/O processing logic, such as all
> user space I/O stack against hardware block device. It uses the new
> vhost messages(VHOST_USER_GET_CONFIG) to get block virtio config
> information from backend process.
I took a quick look. I think I would prefer a more direct approach
where qemu is more of a driver. So user specifies properties and
they get sent to backend at init time. Only handle geometry changes
specially.
>
> Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
> ---
> configure | 11 ++
> hw/block/Makefile.objs | 3 +
> hw/block/vhost-user-blk.c | 360 +++++++++++++++++++++++++++++++++++++
> hw/virtio/virtio-pci.c | 55 ++++++
> hw/virtio/virtio-pci.h | 18 ++
> include/hw/virtio/vhost-user-blk.h | 40 +++++
> 6 files changed, 487 insertions(+)
> create mode 100644 hw/block/vhost-user-blk.c
> create mode 100644 include/hw/virtio/vhost-user-blk.h
>
> diff --git a/configure b/configure
> index dd73cce..1452c66 100755
> --- a/configure
> +++ b/configure
> @@ -305,6 +305,7 @@ tcg="yes"
>
> vhost_net="no"
> vhost_scsi="no"
> +vhost_user_blk="no"
> vhost_vsock="no"
> vhost_user=""
> kvm="no"
> @@ -779,6 +780,7 @@ Linux)
> kvm="yes"
> vhost_net="yes"
> vhost_scsi="yes"
> + vhost_user_blk="yes"
> vhost_vsock="yes"
> QEMU_INCLUDES="-I\$(SRC_PATH)/linux-headers -I$(pwd)/linux-headers $QEMU_INCLUDES"
> supported_os="yes"
> @@ -1136,6 +1138,10 @@ for opt do
> ;;
> --enable-vhost-scsi) vhost_scsi="yes"
> ;;
> + --disable-vhost-user-blk) vhost_user_blk="no"
> + ;;
> + --enable-vhost-user-blk) vhost_user_blk="yes"
> + ;;
> --disable-vhost-vsock) vhost_vsock="no"
> ;;
> --enable-vhost-vsock) vhost_vsock="yes"
> @@ -1506,6 +1512,7 @@ disabled with --disable-FEATURE, default is enabled if available:
> cap-ng libcap-ng support
> attr attr and xattr support
> vhost-net vhost-net acceleration support
> + vhost-user-blk VM virtio-blk acceleration in user space
> spice spice
> rbd rados block device (rbd)
> libiscsi iscsi support
> @@ -5365,6 +5372,7 @@ echo "posix_madvise $posix_madvise"
> echo "libcap-ng support $cap_ng"
> echo "vhost-net support $vhost_net"
> echo "vhost-scsi support $vhost_scsi"
> +echo "vhost-user-blk support $vhost_user_blk"
> echo "vhost-vsock support $vhost_vsock"
> echo "vhost-user support $vhost_user"
> echo "Trace backends $trace_backends"
> @@ -5776,6 +5784,9 @@ fi
> if test "$vhost_scsi" = "yes" ; then
> echo "CONFIG_VHOST_SCSI=y" >> $config_host_mak
> fi
> +if test "$vhost_user_blk" = "yes" ; then
> + echo "CONFIG_VHOST_USER_BLK=y" >> $config_host_mak
> +fi
> if test "$vhost_net" = "yes" -a "$vhost_user" = "yes"; then
> echo "CONFIG_VHOST_NET_USED=y" >> $config_host_mak
> fi
> diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs
> index e0ed980..4c19a58 100644
> --- a/hw/block/Makefile.objs
> +++ b/hw/block/Makefile.objs
> @@ -13,3 +13,6 @@ obj-$(CONFIG_SH4) += tc58128.o
>
> obj-$(CONFIG_VIRTIO) += virtio-blk.o
> obj-$(CONFIG_VIRTIO) += dataplane/
> +ifeq ($(CONFIG_VIRTIO),y)
> +obj-$(CONFIG_VHOST_USER_BLK) += vhost-user-blk.o
> +endif
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> new file mode 100644
> index 0000000..8aa9fa9
> --- /dev/null
> +++ b/hw/block/vhost-user-blk.c
> @@ -0,0 +1,360 @@
> +/*
> + * vhost-user-blk host device
> + *
> + * Copyright IBM, Corp. 2011
> + * Copyright(C) 2017 Intel Corporation.
> + *
> + * Authors:
> + * Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
> + * Changpeng Liu <changpeng.liu@intel.com>
> + *
> + * This work is licensed under the terms of the GNU LGPL, version 2 or later.
> + * See the COPYING.LIB file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "qemu/error-report.h"
> +#include "qemu/typedefs.h"
> +#include "qemu/cutils.h"
> +#include "qom/object.h"
> +#include "hw/qdev-core.h"
> +#include "hw/virtio/vhost.h"
> +#include "hw/virtio/vhost-user-blk.h"
> +#include "hw/virtio/virtio.h"
> +#include "hw/virtio/virtio-bus.h"
> +#include "hw/virtio/virtio-access.h"
> +
> +static const int user_feature_bits[] = {
> + VIRTIO_BLK_F_SIZE_MAX,
> + VIRTIO_BLK_F_SEG_MAX,
> + VIRTIO_BLK_F_GEOMETRY,
> + VIRTIO_BLK_F_BLK_SIZE,
> + VIRTIO_BLK_F_TOPOLOGY,
> + VIRTIO_BLK_F_SCSI,
I don't think we want to support this.
> + VIRTIO_BLK_F_MQ,
> + VIRTIO_BLK_F_RO,
> + VIRTIO_BLK_F_FLUSH,
> + VIRTIO_BLK_F_BARRIER,
> + VIRTIO_BLK_F_WCE,
> + VIRTIO_F_VERSION_1,
How about forcing all remotes to implement this instead?
> + VIRTIO_RING_F_INDIRECT_DESC,
> + VIRTIO_RING_F_EVENT_IDX,
> + VIRTIO_F_NOTIFY_ON_EMPTY,
> + VHOST_INVALID_FEATURE_BIT
No reason to let remote play with that.
> +};
I think a more reasonable set of features is what Linux uses:
VIRTIO_BLK_F_SEG_MAX, VIRTIO_BLK_F_SIZE_MAX, VIRTIO_BLK_F_GEOMETRY,
VIRTIO_BLK_F_RO, VIRTIO_BLK_F_BLK_SIZE,
VIRTIO_BLK_F_FLUSH, VIRTIO_BLK_F_TOPOLOGY, VIRTIO_BLK_F_CONFIG_WCE,
VIRTIO_BLK_F_MQ,
and maybe
> VIRTIO_RING_F_INDIRECT_DESC,
> VIRTIO_RING_F_EVENT_IDX,
others should be forced by qemu.
> +
> +static void vhost_user_blk_update_config(VirtIODevice *vdev, uint8_t *config)
> +{
> + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> +
> + memcpy(config, &s->blkcfg, sizeof(struct virtio_blk_config));
> +}
> +
> +static void vhost_user_blk_set_config(VirtIODevice *vdev, const uint8_t *config)
> +{
> + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> + struct virtio_blk_config *blkcfg = (struct virtio_blk_config *)config;
> + int ret;
> +
> + if (blkcfg->wce == s->blkcfg.wce) {
> + return;
> + }
> +
> + ret = vhost_dev_set_config(&s->dev, config,
> + sizeof(struct virtio_blk_config));
> + if (ret) {
> + error_report("set device config space failed");
> + return;
> + }
> +
> + s->blkcfg.wce = blkcfg->wce;
> +}
> +
> +static void vhost_user_blk_handle_config_change(struct vhost_dev *dev)
> +{
> + int ret;
> + struct virtio_blk_config blkcfg;
> + VHostUserBlk *s = VHOST_USER_BLK(dev->vdev);
> +
> + ret = vhost_dev_get_config(dev, (uint8_t *)&blkcfg,
> + sizeof(struct virtio_blk_config));
> + if (ret < 0) {
> + error_report("get config space failed");
> + return;
> + }
> +
> + memcpy(&s->blkcfg, &blkcfg, sizeof(struct virtio_blk_config));
> + memcpy(dev->vdev->config, &blkcfg, sizeof(struct virtio_blk_config));
Will break if virtio_blk_config becomes larger than 256
bytes. Better add a build time assertion.
> +
> + virtio_notify_config(dev->vdev);
> +}
> +
> +const VhostDevConfigOps blk_ops = {
> + .vhost_dev_config_notifier = vhost_user_blk_handle_config_change,
> +};
> +
> +static void vhost_user_blk_start(VirtIODevice *vdev)
> +{
> + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> + BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
> + VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
> + int i, ret;
> +
> + if (!k->set_guest_notifiers) {
> + error_report("binding does not support guest notifiers");
> + return;
> + }
> +
> + ret = vhost_dev_enable_notifiers(&s->dev, vdev);
> + if (ret < 0) {
> + error_report("Error enabling host notifiers: %d", -ret);
> + return;
> + }
> +
> + ret = k->set_guest_notifiers(qbus->parent, s->dev.nvqs, true);
> + if (ret < 0) {
> + error_report("Error binding guest notifier: %d", -ret);
> + goto err_host_notifiers;
> + }
> +
> + s->dev.acked_features = vdev->guest_features;
> + ret = vhost_dev_start(&s->dev, vdev);
> + if (ret < 0) {
> + error_report("Error starting vhost: %d", -ret);
> + goto err_guest_notifiers;
> + }
> +
> + /* guest_notifier_mask/pending not used yet, so just unmask
> + * everything here. virtio-pci will do the right thing by
> + * enabling/disabling irqfd.
> + */
> + for (i = 0; i < s->dev.nvqs; i++) {
> + vhost_virtqueue_mask(&s->dev, vdev, i, false);
> + }
> +
> + return;
> +
> +err_guest_notifiers:
> + k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false);
> +err_host_notifiers:
> + vhost_dev_disable_notifiers(&s->dev, vdev);
> +}
> +
> +static void vhost_user_blk_stop(VirtIODevice *vdev)
> +{
> + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> + BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
> + VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
> + int ret;
> +
> + if (!k->set_guest_notifiers) {
> + return;
> + }
> +
> + vhost_dev_stop(&s->dev, vdev);
> +
> + ret = k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false);
> + if (ret < 0) {
> + error_report("vhost guest notifier cleanup failed: %d", ret);
> + return;
> + }
> +
> + vhost_dev_disable_notifiers(&s->dev, vdev);
> +}
> +
> +static void vhost_user_blk_set_status(VirtIODevice *vdev, uint8_t status)
> +{
> + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> + bool should_start = status & VIRTIO_CONFIG_S_DRIVER_OK;
> +
> + if (!vdev->vm_running) {
> + should_start = false;
> + }
> +
> + if (s->dev.started == should_start) {
> + return;
> + }
> +
> + if (should_start) {
> + vhost_user_blk_start(vdev);
> + } else {
> + vhost_user_blk_stop(vdev);
> + }
> +
> +}
> +
> +static uint64_t vhost_user_blk_get_features(VirtIODevice *vdev,
> + uint64_t features,
> + Error **errp)
> +{
> + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> + uint64_t get_features;
> +
> + /* Turn on pre-defined features */
> + features |= s->host_features;
> +
> + get_features = vhost_get_features(&s->dev, user_feature_bits, features);
> +
> + return get_features;
> +}
> +
> +static void vhost_user_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
> +{
ever called? assert here?
> +
> +}
> +
> +static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
> +{
> + VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> + VHostUserBlk *s = VHOST_USER_BLK(vdev);
> + int i, ret;
> +
> + if (!s->chardev.chr) {
> + error_setg(errp, "vhost-user-blk: chardev is mandatory");
> + return;
> + }
> +
> + if (!s->num_queues || s->num_queues > VIRTIO_QUEUE_MAX) {
> + error_setg(errp, "vhost-user-blk: invalid number of IO queues");
> + return;
> + }
> +
> + if (!s->queue_size) {
> + error_setg(errp, "vhost-user-blk: queue size must be non-zero");
> + return;
> + }
> +
> + virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK,
> + sizeof(struct virtio_blk_config));
> +
> + for (i = 0; i < s->num_queues; i++) {
> + virtio_add_queue(vdev, s->queue_size,
> + vhost_user_blk_handle_output);
> + }
> +
> + s->dev.nvqs = s->num_queues;
> + s->dev.vqs = g_new(struct vhost_virtqueue, s->dev.nvqs);
> + s->dev.vq_index = 0;
> + s->dev.backend_features = 0;
> +
> + ret = vhost_dev_init(&s->dev, &s->chardev, VHOST_BACKEND_TYPE_USER, 0);
> + if (ret < 0) {
> + error_setg(errp, "vhost-user-blk: vhost initialization failed: %s",
> + strerror(-ret));
> + goto virtio_err;
> + }
> +
> + ret = vhost_dev_get_config(&s->dev, (uint8_t *)&s->blkcfg,
> + sizeof(struct virtio_blk_config));
> + if (ret < 0) {
> + error_setg(errp, "vhost-user-blk: get block config failed");
> + goto vhost_err;
> + }
> +
> + if (s->blkcfg.num_queues != s->num_queues) {
> + s->blkcfg.num_queues = s->num_queues;
> + }
> +
> + vhost_dev_set_config_notifier(&s->dev, &blk_ops);
> +
> + return;
> +
> +vhost_err:
> + vhost_dev_cleanup(&s->dev);
> +virtio_err:
> + g_free(s->dev.vqs);
> + virtio_cleanup(vdev);
> +}
> +
> +static void vhost_user_blk_device_unrealize(DeviceState *dev, Error **errp)
> +{
> + VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> + VHostUserBlk *s = VHOST_USER_BLK(dev);
> +
> + vhost_user_blk_set_status(vdev, 0);
> + vhost_dev_cleanup(&s->dev);
> + g_free(s->dev.vqs);
> + virtio_cleanup(vdev);
> +}
> +
> +static void vhost_user_blk_instance_init(Object *obj)
> +{
> + VHostUserBlk *s = VHOST_USER_BLK(obj);
> +
> + device_add_bootindex_property(obj, &s->bootindex, "bootindex",
> + "/disk@0,0", DEVICE(obj), NULL);
> +}
> +
> +static const VMStateDescription vmstate_vhost_user_blk = {
> + .name = "vhost-user-blk",
> + .minimum_version_id = 1,
> + .version_id = 1,
> + .fields = (VMStateField[]) {
> + VMSTATE_VIRTIO_DEVICE,
> + VMSTATE_END_OF_LIST()
> + },
> +};
> +
> +static Property vhost_user_blk_properties[] = {
> + DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
> + DEFINE_PROP_UINT16("num_queues", VHostUserBlk, num_queues, 1),
> + DEFINE_PROP_UINT32("queue_size", VHostUserBlk, queue_size, 128),
> + DEFINE_PROP_BIT64("f_size_max", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_SIZE_MAX, true),
> + DEFINE_PROP_BIT64("f_sizemax", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_SIZE_MAX, true),
> + DEFINE_PROP_BIT64("f_segmax", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_SEG_MAX, true),
> + DEFINE_PROP_BIT64("f_geometry", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_GEOMETRY, true),
> + DEFINE_PROP_BIT64("f_readonly", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_RO, false),
> + DEFINE_PROP_BIT64("f_blocksize", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_BLK_SIZE, true),
> + DEFINE_PROP_BIT64("f_topology", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_TOPOLOGY, true),
> + DEFINE_PROP_BIT64("f_multiqueue", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_MQ, true),
> + DEFINE_PROP_BIT64("f_flush", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_FLUSH, true),
> + DEFINE_PROP_BIT64("f_barrier", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_BARRIER, false),
> + DEFINE_PROP_BIT64("f_scsi", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_SCSI, false),
> + DEFINE_PROP_BIT64("f_wce", VHostUserBlk, host_features,
> + VIRTIO_BLK_F_WCE, false),
> + DEFINE_PROP_END_OF_LIST(),
> +};
> +
So if you let users specify these, why do you need to query
them from the backend with GET_CONFIG?
> +static void vhost_user_blk_class_init(ObjectClass *klass, void *data)
> +{
> + DeviceClass *dc = DEVICE_CLASS(klass);
> + VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass);
> +
> + dc->props = vhost_user_blk_properties;
> + dc->vmsd = &vmstate_vhost_user_blk;
> + set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
> + vdc->realize = vhost_user_blk_device_realize;
> + vdc->unrealize = vhost_user_blk_device_unrealize;
> + vdc->get_config = vhost_user_blk_update_config;
> + vdc->set_config = vhost_user_blk_set_config;
> + vdc->get_features = vhost_user_blk_get_features;
> + vdc->set_status = vhost_user_blk_set_status;
> +}
> +
Looks like this will pass config accesses directly to backend.
I am not sure it's a good approach.
> +static const TypeInfo vhost_user_blk_info = {
> + .name = TYPE_VHOST_USER_BLK,
> + .parent = TYPE_VIRTIO_DEVICE,
> + .instance_size = sizeof(VHostUserBlk),
> + .instance_init = vhost_user_blk_instance_init,
> + .class_init = vhost_user_blk_class_init,
> +};
> +
> +static void virtio_register_types(void)
> +{
> + type_register_static(&vhost_user_blk_info);
> +}
> +
> +type_init(virtio_register_types)
> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index 8b0d6b6..be9a992 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -2012,6 +2012,58 @@ static const TypeInfo virtio_blk_pci_info = {
> .class_init = virtio_blk_pci_class_init,
> };
>
> +#ifdef CONFIG_VHOST_USER_BLK
> +/* vhost-user-blk */
> +
> +static Property vhost_user_blk_pci_properties[] = {
> + DEFINE_PROP_UINT32("class", VirtIOPCIProxy, class_code, 0),
> + DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
> + DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void vhost_user_blk_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
> +{
> + VHostUserBlkPCI *dev = VHOST_USER_BLK_PCI(vpci_dev);
> + DeviceState *vdev = DEVICE(&dev->vdev);
> +
> + qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus));
> + object_property_set_bool(OBJECT(vdev), true, "realized", errp);
> +}
> +
> +static void vhost_user_blk_pci_class_init(ObjectClass *klass, void *data)
> +{
> + DeviceClass *dc = DEVICE_CLASS(klass);
> + VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass);
> + PCIDeviceClass *pcidev_k = PCI_DEVICE_CLASS(klass);
> +
> + set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
> + dc->props = vhost_user_blk_pci_properties;
> + k->realize = vhost_user_blk_pci_realize;
> + pcidev_k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET;
> + pcidev_k->device_id = PCI_DEVICE_ID_VIRTIO_BLOCK;
> + pcidev_k->revision = VIRTIO_PCI_ABI_VERSION;
> + pcidev_k->class_id = PCI_CLASS_STORAGE_SCSI;
> +}
> +
> +static void vhost_user_blk_pci_instance_init(Object *obj)
> +{
> + VHostUserBlkPCI *dev = VHOST_USER_BLK_PCI(obj);
> +
> + virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev),
> + TYPE_VHOST_USER_BLK);
> + object_property_add_alias(obj, "bootindex", OBJECT(&dev->vdev),
> + "bootindex", &error_abort);
> +}
> +
> +static const TypeInfo vhost_user_blk_pci_info = {
> + .name = TYPE_VHOST_USER_BLK_PCI,
> + .parent = TYPE_VIRTIO_PCI,
> + .instance_size = sizeof(VHostUserBlkPCI),
> + .instance_init = vhost_user_blk_pci_instance_init,
> + .class_init = vhost_user_blk_pci_class_init,
> +};
> +#endif
> +
> /* virtio-scsi-pci */
>
> static Property virtio_scsi_pci_properties[] = {
> @@ -2658,6 +2710,9 @@ static void virtio_pci_register_types(void)
> type_register_static(&virtio_9p_pci_info);
> #endif
> type_register_static(&virtio_blk_pci_info);
> +#ifdef CONFIG_VHOST_USER_BLK
> + type_register_static(&vhost_user_blk_pci_info);
> +#endif
> type_register_static(&virtio_scsi_pci_info);
> type_register_static(&virtio_balloon_pci_info);
> type_register_static(&virtio_serial_pci_info);
> diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
> index 69f5959..19a0d01 100644
> --- a/hw/virtio/virtio-pci.h
> +++ b/hw/virtio/virtio-pci.h
> @@ -27,6 +27,9 @@
> #include "hw/virtio/virtio-gpu.h"
> #include "hw/virtio/virtio-crypto.h"
> #include "hw/virtio/vhost-user-scsi.h"
> +#ifdef CONFIG_VHOST_USER_BLK
> +#include "hw/virtio/vhost-user-blk.h"
> +#endif
>
> #ifdef CONFIG_VIRTFS
> #include "hw/9pfs/virtio-9p.h"
> @@ -46,6 +49,7 @@ typedef struct VirtIOSerialPCI VirtIOSerialPCI;
> typedef struct VirtIONetPCI VirtIONetPCI;
> typedef struct VHostSCSIPCI VHostSCSIPCI;
> typedef struct VHostUserSCSIPCI VHostUserSCSIPCI;
> +typedef struct VHostUserBlkPCI VHostUserBlkPCI;
> typedef struct VirtIORngPCI VirtIORngPCI;
> typedef struct VirtIOInputPCI VirtIOInputPCI;
> typedef struct VirtIOInputHIDPCI VirtIOInputHIDPCI;
> @@ -241,6 +245,20 @@ struct VHostUserSCSIPCI {
> VHostUserSCSI vdev;
> };
>
> +#ifdef CONFIG_VHOST_USER_BLK
> +/*
> + * vhost-user-blk-pci: This extends VirtioPCIProxy.
> + */
> +#define TYPE_VHOST_USER_BLK_PCI "vhost-user-blk-pci"
> +#define VHOST_USER_BLK_PCI(obj) \
> + OBJECT_CHECK(VHostUserBlkPCI, (obj), TYPE_VHOST_USER_BLK_PCI)
> +
> +struct VHostUserBlkPCI {
> + VirtIOPCIProxy parent_obj;
> + VHostUserBlk vdev;
> +};
> +#endif
> +
> /*
> * virtio-blk-pci: This extends VirtioPCIProxy.
> */
> diff --git a/include/hw/virtio/vhost-user-blk.h b/include/hw/virtio/vhost-user-blk.h
> new file mode 100644
> index 0000000..77d20f0
> --- /dev/null
> +++ b/include/hw/virtio/vhost-user-blk.h
> @@ -0,0 +1,40 @@
> +/*
> + * vhost-user-blk host device
> + * Copyright IBM, Corp. 2011
> + * Copyright(C) 2017 Intel Corporation.
> + *
> + * Authors:
> + * Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
> + * Changpeng Liu <changpeng.liu@intel.com>
> + *
> + * This work is licensed under the terms of the GNU LGPL, version 2 or later.
> + * See the COPYING.LIB file in the top-level directory.
> + *
> + */
> +
> +#ifndef VHOST_USER_BLK_H
> +#define VHOST_USER_BLK_H
> +
> +#include "standard-headers/linux/virtio_blk.h"
> +#include "qemu-common.h"
> +#include "hw/qdev.h"
> +#include "hw/block/block.h"
> +#include "chardev/char-fe.h"
> +#include "hw/virtio/vhost.h"
> +
> +#define TYPE_VHOST_USER_BLK "vhost-user-blk"
> +#define VHOST_USER_BLK(obj) \
> + OBJECT_CHECK(VHostUserBlk, (obj), TYPE_VHOST_USER_BLK)
> +
> +typedef struct VHostUserBlk {
> + VirtIODevice parent_obj;
> + CharBackend chardev;
> + int32_t bootindex;
> + uint64_t host_features;
> + struct virtio_blk_config blkcfg;
> + uint16_t num_queues;
> + uint32_t queue_size;
> + struct vhost_dev dev;
> +} VHostUserBlk;
> +
> +#endif
> --
> 1.9.3
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH v2 2/4] vhost-user-blk: introduce a new vhost-user-blk host device
2017-08-09 17:10 ` Michael S. Tsirkin
@ 2017-08-10 9:29 ` Paolo Bonzini
0 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2017-08-10 9:29 UTC (permalink / raw)
To: Michael S. Tsirkin, Changpeng Liu
Cc: qemu-devel, stefanha, marcandre.lureau, felipe, james.r.harris
On 09/08/2017 19:10, Michael S. Tsirkin wrote:
> So user specifies properties and
> they get sent to backend at init time. Only handle geometry changes
> specially.
So QEMU would get the configuration, set these properties, and send the
result to the backend via SET_CONFIG?
vhost-user-blk-pci.cyls=uint32
vhost-user-blk-pci.secs=uint32
vhost-user-blk-pci.heads=uint32
vhost-user-blk-pci.serial=str
vhost-user-blk-pci.min_io_size=uint16
vhost-user-blk-pci.opt_io_size=uint32
vhost-user-blk-pci.logical_block_size=uint16
vhost-user-blk-pci.physical_block_size=uint16
If the properties are incompatible (e.g. too small logical block size)
SET_CONFIG fails and QEMU would fail to realize the device. This makes
sense, I think.
Thanks,
Paolo
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH v2 3/4] contrib/libvhost-user: enable virtio config space messages
2017-08-10 10:12 [Qemu-devel] [PATCH v2 0/4] *** Introduce a new vhost-user-blk host device to Qemu *** Changpeng Liu
` (2 preceding siblings ...)
2017-08-10 10:12 ` [Qemu-devel] [PATCH v2 2/4] vhost-user-blk: introduce a new vhost-user-blk host device Changpeng Liu
@ 2017-08-10 10:12 ` Changpeng Liu
2017-08-09 18:34 ` Marc-André Lureau
2017-08-10 10:12 ` [Qemu-devel] [PATCH v2 4/4] contrib/vhost-user-blk: introduce a vhost-user-blk sample application Changpeng Liu
4 siblings, 1 reply; 13+ messages in thread
From: Changpeng Liu @ 2017-08-10 10:12 UTC (permalink / raw)
To: changpeng.liu, qemu-devel
Cc: stefanha, pbonzini, mst, marcandre.lureau, felipe, james.r.harris
Enable VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG/VHOST_USER_SET_CONFIG_FD
messages in libvhost-user library, users can implement their own I/O target
based on the library. This enable the virtio config space delivered between
Qemu host device and the I/O target, also event notifier is added in case
of virtio config space changed.
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
---
contrib/libvhost-user/libvhost-user.c | 51 +++++++++++++++++++++++++++++++++++
contrib/libvhost-user/libvhost-user.h | 14 ++++++++++
2 files changed, 65 insertions(+)
diff --git a/contrib/libvhost-user/libvhost-user.c b/contrib/libvhost-user/libvhost-user.c
index 9efb9da..002cf15 100644
--- a/contrib/libvhost-user/libvhost-user.c
+++ b/contrib/libvhost-user/libvhost-user.c
@@ -63,6 +63,9 @@ vu_request_to_string(int req)
REQ(VHOST_USER_SET_VRING_ENABLE),
REQ(VHOST_USER_SEND_RARP),
REQ(VHOST_USER_INPUT_GET_CONFIG),
+ REQ(VHOST_USER_GET_CONFIG),
+ REQ(VHOST_USER_SET_CONFIG),
+ REQ(VHOST_USER_SET_CONFIG_FD),
REQ(VHOST_USER_MAX),
};
#undef REQ
@@ -744,6 +747,43 @@ vu_set_vring_enable_exec(VuDev *dev, VhostUserMsg *vmsg)
}
static bool
+vu_get_config(VuDev *dev, VhostUserMsg *vmsg)
+{
+ if (dev->iface->get_config) {
+ dev->iface->get_config(dev, vmsg->payload.config, vmsg->size);
+ }
+
+ return true;
+}
+
+static bool
+vu_set_config(VuDev *dev, VhostUserMsg *vmsg)
+{
+ if (dev->iface->set_config) {
+ dev->iface->set_config(dev, vmsg->payload.config, vmsg->size);
+ }
+
+ return false;
+}
+
+static bool
+vu_set_config_fd(VuDev *dev, VhostUserMsg *vmsg)
+{
+ if (vmsg->fd_num != 1) {
+ vu_panic(dev, "Invalid config_fd message");
+ return false;
+ }
+
+ if (dev->config_fd != -1) {
+ close(dev->config_fd);
+ }
+ dev->config_fd = vmsg->fds[0];
+ DPRINT("Got config_fd: %d\n", vmsg->fds[0]);
+
+ return false;
+}
+
+static bool
vu_process_message(VuDev *dev, VhostUserMsg *vmsg)
{
int do_reply = 0;
@@ -806,6 +846,12 @@ vu_process_message(VuDev *dev, VhostUserMsg *vmsg)
return vu_get_queue_num_exec(dev, vmsg);
case VHOST_USER_SET_VRING_ENABLE:
return vu_set_vring_enable_exec(dev, vmsg);
+ case VHOST_USER_GET_CONFIG:
+ return vu_get_config(dev, vmsg);
+ case VHOST_USER_SET_CONFIG:
+ return vu_set_config(dev, vmsg);
+ case VHOST_USER_SET_CONFIG_FD:
+ return vu_set_config_fd(dev, vmsg);
default:
vmsg_close_fds(vmsg);
vu_panic(dev, "Unhandled request: %d", vmsg->request);
@@ -878,6 +924,10 @@ vu_deinit(VuDev *dev)
vu_close_log(dev);
+ if (dev->config_fd != -1) {
+ close(dev->config_fd);
+ }
+
if (dev->sock != -1) {
close(dev->sock);
}
@@ -907,6 +957,7 @@ vu_init(VuDev *dev,
dev->remove_watch = remove_watch;
dev->iface = iface;
dev->log_call_fd = -1;
+ dev->config_fd = -1;
for (i = 0; i < VHOST_MAX_NR_VIRTQUEUE; i++) {
dev->vq[i] = (VuVirtq) {
.call_fd = -1, .kick_fd = -1, .err_fd = -1,
diff --git a/contrib/libvhost-user/libvhost-user.h b/contrib/libvhost-user/libvhost-user.h
index 53ef222..899dee1 100644
--- a/contrib/libvhost-user/libvhost-user.h
+++ b/contrib/libvhost-user/libvhost-user.h
@@ -30,6 +30,8 @@
#define VHOST_MEMORY_MAX_NREGIONS 8
+#define VHOST_USER_MAX_CONFIG_SIZE 256
+
enum VhostUserProtocolFeature {
VHOST_USER_PROTOCOL_F_MQ = 0,
VHOST_USER_PROTOCOL_F_LOG_SHMFD = 1,
@@ -62,6 +64,9 @@ typedef enum VhostUserRequest {
VHOST_USER_SET_VRING_ENABLE = 18,
VHOST_USER_SEND_RARP = 19,
VHOST_USER_INPUT_GET_CONFIG = 20,
+ VHOST_USER_GET_CONFIG = 24,
+ VHOST_USER_SET_CONFIG = 25,
+ VHOST_USER_SET_CONFIG_FD = 26,
VHOST_USER_MAX
} VhostUserRequest;
@@ -105,6 +110,7 @@ typedef struct VhostUserMsg {
struct vhost_vring_addr addr;
VhostUserMemory memory;
VhostUserLog log;
+ uint8_t config[VHOST_USER_MAX_CONFIG_SIZE];
} payload;
int fds[VHOST_MEMORY_MAX_NREGIONS];
@@ -132,6 +138,9 @@ typedef void (*vu_set_features_cb) (VuDev *dev, uint64_t features);
typedef int (*vu_process_msg_cb) (VuDev *dev, VhostUserMsg *vmsg,
int *do_reply);
typedef void (*vu_queue_set_started_cb) (VuDev *dev, int qidx, bool started);
+typedef int (*vu_get_config_cb) (VuDev *dev, uint8_t *config, size_t len);
+typedef int (*vu_set_config_cb) (VuDev *dev, const uint8_t *config,
+ size_t len);
typedef struct VuDevIface {
/* called by VHOST_USER_GET_FEATURES to get the features bitmask */
@@ -148,6 +157,10 @@ typedef struct VuDevIface {
vu_process_msg_cb process_msg;
/* tells when queues can be processed */
vu_queue_set_started_cb queue_set_started;
+ /* get the config space of the device */
+ vu_get_config_cb get_config;
+ /* set the config space of the device */
+ vu_set_config_cb set_config;
} VuDevIface;
typedef void (*vu_queue_handler_cb) (VuDev *dev, int qidx);
@@ -212,6 +225,7 @@ struct VuDev {
VuDevRegion regions[VHOST_MEMORY_MAX_NREGIONS];
VuVirtq vq[VHOST_MAX_NR_VIRTQUEUE];
int log_call_fd;
+ int config_fd;
uint64_t log_size;
uint8_t *log_table;
uint64_t features;
--
1.9.3
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH v2 3/4] contrib/libvhost-user: enable virtio config space messages
2017-08-10 10:12 ` [Qemu-devel] [PATCH v2 3/4] contrib/libvhost-user: enable virtio config space messages Changpeng Liu
@ 2017-08-09 18:34 ` Marc-André Lureau
0 siblings, 0 replies; 13+ messages in thread
From: Marc-André Lureau @ 2017-08-09 18:34 UTC (permalink / raw)
To: Changpeng Liu; +Cc: qemu-devel, stefanha, pbonzini, mst, felipe, james r harris
Hi
----- Original Message -----
> Enable VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG/VHOST_USER_SET_CONFIG_FD
> messages in libvhost-user library, users can implement their own I/O target
> based on the library. This enable the virtio config space delivered between
> Qemu host device and the I/O target, also event notifier is added in case
> of virtio config space changed.
>
> Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
> ---
> contrib/libvhost-user/libvhost-user.c | 51
> +++++++++++++++++++++++++++++++++++
> contrib/libvhost-user/libvhost-user.h | 14 ++++++++++
> 2 files changed, 65 insertions(+)
>
> diff --git a/contrib/libvhost-user/libvhost-user.c
> b/contrib/libvhost-user/libvhost-user.c
> index 9efb9da..002cf15 100644
> --- a/contrib/libvhost-user/libvhost-user.c
> +++ b/contrib/libvhost-user/libvhost-user.c
> @@ -63,6 +63,9 @@ vu_request_to_string(int req)
> REQ(VHOST_USER_SET_VRING_ENABLE),
> REQ(VHOST_USER_SEND_RARP),
> REQ(VHOST_USER_INPUT_GET_CONFIG),
> + REQ(VHOST_USER_GET_CONFIG),
> + REQ(VHOST_USER_SET_CONFIG),
> + REQ(VHOST_USER_SET_CONFIG_FD),
> REQ(VHOST_USER_MAX),
> };
> #undef REQ
> @@ -744,6 +747,43 @@ vu_set_vring_enable_exec(VuDev *dev, VhostUserMsg *vmsg)
> }
>
> static bool
> +vu_get_config(VuDev *dev, VhostUserMsg *vmsg)
> +{
> + if (dev->iface->get_config) {
> + dev->iface->get_config(dev, vmsg->payload.config, vmsg->size);
better check the return value on error to avoid sending garbage back to master.
> + }
> +
> + return true;
> +}
> +
> +static bool
> +vu_set_config(VuDev *dev, VhostUserMsg *vmsg)
> +{
> + if (dev->iface->set_config) {
> + dev->iface->set_config(dev, vmsg->payload.config, vmsg->size);
you could perhaps make that function return void instead (since error isn't reported to master)
> + }
> +
> + return false;
> +}
> +
> +static bool
> +vu_set_config_fd(VuDev *dev, VhostUserMsg *vmsg)
> +{
> + if (vmsg->fd_num != 1) {
> + vu_panic(dev, "Invalid config_fd message");
> + return false;
> + }
> +
> + if (dev->config_fd != -1) {
> + close(dev->config_fd);
> + }
> + dev->config_fd = vmsg->fds[0];
> + DPRINT("Got config_fd: %d\n", vmsg->fds[0]);
> +
> + return false;
> +}
> +
> +static bool
> vu_process_message(VuDev *dev, VhostUserMsg *vmsg)
> {
> int do_reply = 0;
> @@ -806,6 +846,12 @@ vu_process_message(VuDev *dev, VhostUserMsg *vmsg)
> return vu_get_queue_num_exec(dev, vmsg);
> case VHOST_USER_SET_VRING_ENABLE:
> return vu_set_vring_enable_exec(dev, vmsg);
> + case VHOST_USER_GET_CONFIG:
> + return vu_get_config(dev, vmsg);
> + case VHOST_USER_SET_CONFIG:
> + return vu_set_config(dev, vmsg);
> + case VHOST_USER_SET_CONFIG_FD:
> + return vu_set_config_fd(dev, vmsg);
> default:
> vmsg_close_fds(vmsg);
> vu_panic(dev, "Unhandled request: %d", vmsg->request);
> @@ -878,6 +924,10 @@ vu_deinit(VuDev *dev)
>
> vu_close_log(dev);
>
> + if (dev->config_fd != -1) {
> + close(dev->config_fd);
> + }
> +
> if (dev->sock != -1) {
> close(dev->sock);
> }
> @@ -907,6 +957,7 @@ vu_init(VuDev *dev,
> dev->remove_watch = remove_watch;
> dev->iface = iface;
> dev->log_call_fd = -1;
> + dev->config_fd = -1;
> for (i = 0; i < VHOST_MAX_NR_VIRTQUEUE; i++) {
> dev->vq[i] = (VuVirtq) {
> .call_fd = -1, .kick_fd = -1, .err_fd = -1,
> diff --git a/contrib/libvhost-user/libvhost-user.h
> b/contrib/libvhost-user/libvhost-user.h
> index 53ef222..899dee1 100644
> --- a/contrib/libvhost-user/libvhost-user.h
> +++ b/contrib/libvhost-user/libvhost-user.h
> @@ -30,6 +30,8 @@
>
> #define VHOST_MEMORY_MAX_NREGIONS 8
>
> +#define VHOST_USER_MAX_CONFIG_SIZE 256
> +
> enum VhostUserProtocolFeature {
> VHOST_USER_PROTOCOL_F_MQ = 0,
> VHOST_USER_PROTOCOL_F_LOG_SHMFD = 1,
> @@ -62,6 +64,9 @@ typedef enum VhostUserRequest {
> VHOST_USER_SET_VRING_ENABLE = 18,
> VHOST_USER_SEND_RARP = 19,
> VHOST_USER_INPUT_GET_CONFIG = 20,
> + VHOST_USER_GET_CONFIG = 24,
> + VHOST_USER_SET_CONFIG = 25,
> + VHOST_USER_SET_CONFIG_FD = 26,
> VHOST_USER_MAX
> } VhostUserRequest;
>
> @@ -105,6 +110,7 @@ typedef struct VhostUserMsg {
> struct vhost_vring_addr addr;
> VhostUserMemory memory;
> VhostUserLog log;
> + uint8_t config[VHOST_USER_MAX_CONFIG_SIZE];
> } payload;
>
> int fds[VHOST_MEMORY_MAX_NREGIONS];
> @@ -132,6 +138,9 @@ typedef void (*vu_set_features_cb) (VuDev *dev, uint64_t
> features);
> typedef int (*vu_process_msg_cb) (VuDev *dev, VhostUserMsg *vmsg,
> int *do_reply);
> typedef void (*vu_queue_set_started_cb) (VuDev *dev, int qidx, bool
> started);
> +typedef int (*vu_get_config_cb) (VuDev *dev, uint8_t *config, size_t len);
> +typedef int (*vu_set_config_cb) (VuDev *dev, const uint8_t *config,
> + size_t len);
>
> typedef struct VuDevIface {
> /* called by VHOST_USER_GET_FEATURES to get the features bitmask */
> @@ -148,6 +157,10 @@ typedef struct VuDevIface {
> vu_process_msg_cb process_msg;
> /* tells when queues can be processed */
> vu_queue_set_started_cb queue_set_started;
> + /* get the config space of the device */
> + vu_get_config_cb get_config;
> + /* set the config space of the device */
> + vu_set_config_cb set_config;
> } VuDevIface;
>
> typedef void (*vu_queue_handler_cb) (VuDev *dev, int qidx);
> @@ -212,6 +225,7 @@ struct VuDev {
> VuDevRegion regions[VHOST_MEMORY_MAX_NREGIONS];
> VuVirtq vq[VHOST_MAX_NR_VIRTQUEUE];
> int log_call_fd;
> + int config_fd;
> uint64_t log_size;
> uint8_t *log_table;
> uint64_t features;
> --
> 1.9.3
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH v2 4/4] contrib/vhost-user-blk: introduce a vhost-user-blk sample application
2017-08-10 10:12 [Qemu-devel] [PATCH v2 0/4] *** Introduce a new vhost-user-blk host device to Qemu *** Changpeng Liu
` (3 preceding siblings ...)
2017-08-10 10:12 ` [Qemu-devel] [PATCH v2 3/4] contrib/libvhost-user: enable virtio config space messages Changpeng Liu
@ 2017-08-10 10:12 ` Changpeng Liu
2017-08-09 18:27 ` Marc-André Lureau
4 siblings, 1 reply; 13+ messages in thread
From: Changpeng Liu @ 2017-08-10 10:12 UTC (permalink / raw)
To: changpeng.liu, qemu-devel
Cc: stefanha, pbonzini, mst, marcandre.lureau, felipe, james.r.harris
This commit introcudes a vhost-user-blk backend device, it uses UNIX
domain socket to communicate with Qemu. The vhost-user-blk sample
application should be used with Qemu vhost-user-blk-pci device.
To use it, complie with:
make vhost-user-blk
and start like this:
vhost-user-blk -b /dev/sdb -s /path/vhost.socket
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
---
.gitignore | 1 +
Makefile | 3 +
Makefile.objs | 2 +
contrib/vhost-user-blk/Makefile.objs | 1 +
contrib/vhost-user-blk/vhost-user-blk.c | 735 ++++++++++++++++++++++++++++++++
5 files changed, 742 insertions(+)
create mode 100644 contrib/vhost-user-blk/Makefile.objs
create mode 100644 contrib/vhost-user-blk/vhost-user-blk.c
diff --git a/.gitignore b/.gitignore
index cf65316..dbe5c13 100644
--- a/.gitignore
+++ b/.gitignore
@@ -51,6 +51,7 @@
/module_block.h
/vscclient
/vhost-user-scsi
+/vhost-user-blk
/fsdev/virtfs-proxy-helper
*.[1-9]
*.a
diff --git a/Makefile b/Makefile
index 97a58a0..e68e339 100644
--- a/Makefile
+++ b/Makefile
@@ -270,6 +270,7 @@ dummy := $(call unnest-vars,, \
ivshmem-server-obj-y \
libvhost-user-obj-y \
vhost-user-scsi-obj-y \
+ vhost-user-blk-obj-y \
qga-vss-dll-obj-y \
block-obj-y \
block-obj-m \
@@ -478,6 +479,8 @@ ivshmem-server$(EXESUF): $(ivshmem-server-obj-y) $(COMMON_LDADDS)
endif
vhost-user-scsi$(EXESUF): $(vhost-user-scsi-obj-y)
$(call LINK, $^)
+vhost-user-blk$(EXESUF): $(vhost-user-blk-obj-y)
+ $(call LINK, $^)
module_block.h: $(SRC_PATH)/scripts/modules/module_block.py config-host.mak
$(call quiet-command,$(PYTHON) $< $@ \
diff --git a/Makefile.objs b/Makefile.objs
index 24a4ea0..6b81548 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -114,6 +114,8 @@ vhost-user-scsi.o-cflags := $(LIBISCSI_CFLAGS)
vhost-user-scsi.o-libs := $(LIBISCSI_LIBS)
vhost-user-scsi-obj-y = contrib/vhost-user-scsi/
vhost-user-scsi-obj-y += contrib/libvhost-user/libvhost-user.o
+vhost-user-blk-obj-y = contrib/vhost-user-blk/
+vhost-user-blk-obj-y += contrib/libvhost-user/libvhost-user.o
######################################################################
trace-events-subdirs =
diff --git a/contrib/vhost-user-blk/Makefile.objs b/contrib/vhost-user-blk/Makefile.objs
new file mode 100644
index 0000000..72e2cdc
--- /dev/null
+++ b/contrib/vhost-user-blk/Makefile.objs
@@ -0,0 +1 @@
+vhost-user-blk-obj-y = vhost-user-blk.o
diff --git a/contrib/vhost-user-blk/vhost-user-blk.c b/contrib/vhost-user-blk/vhost-user-blk.c
new file mode 100644
index 0000000..9b90164
--- /dev/null
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -0,0 +1,735 @@
+/*
+ * vhost-user-blk sample application
+ *
+ * Copyright IBM, Corp. 2007
+ * Copyright (c) 2016 Nutanix Inc. All rights reserved.
+ * Copyright (c) 2017 Intel Corporation. All rights reserved.
+ *
+ * Author:
+ * Anthony Liguori <aliguori@us.ibm.com>
+ * Felipe Franciosi <felipe@nutanix.com>
+ * Changpeng Liu <changpeng.liu@intel.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 only.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/virtio/virtio-blk.h"
+#include "contrib/libvhost-user/libvhost-user.h"
+
+#include <glib.h>
+
+/* Small compat shim from glib 2.32 */
+#ifndef G_SOURCE_CONTINUE
+#define G_SOURCE_CONTINUE TRUE
+#endif
+#ifndef G_SOURCE_REMOVE
+#define G_SOURCE_REMOVE FALSE
+#endif
+
+/* And this is the final byte of request*/
+#define VIRTIO_BLK_S_OK 0
+#define VIRTIO_BLK_S_IOERR 1
+#define VIRTIO_BLK_S_UNSUPP 2
+
+typedef struct vhost_blk_dev {
+ VuDev vu_dev;
+ int server_sock;
+ int blk_fd;
+ struct virtio_blk_config blkcfg;
+ char *blk_name;
+ GMainLoop *loop;
+ GTree *fdmap; /* fd -> gsource context id */
+} vhost_blk_dev_t;
+
+typedef struct vhost_blk_request {
+ VuVirtqElement *elem;
+ int64_t sector_num;
+ size_t size;
+ struct virtio_blk_inhdr *in;
+ struct virtio_blk_outhdr *out;
+ vhost_blk_dev_t *vdev_blk;
+ struct VuVirtq *vq;
+} vhost_blk_request_t;
+
+/** refer util/iov.c **/
+static size_t vu_blk_iov_size(const struct iovec *iov,
+ const unsigned int iov_cnt)
+{
+ size_t len;
+ unsigned int i;
+
+ len = 0;
+ for (i = 0; i < iov_cnt; i++) {
+ len += iov[i].iov_len;
+ }
+ return len;
+}
+
+/** glib event loop integration for libvhost-user and misc callbacks **/
+
+QEMU_BUILD_BUG_ON((int)G_IO_IN != (int)VU_WATCH_IN);
+QEMU_BUILD_BUG_ON((int)G_IO_OUT != (int)VU_WATCH_OUT);
+QEMU_BUILD_BUG_ON((int)G_IO_PRI != (int)VU_WATCH_PRI);
+QEMU_BUILD_BUG_ON((int)G_IO_ERR != (int)VU_WATCH_ERR);
+QEMU_BUILD_BUG_ON((int)G_IO_HUP != (int)VU_WATCH_HUP);
+
+typedef struct vu_blk_gsrc {
+ GSource parent;
+ vhost_blk_dev_t *vdev_blk;
+ GPollFD gfd;
+ vu_watch_cb vu_cb;
+} vu_blk_gsrc_t;
+
+static gint vu_blk_fdmap_compare(gconstpointer a, gconstpointer b)
+{
+ return (b > a) - (b < a);
+}
+
+static gboolean vu_blk_gsrc_prepare(GSource *src, gint *timeout)
+{
+ assert(timeout);
+
+ *timeout = -1;
+ return FALSE;
+}
+
+static gboolean vu_blk_gsrc_check(GSource *src)
+{
+ vu_blk_gsrc_t *vu_blk_src = (vu_blk_gsrc_t *)src;
+
+ assert(vu_blk_src);
+
+ return vu_blk_src->gfd.revents & vu_blk_src->gfd.events;
+}
+
+static gboolean vu_blk_gsrc_dispatch(GSource *src,
+ GSourceFunc cb, gpointer data)
+{
+ vhost_blk_dev_t *vdev_blk;
+ vu_blk_gsrc_t *vu_blk_src = (vu_blk_gsrc_t *)src;
+
+ assert(vu_blk_src);
+ assert(!(vu_blk_src->vu_cb && cb));
+
+ vdev_blk = vu_blk_src->vdev_blk;
+
+ assert(vdev_blk);
+
+ if (cb) {
+ return cb(data);
+ }
+ if (vu_blk_src->vu_cb) {
+ vu_blk_src->vu_cb(&vdev_blk->vu_dev, vu_blk_src->gfd.revents, data);
+ }
+ return G_SOURCE_CONTINUE;
+}
+
+static GSourceFuncs vu_blk_gsrc_funcs = {
+ vu_blk_gsrc_prepare,
+ vu_blk_gsrc_check,
+ vu_blk_gsrc_dispatch,
+ NULL
+};
+
+static int vu_blk_gsrc_new(vhost_blk_dev_t *vdev_blk, int fd,
+ GIOCondition cond, vu_watch_cb vu_cb,
+ GSourceFunc gsrc_cb, gpointer data)
+{
+ GSource *vu_blk_gsrc;
+ vu_blk_gsrc_t *vu_blk_src;
+ guint id;
+
+ assert(vdev_blk);
+ assert(fd >= 0);
+ assert(vu_cb || gsrc_cb);
+ assert(!(vu_cb && gsrc_cb));
+
+ vu_blk_gsrc = g_source_new(&vu_blk_gsrc_funcs, sizeof(vu_blk_gsrc_t));
+ if (!vu_blk_gsrc) {
+ fprintf(stderr, "Error creating GSource for new watch\n");
+ return -1;
+ }
+ vu_blk_src = (vu_blk_gsrc_t *)vu_blk_gsrc;
+
+ vu_blk_src->vdev_blk = vdev_blk;
+ vu_blk_src->gfd.fd = fd;
+ vu_blk_src->gfd.events = cond;
+ vu_blk_src->vu_cb = vu_cb;
+
+ g_source_add_poll(vu_blk_gsrc, &vu_blk_src->gfd);
+ g_source_set_callback(vu_blk_gsrc, gsrc_cb, data, NULL);
+ id = g_source_attach(vu_blk_gsrc, NULL);
+ assert(id);
+ g_source_unref(vu_blk_gsrc);
+
+ g_tree_insert(vdev_blk->fdmap, (gpointer)(uintptr_t)fd,
+ (gpointer)(uintptr_t)id);
+
+ return 0;
+}
+
+static void vu_blk_panic_cb(VuDev *vu_dev, const char *buf)
+{
+ vhost_blk_dev_t *vdev_blk;
+
+ assert(vu_dev);
+
+ vdev_blk = container_of(vu_dev, vhost_blk_dev_t, vu_dev);
+
+ if (buf) {
+ fprintf(stderr, "vu_blk_panic_cb: %s\n", buf);
+ }
+
+ if (vdev_blk) {
+ assert(vdev_blk->loop);
+ g_main_loop_quit(vdev_blk->loop);
+ }
+}
+
+static void vu_blk_add_watch_cb(VuDev *vu_dev, int fd, int vu_evt,
+ vu_watch_cb cb, void *pvt) {
+ vhost_blk_dev_t *vdev_blk;
+ guint id;
+
+ assert(vu_dev);
+ assert(fd >= 0);
+ assert(cb);
+
+ vdev_blk = container_of(vu_dev, vhost_blk_dev_t, vu_dev);
+ if (!vdev_blk) {
+ vu_blk_panic_cb(vu_dev, NULL);
+ return;
+ }
+
+ id = (guint)(uintptr_t)g_tree_lookup(vdev_blk->fdmap,
+ (gpointer)(uintptr_t)fd);
+ if (id) {
+ GSource *vu_blk_src = g_main_context_find_source_by_id(NULL, id);
+ assert(vu_blk_src);
+ g_source_destroy(vu_blk_src);
+ (void)g_tree_remove(vdev_blk->fdmap, (gpointer)(uintptr_t)fd);
+ }
+
+ if (vu_blk_gsrc_new(vdev_blk, fd, vu_evt, cb, NULL, pvt)) {
+ vu_blk_panic_cb(vu_dev, NULL);
+ }
+}
+
+static void vu_blk_del_watch_cb(VuDev *vu_dev, int fd)
+{
+ vhost_blk_dev_t *vdev_blk;
+ guint id;
+
+ assert(vu_dev);
+ assert(fd >= 0);
+
+ vdev_blk = container_of(vu_dev, vhost_blk_dev_t, vu_dev);
+ if (!vdev_blk) {
+ vu_blk_panic_cb(vu_dev, NULL);
+ return;
+ }
+
+ id = (guint)(uintptr_t)g_tree_lookup(vdev_blk->fdmap,
+ (gpointer)(uintptr_t)fd);
+ if (id) {
+ GSource *vu_blk_src = g_main_context_find_source_by_id(NULL, id);
+ assert(vu_blk_src);
+ g_source_destroy(vu_blk_src);
+ (void)g_tree_remove(vdev_blk->fdmap, (gpointer)(uintptr_t)fd);
+ }
+}
+
+static void vu_blk_req_complete(vhost_blk_request_t *req)
+{
+ VuDev *vu_dev = &req->vdev_blk->vu_dev;
+
+ /* IO size with 1 extra status byte */
+ vu_queue_push(vu_dev, req->vq, req->elem,
+ req->size + 1);
+ vu_queue_notify(vu_dev, req->vq);
+
+ if (req->elem) {
+ free(req->elem);
+ }
+ if (req) {
+ free(req);
+ }
+}
+
+static int vu_blk_open(const char *file_name)
+{
+ int fd;
+
+ fd = open(file_name, O_RDWR | O_DIRECT);
+ if (fd < 0) {
+ fprintf(stderr, "Cannot open file %s, %s\n", file_name,
+ strerror(errno));
+ return -1;
+ }
+
+ return fd;
+}
+
+static void vu_blk_close(int fd)
+{
+ if (fd >= 0) {
+ close(fd);
+ }
+}
+
+static ssize_t
+vu_blk_readv(vhost_blk_request_t *req, struct iovec *iov, uint32_t iovcnt)
+{
+ vhost_blk_dev_t *vdev_blk = req->vdev_blk;
+ ssize_t rc;
+
+ if (!iovcnt) {
+ fprintf(stderr, "Invalid Read IOV count\n");
+ return -1;
+ }
+
+ req->size = vu_blk_iov_size(iov, iovcnt);
+ rc = preadv(vdev_blk->blk_fd, iov, iovcnt, req->sector_num * 512);
+ if (rc < 0) {
+ fprintf(stderr, "Block %s, Sector %"PRIu64", Size %lu Read Failed\n",
+ vdev_blk->blk_name, req->sector_num, req->size);
+ return -1;
+ }
+
+ return rc;
+}
+
+static ssize_t
+vu_blk_writev(vhost_blk_request_t *req, struct iovec *iov, uint32_t iovcnt)
+{
+ vhost_blk_dev_t *vdev_blk = req->vdev_blk;
+ ssize_t rc;
+
+ if (!iovcnt) {
+ fprintf(stderr, "Invalid Write IOV count\n");
+ return -1;
+ }
+
+ req->size = vu_blk_iov_size(iov, iovcnt);
+ rc = pwritev(vdev_blk->blk_fd, iov, iovcnt, req->sector_num * 512);
+ if (rc < 0) {
+ fprintf(stderr, "Block %s, Sector %"PRIu64", Size %lu Write Failed\n",
+ vdev_blk->blk_name, req->sector_num, req->size);
+ return -1;
+ }
+
+ return rc;
+}
+
+static void
+vu_blk_flush(vhost_blk_request_t *req)
+{
+ vhost_blk_dev_t *vdev_blk = req->vdev_blk;
+
+ if (vdev_blk->blk_fd) {
+ fsync(vdev_blk->blk_fd);
+ }
+}
+
+
+static int vu_virtio_blk_process_req(vhost_blk_dev_t *vdev_blk,
+ VuVirtq *vq)
+{
+ VuVirtqElement *elem;
+ uint32_t type;
+ unsigned in_num;
+ unsigned out_num;
+ vhost_blk_request_t *req;
+
+ elem = vu_queue_pop(&vdev_blk->vu_dev, vq, sizeof(VuVirtqElement));
+ if (!elem) {
+ return -1;
+ }
+
+ /* refer to hw/block/virtio_blk.c */
+ if (elem->out_num < 1 || elem->in_num < 1) {
+ fprintf(stderr, "virtio-blk request missing headers\n");
+ free(elem);
+ return -1;
+ }
+
+ req = calloc(1, sizeof(*req));
+ assert(req);
+ req->vdev_blk = vdev_blk;
+ req->vq = vq;
+ req->elem = elem;
+
+ in_num = elem->in_num;
+ out_num = elem->out_num;
+
+ /* don't support VIRTIO_F_ANY_LAYOUT and virtio 1.0 only */
+ if (elem->out_sg[0].iov_len < sizeof(struct virtio_blk_outhdr)) {
+ fprintf(stderr, "Invalid outhdr size\n");
+ goto err;
+ }
+ req->out = (struct virtio_blk_outhdr *)elem->out_sg[0].iov_base;
+ out_num--;
+
+ if (elem->in_sg[in_num - 1].iov_len < sizeof(struct virtio_blk_inhdr)) {
+ fprintf(stderr, "Invalid inhdr size\n");
+ goto err;
+ }
+ req->in = (struct virtio_blk_inhdr *)elem->in_sg[in_num - 1].iov_base;
+ in_num--;
+
+ type = le32_to_cpu(req->out->type);
+ switch (type & ~(VIRTIO_BLK_T_OUT | VIRTIO_BLK_T_BARRIER)) {
+ case VIRTIO_BLK_T_IN: {
+ ssize_t ret = 0;
+ bool is_write = type & VIRTIO_BLK_T_OUT;
+ req->sector_num = le64_to_cpu(req->out->sector);
+ if (is_write) {
+ ret = vu_blk_writev(req, &elem->out_sg[1], out_num);
+ } else {
+ ret = vu_blk_readv(req, &elem->in_sg[0], in_num);
+ }
+ if (ret >= 0) {
+ req->in->status = VIRTIO_BLK_S_OK;
+ } else {
+ req->in->status = VIRTIO_BLK_S_IOERR;
+ }
+ vu_blk_req_complete(req);
+ break;
+ }
+ case VIRTIO_BLK_T_FLUSH: {
+ vu_blk_flush(req);
+ req->in->status = VIRTIO_BLK_S_OK;
+ vu_blk_req_complete(req);
+ break;
+ }
+ case VIRTIO_BLK_T_GET_ID: {
+ size_t size = MIN(vu_blk_iov_size(&elem->in_sg[0], in_num),
+ VIRTIO_BLK_ID_BYTES);
+ snprintf(elem->in_sg[0].iov_base, size, "%s", "vhost_user_blk");
+ req->in->status = VIRTIO_BLK_S_OK;
+ req->size = elem->in_sg[0].iov_len;
+ vu_blk_req_complete(req);
+ break;
+ }
+ default: {
+ req->in->status = VIRTIO_BLK_S_UNSUPP;
+ vu_blk_req_complete(req);
+ break;
+ }
+ }
+
+ return 0;
+
+err:
+ free(elem);
+ free(req);
+ return -1;
+}
+
+static void vu_blk_process_vq(VuDev *vu_dev, int idx)
+{
+ vhost_blk_dev_t *vdev_blk;
+ VuVirtq *vq;
+ int ret;
+
+ if ((idx < 0) || (idx >= VHOST_MAX_NR_VIRTQUEUE)) {
+ fprintf(stderr, "VQ Index out of range: %d\n", idx);
+ vu_blk_panic_cb(vu_dev, NULL);
+ return;
+ }
+
+ vdev_blk = container_of(vu_dev, vhost_blk_dev_t, vu_dev);
+ assert(vdev_blk);
+
+ vq = vu_get_queue(vu_dev, idx);
+ assert(vq);
+
+ while (1) {
+ ret = vu_virtio_blk_process_req(vdev_blk, vq);
+ if (ret) {
+ break;
+ }
+ }
+}
+
+static void vu_blk_queue_set_started(VuDev *vu_dev, int idx, bool started)
+{
+ VuVirtq *vq;
+
+ assert(vu_dev);
+
+ if ((idx < 0) || (idx >= VHOST_MAX_NR_VIRTQUEUE)) {
+ fprintf(stderr, "VQ Index out of range: %d\n", idx);
+ vu_blk_panic_cb(vu_dev, NULL);
+ return;
+ }
+
+ vq = vu_get_queue(vu_dev, idx);
+ vu_set_queue_handler(vu_dev, vq, started ? vu_blk_process_vq : NULL);
+}
+
+static uint64_t
+vu_blk_get_features(VuDev *dev)
+{
+ return 1ull << VIRTIO_BLK_F_SIZE_MAX |
+ 1ull << VIRTIO_BLK_F_SEG_MAX |
+ 1ull << VIRTIO_BLK_F_TOPOLOGY |
+ 1ull << VIRTIO_BLK_F_BLK_SIZE |
+ 1ull << VIRTIO_F_VERSION_1 |
+ 1ull << VHOST_USER_F_PROTOCOL_FEATURES;
+}
+
+static int
+vu_blk_get_config(VuDev *vu_dev, uint8_t *config, size_t len)
+{
+ vhost_blk_dev_t *vdev_blk;
+
+ if (len != sizeof(struct virtio_blk_config)) {
+ return -1;
+ }
+ vdev_blk = container_of(vu_dev, vhost_blk_dev_t, vu_dev);
+ memcpy(config, &vdev_blk->blkcfg, len);
+
+ return 0;
+}
+
+static const VuDevIface vu_blk_iface = {
+ .get_features = vu_blk_get_features,
+ .queue_set_started = vu_blk_queue_set_started,
+ .get_config = vu_blk_get_config,
+};
+
+static gboolean vu_blk_vhost_cb(gpointer data)
+{
+ VuDev *vu_dev = (VuDev *)data;
+
+ assert(vu_dev);
+
+ if (!vu_dispatch(vu_dev) != 0) {
+ fprintf(stderr, "Error processing vhost message\n");
+ vu_blk_panic_cb(vu_dev, NULL);
+ return G_SOURCE_REMOVE;
+ }
+
+ return G_SOURCE_CONTINUE;
+}
+
+static int unix_sock_new(char *unix_fn)
+{
+ int sock;
+ struct sockaddr_un un;
+ size_t len;
+
+ assert(unix_fn);
+
+ sock = socket(AF_UNIX, SOCK_STREAM, 0);
+ if (sock <= 0) {
+ perror("socket");
+ return -1;
+ }
+
+ un.sun_family = AF_UNIX;
+ (void)snprintf(un.sun_path, sizeof(un.sun_path), "%s", unix_fn);
+ len = sizeof(un.sun_family) + strlen(un.sun_path);
+
+ (void)unlink(unix_fn);
+ if (bind(sock, (struct sockaddr *)&un, len) < 0) {
+ perror("bind");
+ goto fail;
+ }
+
+ if (listen(sock, 1) < 0) {
+ perror("listen");
+ goto fail;
+ }
+
+ return sock;
+
+fail:
+ (void)close(sock);
+
+ return -1;
+}
+
+static int vdev_blk_run(struct vhost_blk_dev *vdev_blk)
+{
+ int cli_sock;
+ int ret = 0;
+
+ assert(vdev_blk);
+ assert(vdev_blk->server_sock >= 0);
+ assert(vdev_blk->loop);
+
+ cli_sock = accept(vdev_blk->server_sock, (void *)0, (void *)0);
+ if (cli_sock < 0) {
+ perror("accept");
+ return -1;
+ }
+
+ vu_init(&vdev_blk->vu_dev,
+ cli_sock,
+ vu_blk_panic_cb,
+ vu_blk_add_watch_cb,
+ vu_blk_del_watch_cb,
+ &vu_blk_iface);
+
+ if (vu_blk_gsrc_new(vdev_blk, cli_sock, G_IO_IN, NULL, vu_blk_vhost_cb,
+ &vdev_blk->vu_dev)) {
+ ret = -1;
+ goto out;
+ }
+
+ g_main_loop_run(vdev_blk->loop);
+
+out:
+ vu_deinit(&vdev_blk->vu_dev);
+ return ret;
+}
+
+static void vdev_blk_deinit(struct vhost_blk_dev *vdev_blk)
+{
+ if (!vdev_blk) {
+ return;
+ }
+
+ if (vdev_blk->server_sock >= 0) {
+ struct sockaddr_storage ss;
+ socklen_t sslen = sizeof(ss);
+
+ if (getsockname(vdev_blk->server_sock, (struct sockaddr *)&ss,
+ &sslen) == 0) {
+ struct sockaddr_un *su = (struct sockaddr_un *)&ss;
+ (void)unlink(su->sun_path);
+ }
+
+ (void)close(vdev_blk->server_sock);
+ vdev_blk->server_sock = -1;
+ }
+
+ if (vdev_blk->loop) {
+ g_main_loop_unref(vdev_blk->loop);
+ vdev_blk->loop = NULL;
+ }
+
+ if (vdev_blk->blk_fd) {
+ vu_blk_close(vdev_blk->blk_fd);
+ }
+}
+
+static void
+vu_blk_initialize_config(int fd, struct virtio_blk_config *config)
+{
+ off64_t capacity;
+
+ capacity = lseek64(fd, 0, SEEK_END);
+ config->capacity = capacity >> 9;
+ config->blk_size = 512;
+ config->size_max = 65536;
+ config->seg_max = 128 - 2;
+ config->min_io_size = 1;
+ config->opt_io_size = 1;
+ config->num_queues = 1;
+}
+
+static vhost_blk_dev_t *
+vdev_blk_new(char *unix_fn, char *blk_file)
+{
+ vhost_blk_dev_t *vdev_blk = NULL;
+
+ vdev_blk = calloc(1, sizeof(struct vhost_blk_dev));
+ if (!vdev_blk) {
+ fprintf(stderr, "calloc: %s", strerror(errno));
+ return NULL;
+ }
+
+ vdev_blk->server_sock = unix_sock_new(unix_fn);
+ if (vdev_blk->server_sock < 0) {
+ goto err;
+ }
+
+ vdev_blk->loop = g_main_loop_new(NULL, FALSE);
+ if (!vdev_blk->loop) {
+ fprintf(stderr, "Error creating glib event loop");
+ goto err;
+ }
+
+ vdev_blk->fdmap = g_tree_new(vu_blk_fdmap_compare);
+ if (!vdev_blk->fdmap) {
+ fprintf(stderr, "Error creating glib tree for fdmap");
+ goto err;
+ }
+
+ vdev_blk->blk_fd = vu_blk_open(blk_file);
+ if (vdev_blk->blk_fd < 0) {
+ fprintf(stderr, "Error open block device %s\n", blk_file);
+ goto err;
+ }
+ vdev_blk->blk_name = blk_file;
+
+ /* fill virtio_blk_config with block parameters */
+ vu_blk_initialize_config(vdev_blk->blk_fd, &vdev_blk->blkcfg);
+
+ return vdev_blk;
+
+err:
+ vdev_blk_deinit(vdev_blk);
+ free(vdev_blk);
+
+ return NULL;
+}
+
+int main(int argc, char **argv)
+{
+ int opt;
+ char *unix_socket = NULL;
+ char *blk_file = NULL;
+ vhost_blk_dev_t *vdev_blk = NULL;
+
+ while ((opt = getopt(argc, argv, "b:h:s:")) != -1) {
+ switch (opt) {
+ case 'b':
+ blk_file = strdup(optarg);
+ break;
+ case 's':
+ unix_socket = strdup(optarg);
+ break;
+ case 'h':
+ default:
+ printf("Usage: %s [-b block device or file, -s UNIX domain socket]"
+ " | [ -h ]\n", argv[0]);
+ break;
+ }
+ }
+
+ if (!unix_socket || !blk_file) {
+ printf("Usage: %s [-b block device or file, -s UNIX domain socket] |"
+ " [ -h ]\n", argv[0]);
+ return -1;
+ }
+
+ vdev_blk = vdev_blk_new(unix_socket, blk_file);
+ if (!vdev_blk) {
+ goto err;
+ }
+
+ if (vdev_blk_run(vdev_blk) != 0) {
+ goto err;
+ }
+
+err:
+ if (vdev_blk) {
+ vdev_blk_deinit(vdev_blk);
+ free(vdev_blk);
+ }
+ if (unix_socket) {
+ free(unix_socket);
+ }
+ if (blk_file) {
+ free(blk_file);
+ }
+
+ return 0;
+}
+
--
1.9.3
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH v2 4/4] contrib/vhost-user-blk: introduce a vhost-user-blk sample application
2017-08-10 10:12 ` [Qemu-devel] [PATCH v2 4/4] contrib/vhost-user-blk: introduce a vhost-user-blk sample application Changpeng Liu
@ 2017-08-09 18:27 ` Marc-André Lureau
0 siblings, 0 replies; 13+ messages in thread
From: Marc-André Lureau @ 2017-08-09 18:27 UTC (permalink / raw)
To: Changpeng Liu; +Cc: qemu-devel, stefanha, pbonzini, mst, felipe, james r harris
Hi
----- Original Message -----
> This commit introcudes a vhost-user-blk backend device, it uses UNIX
> domain socket to communicate with Qemu. The vhost-user-blk sample
> application should be used with Qemu vhost-user-blk-pci device.
>
> To use it, complie with:
> make vhost-user-blk
>
> and start like this:
> vhost-user-blk -b /dev/sdb -s /path/vhost.socket
I guess it could be a regular file instead (fallocate/trunc to desired size).
>
> Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
> ---
> .gitignore | 1 +
> Makefile | 3 +
> Makefile.objs | 2 +
> contrib/vhost-user-blk/Makefile.objs | 1 +
> contrib/vhost-user-blk/vhost-user-blk.c | 735
> ++++++++++++++++++++++++++++++++
> 5 files changed, 742 insertions(+)
> create mode 100644 contrib/vhost-user-blk/Makefile.objs
> create mode 100644 contrib/vhost-user-blk/vhost-user-blk.c
>
> diff --git a/.gitignore b/.gitignore
> index cf65316..dbe5c13 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -51,6 +51,7 @@
> /module_block.h
> /vscclient
> /vhost-user-scsi
> +/vhost-user-blk
> /fsdev/virtfs-proxy-helper
> *.[1-9]
> *.a
> diff --git a/Makefile b/Makefile
> index 97a58a0..e68e339 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -270,6 +270,7 @@ dummy := $(call unnest-vars,, \
> ivshmem-server-obj-y \
> libvhost-user-obj-y \
> vhost-user-scsi-obj-y \
> + vhost-user-blk-obj-y \
> qga-vss-dll-obj-y \
> block-obj-y \
> block-obj-m \
> @@ -478,6 +479,8 @@ ivshmem-server$(EXESUF): $(ivshmem-server-obj-y)
> $(COMMON_LDADDS)
> endif
> vhost-user-scsi$(EXESUF): $(vhost-user-scsi-obj-y)
> $(call LINK, $^)
> +vhost-user-blk$(EXESUF): $(vhost-user-blk-obj-y)
> + $(call LINK, $^)
>
> module_block.h: $(SRC_PATH)/scripts/modules/module_block.py config-host.mak
> $(call quiet-command,$(PYTHON) $< $@ \
> diff --git a/Makefile.objs b/Makefile.objs
> index 24a4ea0..6b81548 100644
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@ -114,6 +114,8 @@ vhost-user-scsi.o-cflags := $(LIBISCSI_CFLAGS)
> vhost-user-scsi.o-libs := $(LIBISCSI_LIBS)
> vhost-user-scsi-obj-y = contrib/vhost-user-scsi/
> vhost-user-scsi-obj-y += contrib/libvhost-user/libvhost-user.o
> +vhost-user-blk-obj-y = contrib/vhost-user-blk/
> +vhost-user-blk-obj-y += contrib/libvhost-user/libvhost-user.o
>
> ######################################################################
> trace-events-subdirs =
> diff --git a/contrib/vhost-user-blk/Makefile.objs
> b/contrib/vhost-user-blk/Makefile.objs
> new file mode 100644
> index 0000000..72e2cdc
> --- /dev/null
> +++ b/contrib/vhost-user-blk/Makefile.objs
> @@ -0,0 +1 @@
> +vhost-user-blk-obj-y = vhost-user-blk.o
> diff --git a/contrib/vhost-user-blk/vhost-user-blk.c
> b/contrib/vhost-user-blk/vhost-user-blk.c
My bad I didn't review vhost-user-scsi.c and you reproduce a lot of code here.
Imho, there is no need for memory allocation failure check; it's a test app & glib will terminate if allocation fails anyway.
I should also say that libvhost-user is supposed to be glib-free, and it's not fully (it is 99%). That also create some confusion I believe. And some docs is lacking.
> new file mode 100644
> index 0000000..9b90164
> --- /dev/null
> +++ b/contrib/vhost-user-blk/vhost-user-blk.c
> @@ -0,0 +1,735 @@
> +/*
> + * vhost-user-blk sample application
> + *
> + * Copyright IBM, Corp. 2007
> + * Copyright (c) 2016 Nutanix Inc. All rights reserved.
> + * Copyright (c) 2017 Intel Corporation. All rights reserved.
> + *
> + * Author:
> + * Anthony Liguori <aliguori@us.ibm.com>
> + * Felipe Franciosi <felipe@nutanix.com>
> + * Changpeng Liu <changpeng.liu@intel.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 only.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "hw/virtio/virtio-blk.h"
> +#include "contrib/libvhost-user/libvhost-user.h"
> +
> +#include <glib.h>
> +
> +/* Small compat shim from glib 2.32 */
> +#ifndef G_SOURCE_CONTINUE
> +#define G_SOURCE_CONTINUE TRUE
> +#endif
> +#ifndef G_SOURCE_REMOVE
> +#define G_SOURCE_REMOVE FALSE
> +#endif
> +
Should probably be in glib-compat.h
> +/* And this is the final byte of request*/
> +#define VIRTIO_BLK_S_OK 0
> +#define VIRTIO_BLK_S_IOERR 1
> +#define VIRTIO_BLK_S_UNSUPP 2
> +
> +typedef struct vhost_blk_dev {
> + VuDev vu_dev;
> + int server_sock;
> + int blk_fd;
> + struct virtio_blk_config blkcfg;
> + char *blk_name;
> + GMainLoop *loop;
> + GTree *fdmap; /* fd -> gsource context id */
why a tree? I would rather have hashmap, or even a fixed size array, since the app isn't supposed to have so many FD open...
> +} vhost_blk_dev_t;
> +
> +typedef struct vhost_blk_request {
> + VuVirtqElement *elem;
> + int64_t sector_num;
> + size_t size;
> + struct virtio_blk_inhdr *in;
> + struct virtio_blk_outhdr *out;
> + vhost_blk_dev_t *vdev_blk;
> + struct VuVirtq *vq;
> +} vhost_blk_request_t;
> +
> +/** refer util/iov.c **/
> +static size_t vu_blk_iov_size(const struct iovec *iov,
> + const unsigned int iov_cnt)
> +{
> + size_t len;
> + unsigned int i;
> +
> + len = 0;
> + for (i = 0; i < iov_cnt; i++) {
> + len += iov[i].iov_len;
> + }
> + return len;
> +}
Hmm I wonder if you could link with libqemuutil or just util/iov.c. Not worth the change for that simple function though.
> +
> +/** glib event loop integration for libvhost-user and misc callbacks **/
> +
> +QEMU_BUILD_BUG_ON((int)G_IO_IN != (int)VU_WATCH_IN);
> +QEMU_BUILD_BUG_ON((int)G_IO_OUT != (int)VU_WATCH_OUT);
> +QEMU_BUILD_BUG_ON((int)G_IO_PRI != (int)VU_WATCH_PRI);
> +QEMU_BUILD_BUG_ON((int)G_IO_ERR != (int)VU_WATCH_ERR);
> +QEMU_BUILD_BUG_ON((int)G_IO_HUP != (int)VU_WATCH_HUP);
We may want to add this in libvhost-user.c for now.
> +
> +typedef struct vu_blk_gsrc {
> + GSource parent;
> + vhost_blk_dev_t *vdev_blk;
> + GPollFD gfd;
> + vu_watch_cb vu_cb;
> +} vu_blk_gsrc_t;
> +
> +static gint vu_blk_fdmap_compare(gconstpointer a, gconstpointer b)
> +{
> + return (b > a) - (b < a);
> +}
interesting, but a - b should probably work fine too, no?
> +
> +static gboolean vu_blk_gsrc_prepare(GSource *src, gint *timeout)
> +{
> + assert(timeout);
> +
> + *timeout = -1;
> + return FALSE;
> +}
> +
> +static gboolean vu_blk_gsrc_check(GSource *src)
> +{
> + vu_blk_gsrc_t *vu_blk_src = (vu_blk_gsrc_t *)src;
> +
> + assert(vu_blk_src);
> +
> + return vu_blk_src->gfd.revents & vu_blk_src->gfd.events;
> +}
> +
> +static gboolean vu_blk_gsrc_dispatch(GSource *src,
> + GSourceFunc cb, gpointer data)
> +{
> + vhost_blk_dev_t *vdev_blk;
> + vu_blk_gsrc_t *vu_blk_src = (vu_blk_gsrc_t *)src;
> +
> + assert(vu_blk_src);
> + assert(!(vu_blk_src->vu_cb && cb));
> +
> + vdev_blk = vu_blk_src->vdev_blk;
> +
> + assert(vdev_blk);
> +
> + if (cb) {
> + return cb(data);
> + }
> + if (vu_blk_src->vu_cb) {
> + vu_blk_src->vu_cb(&vdev_blk->vu_dev, vu_blk_src->gfd.revents, data);
> + }
That's quite weird for a source dispatch. You are supposed to call only the GSourceFunc cb, casting it to the required type if necessary afaik.
> + return G_SOURCE_CONTINUE;
> +}
> +
> +static GSourceFuncs vu_blk_gsrc_funcs = {
> + vu_blk_gsrc_prepare,
> + vu_blk_gsrc_check,
> + vu_blk_gsrc_dispatch,
> + NULL
> +};
> +
> +static int vu_blk_gsrc_new(vhost_blk_dev_t *vdev_blk, int fd,
> + GIOCondition cond, vu_watch_cb vu_cb,
> + GSourceFunc gsrc_cb, gpointer data)
> +{
> + GSource *vu_blk_gsrc;
> + vu_blk_gsrc_t *vu_blk_src;
> + guint id;
> +
> + assert(vdev_blk);
> + assert(fd >= 0);
> + assert(vu_cb || gsrc_cb);
> + assert(!(vu_cb && gsrc_cb));
> +
> + vu_blk_gsrc = g_source_new(&vu_blk_gsrc_funcs, sizeof(vu_blk_gsrc_t));
> + if (!vu_blk_gsrc) {
> + fprintf(stderr, "Error creating GSource for new watch\n");
> + return -1;
> + }
> + vu_blk_src = (vu_blk_gsrc_t *)vu_blk_gsrc;
> +
> + vu_blk_src->vdev_blk = vdev_blk;
> + vu_blk_src->gfd.fd = fd;
> + vu_blk_src->gfd.events = cond;
> + vu_blk_src->vu_cb = vu_cb;
> +
> + g_source_add_poll(vu_blk_gsrc, &vu_blk_src->gfd);
> + g_source_set_callback(vu_blk_gsrc, gsrc_cb, data, NULL);
oh, so the source can handle both case. Not how you are supposed to use GSource. I should patch vhost-user-scsi to simplify the code, or feel free to do it before I do if you know how to fix it.
> + id = g_source_attach(vu_blk_gsrc, NULL);
> + assert(id);
> + g_source_unref(vu_blk_gsrc);
> +
> + g_tree_insert(vdev_blk->fdmap, (gpointer)(uintptr_t)fd,
> + (gpointer)(uintptr_t)id);
> +
> + return 0;
> +}
> +
> +static void vu_blk_panic_cb(VuDev *vu_dev, const char *buf)
> +{
> + vhost_blk_dev_t *vdev_blk;
> +
> + assert(vu_dev);
> +
> + vdev_blk = container_of(vu_dev, vhost_blk_dev_t, vu_dev);
> +
> + if (buf) {
> + fprintf(stderr, "vu_blk_panic_cb: %s\n", buf);
> + }
> +
> + if (vdev_blk) {
> + assert(vdev_blk->loop);
> + g_main_loop_quit(vdev_blk->loop);
> + }
> +}
> +
> +static void vu_blk_add_watch_cb(VuDev *vu_dev, int fd, int vu_evt,
> + vu_watch_cb cb, void *pvt) {
> + vhost_blk_dev_t *vdev_blk;
> + guint id;
> +
> + assert(vu_dev);
> + assert(fd >= 0);
> + assert(cb);
> +
> + vdev_blk = container_of(vu_dev, vhost_blk_dev_t, vu_dev);
> + if (!vdev_blk) {
> + vu_blk_panic_cb(vu_dev, NULL);
> + return;
> + }
> +
> + id = (guint)(uintptr_t)g_tree_lookup(vdev_blk->fdmap,
> + (gpointer)(uintptr_t)fd);
> + if (id) {
> + GSource *vu_blk_src = g_main_context_find_source_by_id(NULL, id);
> + assert(vu_blk_src);
> + g_source_destroy(vu_blk_src);
> + (void)g_tree_remove(vdev_blk->fdmap, (gpointer)(uintptr_t)fd);
> + }
> +
> + if (vu_blk_gsrc_new(vdev_blk, fd, vu_evt, cb, NULL, pvt)) {
> + vu_blk_panic_cb(vu_dev, NULL);
g_io_add_watch() should be enough I suppose (or g_unix_fd_add but it requires 2.36)
> + }
> +}
> +
> +static void vu_blk_del_watch_cb(VuDev *vu_dev, int fd)
> +{
> + vhost_blk_dev_t *vdev_blk;
> + guint id;
> +
> + assert(vu_dev);
> + assert(fd >= 0);
> +
> + vdev_blk = container_of(vu_dev, vhost_blk_dev_t, vu_dev);
> + if (!vdev_blk) {
> + vu_blk_panic_cb(vu_dev, NULL);
> + return;
> + }
> +
> + id = (guint)(uintptr_t)g_tree_lookup(vdev_blk->fdmap,
> + (gpointer)(uintptr_t)fd);
> + if (id) {
> + GSource *vu_blk_src = g_main_context_find_source_by_id(NULL, id);
> + assert(vu_blk_src);
> + g_source_destroy(vu_blk_src);
> + (void)g_tree_remove(vdev_blk->fdmap, (gpointer)(uintptr_t)fd);
> + }
> +}
> +
> +static void vu_blk_req_complete(vhost_blk_request_t *req)
> +{
> + VuDev *vu_dev = &req->vdev_blk->vu_dev;
> +
> + /* IO size with 1 extra status byte */
> + vu_queue_push(vu_dev, req->vq, req->elem,
> + req->size + 1);
> + vu_queue_notify(vu_dev, req->vq);
> +
> + if (req->elem) {
> + free(req->elem);
> + }
I realize we need to better document vu_queue_pop() about allocation, and probably introduce vu_queue_elem_free().
> + if (req) {
> + free(req);
> + }
> +}
> +
> +static int vu_blk_open(const char *file_name)
> +{
> + int fd;
> +
> + fd = open(file_name, O_RDWR | O_DIRECT);
> + if (fd < 0) {
> + fprintf(stderr, "Cannot open file %s, %s\n", file_name,
> + strerror(errno));
> + return -1;
> + }
> +
> + return fd;
> +}
> +
> +static void vu_blk_close(int fd)
> +{
> + if (fd >= 0) {
> + close(fd);
> + }
> +}
> +
> +static ssize_t
> +vu_blk_readv(vhost_blk_request_t *req, struct iovec *iov, uint32_t iovcnt)
> +{
> + vhost_blk_dev_t *vdev_blk = req->vdev_blk;
> + ssize_t rc;
> +
> + if (!iovcnt) {
> + fprintf(stderr, "Invalid Read IOV count\n");
> + return -1;
> + }
> +
> + req->size = vu_blk_iov_size(iov, iovcnt);
> + rc = preadv(vdev_blk->blk_fd, iov, iovcnt, req->sector_num * 512);
> + if (rc < 0) {
> + fprintf(stderr, "Block %s, Sector %"PRIu64", Size %lu Read
> Failed\n",
> + vdev_blk->blk_name, req->sector_num, req->size);
> + return -1;
> + }
> +
> + return rc;
> +}
> +
> +static ssize_t
> +vu_blk_writev(vhost_blk_request_t *req, struct iovec *iov, uint32_t iovcnt)
> +{
> + vhost_blk_dev_t *vdev_blk = req->vdev_blk;
> + ssize_t rc;
> +
> + if (!iovcnt) {
> + fprintf(stderr, "Invalid Write IOV count\n");
> + return -1;
> + }
> +
> + req->size = vu_blk_iov_size(iov, iovcnt);
> + rc = pwritev(vdev_blk->blk_fd, iov, iovcnt, req->sector_num * 512);
> + if (rc < 0) {
> + fprintf(stderr, "Block %s, Sector %"PRIu64", Size %lu Write
> Failed\n",
> + vdev_blk->blk_name, req->sector_num, req->size);
> + return -1;
> + }
> +
> + return rc;
> +}
> +
> +static void
> +vu_blk_flush(vhost_blk_request_t *req)
> +{
> + vhost_blk_dev_t *vdev_blk = req->vdev_blk;
> +
> + if (vdev_blk->blk_fd) {
> + fsync(vdev_blk->blk_fd);
> + }
> +}
> +
> +
> +static int vu_virtio_blk_process_req(vhost_blk_dev_t *vdev_blk,
> + VuVirtq *vq)
> +{
> + VuVirtqElement *elem;
> + uint32_t type;
> + unsigned in_num;
> + unsigned out_num;
> + vhost_blk_request_t *req;
> +
> + elem = vu_queue_pop(&vdev_blk->vu_dev, vq, sizeof(VuVirtqElement));
> + if (!elem) {
> + return -1;
> + }
> +
> + /* refer to hw/block/virtio_blk.c */
> + if (elem->out_num < 1 || elem->in_num < 1) {
> + fprintf(stderr, "virtio-blk request missing headers\n");
> + free(elem);
> + return -1;
> + }
> +
> + req = calloc(1, sizeof(*req));
> + assert(req);
> + req->vdev_blk = vdev_blk;
> + req->vq = vq;
> + req->elem = elem;
> +
> + in_num = elem->in_num;
> + out_num = elem->out_num;
> +
> + /* don't support VIRTIO_F_ANY_LAYOUT and virtio 1.0 only */
> + if (elem->out_sg[0].iov_len < sizeof(struct virtio_blk_outhdr)) {
> + fprintf(stderr, "Invalid outhdr size\n");
> + goto err;
> + }
> + req->out = (struct virtio_blk_outhdr *)elem->out_sg[0].iov_base;
> + out_num--;
> +
> + if (elem->in_sg[in_num - 1].iov_len < sizeof(struct virtio_blk_inhdr)) {
> + fprintf(stderr, "Invalid inhdr size\n");
> + goto err;
> + }
> + req->in = (struct virtio_blk_inhdr *)elem->in_sg[in_num - 1].iov_base;
> + in_num--;
> +
> + type = le32_to_cpu(req->out->type);
> + switch (type & ~(VIRTIO_BLK_T_OUT | VIRTIO_BLK_T_BARRIER)) {
> + case VIRTIO_BLK_T_IN: {
> + ssize_t ret = 0;
> + bool is_write = type & VIRTIO_BLK_T_OUT;
> + req->sector_num = le64_to_cpu(req->out->sector);
> + if (is_write) {
> + ret = vu_blk_writev(req, &elem->out_sg[1], out_num);
> + } else {
> + ret = vu_blk_readv(req, &elem->in_sg[0], in_num);
> + }
> + if (ret >= 0) {
> + req->in->status = VIRTIO_BLK_S_OK;
> + } else {
> + req->in->status = VIRTIO_BLK_S_IOERR;
> + }
> + vu_blk_req_complete(req);
> + break;
> + }
> + case VIRTIO_BLK_T_FLUSH: {
> + vu_blk_flush(req);
> + req->in->status = VIRTIO_BLK_S_OK;
> + vu_blk_req_complete(req);
> + break;
> + }
> + case VIRTIO_BLK_T_GET_ID: {
> + size_t size = MIN(vu_blk_iov_size(&elem->in_sg[0], in_num),
> + VIRTIO_BLK_ID_BYTES);
> + snprintf(elem->in_sg[0].iov_base, size, "%s", "vhost_user_blk");
> + req->in->status = VIRTIO_BLK_S_OK;
> + req->size = elem->in_sg[0].iov_len;
> + vu_blk_req_complete(req);
> + break;
> + }
> + default: {
> + req->in->status = VIRTIO_BLK_S_UNSUPP;
> + vu_blk_req_complete(req);
> + break;
> + }
> + }
> +
> + return 0;
> +
> +err:
> + free(elem);
> + free(req);
> + return -1;
> +}
> +
> +static void vu_blk_process_vq(VuDev *vu_dev, int idx)
> +{
> + vhost_blk_dev_t *vdev_blk;
> + VuVirtq *vq;
> + int ret;
> +
> + if ((idx < 0) || (idx >= VHOST_MAX_NR_VIRTQUEUE)) {
> + fprintf(stderr, "VQ Index out of range: %d\n", idx);
> + vu_blk_panic_cb(vu_dev, NULL);
> + return;
> + }
> +
> + vdev_blk = container_of(vu_dev, vhost_blk_dev_t, vu_dev);
> + assert(vdev_blk);
> +
> + vq = vu_get_queue(vu_dev, idx);
> + assert(vq);
> +
> + while (1) {
> + ret = vu_virtio_blk_process_req(vdev_blk, vq);
> + if (ret) {
> + break;
> + }
> + }
> +}
> +
> +static void vu_blk_queue_set_started(VuDev *vu_dev, int idx, bool started)
> +{
> + VuVirtq *vq;
> +
> + assert(vu_dev);
> +
> + if ((idx < 0) || (idx >= VHOST_MAX_NR_VIRTQUEUE)) {
> + fprintf(stderr, "VQ Index out of range: %d\n", idx);
> + vu_blk_panic_cb(vu_dev, NULL);
> + return;
> + }
> +
> + vq = vu_get_queue(vu_dev, idx);
> + vu_set_queue_handler(vu_dev, vq, started ? vu_blk_process_vq : NULL);
> +}
> +
> +static uint64_t
> +vu_blk_get_features(VuDev *dev)
> +{
> + return 1ull << VIRTIO_BLK_F_SIZE_MAX |
> + 1ull << VIRTIO_BLK_F_SEG_MAX |
> + 1ull << VIRTIO_BLK_F_TOPOLOGY |
> + 1ull << VIRTIO_BLK_F_BLK_SIZE |
> + 1ull << VIRTIO_F_VERSION_1 |
> + 1ull << VHOST_USER_F_PROTOCOL_FEATURES;
> +}
> +
> +static int
> +vu_blk_get_config(VuDev *vu_dev, uint8_t *config, size_t len)
> +{
> + vhost_blk_dev_t *vdev_blk;
> +
> + if (len != sizeof(struct virtio_blk_config)) {
> + return -1;
> + }
> + vdev_blk = container_of(vu_dev, vhost_blk_dev_t, vu_dev);
> + memcpy(config, &vdev_blk->blkcfg, len);
> +
> + return 0;
> +}
> +
> +static const VuDevIface vu_blk_iface = {
> + .get_features = vu_blk_get_features,
> + .queue_set_started = vu_blk_queue_set_started,
> + .get_config = vu_blk_get_config,
> +};
> +
> +static gboolean vu_blk_vhost_cb(gpointer data)
> +{
> + VuDev *vu_dev = (VuDev *)data;
> +
> + assert(vu_dev);
> +
> + if (!vu_dispatch(vu_dev) != 0) {
> + fprintf(stderr, "Error processing vhost message\n");
> + vu_blk_panic_cb(vu_dev, NULL);
> + return G_SOURCE_REMOVE;
> + }
> +
> + return G_SOURCE_CONTINUE;
> +}
> +
> +static int unix_sock_new(char *unix_fn)
> +{
> + int sock;
> + struct sockaddr_un un;
> + size_t len;
> +
> + assert(unix_fn);
> +
> + sock = socket(AF_UNIX, SOCK_STREAM, 0);
> + if (sock <= 0) {
> + perror("socket");
> + return -1;
> + }
> +
> + un.sun_family = AF_UNIX;
> + (void)snprintf(un.sun_path, sizeof(un.sun_path), "%s", unix_fn);
> + len = sizeof(un.sun_family) + strlen(un.sun_path);
> +
> + (void)unlink(unix_fn);
> + if (bind(sock, (struct sockaddr *)&un, len) < 0) {
> + perror("bind");
> + goto fail;
> + }
> +
> + if (listen(sock, 1) < 0) {
> + perror("listen");
> + goto fail;
> + }
> +
> + return sock;
> +
> +fail:
> + (void)close(sock);
> +
> + return -1;
> +}
> +
> +static int vdev_blk_run(struct vhost_blk_dev *vdev_blk)
> +{
> + int cli_sock;
> + int ret = 0;
> +
> + assert(vdev_blk);
> + assert(vdev_blk->server_sock >= 0);
> + assert(vdev_blk->loop);
> +
> + cli_sock = accept(vdev_blk->server_sock, (void *)0, (void *)0);
> + if (cli_sock < 0) {
> + perror("accept");
> + return -1;
> + }
> +
> + vu_init(&vdev_blk->vu_dev,
> + cli_sock,
> + vu_blk_panic_cb,
> + vu_blk_add_watch_cb,
> + vu_blk_del_watch_cb,
> + &vu_blk_iface);
> +
> + if (vu_blk_gsrc_new(vdev_blk, cli_sock, G_IO_IN, NULL, vu_blk_vhost_cb,
> + &vdev_blk->vu_dev)) {
> + ret = -1;
> + goto out;
> + }
> +
> + g_main_loop_run(vdev_blk->loop);
> +
> +out:
> + vu_deinit(&vdev_blk->vu_dev);
> + return ret;
> +}
> +
> +static void vdev_blk_deinit(struct vhost_blk_dev *vdev_blk)
> +{
> + if (!vdev_blk) {
> + return;
> + }
> +
> + if (vdev_blk->server_sock >= 0) {
> + struct sockaddr_storage ss;
> + socklen_t sslen = sizeof(ss);
> +
> + if (getsockname(vdev_blk->server_sock, (struct sockaddr *)&ss,
> + &sslen) == 0) {
> + struct sockaddr_un *su = (struct sockaddr_un *)&ss;
> + (void)unlink(su->sun_path);
> + }
> +
> + (void)close(vdev_blk->server_sock);
> + vdev_blk->server_sock = -1;
> + }
> +
> + if (vdev_blk->loop) {
> + g_main_loop_unref(vdev_blk->loop);
> + vdev_blk->loop = NULL;
> + }
> +
> + if (vdev_blk->blk_fd) {
> + vu_blk_close(vdev_blk->blk_fd);
> + }
> +}
> +
> +static void
> +vu_blk_initialize_config(int fd, struct virtio_blk_config *config)
> +{
> + off64_t capacity;
> +
> + capacity = lseek64(fd, 0, SEEK_END);
> + config->capacity = capacity >> 9;
> + config->blk_size = 512;
> + config->size_max = 65536;
> + config->seg_max = 128 - 2;
> + config->min_io_size = 1;
> + config->opt_io_size = 1;
> + config->num_queues = 1;
> +}
> +
> +static vhost_blk_dev_t *
> +vdev_blk_new(char *unix_fn, char *blk_file)
> +{
> + vhost_blk_dev_t *vdev_blk = NULL;
> +
> + vdev_blk = calloc(1, sizeof(struct vhost_blk_dev));
> + if (!vdev_blk) {
> + fprintf(stderr, "calloc: %s", strerror(errno));
> + return NULL;
> + }
> +
> + vdev_blk->server_sock = unix_sock_new(unix_fn);
> + if (vdev_blk->server_sock < 0) {
> + goto err;
> + }
> +
> + vdev_blk->loop = g_main_loop_new(NULL, FALSE);
> + if (!vdev_blk->loop) {
> + fprintf(stderr, "Error creating glib event loop");
> + goto err;
> + }
> +
> + vdev_blk->fdmap = g_tree_new(vu_blk_fdmap_compare);
> + if (!vdev_blk->fdmap) {
> + fprintf(stderr, "Error creating glib tree for fdmap");
> + goto err;
> + }
> +
> + vdev_blk->blk_fd = vu_blk_open(blk_file);
> + if (vdev_blk->blk_fd < 0) {
> + fprintf(stderr, "Error open block device %s\n", blk_file);
> + goto err;
> + }
> + vdev_blk->blk_name = blk_file;
> +
> + /* fill virtio_blk_config with block parameters */
> + vu_blk_initialize_config(vdev_blk->blk_fd, &vdev_blk->blkcfg);
> +
> + return vdev_blk;
> +
> +err:
> + vdev_blk_deinit(vdev_blk);
> + free(vdev_blk);
> +
> + return NULL;
> +}
> +
> +int main(int argc, char **argv)
> +{
> + int opt;
> + char *unix_socket = NULL;
> + char *blk_file = NULL;
> + vhost_blk_dev_t *vdev_blk = NULL;
> +
> + while ((opt = getopt(argc, argv, "b:h:s:")) != -1) {
> + switch (opt) {
> + case 'b':
> + blk_file = strdup(optarg);
> + break;
> + case 's':
> + unix_socket = strdup(optarg);
> + break;
> + case 'h':
> + default:
> + printf("Usage: %s [-b block device or file, -s UNIX domain
> socket]"
> + " | [ -h ]\n", argv[0]);
> + break;
> + }
> + }
> +
> + if (!unix_socket || !blk_file) {
> + printf("Usage: %s [-b block device or file, -s UNIX domain socket]
> |"
> + " [ -h ]\n", argv[0]);
> + return -1;
> + }
> +
> + vdev_blk = vdev_blk_new(unix_socket, blk_file);
> + if (!vdev_blk) {
> + goto err;
> + }
> +
> + if (vdev_blk_run(vdev_blk) != 0) {
> + goto err;
> + }
> +
> +err:
> + if (vdev_blk) {
> + vdev_blk_deinit(vdev_blk);
> + free(vdev_blk);
> + }
> + if (unix_socket) {
> + free(unix_socket);
> + }
> + if (blk_file) {
> + free(blk_file);
> + }
> +
> + return 0;
> +}
So beside the glib-style issues shared with v-u-s, it has overall same code quality to me, so it may be acceptable as is for contrib/. However, I would like to improve v-u-s before to avoid duplicating effort.
^ permalink raw reply [flat|nested] 13+ messages in thread