qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Fix vhost reconnect issues
@ 2023-08-04  5:29 Li Feng
  2023-08-04  5:29 ` [PATCH 1/2] vhost-user: fix lost reconnect Li Feng
                   ` (3 more replies)
  0 siblings, 4 replies; 27+ messages in thread
From: Li Feng @ 2023-08-04  5:29 UTC (permalink / raw)
  To: Paolo Bonzini, Fam Zheng, Michael S. Tsirkin, Raphael Norwitz,
	open list:All patches CC here
  Cc: Li Feng

The patchset fixes the regression issue of vhost reconnect.
It's a serious bug that the vhost-user will lose the reconnect forever.

The 2nd patch enhances the error handle of vhost-user-scsi.

This patchset's parent commit is:
https://lore.kernel.org/all/20230731121018.2856310-1-fengli@smartx.com/

Li Feng (2):
  vhost-user: fix lost reconnect
  vhost: Add Error parameter to vhost_scsi_common_start()

 hw/scsi/vhost-scsi-common.c           | 17 ++++++++++-------
 hw/scsi/vhost-scsi.c                  |  5 +++--
 hw/scsi/vhost-user-scsi.c             | 14 ++++++++------
 hw/virtio/vhost-user.c                | 10 +---------
 include/hw/virtio/vhost-scsi-common.h |  2 +-
 5 files changed, 23 insertions(+), 25 deletions(-)

-- 
2.41.0



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/2] vhost-user: fix lost reconnect
  2023-08-04  5:29 [PATCH 0/2] Fix vhost reconnect issues Li Feng
@ 2023-08-04  5:29 ` Li Feng
  2023-08-14 12:11   ` Raphael Norwitz
  2023-08-04  5:29 ` [PATCH 2/2] vhost: Add Error parameter to vhost_scsi_common_start() Li Feng
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 27+ messages in thread
From: Li Feng @ 2023-08-04  5:29 UTC (permalink / raw)
  To: Michael S. Tsirkin, Paolo Bonzini, Fam Zheng, Raphael Norwitz,
	open list:All patches CC here
  Cc: Li Feng

When the vhost-user is reconnecting to the backend, and if the vhost-user fails
at the get_features in vhost_dev_init(), then the reconnect will fail
and it will not be retriggered forever.

The reason is:
When the vhost-user fail at get_features, the vhost_dev_cleanup will be called
immediately.

vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'.

The reconnect path is:
vhost_user_blk_event
   vhost_user_async_close(.. vhost_user_blk_disconnect ..)
     qemu_chr_fe_set_handlers <----- clear the notifier callback
       schedule vhost_user_async_close_bh

The vhost->vdev is null, so the vhost_user_blk_disconnect will not be
called, then the event fd callback will not be reinstalled.

With this patch, the vhost_user_blk_disconnect will call the
vhost_dev_cleanup() again, it's safe.

All vhost-user devices have this issue, including vhost-user-blk/scsi.

Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling")

Signed-off-by: Li Feng <fengli@smartx.com>
---
 hw/virtio/vhost-user.c | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 8dcf049d42..697b403fe2 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -2648,16 +2648,8 @@ typedef struct {
 static void vhost_user_async_close_bh(void *opaque)
 {
     VhostAsyncCallback *data = opaque;
-    struct vhost_dev *vhost = data->vhost;
 
-    /*
-     * If the vhost_dev has been cleared in the meantime there is
-     * nothing left to do as some other path has completed the
-     * cleanup.
-     */
-    if (vhost->vdev) {
-        data->cb(data->dev);
-    }
+    data->cb(data->dev);
 
     g_free(data);
 }
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 2/2] vhost: Add Error parameter to vhost_scsi_common_start()
  2023-08-04  5:29 [PATCH 0/2] Fix vhost reconnect issues Li Feng
  2023-08-04  5:29 ` [PATCH 1/2] vhost-user: fix lost reconnect Li Feng
@ 2023-08-04  5:29 ` Li Feng
  2023-08-14 12:11   ` Raphael Norwitz
  2023-08-24  7:41 ` [PATCH v2 0/2] Fix vhost reconnect issues Li Feng
  2023-08-30  4:57 ` [PATCH v3 0/2] Fix vhost reconnect issues Li Feng
  3 siblings, 1 reply; 27+ messages in thread
From: Li Feng @ 2023-08-04  5:29 UTC (permalink / raw)
  To: Michael S. Tsirkin, Paolo Bonzini, Fam Zheng, Raphael Norwitz,
	open list:All patches CC here
  Cc: Li Feng

Add a Error parameter to report the real error, like vhost-user-blk.

Signed-off-by: Li Feng <fengli@smartx.com>
---
 hw/scsi/vhost-scsi-common.c           | 17 ++++++++++-------
 hw/scsi/vhost-scsi.c                  |  5 +++--
 hw/scsi/vhost-user-scsi.c             | 14 ++++++++------
 include/hw/virtio/vhost-scsi-common.h |  2 +-
 4 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/hw/scsi/vhost-scsi-common.c b/hw/scsi/vhost-scsi-common.c
index a61cd0e907..392587dfb5 100644
--- a/hw/scsi/vhost-scsi-common.c
+++ b/hw/scsi/vhost-scsi-common.c
@@ -16,6 +16,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qapi/error.h"
 #include "qemu/error-report.h"
 #include "qemu/module.h"
 #include "hw/virtio/vhost.h"
@@ -25,7 +26,7 @@
 #include "hw/virtio/virtio-access.h"
 #include "hw/fw-path-provider.h"
 
-int vhost_scsi_common_start(VHostSCSICommon *vsc)
+int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp)
 {
     int ret, i;
     VirtIODevice *vdev = VIRTIO_DEVICE(vsc);
@@ -35,18 +36,19 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
     VirtIOSCSICommon *vs = (VirtIOSCSICommon *)vsc;
 
     if (!k->set_guest_notifiers) {
-        error_report("binding does not support guest notifiers");
+        error_setg(errp, "binding does not support guest notifiers");
         return -ENOSYS;
     }
 
     ret = vhost_dev_enable_notifiers(&vsc->dev, vdev);
     if (ret < 0) {
+        error_setg_errno(errp, -ret, "Error enabling host notifiers");
         return ret;
     }
 
     ret = k->set_guest_notifiers(qbus->parent, vsc->dev.nvqs, true);
     if (ret < 0) {
-        error_report("Error binding guest notifier");
+        error_setg_errno(errp, -ret, "Error binding guest notifier");
         goto err_host_notifiers;
     }
 
@@ -54,7 +56,7 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
 
     ret = vhost_dev_prepare_inflight(&vsc->dev, vdev);
     if (ret < 0) {
-        error_report("Error setting inflight format: %d", -ret);
+        error_setg_errno(errp, -ret, "Error setting inflight format: %d", -ret);
         goto err_guest_notifiers;
     }
 
@@ -64,21 +66,22 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
                                         vs->conf.virtqueue_size,
                                         vsc->inflight);
             if (ret < 0) {
-                error_report("Error getting inflight: %d", -ret);
+                error_setg_errno(errp, -ret, "Error getting inflight: %d",
+                                 -ret);
                 goto err_guest_notifiers;
             }
         }
 
         ret = vhost_dev_set_inflight(&vsc->dev, vsc->inflight);
         if (ret < 0) {
-            error_report("Error setting inflight: %d", -ret);
+            error_setg_errno(errp, -ret, "Error setting inflight: %d", -ret);
             goto err_guest_notifiers;
         }
     }
 
     ret = vhost_dev_start(&vsc->dev, vdev, true);
     if (ret < 0) {
-        error_report("Error start vhost dev");
+        error_setg_errno(errp, -ret, "Error start vhost dev");
         goto err_guest_notifiers;
     }
 
diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index 443f67daa4..01a3ab4277 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -75,6 +75,7 @@ static int vhost_scsi_start(VHostSCSI *s)
     int ret, abi_version;
     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
     const VhostOps *vhost_ops = vsc->dev.vhost_ops;
+    Error *local_err = NULL;
 
     ret = vhost_ops->vhost_scsi_get_abi_version(&vsc->dev, &abi_version);
     if (ret < 0) {
@@ -88,14 +89,14 @@ static int vhost_scsi_start(VHostSCSI *s)
         return -ENOSYS;
     }
 
-    ret = vhost_scsi_common_start(vsc);
+    ret = vhost_scsi_common_start(vsc, &local_err);
     if (ret < 0) {
         return ret;
     }
 
     ret = vhost_scsi_set_endpoint(s);
     if (ret < 0) {
-        error_report("Error setting vhost-scsi endpoint");
+        error_reportf_err(local_err, "Error setting vhost-scsi endpoint");
         vhost_scsi_common_stop(vsc);
     }
 
diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
index a7fa8e8df2..d368171e28 100644
--- a/hw/scsi/vhost-user-scsi.c
+++ b/hw/scsi/vhost-user-scsi.c
@@ -43,12 +43,12 @@ enum VhostUserProtocolFeature {
     VHOST_USER_PROTOCOL_F_RESET_DEVICE = 13,
 };
 
-static int vhost_user_scsi_start(VHostUserSCSI *s)
+static int vhost_user_scsi_start(VHostUserSCSI *s, Error **errp)
 {
     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
     int ret;
 
-    ret = vhost_scsi_common_start(vsc);
+    ret = vhost_scsi_common_start(vsc, errp);
     s->started_vu = (ret < 0 ? false : true);
 
     return ret;
@@ -73,6 +73,7 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
     VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(dev);
     bool should_start = virtio_device_should_start(vdev, status);
+    Error *local_err = NULL;
     int ret;
 
     if (!s->connected) {
@@ -84,9 +85,10 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
     }
 
     if (should_start) {
-        ret = vhost_user_scsi_start(s);
+        ret = vhost_user_scsi_start(s, &local_err);
         if (ret < 0) {
-            error_report("unable to start vhost-user-scsi: %s", strerror(-ret));
+            error_reportf_err(local_err, "unable to start vhost-user-scsi: %s",
+                              strerror(-ret));
             qemu_chr_fe_disconnect(&vs->conf.chardev);
         }
     } else {
@@ -139,7 +141,7 @@ static void vhost_user_scsi_handle_output(VirtIODevice *vdev, VirtQueue *vq)
      * Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
      * vhost here instead of waiting for .set_status().
      */
-    ret = vhost_user_scsi_start(s);
+    ret = vhost_user_scsi_start(s, &local_err);
     if (ret < 0) {
         error_reportf_err(local_err, "vhost-user-scsi: vhost start failed: ");
         qemu_chr_fe_disconnect(&vs->conf.chardev);
@@ -184,7 +186,7 @@ static int vhost_user_scsi_connect(DeviceState *dev, Error **errp)
 
     /* restore vhost state */
     if (virtio_device_started(vdev, vdev->status)) {
-        ret = vhost_user_scsi_start(s);
+        ret = vhost_user_scsi_start(s, errp);
         if (ret < 0) {
             return ret;
         }
diff --git a/include/hw/virtio/vhost-scsi-common.h b/include/hw/virtio/vhost-scsi-common.h
index 18f115527c..c5d2c09455 100644
--- a/include/hw/virtio/vhost-scsi-common.h
+++ b/include/hw/virtio/vhost-scsi-common.h
@@ -39,7 +39,7 @@ struct VHostSCSICommon {
     struct vhost_inflight *inflight;
 };
 
-int vhost_scsi_common_start(VHostSCSICommon *vsc);
+int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp);
 void vhost_scsi_common_stop(VHostSCSICommon *vsc);
 char *vhost_scsi_common_get_fw_dev_path(FWPathProvider *p, BusState *bus,
                                         DeviceState *dev);
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] vhost-user: fix lost reconnect
  2023-08-04  5:29 ` [PATCH 1/2] vhost-user: fix lost reconnect Li Feng
@ 2023-08-14 12:11   ` Raphael Norwitz
  2023-08-17  6:40     ` Li Feng
  0 siblings, 1 reply; 27+ messages in thread
From: Raphael Norwitz @ 2023-08-14 12:11 UTC (permalink / raw)
  To: Li Feng
  Cc: Michael S. Tsirkin, Paolo Bonzini, Fam Zheng,
	open list:All patches CC here

Why can’t we rather fix this by adding a “event_cb” param to vhost_user_async_close and then call qemu_chr_fe_set_handlers in vhost_user_async_close_bh()?

Even if calling vhost_dev_cleanup() twice is safe today I worry future changes may easily stumble over the reconnect case and introduce crashes or double frees.


> On Aug 4, 2023, at 1:29 AM, Li Feng <fengli@smartx.com> wrote:
> 
> When the vhost-user is reconnecting to the backend, and if the vhost-user fails
> at the get_features in vhost_dev_init(), then the reconnect will fail
> and it will not be retriggered forever.
> 
> The reason is:
> When the vhost-user fail at get_features, the vhost_dev_cleanup will be called
> immediately.
> 
> vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'.
> 
> The reconnect path is:
> vhost_user_blk_event
>   vhost_user_async_close(.. vhost_user_blk_disconnect ..)
>     qemu_chr_fe_set_handlers <----- clear the notifier callback
>       schedule vhost_user_async_close_bh
> 
> The vhost->vdev is null, so the vhost_user_blk_disconnect will not be
> called, then the event fd callback will not be reinstalled.
> 
> With this patch, the vhost_user_blk_disconnect will call the
> vhost_dev_cleanup() again, it's safe.
> 
> All vhost-user devices have this issue, including vhost-user-blk/scsi.
> 
> Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling")
> 
> Signed-off-by: Li Feng <fengli@smartx.com>
> ---
> hw/virtio/vhost-user.c | 10 +---------
> 1 file changed, 1 insertion(+), 9 deletions(-)
> 
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 8dcf049d42..697b403fe2 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -2648,16 +2648,8 @@ typedef struct {
> static void vhost_user_async_close_bh(void *opaque)
> {
>     VhostAsyncCallback *data = opaque;
> -    struct vhost_dev *vhost = data->vhost;
> 
> -    /*
> -     * If the vhost_dev has been cleared in the meantime there is
> -     * nothing left to do as some other path has completed the
> -     * cleanup.
> -     */
> -    if (vhost->vdev) {
> -        data->cb(data->dev);
> -    }
> +    data->cb(data->dev);
> 
>     g_free(data);
> }
> -- 
> 2.41.0
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] vhost: Add Error parameter to vhost_scsi_common_start()
  2023-08-04  5:29 ` [PATCH 2/2] vhost: Add Error parameter to vhost_scsi_common_start() Li Feng
@ 2023-08-14 12:11   ` Raphael Norwitz
  2023-08-17  6:49     ` Li Feng
  0 siblings, 1 reply; 27+ messages in thread
From: Raphael Norwitz @ 2023-08-14 12:11 UTC (permalink / raw)
  To: Li Feng
  Cc: Michael S. Tsirkin, Paolo Bonzini, Fam Zheng,
	open list:All patches CC here

Thanks for the cleanup! A few comments.

> On Aug 4, 2023, at 1:29 AM, Li Feng <fengli@smartx.com> wrote:
> 
> Add a Error parameter to report the real error, like vhost-user-blk.
> 
> Signed-off-by: Li Feng <fengli@smartx.com>
> ---
> hw/scsi/vhost-scsi-common.c           | 17 ++++++++++-------
> hw/scsi/vhost-scsi.c                  |  5 +++--
> hw/scsi/vhost-user-scsi.c             | 14 ++++++++------
> include/hw/virtio/vhost-scsi-common.h |  2 +-
> 4 files changed, 22 insertions(+), 16 deletions(-)
> 
> diff --git a/hw/scsi/vhost-scsi-common.c b/hw/scsi/vhost-scsi-common.c
> index a61cd0e907..392587dfb5 100644
> --- a/hw/scsi/vhost-scsi-common.c
> +++ b/hw/scsi/vhost-scsi-common.c
> @@ -16,6 +16,7 @@
>  */
> 
> #include "qemu/osdep.h"
> +#include "qapi/error.h"
> #include "qemu/error-report.h"
> #include "qemu/module.h"
> #include "hw/virtio/vhost.h"
> @@ -25,7 +26,7 @@
> #include "hw/virtio/virtio-access.h"
> #include "hw/fw-path-provider.h"
> 
> -int vhost_scsi_common_start(VHostSCSICommon *vsc)
> +int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp)
> {
>     int ret, i;
>     VirtIODevice *vdev = VIRTIO_DEVICE(vsc);
> @@ -35,18 +36,19 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
>     VirtIOSCSICommon *vs = (VirtIOSCSICommon *)vsc;
> 
>     if (!k->set_guest_notifiers) {
> -        error_report("binding does not support guest notifiers");
> +        error_setg(errp, "binding does not support guest notifiers");
>         return -ENOSYS;
>     }
> 
>     ret = vhost_dev_enable_notifiers(&vsc->dev, vdev);
>     if (ret < 0) {
> +        error_setg_errno(errp, -ret, "Error enabling host notifiers");
>         return ret;
>     }
> 
>     ret = k->set_guest_notifiers(qbus->parent, vsc->dev.nvqs, true);
>     if (ret < 0) {
> -        error_report("Error binding guest notifier");
> +        error_setg_errno(errp, -ret, "Error binding guest notifier");
>         goto err_host_notifiers;
>     }
> 
> @@ -54,7 +56,7 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
> 
>     ret = vhost_dev_prepare_inflight(&vsc->dev, vdev);
>     if (ret < 0) {
> -        error_report("Error setting inflight format: %d", -ret);

Curious why you’re adding the error value to the string. Isn’t it redundant since we pass it in as the second param?

> +        error_setg_errno(errp, -ret, "Error setting inflight format: %d", -ret);
>         goto err_guest_notifiers;
>     }
> 
> @@ -64,21 +66,22 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
>                                         vs->conf.virtqueue_size,
>                                         vsc->inflight);
>             if (ret < 0) {
> -                error_report("Error getting inflight: %d", -ret);

Ditto

> +                error_setg_errno(errp, -ret, "Error getting inflight: %d",
> +                                 -ret);
>                 goto err_guest_notifiers;
>             }
>         }
> 
>         ret = vhost_dev_set_inflight(&vsc->dev, vsc->inflight);
>         if (ret < 0) {
> -            error_report("Error setting inflight: %d", -ret);
> +            error_setg_errno(errp, -ret, "Error setting inflight: %d", -ret);
>             goto err_guest_notifiers;
>         }
>     }
> 
>     ret = vhost_dev_start(&vsc->dev, vdev, true);
>     if (ret < 0) {
> -        error_report("Error start vhost dev");

“Error starting vhost dev”?

> +        error_setg_errno(errp, -ret, "Error start vhost dev");
>         goto err_guest_notifiers;
>     }
> 
> diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
> index 443f67daa4..01a3ab4277 100644
> --- a/hw/scsi/vhost-scsi.c
> +++ b/hw/scsi/vhost-scsi.c
> @@ -75,6 +75,7 @@ static int vhost_scsi_start(VHostSCSI *s)
>     int ret, abi_version;
>     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>     const VhostOps *vhost_ops = vsc->dev.vhost_ops;
> +    Error *local_err = NULL;
> 
>     ret = vhost_ops->vhost_scsi_get_abi_version(&vsc->dev, &abi_version);
>     if (ret < 0) {
> @@ -88,14 +89,14 @@ static int vhost_scsi_start(VHostSCSI *s)
>         return -ENOSYS;
>     }
> 
> -    ret = vhost_scsi_common_start(vsc);
> +    ret = vhost_scsi_common_start(vsc, &local_err);
>     if (ret < 0) {
>         return ret;
>     }
> 
>     ret = vhost_scsi_set_endpoint(s);
>     if (ret < 0) {
> -        error_report("Error setting vhost-scsi endpoint");
> +        error_reportf_err(local_err, "Error setting vhost-scsi endpoint");
>         vhost_scsi_common_stop(vsc);
>     }
> 
> diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
> index a7fa8e8df2..d368171e28 100644
> --- a/hw/scsi/vhost-user-scsi.c
> +++ b/hw/scsi/vhost-user-scsi.c
> @@ -43,12 +43,12 @@ enum VhostUserProtocolFeature {
>     VHOST_USER_PROTOCOL_F_RESET_DEVICE = 13,
> };
> 
> -static int vhost_user_scsi_start(VHostUserSCSI *s)
> +static int vhost_user_scsi_start(VHostUserSCSI *s, Error **errp)
> {
>     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>     int ret;
> 
> -    ret = vhost_scsi_common_start(vsc);
> +    ret = vhost_scsi_common_start(vsc, errp);
>     s->started_vu = (ret < 0 ? false : true);
> 
>     return ret;
> @@ -73,6 +73,7 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
>     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>     VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(dev);
>     bool should_start = virtio_device_should_start(vdev, status);
> +    Error *local_err = NULL;
>     int ret;
> 
>     if (!s->connected) {
> @@ -84,9 +85,10 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
>     }
> 
>     if (should_start) {
> -        ret = vhost_user_scsi_start(s);
> +        ret = vhost_user_scsi_start(s, &local_err);
>         if (ret < 0) {
> -            error_report("unable to start vhost-user-scsi: %s", strerror(-ret));
> +            error_reportf_err(local_err, "unable to start vhost-user-scsi: %s",
> +                              strerror(-ret));
>             qemu_chr_fe_disconnect(&vs->conf.chardev);
>         }
>     } else {
> @@ -139,7 +141,7 @@ static void vhost_user_scsi_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>      * Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
>      * vhost here instead of waiting for .set_status().
>      */
> -    ret = vhost_user_scsi_start(s);
> +    ret = vhost_user_scsi_start(s, &local_err);
>     if (ret < 0) {
>         error_reportf_err(local_err, "vhost-user-scsi: vhost start failed: ");
>         qemu_chr_fe_disconnect(&vs->conf.chardev);
> @@ -184,7 +186,7 @@ static int vhost_user_scsi_connect(DeviceState *dev, Error **errp)
> 
>     /* restore vhost state */
>     if (virtio_device_started(vdev, vdev->status)) {
> -        ret = vhost_user_scsi_start(s);
> +        ret = vhost_user_scsi_start(s, errp);
>         if (ret < 0) {
>             return ret;
>         }
> diff --git a/include/hw/virtio/vhost-scsi-common.h b/include/hw/virtio/vhost-scsi-common.h
> index 18f115527c..c5d2c09455 100644
> --- a/include/hw/virtio/vhost-scsi-common.h
> +++ b/include/hw/virtio/vhost-scsi-common.h
> @@ -39,7 +39,7 @@ struct VHostSCSICommon {
>     struct vhost_inflight *inflight;
> };
> 
> -int vhost_scsi_common_start(VHostSCSICommon *vsc);
> +int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp);
> void vhost_scsi_common_stop(VHostSCSICommon *vsc);
> char *vhost_scsi_common_get_fw_dev_path(FWPathProvider *p, BusState *bus,
>                                         DeviceState *dev);
> -- 
> 2.41.0
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] vhost-user: fix lost reconnect
  2023-08-14 12:11   ` Raphael Norwitz
@ 2023-08-17  6:40     ` Li Feng
  2023-08-22  0:38       ` Raphael Norwitz
  0 siblings, 1 reply; 27+ messages in thread
From: Li Feng @ 2023-08-17  6:40 UTC (permalink / raw)
  To: Raphael Norwitz
  Cc: Michael S. Tsirkin, Paolo Bonzini, Fam Zheng,
	open list:All patches CC here


> 2023年8月14日 下午8:11,Raphael Norwitz <raphael.norwitz@nutanix.com> 写道:
> 
> Why can’t we rather fix this by adding a “event_cb” param to vhost_user_async_close and then call qemu_chr_fe_set_handlers in vhost_user_async_close_bh()?
> 
> Even if calling vhost_dev_cleanup() twice is safe today I worry future changes may easily stumble over the reconnect case and introduce crashes or double frees.
> 
I think add a new event_cb is not good enough. ‘qemu_chr_fe_set_handlers’ has been called in vhost_user_async_close, and will be called in event->cb, so why need add a
new event_cb?
 
For avoiding to call the vhost_dev_cleanup() twice, add a ‘inited’ in struct vhost-dev to mark if it’s inited like this:

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index e2f6ffb446..edc80c0231 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1502,6 +1502,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
         goto fail_busyloop;
     }

+    hdev->inited = true;
     return 0;

 fail_busyloop:
@@ -1520,6 +1521,10 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
 {
     int i;

+    if (!hdev->inited) {
+        return;
+    }
+    hdev->inited = false;
     trace_vhost_dev_cleanup(hdev);

     for (i = 0; i < hdev->nvqs; ++i) {
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index ca3131b1af..74b1aec960 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -123,6 +123,7 @@ struct vhost_dev {
     /* @started: is the vhost device started? */
     bool started;
     bool log_enabled;
+    bool inited;
     uint64_t log_size;
     Error *migration_blocker;
     const VhostOps *vhost_ops;

Thanks.

> 
>> On Aug 4, 2023, at 1:29 AM, Li Feng <fengli@smartx.com> wrote:
>> 
>> When the vhost-user is reconnecting to the backend, and if the vhost-user fails
>> at the get_features in vhost_dev_init(), then the reconnect will fail
>> and it will not be retriggered forever.
>> 
>> The reason is:
>> When the vhost-user fail at get_features, the vhost_dev_cleanup will be called
>> immediately.
>> 
>> vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'.
>> 
>> The reconnect path is:
>> vhost_user_blk_event
>>  vhost_user_async_close(.. vhost_user_blk_disconnect ..)
>>    qemu_chr_fe_set_handlers <----- clear the notifier callback
>>      schedule vhost_user_async_close_bh
>> 
>> The vhost->vdev is null, so the vhost_user_blk_disconnect will not be
>> called, then the event fd callback will not be reinstalled.
>> 
>> With this patch, the vhost_user_blk_disconnect will call the
>> vhost_dev_cleanup() again, it's safe.
>> 
>> All vhost-user devices have this issue, including vhost-user-blk/scsi.
>> 
>> Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling")
>> 
>> Signed-off-by: Li Feng <fengli@smartx.com>
>> ---
>> hw/virtio/vhost-user.c | 10 +---------
>> 1 file changed, 1 insertion(+), 9 deletions(-)
>> 
>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>> index 8dcf049d42..697b403fe2 100644
>> --- a/hw/virtio/vhost-user.c
>> +++ b/hw/virtio/vhost-user.c
>> @@ -2648,16 +2648,8 @@ typedef struct {
>> static void vhost_user_async_close_bh(void *opaque)
>> {
>>    VhostAsyncCallback *data = opaque;
>> -    struct vhost_dev *vhost = data->vhost;
>> 
>> -    /*
>> -     * If the vhost_dev has been cleared in the meantime there is
>> -     * nothing left to do as some other path has completed the
>> -     * cleanup.
>> -     */
>> -    if (vhost->vdev) {
>> -        data->cb(data->dev);
>> -    }
>> +    data->cb(data->dev);
>> 
>>    g_free(data);
>> }
>> -- 
>> 2.41.0
>> 
> 



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] vhost: Add Error parameter to vhost_scsi_common_start()
  2023-08-14 12:11   ` Raphael Norwitz
@ 2023-08-17  6:49     ` Li Feng
  2023-08-21 12:09       ` Markus Armbruster
  0 siblings, 1 reply; 27+ messages in thread
From: Li Feng @ 2023-08-17  6:49 UTC (permalink / raw)
  To: Raphael Norwitz
  Cc: Michael S. Tsirkin, Paolo Bonzini, Fam Zheng,
	open list:All patches CC here

[-- Attachment #1: Type: text/plain, Size: 8018 bytes --]



> 2023年8月14日 下午8:11,Raphael Norwitz <raphael.norwitz@nutanix.com> 写道:
> 
> Thanks for the cleanup! A few comments.
> 
>> On Aug 4, 2023, at 1:29 AM, Li Feng <fengli@smartx.com> wrote:
>> 
>> Add a Error parameter to report the real error, like vhost-user-blk.
>> 
>> Signed-off-by: Li Feng <fengli@smartx.com>
>> ---
>> hw/scsi/vhost-scsi-common.c           | 17 ++++++++++-------
>> hw/scsi/vhost-scsi.c                  |  5 +++--
>> hw/scsi/vhost-user-scsi.c             | 14 ++++++++------
>> include/hw/virtio/vhost-scsi-common.h |  2 +-
>> 4 files changed, 22 insertions(+), 16 deletions(-)
>> 
>> diff --git a/hw/scsi/vhost-scsi-common.c b/hw/scsi/vhost-scsi-common.c
>> index a61cd0e907..392587dfb5 100644
>> --- a/hw/scsi/vhost-scsi-common.c
>> +++ b/hw/scsi/vhost-scsi-common.c
>> @@ -16,6 +16,7 @@
>> */
>> 
>> #include "qemu/osdep.h"
>> +#include "qapi/error.h"
>> #include "qemu/error-report.h"
>> #include "qemu/module.h"
>> #include "hw/virtio/vhost.h"
>> @@ -25,7 +26,7 @@
>> #include "hw/virtio/virtio-access.h"
>> #include "hw/fw-path-provider.h"
>> 
>> -int vhost_scsi_common_start(VHostSCSICommon *vsc)
>> +int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp)
>> {
>>    int ret, i;
>>    VirtIODevice *vdev = VIRTIO_DEVICE(vsc);
>> @@ -35,18 +36,19 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
>>    VirtIOSCSICommon *vs = (VirtIOSCSICommon *)vsc;
>> 
>>    if (!k->set_guest_notifiers) {
>> -        error_report("binding does not support guest notifiers");
>> +        error_setg(errp, "binding does not support guest notifiers");
>>        return -ENOSYS;
>>    }
>> 
>>    ret = vhost_dev_enable_notifiers(&vsc->dev, vdev);
>>    if (ret < 0) {
>> +        error_setg_errno(errp, -ret, "Error enabling host notifiers");
>>        return ret;
>>    }
>> 
>>    ret = k->set_guest_notifiers(qbus->parent, vsc->dev.nvqs, true);
>>    if (ret < 0) {
>> -        error_report("Error binding guest notifier");
>> +        error_setg_errno(errp, -ret, "Error binding guest notifier");
>>        goto err_host_notifiers;
>>    }
>> 
>> @@ -54,7 +56,7 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
>> 
>>    ret = vhost_dev_prepare_inflight(&vsc->dev, vdev);
>>    if (ret < 0) {
>> -        error_report("Error setting inflight format: %d", -ret);
> 
> Curious why you’re adding the error value to the string. Isn’t it redundant since we pass it in as the second param?
> 
>> +        error_setg_errno(errp, -ret, "Error setting inflight format: %d", -ret);

I don’t understand. Here I put the error message in errp, where is redundant?

>>        goto err_guest_notifiers;
>>    }
>> 
>> @@ -64,21 +66,22 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
>>                                        vs->conf.virtqueue_size,
>>                                        vsc->inflight);
>>            if (ret < 0) {
>> -                error_report("Error getting inflight: %d", -ret);
> 
> Ditto
> 
>> +                error_setg_errno(errp, -ret, "Error getting inflight: %d",
>> +                                 -ret);
>>                goto err_guest_notifiers;
>>            }
>>        }
>> 
>>        ret = vhost_dev_set_inflight(&vsc->dev, vsc->inflight);
>>        if (ret < 0) {
>> -            error_report("Error setting inflight: %d", -ret);
>> +            error_setg_errno(errp, -ret, "Error setting inflight: %d", -ret);
>>            goto err_guest_notifiers;
>>        }
>>    }
>> 
>>    ret = vhost_dev_start(&vsc->dev, vdev, true);
>>    if (ret < 0) {
>> -        error_report("Error start vhost dev");
> 
> “Error starting vhost dev”?
ACK.

> 
>> +        error_setg_errno(errp, -ret, "Error start vhost dev");
>>        goto err_guest_notifiers;
>>    }
>> 
>> diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
>> index 443f67daa4..01a3ab4277 100644
>> --- a/hw/scsi/vhost-scsi.c
>> +++ b/hw/scsi/vhost-scsi.c
>> @@ -75,6 +75,7 @@ static int vhost_scsi_start(VHostSCSI *s)
>>    int ret, abi_version;
>>    VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>>    const VhostOps *vhost_ops = vsc->dev.vhost_ops;
>> +    Error *local_err = NULL;
>> 
>>    ret = vhost_ops->vhost_scsi_get_abi_version(&vsc->dev, &abi_version);
>>    if (ret < 0) {
>> @@ -88,14 +89,14 @@ static int vhost_scsi_start(VHostSCSI *s)
>>        return -ENOSYS;
>>    }
>> 
>> -    ret = vhost_scsi_common_start(vsc);
>> +    ret = vhost_scsi_common_start(vsc, &local_err);
>>    if (ret < 0) {
>>        return ret;
>>    }
>> 
>>    ret = vhost_scsi_set_endpoint(s);
>>    if (ret < 0) {
>> -        error_report("Error setting vhost-scsi endpoint");
>> +        error_reportf_err(local_err, "Error setting vhost-scsi endpoint");
>>        vhost_scsi_common_stop(vsc);
>>    }
>> 
>> diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
>> index a7fa8e8df2..d368171e28 100644
>> --- a/hw/scsi/vhost-user-scsi.c
>> +++ b/hw/scsi/vhost-user-scsi.c
>> @@ -43,12 +43,12 @@ enum VhostUserProtocolFeature {
>>    VHOST_USER_PROTOCOL_F_RESET_DEVICE = 13,
>> };
>> 
>> -static int vhost_user_scsi_start(VHostUserSCSI *s)
>> +static int vhost_user_scsi_start(VHostUserSCSI *s, Error **errp)
>> {
>>    VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>>    int ret;
>> 
>> -    ret = vhost_scsi_common_start(vsc);
>> +    ret = vhost_scsi_common_start(vsc, errp);
>>    s->started_vu = (ret < 0 ? false : true);
>> 
>>    return ret;
>> @@ -73,6 +73,7 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
>>    VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>>    VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(dev);
>>    bool should_start = virtio_device_should_start(vdev, status);
>> +    Error *local_err = NULL;
>>    int ret;
>> 
>>    if (!s->connected) {
>> @@ -84,9 +85,10 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
>>    }
>> 
>>    if (should_start) {
>> -        ret = vhost_user_scsi_start(s);
>> +        ret = vhost_user_scsi_start(s, &local_err);
>>        if (ret < 0) {
>> -            error_report("unable to start vhost-user-scsi: %s", strerror(-ret));
>> +            error_reportf_err(local_err, "unable to start vhost-user-scsi: %s",
>> +                              strerror(-ret));
>>            qemu_chr_fe_disconnect(&vs->conf.chardev);
>>        }
>>    } else {
>> @@ -139,7 +141,7 @@ static void vhost_user_scsi_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>>     * Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
>>     * vhost here instead of waiting for .set_status().
>>     */
>> -    ret = vhost_user_scsi_start(s);
>> +    ret = vhost_user_scsi_start(s, &local_err);
>>    if (ret < 0) {
>>        error_reportf_err(local_err, "vhost-user-scsi: vhost start failed: ");
>>        qemu_chr_fe_disconnect(&vs->conf.chardev);
>> @@ -184,7 +186,7 @@ static int vhost_user_scsi_connect(DeviceState *dev, Error **errp)
>> 
>>    /* restore vhost state */
>>    if (virtio_device_started(vdev, vdev->status)) {
>> -        ret = vhost_user_scsi_start(s);
>> +        ret = vhost_user_scsi_start(s, errp);
>>        if (ret < 0) {
>>            return ret;
>>        }
>> diff --git a/include/hw/virtio/vhost-scsi-common.h b/include/hw/virtio/vhost-scsi-common.h
>> index 18f115527c..c5d2c09455 100644
>> --- a/include/hw/virtio/vhost-scsi-common.h
>> +++ b/include/hw/virtio/vhost-scsi-common.h
>> @@ -39,7 +39,7 @@ struct VHostSCSICommon {
>>    struct vhost_inflight *inflight;
>> };
>> 
>> -int vhost_scsi_common_start(VHostSCSICommon *vsc);
>> +int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp);
>> void vhost_scsi_common_stop(VHostSCSICommon *vsc);
>> char *vhost_scsi_common_get_fw_dev_path(FWPathProvider *p, BusState *bus,
>>                                        DeviceState *dev);
>> -- 
>> 2.41.0


[-- Attachment #2: Type: text/html, Size: 19216 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] vhost: Add Error parameter to vhost_scsi_common_start()
  2023-08-17  6:49     ` Li Feng
@ 2023-08-21 12:09       ` Markus Armbruster
  2023-08-22  4:47         ` Li Feng
  0 siblings, 1 reply; 27+ messages in thread
From: Markus Armbruster @ 2023-08-21 12:09 UTC (permalink / raw)
  To: Li Feng
  Cc: Raphael Norwitz, Michael S. Tsirkin, Paolo Bonzini, Fam Zheng,
	open list:All patches CC here

Li Feng <fengli@smartx.com> writes:

>> 2023年8月14日 下午8:11,Raphael Norwitz <raphael.norwitz@nutanix.com> 写道:
>> 
>> Thanks for the cleanup! A few comments.
>> 
>>> On Aug 4, 2023, at 1:29 AM, Li Feng <fengli@smartx.com> wrote:
>>> 
>>> Add a Error parameter to report the real error, like vhost-user-blk.
>>> 
>>> Signed-off-by: Li Feng <fengli@smartx.com>
>>> ---
>>> hw/scsi/vhost-scsi-common.c           | 17 ++++++++++-------
>>> hw/scsi/vhost-scsi.c                  |  5 +++--
>>> hw/scsi/vhost-user-scsi.c             | 14 ++++++++------
>>> include/hw/virtio/vhost-scsi-common.h |  2 +-
>>> 4 files changed, 22 insertions(+), 16 deletions(-)
>>> 
>>> diff --git a/hw/scsi/vhost-scsi-common.c b/hw/scsi/vhost-scsi-common.c
>>> index a61cd0e907..392587dfb5 100644
>>> --- a/hw/scsi/vhost-scsi-common.c
>>> +++ b/hw/scsi/vhost-scsi-common.c
>>> @@ -16,6 +16,7 @@
>>> */
>>> 
>>> #include "qemu/osdep.h"
>>> +#include "qapi/error.h"
>>> #include "qemu/error-report.h"
>>> #include "qemu/module.h"
>>> #include "hw/virtio/vhost.h"
>>> @@ -25,7 +26,7 @@
>>> #include "hw/virtio/virtio-access.h"
>>> #include "hw/fw-path-provider.h"
>>> 
>>> -int vhost_scsi_common_start(VHostSCSICommon *vsc)
>>> +int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp)
>>> {
>>>    int ret, i;
>>>    VirtIODevice *vdev = VIRTIO_DEVICE(vsc);
>>> @@ -35,18 +36,19 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
>>>    VirtIOSCSICommon *vs = (VirtIOSCSICommon *)vsc;
>>> 
>>>    if (!k->set_guest_notifiers) {
>>> -        error_report("binding does not support guest notifiers");
>>> +        error_setg(errp, "binding does not support guest notifiers");
>>>        return -ENOSYS;
>>>    }
>>> 
>>>    ret = vhost_dev_enable_notifiers(&vsc->dev, vdev);
>>>    if (ret < 0) {
>>> +        error_setg_errno(errp, -ret, "Error enabling host notifiers");
>>>        return ret;
>>>    }
>>> 
>>>    ret = k->set_guest_notifiers(qbus->parent, vsc->dev.nvqs, true);
>>>    if (ret < 0) {
>>> -        error_report("Error binding guest notifier");
>>> +        error_setg_errno(errp, -ret, "Error binding guest notifier");
>>>        goto err_host_notifiers;
>>>    }
>>> 
>>> @@ -54,7 +56,7 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
>>> 
>>>    ret = vhost_dev_prepare_inflight(&vsc->dev, vdev);
>>>    if (ret < 0) {
>>> -        error_report("Error setting inflight format: %d", -ret);
>> 
>> Curious why you’re adding the error value to the string. Isn’t it redundant since we pass it in as the second param?
>> 
>>> +        error_setg_errno(errp, -ret, "Error setting inflight format: %d", -ret);
>
> I don’t understand. Here I put the error message in errp, where is redundant?

The error message will come out like

    Error setting inflight format: 22: Invalid argument

You need to drop ": %d".

Two remarks:

1. The #1 reason for bad error messages is neglecting to *test* them.

2. Printing errno as a number is needlessly unfriendly to users.

[...]



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] vhost-user: fix lost reconnect
  2023-08-17  6:40     ` Li Feng
@ 2023-08-22  0:38       ` Raphael Norwitz
  2023-08-22  4:49         ` Li Feng
  0 siblings, 1 reply; 27+ messages in thread
From: Raphael Norwitz @ 2023-08-22  0:38 UTC (permalink / raw)
  To: Li Feng
  Cc: Michael S. Tsirkin, Paolo Bonzini, Fam Zheng,
	open list:All patches CC here


> On Aug 17, 2023, at 2:40 AM, Li Feng <fengli@smartx.com> wrote:
> 
> 
>> 2023年8月14日 下午8:11,Raphael Norwitz <raphael.norwitz@nutanix.com> 写道:
>> 
>> Why can’t we rather fix this by adding a “event_cb” param to vhost_user_async_close and then call qemu_chr_fe_set_handlers in vhost_user_async_close_bh()?
>> 
>> Even if calling vhost_dev_cleanup() twice is safe today I worry future changes may easily stumble over the reconnect case and introduce crashes or double frees.
>> 
> I think add a new event_cb is not good enough. ‘qemu_chr_fe_set_handlers’ has been called in vhost_user_async_close, and will be called in event->cb, so why need add a
> new event_cb?
> 

I’m suggesting calling the data->event_cb instead of the data->cb if we hit the error case where vhost->vdev is NULL. Something like:

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 8dcf049d42..edf1dccd44 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -2648,6 +2648,10 @@ typedef struct {
 static void vhost_user_async_close_bh(void *opaque)
 {
     VhostAsyncCallback *data = opaque;
+
+    VirtIODevice *vdev = VIRTIO_DEVICE(data->dev);
+    VHostUserBlk *s = VHOST_USER_BLK(vdev);
+
     struct vhost_dev *vhost = data->vhost;
 
     /*
@@ -2657,6 +2661,9 @@ static void vhost_user_async_close_bh(void *opaque)
      */
     if (vhost->vdev) {
         data->cb(data->dev);
+    } else if (data->event_cb) {
+        qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, data->event_cb,
+                                 NULL, data->dev, NULL, true);
     }
 
     g_free(data);

data->event_cb would be vhost_user_blk_event().

I think that makes the error path a lot easier to reason about and more future proof.

> For avoiding to call the vhost_dev_cleanup() twice, add a ‘inited’ in struct vhost-dev to mark if it’s inited like this:
> 

This is better than the original, but let me know what you think of my alternative.

> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index e2f6ffb446..edc80c0231 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -1502,6 +1502,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
>         goto fail_busyloop;
>     }
> 
> +    hdev->inited = true;
>     return 0;
> 
> fail_busyloop:
> @@ -1520,6 +1521,10 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
> {
>     int i;
> 
> +    if (!hdev->inited) {
> +        return;
> +    }
> +    hdev->inited = false;
>     trace_vhost_dev_cleanup(hdev);
> 
>     for (i = 0; i < hdev->nvqs; ++i) {
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index ca3131b1af..74b1aec960 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -123,6 +123,7 @@ struct vhost_dev {
>     /* @started: is the vhost device started? */
>     bool started;
>     bool log_enabled;
> +    bool inited;
>     uint64_t log_size;
>     Error *migration_blocker;
>     const VhostOps *vhost_ops;
> 
> Thanks.
> 
>> 
>>> On Aug 4, 2023, at 1:29 AM, Li Feng <fengli@smartx.com> wrote:
>>> 
>>> When the vhost-user is reconnecting to the backend, and if the vhost-user fails
>>> at the get_features in vhost_dev_init(), then the reconnect will fail
>>> and it will not be retriggered forever.
>>> 
>>> The reason is:
>>> When the vhost-user fail at get_features, the vhost_dev_cleanup will be called
>>> immediately.
>>> 
>>> vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'.
>>> 
>>> The reconnect path is:
>>> vhost_user_blk_event
>>> vhost_user_async_close(.. vhost_user_blk_disconnect ..)
>>>   qemu_chr_fe_set_handlers <----- clear the notifier callback
>>>     schedule vhost_user_async_close_bh
>>> 
>>> The vhost->vdev is null, so the vhost_user_blk_disconnect will not be
>>> called, then the event fd callback will not be reinstalled.
>>> 
>>> With this patch, the vhost_user_blk_disconnect will call the
>>> vhost_dev_cleanup() again, it's safe.
>>> 
>>> All vhost-user devices have this issue, including vhost-user-blk/scsi.
>>> 
>>> Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling")
>>> 
>>> Signed-off-by: Li Feng <fengli@smartx.com>
>>> ---
>>> hw/virtio/vhost-user.c | 10 +---------
>>> 1 file changed, 1 insertion(+), 9 deletions(-)
>>> 
>>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>>> index 8dcf049d42..697b403fe2 100644
>>> --- a/hw/virtio/vhost-user.c
>>> +++ b/hw/virtio/vhost-user.c
>>> @@ -2648,16 +2648,8 @@ typedef struct {
>>> static void vhost_user_async_close_bh(void *opaque)
>>> {
>>>   VhostAsyncCallback *data = opaque;
>>> -    struct vhost_dev *vhost = data->vhost;
>>> 
>>> -    /*
>>> -     * If the vhost_dev has been cleared in the meantime there is
>>> -     * nothing left to do as some other path has completed the
>>> -     * cleanup.
>>> -     */
>>> -    if (vhost->vdev) {
>>> -        data->cb(data->dev);
>>> -    }
>>> +    data->cb(data->dev);
>>> 
>>>   g_free(data);
>>> }
>>> -- 
>>> 2.41.0
>>> 
>> 
> 


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] vhost: Add Error parameter to vhost_scsi_common_start()
  2023-08-21 12:09       ` Markus Armbruster
@ 2023-08-22  4:47         ` Li Feng
  0 siblings, 0 replies; 27+ messages in thread
From: Li Feng @ 2023-08-22  4:47 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Raphael Norwitz, Michael S. Tsirkin, Paolo Bonzini, Fam Zheng,
	open list:All patches CC here

[-- Attachment #1: Type: text/plain, Size: 2896 bytes --]

On 21 Aug 2023, at 8:09 PM, Markus Armbruster <armbru@redhat.com> wrote:

Li Feng <fengli@smartx.com> writes:

2023年8月14日 下午8:11,Raphael Norwitz <raphael.norwitz@nutanix.com> 写道:

Thanks for the cleanup! A few comments.

On Aug 4, 2023, at 1:29 AM, Li Feng <fengli@smartx.com> wrote:

Add a Error parameter to report the real error, like vhost-user-blk.

Signed-off-by: Li Feng <fengli@smartx.com>
---
hw/scsi/vhost-scsi-common.c           | 17 ++++++++++-------
hw/scsi/vhost-scsi.c                  |  5 +++--
hw/scsi/vhost-user-scsi.c             | 14 ++++++++------
include/hw/virtio/vhost-scsi-common.h |  2 +-
4 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/hw/scsi/vhost-scsi-common.c b/hw/scsi/vhost-scsi-common.c
index a61cd0e907..392587dfb5 100644
--- a/hw/scsi/vhost-scsi-common.c
+++ b/hw/scsi/vhost-scsi-common.c
@@ -16,6 +16,7 @@
*/

#include "qemu/osdep.h"
+#include "qapi/error.h"
#include "qemu/error-report.h"
#include "qemu/module.h"
#include "hw/virtio/vhost.h"
@@ -25,7 +26,7 @@
#include "hw/virtio/virtio-access.h"
#include "hw/fw-path-provider.h"

-int vhost_scsi_common_start(VHostSCSICommon *vsc)
+int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp)
{
int ret, i;
VirtIODevice *vdev = VIRTIO_DEVICE(vsc);
@@ -35,18 +36,19 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
VirtIOSCSICommon *vs = (VirtIOSCSICommon *)vsc;

if (!k->set_guest_notifiers) {
-        error_report("binding does not support guest notifiers");
+        error_setg(errp, "binding does not support guest notifiers");
    return -ENOSYS;
}

ret = vhost_dev_enable_notifiers(&vsc->dev, vdev);
if (ret < 0) {
+        error_setg_errno(errp, -ret, "Error enabling host notifiers");
    return ret;
}

ret = k->set_guest_notifiers(qbus->parent, vsc->dev.nvqs, true);
if (ret < 0) {
-        error_report("Error binding guest notifier");
+        error_setg_errno(errp, -ret, "Error binding guest notifier");
    goto err_host_notifiers;
}

@@ -54,7 +56,7 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)

ret = vhost_dev_prepare_inflight(&vsc->dev, vdev);
if (ret < 0) {
-        error_report("Error setting inflight format: %d", -ret);


Curious why you’re adding the error value to the string. Isn’t it redundant
since we pass it in as the second param?

+        error_setg_errno(errp, -ret, "Error setting inflight format: %d",
-ret);


I don’t understand. Here I put the error message in errp, where is
redundant?


The error message will come out like

 Error setting inflight format: 22: Invalid argument

You need to drop ": %d".

Two remarks:

1. The #1 reason for bad error messages is neglecting to *test* them.

2. Printing errno as a number is needlessly unfriendly to users.

[...]

Understood! Very thanks, I will fix it in the v2.

[-- Attachment #2: Type: text/html, Size: 3977 bytes --]

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] vhost-user: fix lost reconnect
  2023-08-22  0:38       ` Raphael Norwitz
@ 2023-08-22  4:49         ` Li Feng
  2023-08-22 10:17           ` Raphael Norwitz
  0 siblings, 1 reply; 27+ messages in thread
From: Li Feng @ 2023-08-22  4:49 UTC (permalink / raw)
  To: Raphael Norwitz
  Cc: Michael S. Tsirkin, Paolo Bonzini, Fam Zheng,
	open list:All patches CC here

[-- Attachment #1: Type: text/plain, Size: 5114 bytes --]

On 22 Aug 2023, at 8:38 AM, Raphael Norwitz <raphael.norwitz@nutanix.com>
wrote:


On Aug 17, 2023, at 2:40 AM, Li Feng <fengli@smartx.com> wrote:


2023年8月14日 下午8:11,Raphael Norwitz <raphael.norwitz@nutanix.com> 写道:

Why can’t we rather fix this by adding a “event_cb” param to
vhost_user_async_close and then call qemu_chr_fe_set_handlers in
vhost_user_async_close_bh()?

Even if calling vhost_dev_cleanup() twice is safe today I worry future
changes may easily stumble over the reconnect case and introduce crashes or
double frees.

I think add a new event_cb is not good enough. ‘qemu_chr_fe_set_handlers’
has been called in vhost_user_async_close, and will be called in event->cb,
so why need add a
new event_cb?


I’m suggesting calling the data->event_cb instead of the data->cb if we hit
the error case where vhost->vdev is NULL. Something like:

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 8dcf049d42..edf1dccd44 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -2648,6 +2648,10 @@ typedef struct {
static void vhost_user_async_close_bh(void *opaque)
{
    VhostAsyncCallback *data = opaque;
+
+    VirtIODevice *vdev = VIRTIO_DEVICE(data->dev);
+    VHostUserBlk *s = VHOST_USER_BLK(vdev);
+
    struct vhost_dev *vhost = data->vhost;

    /*
@@ -2657,6 +2661,9 @@ static void vhost_user_async_close_bh(void *opaque)
     */
    if (vhost->vdev) {
        data->cb(data->dev);
+    } else if (data->event_cb) {
+        qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, data->event_cb,
+                                 NULL, data->dev, NULL, true);
    }

    g_free(data);

data->event_cb would be vhost_user_blk_event().

I think that makes the error path a lot easier to reason about and more
future proof.

For avoiding to call the vhost_dev_cleanup() twice, add a ‘inited’ in
struct vhost-dev to mark if it’s inited like this:


This is better than the original, but let me know what you think of my
alternative.


The vhost_user_async_close_bh() is a common function in vhost-user.c, and
vhost_user_async_close() is used by vhost-user-scsi/blk/gpio,
However, in your patch it’s limited to VhostUserBlk, so I think my fix is
more reasonable.

Thanks,
LI


diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index e2f6ffb446..edc80c0231 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1502,6 +1502,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void
*opaque,
       goto fail_busyloop;
   }

+    hdev->inited = true;
   return 0;

fail_busyloop:
@@ -1520,6 +1521,10 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
{
   int i;

+    if (!hdev->inited) {
+        return;
+    }
+    hdev->inited = false;
   trace_vhost_dev_cleanup(hdev);

   for (i = 0; i < hdev->nvqs; ++i) {
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index ca3131b1af..74b1aec960 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -123,6 +123,7 @@ struct vhost_dev {
   /* @started: is the vhost device started? */
   bool started;
   bool log_enabled;
+    bool inited;
   uint64_t log_size;
   Error *migration_blocker;
   const VhostOps *vhost_ops;

Thanks.


On Aug 4, 2023, at 1:29 AM, Li Feng <fengli@smartx.com> wrote:

When the vhost-user is reconnecting to the backend, and if the vhost-user
fails
at the get_features in vhost_dev_init(), then the reconnect will fail
and it will not be retriggered forever.

The reason is:
When the vhost-user fail at get_features, the vhost_dev_cleanup will be
called
immediately.

vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'.

The reconnect path is:
vhost_user_blk_event
vhost_user_async_close(.. vhost_user_blk_disconnect ..)
 qemu_chr_fe_set_handlers <----- clear the notifier callback
   schedule vhost_user_async_close_bh

The vhost->vdev is null, so the vhost_user_blk_disconnect will not be
called, then the event fd callback will not be reinstalled.

With this patch, the vhost_user_blk_disconnect will call the
vhost_dev_cleanup() again, it's safe.

All vhost-user devices have this issue, including vhost-user-blk/scsi.

Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling")

Signed-off-by: Li Feng <fengli@smartx.com>
---
hw/virtio/vhost-user.c | 10 +---------
1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 8dcf049d42..697b403fe2 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -2648,16 +2648,8 @@ typedef struct {
static void vhost_user_async_close_bh(void *opaque)
{
 VhostAsyncCallback *data = opaque;
-    struct vhost_dev *vhost = data->vhost;

-    /*
-     * If the vhost_dev has been cleared in the meantime there is
-     * nothing left to do as some other path has completed the
-     * cleanup.
-     */
-    if (vhost->vdev) {
-        data->cb(data->dev);
-    }
+    data->cb(data->dev);

 g_free(data);
}
-- 
2.41.0

[-- Attachment #2: Type: text/html, Size: 23268 bytes --]

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] vhost-user: fix lost reconnect
  2023-08-22  4:49         ` Li Feng
@ 2023-08-22 10:17           ` Raphael Norwitz
  2023-08-24  7:27             ` Li Feng
  0 siblings, 1 reply; 27+ messages in thread
From: Raphael Norwitz @ 2023-08-22 10:17 UTC (permalink / raw)
  To: Li Feng
  Cc: Michael S. Tsirkin, Paolo Bonzini, Fam Zheng,
	open list:All patches CC here



> On Aug 22, 2023, at 12:49 AM, Li Feng <fengli@smartx.com> wrote:
> 
> 
> 
>> On 22 Aug 2023, at 8:38 AM, Raphael Norwitz <raphael.norwitz@nutanix.com> wrote:
>> 
>>> 
>>> On Aug 17, 2023, at 2:40 AM, Li Feng <fengli@smartx.com> wrote:
>>> 
>>> 
>>>> 2023年8月14日 下午8:11,Raphael Norwitz <raphael.norwitz@nutanix.com> 写道:
>>>> 
>>>> Why can’t we rather fix this by adding a “event_cb” param to vhost_user_async_close and then call qemu_chr_fe_set_handlers in vhost_user_async_close_bh()?
>>>> 
>>>> Even if calling vhost_dev_cleanup() twice is safe today I worry future changes may easily stumble over the reconnect case and introduce crashes or double frees.
>>>> 
>>> I think add a new event_cb is not good enough. ‘qemu_chr_fe_set_handlers’ has been called in vhost_user_async_close, and will be called in event->cb, so why need add a
>>> new event_cb?
>>> 
>> 
>> I’m suggesting calling the data->event_cb instead of the data->cb if we hit the error case where vhost->vdev is NULL. Something like:
>> 
>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>> index 8dcf049d42..edf1dccd44 100644
>> --- a/hw/virtio/vhost-user.c
>> +++ b/hw/virtio/vhost-user.c
>> @@ -2648,6 +2648,10 @@ typedef struct {
>> static void vhost_user_async_close_bh(void *opaque)
>> {
>>     VhostAsyncCallback *data = opaque;
>> +
>> +    VirtIODevice *vdev = VIRTIO_DEVICE(data->dev);
>> +    VHostUserBlk *s = VHOST_USER_BLK(vdev);
>> +
>>     struct vhost_dev *vhost = data->vhost;
>> 
>>     /*
>> @@ -2657,6 +2661,9 @@ static void vhost_user_async_close_bh(void *opaque)
>>      */
>>     if (vhost->vdev) {
>>         data->cb(data->dev);
>> +    } else if (data->event_cb) {
>> +        qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, data->event_cb,
>> +                                 NULL, data->dev, NULL, true);
>>     }
>> 
>>     g_free(data);
>> 
>> data->event_cb would be vhost_user_blk_event().
>> 
>> I think that makes the error path a lot easier to reason about and more future proof.
>> 
>>> For avoiding to call the vhost_dev_cleanup() twice, add a ‘inited’ in struct vhost-dev to mark if it’s inited like this:
>>> 
>> 
>> This is better than the original, but let me know what you think of my alternative.
> 
> The vhost_user_async_close_bh() is a common function in vhost-user.c, and vhost_user_async_close() is used by vhost-user-scsi/blk/gpio, 
> However, in your patch it’s limited to VhostUserBlk, so I think my fix is more reasonable.

I did not write out the full patch. 

vhost-user-scsi/blk/gpio would each provide their own ->event_cb, just like they each provide a different ->cb today. Looking at it again, data has the chardev, so no need to use VIRTO_DEVICE() or VHOST_USER_BLK().

The fix generalizes to all device types.

> 
> Thanks,
> LI
>> 
>>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>>> index e2f6ffb446..edc80c0231 100644
>>> --- a/hw/virtio/vhost.c
>>> +++ b/hw/virtio/vhost.c
>>> @@ -1502,6 +1502,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
>>>        goto fail_busyloop;
>>>    }
>>> 
>>> +    hdev->inited = true;
>>>    return 0;
>>> 
>>> fail_busyloop:
>>> @@ -1520,6 +1521,10 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
>>> {
>>>    int i;
>>> 
>>> +    if (!hdev->inited) {
>>> +        return;
>>> +    }
>>> +    hdev->inited = false;
>>>    trace_vhost_dev_cleanup(hdev);
>>> 
>>>    for (i = 0; i < hdev->nvqs; ++i) {
>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>>> index ca3131b1af..74b1aec960 100644
>>> --- a/include/hw/virtio/vhost.h
>>> +++ b/include/hw/virtio/vhost.h
>>> @@ -123,6 +123,7 @@ struct vhost_dev {
>>>    /* @started: is the vhost device started? */
>>>    bool started;
>>>    bool log_enabled;
>>> +    bool inited;
>>>    uint64_t log_size;
>>>    Error *migration_blocker;
>>>    const VhostOps *vhost_ops;
>>> 
>>> Thanks.
>>> 
>>>> 
>>>>> On Aug 4, 2023, at 1:29 AM, Li Feng <fengli@smartx.com> wrote:
>>>>> 
>>>>> When the vhost-user is reconnecting to the backend, and if the vhost-user fails
>>>>> at the get_features in vhost_dev_init(), then the reconnect will fail
>>>>> and it will not be retriggered forever.
>>>>> 
>>>>> The reason is:
>>>>> When the vhost-user fail at get_features, the vhost_dev_cleanup will be called
>>>>> immediately.
>>>>> 
>>>>> vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'.
>>>>> 
>>>>> The reconnect path is:
>>>>> vhost_user_blk_event
>>>>> vhost_user_async_close(.. vhost_user_blk_disconnect ..)
>>>>>  qemu_chr_fe_set_handlers <----- clear the notifier callback
>>>>>    schedule vhost_user_async_close_bh
>>>>> 
>>>>> The vhost->vdev is null, so the vhost_user_blk_disconnect will not be
>>>>> called, then the event fd callback will not be reinstalled.
>>>>> 
>>>>> With this patch, the vhost_user_blk_disconnect will call the
>>>>> vhost_dev_cleanup() again, it's safe.
>>>>> 
>>>>> All vhost-user devices have this issue, including vhost-user-blk/scsi.
>>>>> 
>>>>> Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling")
>>>>> 
>>>>> Signed-off-by: Li Feng <fengli@smartx.com>
>>>>> ---
>>>>> hw/virtio/vhost-user.c | 10 +---------
>>>>> 1 file changed, 1 insertion(+), 9 deletions(-)
>>>>> 
>>>>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>>>>> index 8dcf049d42..697b403fe2 100644
>>>>> --- a/hw/virtio/vhost-user.c
>>>>> +++ b/hw/virtio/vhost-user.c
>>>>> @@ -2648,16 +2648,8 @@ typedef struct {
>>>>> static void vhost_user_async_close_bh(void *opaque)
>>>>> {
>>>>>  VhostAsyncCallback *data = opaque;
>>>>> -    struct vhost_dev *vhost = data->vhost;
>>>>> 
>>>>> -    /*
>>>>> -     * If the vhost_dev has been cleared in the meantime there is
>>>>> -     * nothing left to do as some other path has completed the
>>>>> -     * cleanup.
>>>>> -     */
>>>>> -    if (vhost->vdev) {
>>>>> -        data->cb(data->dev);
>>>>> -    }
>>>>> +    data->cb(data->dev);
>>>>> 
>>>>>  g_free(data);
>>>>> }
>>>>> -- 
>>>>> 2.41.0


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] vhost-user: fix lost reconnect
  2023-08-22 10:17           ` Raphael Norwitz
@ 2023-08-24  7:27             ` Li Feng
  0 siblings, 0 replies; 27+ messages in thread
From: Li Feng @ 2023-08-24  7:27 UTC (permalink / raw)
  To: Raphael Norwitz
  Cc: Michael S. Tsirkin, Paolo Bonzini, Fam Zheng,
	open list:All patches CC here

[-- Attachment #1: Type: text/plain, Size: 6450 bytes --]



> On 22 Aug 2023, at 6:17 PM, Raphael Norwitz <raphael.norwitz@nutanix.com> wrote:
> 
> 
> 
>> On Aug 22, 2023, at 12:49 AM, Li Feng <fengli@smartx.com> wrote:
>> 
>> 
>> 
>>> On 22 Aug 2023, at 8:38 AM, Raphael Norwitz <raphael.norwitz@nutanix.com> wrote:
>>> 
>>>> 
>>>> On Aug 17, 2023, at 2:40 AM, Li Feng <fengli@smartx.com> wrote:
>>>> 
>>>> 
>>>>> 2023年8月14日 下午8:11,Raphael Norwitz <raphael.norwitz@nutanix.com> 写道:
>>>>> 
>>>>> Why can’t we rather fix this by adding a “event_cb” param to vhost_user_async_close and then call qemu_chr_fe_set_handlers in vhost_user_async_close_bh()?
>>>>> 
>>>>> Even if calling vhost_dev_cleanup() twice is safe today I worry future changes may easily stumble over the reconnect case and introduce crashes or double frees.
>>>>> 
>>>> I think add a new event_cb is not good enough. ‘qemu_chr_fe_set_handlers’ has been called in vhost_user_async_close, and will be called in event->cb, so why need add a
>>>> new event_cb?
>>>> 
>>> 
>>> I’m suggesting calling the data->event_cb instead of the data->cb if we hit the error case where vhost->vdev is NULL. Something like:
>>> 
>>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>>> index 8dcf049d42..edf1dccd44 100644
>>> --- a/hw/virtio/vhost-user.c
>>> +++ b/hw/virtio/vhost-user.c
>>> @@ -2648,6 +2648,10 @@ typedef struct {
>>> static void vhost_user_async_close_bh(void *opaque)
>>> {
>>>    VhostAsyncCallback *data = opaque;
>>> +
>>> +    VirtIODevice *vdev = VIRTIO_DEVICE(data->dev);
>>> +    VHostUserBlk *s = VHOST_USER_BLK(vdev);
>>> +
>>>    struct vhost_dev *vhost = data->vhost;
>>> 
>>>    /*
>>> @@ -2657,6 +2661,9 @@ static void vhost_user_async_close_bh(void *opaque)
>>>     */
>>>    if (vhost->vdev) {
>>>        data->cb(data->dev);
>>> +    } else if (data->event_cb) {
>>> +        qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, data->event_cb,
>>> +                                 NULL, data->dev, NULL, true);
>>>    }
>>> 
>>>    g_free(data);
>>> 
>>> data->event_cb would be vhost_user_blk_event().
>>> 
>>> I think that makes the error path a lot easier to reason about and more future proof.
>>> 
>>>> For avoiding to call the vhost_dev_cleanup() twice, add a ‘inited’ in struct vhost-dev to mark if it’s inited like this:
>>>> 
>>> 
>>> This is better than the original, but let me know what you think of my alternative.
>> 
>> The vhost_user_async_close_bh() is a common function in vhost-user.c, and vhost_user_async_close() is used by vhost-user-scsi/blk/gpio, 
>> However, in your patch it’s limited to VhostUserBlk, so I think my fix is more reasonable.
> 
> I did not write out the full patch. 
> 
> vhost-user-scsi/blk/gpio would each provide their own ->event_cb, just like they each provide a different ->cb today. Looking at it again, data has the chardev, so no need to use VIRTO_DEVICE() or VHOST_USER_BLK().
> 
> The fix generalizes to all device types.
OK, I will change it in V2.
> 
>> 
>> Thanks,
>> LI
>>> 
>>>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>>>> index e2f6ffb446..edc80c0231 100644
>>>> --- a/hw/virtio/vhost.c
>>>> +++ b/hw/virtio/vhost.c
>>>> @@ -1502,6 +1502,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
>>>>       goto fail_busyloop;
>>>>   }
>>>> 
>>>> +    hdev->inited = true;
>>>>   return 0;
>>>> 
>>>> fail_busyloop:
>>>> @@ -1520,6 +1521,10 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
>>>> {
>>>>   int i;
>>>> 
>>>> +    if (!hdev->inited) {
>>>> +        return;
>>>> +    }
>>>> +    hdev->inited = false;
>>>>   trace_vhost_dev_cleanup(hdev);
>>>> 
>>>>   for (i = 0; i < hdev->nvqs; ++i) {
>>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>>>> index ca3131b1af..74b1aec960 100644
>>>> --- a/include/hw/virtio/vhost.h
>>>> +++ b/include/hw/virtio/vhost.h
>>>> @@ -123,6 +123,7 @@ struct vhost_dev {
>>>>   /* @started: is the vhost device started? */
>>>>   bool started;
>>>>   bool log_enabled;
>>>> +    bool inited;
>>>>   uint64_t log_size;
>>>>   Error *migration_blocker;
>>>>   const VhostOps *vhost_ops;
>>>> 
>>>> Thanks.
>>>> 
>>>>> 
>>>>>> On Aug 4, 2023, at 1:29 AM, Li Feng <fengli@smartx.com> wrote:
>>>>>> 
>>>>>> When the vhost-user is reconnecting to the backend, and if the vhost-user fails
>>>>>> at the get_features in vhost_dev_init(), then the reconnect will fail
>>>>>> and it will not be retriggered forever.
>>>>>> 
>>>>>> The reason is:
>>>>>> When the vhost-user fail at get_features, the vhost_dev_cleanup will be called
>>>>>> immediately.
>>>>>> 
>>>>>> vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'.
>>>>>> 
>>>>>> The reconnect path is:
>>>>>> vhost_user_blk_event
>>>>>> vhost_user_async_close(.. vhost_user_blk_disconnect ..)
>>>>>> qemu_chr_fe_set_handlers <----- clear the notifier callback
>>>>>>   schedule vhost_user_async_close_bh
>>>>>> 
>>>>>> The vhost->vdev is null, so the vhost_user_blk_disconnect will not be
>>>>>> called, then the event fd callback will not be reinstalled.
>>>>>> 
>>>>>> With this patch, the vhost_user_blk_disconnect will call the
>>>>>> vhost_dev_cleanup() again, it's safe.
>>>>>> 
>>>>>> All vhost-user devices have this issue, including vhost-user-blk/scsi.
>>>>>> 
>>>>>> Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling")
>>>>>> 
>>>>>> Signed-off-by: Li Feng <fengli@smartx.com>
>>>>>> ---
>>>>>> hw/virtio/vhost-user.c | 10 +---------
>>>>>> 1 file changed, 1 insertion(+), 9 deletions(-)
>>>>>> 
>>>>>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>>>>>> index 8dcf049d42..697b403fe2 100644
>>>>>> --- a/hw/virtio/vhost-user.c
>>>>>> +++ b/hw/virtio/vhost-user.c
>>>>>> @@ -2648,16 +2648,8 @@ typedef struct {
>>>>>> static void vhost_user_async_close_bh(void *opaque)
>>>>>> {
>>>>>> VhostAsyncCallback *data = opaque;
>>>>>> -    struct vhost_dev *vhost = data->vhost;
>>>>>> 
>>>>>> -    /*
>>>>>> -     * If the vhost_dev has been cleared in the meantime there is
>>>>>> -     * nothing left to do as some other path has completed the
>>>>>> -     * cleanup.
>>>>>> -     */
>>>>>> -    if (vhost->vdev) {
>>>>>> -        data->cb(data->dev);
>>>>>> -    }
>>>>>> +    data->cb(data->dev);
>>>>>> 
>>>>>> g_free(data);
>>>>>> }
>>>>>> -- 
>>>>>> 2.41.0


[-- Attachment #2: Type: text/html, Size: 12354 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 0/2] Fix vhost reconnect issues
  2023-08-04  5:29 [PATCH 0/2] Fix vhost reconnect issues Li Feng
  2023-08-04  5:29 ` [PATCH 1/2] vhost-user: fix lost reconnect Li Feng
  2023-08-04  5:29 ` [PATCH 2/2] vhost: Add Error parameter to vhost_scsi_common_start() Li Feng
@ 2023-08-24  7:41 ` Li Feng
  2023-08-24  7:41   ` [PATCH v2 1/2] vhost-user: Fix lost reconnect Li Feng
  2023-08-24  7:41   ` [PATCH v2 2/2] vhost: Add Error parameter to vhost_scsi_common_start() Li Feng
  2023-08-30  4:57 ` [PATCH v3 0/2] Fix vhost reconnect issues Li Feng
  3 siblings, 2 replies; 27+ messages in thread
From: Li Feng @ 2023-08-24  7:41 UTC (permalink / raw)
  To: Raphael Norwitz, Michael S. Tsirkin, Kevin Wolf, Hanna Reitz,
	Paolo Bonzini, Fam Zheng, Alex Bennée, Viresh Kumar,
	open list:Block layer core, open list:All patches CC here
  Cc: Li Feng

The patchset fixes the regression issue of vhost reconnect.
It's a serious bug that the vhost-user will lose the reconnect forever.

The 2nd patch enhances the error handle of vhost-user-scsi.

This patchset's parent commit is:
https://lore.kernel.org/all/20230731121018.2856310-1-fengli@smartx.com/

Changes for v2:
- Add a event_cb in VhostAsyncCallback to be called when dev is NULL;
- Fix the error report message.

Li Feng (2):
  vhost-user: Fix lost reconnect
  vhost: Add Error parameter to vhost_scsi_common_start()

 hw/block/vhost-user-blk.c             |  2 +-
 hw/scsi/vhost-scsi-common.c           | 16 +++++++++-------
 hw/scsi/vhost-scsi.c                  |  5 +++--
 hw/scsi/vhost-user-scsi.c             | 17 ++++++++++-------
 hw/virtio/vhost-user-gpio.c           |  2 +-
 hw/virtio/vhost-user.c                | 10 ++++++++--
 include/hw/virtio/vhost-scsi-common.h |  2 +-
 include/hw/virtio/vhost-user.h        |  4 +++-
 8 files changed, 36 insertions(+), 22 deletions(-)

-- 
2.41.0



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 1/2] vhost-user: Fix lost reconnect
  2023-08-24  7:41 ` [PATCH v2 0/2] Fix vhost reconnect issues Li Feng
@ 2023-08-24  7:41   ` Li Feng
  2023-08-29 22:11     ` Raphael Norwitz
  2023-08-24  7:41   ` [PATCH v2 2/2] vhost: Add Error parameter to vhost_scsi_common_start() Li Feng
  1 sibling, 1 reply; 27+ messages in thread
From: Li Feng @ 2023-08-24  7:41 UTC (permalink / raw)
  To: Michael S. Tsirkin, Raphael Norwitz, Kevin Wolf, Hanna Reitz,
	Paolo Bonzini, Fam Zheng, Alex Bennée, Viresh Kumar,
	open list:Block layer core, open list:All patches CC here
  Cc: Li Feng

When the vhost-user is reconnecting to the backend, and if the vhost-user fails
at the get_features in vhost_dev_init(), then the reconnect will fail
and it will not be retriggered forever.

The reason is:
When the vhost-user fails at get_features, the vhost_dev_cleanup will be called
immediately.

vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'.

The reconnect path is:
vhost_user_blk_event
   vhost_user_async_close(.. vhost_user_blk_disconnect ..)
     qemu_chr_fe_set_handlers <----- clear the notifier callback
       schedule vhost_user_async_close_bh

The vhost->vdev is null, so the vhost_user_blk_disconnect will not be
called, then the event fd callback will not be reinstalled.

All vhost-user devices have this issue, including vhost-user-blk/scsi.

With this patch, if the vdev->vdev is null, the fd callback will still
be reinstalled.

Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling")

Signed-off-by: Li Feng <fengli@smartx.com>
---
 hw/block/vhost-user-blk.c      |  2 +-
 hw/scsi/vhost-user-scsi.c      |  3 ++-
 hw/virtio/vhost-user-gpio.c    |  2 +-
 hw/virtio/vhost-user.c         | 10 ++++++++--
 include/hw/virtio/vhost-user.h |  4 +++-
 5 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index 3c69fa47d5..95c758200d 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -391,7 +391,7 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
     case CHR_EVENT_CLOSED:
         /* defer close until later to avoid circular close */
         vhost_user_async_close(dev, &s->chardev, &s->dev,
-                               vhost_user_blk_disconnect);
+                               vhost_user_blk_disconnect, vhost_user_blk_event);
         break;
     case CHR_EVENT_BREAK:
     case CHR_EVENT_MUX_IN:
diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
index a7fa8e8df2..e931df9f5b 100644
--- a/hw/scsi/vhost-user-scsi.c
+++ b/hw/scsi/vhost-user-scsi.c
@@ -236,7 +236,8 @@ static void vhost_user_scsi_event(void *opaque, QEMUChrEvent event)
     case CHR_EVENT_CLOSED:
         /* defer close until later to avoid circular close */
         vhost_user_async_close(dev, &vs->conf.chardev, &vsc->dev,
-                               vhost_user_scsi_disconnect);
+                               vhost_user_scsi_disconnect,
+                               vhost_user_scsi_event);
         break;
     case CHR_EVENT_BREAK:
     case CHR_EVENT_MUX_IN:
diff --git a/hw/virtio/vhost-user-gpio.c b/hw/virtio/vhost-user-gpio.c
index d9979aa5db..04c2cc79f4 100644
--- a/hw/virtio/vhost-user-gpio.c
+++ b/hw/virtio/vhost-user-gpio.c
@@ -283,7 +283,7 @@ static void vu_gpio_event(void *opaque, QEMUChrEvent event)
     case CHR_EVENT_CLOSED:
         /* defer close until later to avoid circular close */
         vhost_user_async_close(dev, &gpio->chardev, &gpio->vhost_dev,
-                               vu_gpio_disconnect);
+                               vu_gpio_disconnect, vu_gpio_event);
         break;
     case CHR_EVENT_BREAK:
     case CHR_EVENT_MUX_IN:
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 8dcf049d42..9540766dd3 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -2643,6 +2643,7 @@ typedef struct {
     DeviceState *dev;
     CharBackend *cd;
     struct vhost_dev *vhost;
+    IOEventHandler *event_cb;
 } VhostAsyncCallback;
 
 static void vhost_user_async_close_bh(void *opaque)
@@ -2657,7 +2658,10 @@ static void vhost_user_async_close_bh(void *opaque)
      */
     if (vhost->vdev) {
         data->cb(data->dev);
-    }
+    } else if (data->event_cb) {
+        qemu_chr_fe_set_handlers(data->cd, NULL, NULL, data->event_cb,
+                                 NULL, data->dev, NULL, true);
+   }
 
     g_free(data);
 }
@@ -2669,7 +2673,9 @@ static void vhost_user_async_close_bh(void *opaque)
  */
 void vhost_user_async_close(DeviceState *d,
                             CharBackend *chardev, struct vhost_dev *vhost,
-                            vu_async_close_fn cb)
+                            vu_async_close_fn cb,
+                            IOEventHandler *event_cb
+                            )
 {
     if (!runstate_check(RUN_STATE_SHUTDOWN)) {
         /*
diff --git a/include/hw/virtio/vhost-user.h b/include/hw/virtio/vhost-user.h
index 191216a74f..5fdc711d4e 100644
--- a/include/hw/virtio/vhost-user.h
+++ b/include/hw/virtio/vhost-user.h
@@ -84,6 +84,8 @@ typedef void (*vu_async_close_fn)(DeviceState *cb);
 
 void vhost_user_async_close(DeviceState *d,
                             CharBackend *chardev, struct vhost_dev *vhost,
-                            vu_async_close_fn cb);
+                            vu_async_close_fn cb,
+                            IOEventHandler *event_cb
+                            );
 
 #endif
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 2/2] vhost: Add Error parameter to vhost_scsi_common_start()
  2023-08-24  7:41 ` [PATCH v2 0/2] Fix vhost reconnect issues Li Feng
  2023-08-24  7:41   ` [PATCH v2 1/2] vhost-user: Fix lost reconnect Li Feng
@ 2023-08-24  7:41   ` Li Feng
  2023-08-29 22:20     ` Raphael Norwitz
  1 sibling, 1 reply; 27+ messages in thread
From: Li Feng @ 2023-08-24  7:41 UTC (permalink / raw)
  To: Michael S. Tsirkin, Raphael Norwitz, Kevin Wolf, Hanna Reitz,
	Paolo Bonzini, Fam Zheng, Alex Bennée, Viresh Kumar,
	open list:Block layer core, open list:All patches CC here
  Cc: Li Feng

Add a Error parameter to report the real error, like vhost-user-blk.

Signed-off-by: Li Feng <fengli@smartx.com>
---
 hw/scsi/vhost-scsi-common.c           | 16 +++++++++-------
 hw/scsi/vhost-scsi.c                  |  5 +++--
 hw/scsi/vhost-user-scsi.c             | 14 ++++++++------
 include/hw/virtio/vhost-scsi-common.h |  2 +-
 4 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/hw/scsi/vhost-scsi-common.c b/hw/scsi/vhost-scsi-common.c
index a61cd0e907..4c8637045d 100644
--- a/hw/scsi/vhost-scsi-common.c
+++ b/hw/scsi/vhost-scsi-common.c
@@ -16,6 +16,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qapi/error.h"
 #include "qemu/error-report.h"
 #include "qemu/module.h"
 #include "hw/virtio/vhost.h"
@@ -25,7 +26,7 @@
 #include "hw/virtio/virtio-access.h"
 #include "hw/fw-path-provider.h"
 
-int vhost_scsi_common_start(VHostSCSICommon *vsc)
+int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp)
 {
     int ret, i;
     VirtIODevice *vdev = VIRTIO_DEVICE(vsc);
@@ -35,18 +36,19 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
     VirtIOSCSICommon *vs = (VirtIOSCSICommon *)vsc;
 
     if (!k->set_guest_notifiers) {
-        error_report("binding does not support guest notifiers");
+        error_setg(errp, "binding does not support guest notifiers");
         return -ENOSYS;
     }
 
     ret = vhost_dev_enable_notifiers(&vsc->dev, vdev);
     if (ret < 0) {
+        error_setg_errno(errp, -ret, "Error enabling host notifiers");
         return ret;
     }
 
     ret = k->set_guest_notifiers(qbus->parent, vsc->dev.nvqs, true);
     if (ret < 0) {
-        error_report("Error binding guest notifier");
+        error_setg_errno(errp, -ret, "Error binding guest notifier");
         goto err_host_notifiers;
     }
 
@@ -54,7 +56,7 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
 
     ret = vhost_dev_prepare_inflight(&vsc->dev, vdev);
     if (ret < 0) {
-        error_report("Error setting inflight format: %d", -ret);
+        error_setg_errno(errp, -ret, "Error setting inflight format");
         goto err_guest_notifiers;
     }
 
@@ -64,21 +66,21 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
                                         vs->conf.virtqueue_size,
                                         vsc->inflight);
             if (ret < 0) {
-                error_report("Error getting inflight: %d", -ret);
+                error_setg_errno(errp, -ret, "Error getting inflight");
                 goto err_guest_notifiers;
             }
         }
 
         ret = vhost_dev_set_inflight(&vsc->dev, vsc->inflight);
         if (ret < 0) {
-            error_report("Error setting inflight: %d", -ret);
+            error_setg_errno(errp, -ret, "Error setting inflight");
             goto err_guest_notifiers;
         }
     }
 
     ret = vhost_dev_start(&vsc->dev, vdev, true);
     if (ret < 0) {
-        error_report("Error start vhost dev");
+        error_setg_errno(errp, -ret, "Error starting vhost dev");
         goto err_guest_notifiers;
     }
 
diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index 443f67daa4..01a3ab4277 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -75,6 +75,7 @@ static int vhost_scsi_start(VHostSCSI *s)
     int ret, abi_version;
     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
     const VhostOps *vhost_ops = vsc->dev.vhost_ops;
+    Error *local_err = NULL;
 
     ret = vhost_ops->vhost_scsi_get_abi_version(&vsc->dev, &abi_version);
     if (ret < 0) {
@@ -88,14 +89,14 @@ static int vhost_scsi_start(VHostSCSI *s)
         return -ENOSYS;
     }
 
-    ret = vhost_scsi_common_start(vsc);
+    ret = vhost_scsi_common_start(vsc, &local_err);
     if (ret < 0) {
         return ret;
     }
 
     ret = vhost_scsi_set_endpoint(s);
     if (ret < 0) {
-        error_report("Error setting vhost-scsi endpoint");
+        error_reportf_err(local_err, "Error setting vhost-scsi endpoint");
         vhost_scsi_common_stop(vsc);
     }
 
diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
index e931df9f5b..62fc98bb1c 100644
--- a/hw/scsi/vhost-user-scsi.c
+++ b/hw/scsi/vhost-user-scsi.c
@@ -43,12 +43,12 @@ enum VhostUserProtocolFeature {
     VHOST_USER_PROTOCOL_F_RESET_DEVICE = 13,
 };
 
-static int vhost_user_scsi_start(VHostUserSCSI *s)
+static int vhost_user_scsi_start(VHostUserSCSI *s, Error **errp)
 {
     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
     int ret;
 
-    ret = vhost_scsi_common_start(vsc);
+    ret = vhost_scsi_common_start(vsc, errp);
     s->started_vu = (ret < 0 ? false : true);
 
     return ret;
@@ -73,6 +73,7 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
     VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(dev);
     bool should_start = virtio_device_should_start(vdev, status);
+    Error *local_err = NULL;
     int ret;
 
     if (!s->connected) {
@@ -84,9 +85,10 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
     }
 
     if (should_start) {
-        ret = vhost_user_scsi_start(s);
+        ret = vhost_user_scsi_start(s, &local_err);
         if (ret < 0) {
-            error_report("unable to start vhost-user-scsi: %s", strerror(-ret));
+            error_reportf_err(local_err, "unable to start vhost-user-scsi: %s",
+                              strerror(-ret));
             qemu_chr_fe_disconnect(&vs->conf.chardev);
         }
     } else {
@@ -139,7 +141,7 @@ static void vhost_user_scsi_handle_output(VirtIODevice *vdev, VirtQueue *vq)
      * Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
      * vhost here instead of waiting for .set_status().
      */
-    ret = vhost_user_scsi_start(s);
+    ret = vhost_user_scsi_start(s, &local_err);
     if (ret < 0) {
         error_reportf_err(local_err, "vhost-user-scsi: vhost start failed: ");
         qemu_chr_fe_disconnect(&vs->conf.chardev);
@@ -184,7 +186,7 @@ static int vhost_user_scsi_connect(DeviceState *dev, Error **errp)
 
     /* restore vhost state */
     if (virtio_device_started(vdev, vdev->status)) {
-        ret = vhost_user_scsi_start(s);
+        ret = vhost_user_scsi_start(s, errp);
         if (ret < 0) {
             return ret;
         }
diff --git a/include/hw/virtio/vhost-scsi-common.h b/include/hw/virtio/vhost-scsi-common.h
index 18f115527c..c5d2c09455 100644
--- a/include/hw/virtio/vhost-scsi-common.h
+++ b/include/hw/virtio/vhost-scsi-common.h
@@ -39,7 +39,7 @@ struct VHostSCSICommon {
     struct vhost_inflight *inflight;
 };
 
-int vhost_scsi_common_start(VHostSCSICommon *vsc);
+int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp);
 void vhost_scsi_common_stop(VHostSCSICommon *vsc);
 char *vhost_scsi_common_get_fw_dev_path(FWPathProvider *p, BusState *bus,
                                         DeviceState *dev);
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/2] vhost-user: Fix lost reconnect
  2023-08-24  7:41   ` [PATCH v2 1/2] vhost-user: Fix lost reconnect Li Feng
@ 2023-08-29 22:11     ` Raphael Norwitz
  2023-08-30  4:51       ` Li Feng
  0 siblings, 1 reply; 27+ messages in thread
From: Raphael Norwitz @ 2023-08-29 22:11 UTC (permalink / raw)
  To: Li Feng
  Cc: Michael S. Tsirkin, Kevin Wolf, Hanna Reitz, Paolo Bonzini,
	Fam Zheng, Alex Bennée, Viresh Kumar,
	open list:Block layer core, open list:All patches CC here



> On Aug 24, 2023, at 3:41 AM, Li Feng <fengli@smartx.com> wrote:
> 
> When the vhost-user is reconnecting to the backend, and if the vhost-user fails
> at the get_features in vhost_dev_init(), then the reconnect will fail
> and it will not be retriggered forever.
> 
> The reason is:
> When the vhost-user fails at get_features, the vhost_dev_cleanup will be called
> immediately.
> 
> vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'.
> 
> The reconnect path is:
> vhost_user_blk_event
>   vhost_user_async_close(.. vhost_user_blk_disconnect ..)
>     qemu_chr_fe_set_handlers <----- clear the notifier callback
>       schedule vhost_user_async_close_bh
> 
> The vhost->vdev is null, so the vhost_user_blk_disconnect will not be
> called, then the event fd callback will not be reinstalled.
> 
> All vhost-user devices have this issue, including vhost-user-blk/scsi.
> 
> With this patch, if the vdev->vdev is null, the fd callback will still
> be reinstalled.
> 
> Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling")
> 

A couple of NITs, otherwise LGTM

Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>

> Signed-off-by: Li Feng <fengli@smartx.com>
> ---
> hw/block/vhost-user-blk.c      |  2 +-
> hw/scsi/vhost-user-scsi.c      |  3 ++-
> hw/virtio/vhost-user-gpio.c    |  2 +-
> hw/virtio/vhost-user.c         | 10 ++++++++--
> include/hw/virtio/vhost-user.h |  4 +++-
> 5 files changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index 3c69fa47d5..95c758200d 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -391,7 +391,7 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
>     case CHR_EVENT_CLOSED:
>         /* defer close until later to avoid circular close */
>         vhost_user_async_close(dev, &s->chardev, &s->dev,
> -                               vhost_user_blk_disconnect);
> +                               vhost_user_blk_disconnect, vhost_user_blk_event);
>         break;
>     case CHR_EVENT_BREAK:
>     case CHR_EVENT_MUX_IN:
> diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
> index a7fa8e8df2..e931df9f5b 100644
> --- a/hw/scsi/vhost-user-scsi.c
> +++ b/hw/scsi/vhost-user-scsi.c
> @@ -236,7 +236,8 @@ static void vhost_user_scsi_event(void *opaque, QEMUChrEvent event)
>     case CHR_EVENT_CLOSED:
>         /* defer close until later to avoid circular close */
>         vhost_user_async_close(dev, &vs->conf.chardev, &vsc->dev,
> -                               vhost_user_scsi_disconnect);
> +                               vhost_user_scsi_disconnect,
> +                               vhost_user_scsi_event);
>         break;
>     case CHR_EVENT_BREAK:
>     case CHR_EVENT_MUX_IN:
> diff --git a/hw/virtio/vhost-user-gpio.c b/hw/virtio/vhost-user-gpio.c
> index d9979aa5db..04c2cc79f4 100644
> --- a/hw/virtio/vhost-user-gpio.c
> +++ b/hw/virtio/vhost-user-gpio.c
> @@ -283,7 +283,7 @@ static void vu_gpio_event(void *opaque, QEMUChrEvent event)
>     case CHR_EVENT_CLOSED:
>         /* defer close until later to avoid circular close */
>         vhost_user_async_close(dev, &gpio->chardev, &gpio->vhost_dev,
> -                               vu_gpio_disconnect);
> +                               vu_gpio_disconnect, vu_gpio_event);
>         break;
>     case CHR_EVENT_BREAK:
>     case CHR_EVENT_MUX_IN:
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 8dcf049d42..9540766dd3 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -2643,6 +2643,7 @@ typedef struct {
>     DeviceState *dev;
>     CharBackend *cd;
>     struct vhost_dev *vhost;
> +    IOEventHandler *event_cb;
> } VhostAsyncCallback;
> 
> static void vhost_user_async_close_bh(void *opaque)
> @@ -2657,7 +2658,10 @@ static void vhost_user_async_close_bh(void *opaque)
>      */
>     if (vhost->vdev) {
>         data->cb(data->dev);
> -    }
> +    } else if (data->event_cb) {
> +        qemu_chr_fe_set_handlers(data->cd, NULL, NULL, data->event_cb,
> +                                 NULL, data->dev, NULL, true);
> +   }
> 
>     g_free(data);
> }
> @@ -2669,7 +2673,9 @@ static void vhost_user_async_close_bh(void *opaque)
>  */
> void vhost_user_async_close(DeviceState *d,
>                             CharBackend *chardev, struct vhost_dev *vhost,
> -                            vu_async_close_fn cb)
> +                            vu_async_close_fn cb,
> +                            IOEventHandler *event_cb

Nit: why the newline before the closing parenthesis?

> +                            )
> {
>     if (!runstate_check(RUN_STATE_SHUTDOWN)) {
>         /*
> diff --git a/include/hw/virtio/vhost-user.h b/include/hw/virtio/vhost-user.h
> index 191216a74f..5fdc711d4e 100644
> --- a/include/hw/virtio/vhost-user.h
> +++ b/include/hw/virtio/vhost-user.h
> @@ -84,6 +84,8 @@ typedef void (*vu_async_close_fn)(DeviceState *cb);
> 
> void vhost_user_async_close(DeviceState *d,
>                             CharBackend *chardev, struct vhost_dev *vhost,
> -                            vu_async_close_fn cb);
> +                            vu_async_close_fn cb,
> +                            IOEventHandler *event_cb

Nit: ditto - don’t think we need this newline before );

> +                            );
> 
> #endif
> -- 
> 2.41.0
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 2/2] vhost: Add Error parameter to vhost_scsi_common_start()
  2023-08-24  7:41   ` [PATCH v2 2/2] vhost: Add Error parameter to vhost_scsi_common_start() Li Feng
@ 2023-08-29 22:20     ` Raphael Norwitz
  0 siblings, 0 replies; 27+ messages in thread
From: Raphael Norwitz @ 2023-08-29 22:20 UTC (permalink / raw)
  To: Li Feng
  Cc: Michael S. Tsirkin, Kevin Wolf, Hanna Reitz, Paolo Bonzini,
	Fam Zheng, Alex Bennée, Viresh Kumar,
	open list:Block layer core, open list:All patches CC here



> On Aug 24, 2023, at 3:41 AM, Li Feng <fengli@smartx.com> wrote:
> 
> Add a Error parameter to report the real error, like vhost-user-blk.
> 
> Signed-off-by: Li Feng <fengli@smartx.com>

Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>

> ---
> hw/scsi/vhost-scsi-common.c           | 16 +++++++++-------
> hw/scsi/vhost-scsi.c                  |  5 +++--
> hw/scsi/vhost-user-scsi.c             | 14 ++++++++------
> include/hw/virtio/vhost-scsi-common.h |  2 +-
> 4 files changed, 21 insertions(+), 16 deletions(-)
> 
> diff --git a/hw/scsi/vhost-scsi-common.c b/hw/scsi/vhost-scsi-common.c
> index a61cd0e907..4c8637045d 100644
> --- a/hw/scsi/vhost-scsi-common.c
> +++ b/hw/scsi/vhost-scsi-common.c
> @@ -16,6 +16,7 @@
>  */
> 
> #include "qemu/osdep.h"
> +#include "qapi/error.h"
> #include "qemu/error-report.h"
> #include "qemu/module.h"
> #include "hw/virtio/vhost.h"
> @@ -25,7 +26,7 @@
> #include "hw/virtio/virtio-access.h"
> #include "hw/fw-path-provider.h"
> 
> -int vhost_scsi_common_start(VHostSCSICommon *vsc)
> +int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp)
> {
>     int ret, i;
>     VirtIODevice *vdev = VIRTIO_DEVICE(vsc);
> @@ -35,18 +36,19 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
>     VirtIOSCSICommon *vs = (VirtIOSCSICommon *)vsc;
> 
>     if (!k->set_guest_notifiers) {
> -        error_report("binding does not support guest notifiers");
> +        error_setg(errp, "binding does not support guest notifiers");
>         return -ENOSYS;
>     }
> 
>     ret = vhost_dev_enable_notifiers(&vsc->dev, vdev);
>     if (ret < 0) {
> +        error_setg_errno(errp, -ret, "Error enabling host notifiers");
>         return ret;
>     }
> 
>     ret = k->set_guest_notifiers(qbus->parent, vsc->dev.nvqs, true);
>     if (ret < 0) {
> -        error_report("Error binding guest notifier");
> +        error_setg_errno(errp, -ret, "Error binding guest notifier");
>         goto err_host_notifiers;
>     }
> 
> @@ -54,7 +56,7 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
> 
>     ret = vhost_dev_prepare_inflight(&vsc->dev, vdev);
>     if (ret < 0) {
> -        error_report("Error setting inflight format: %d", -ret);
> +        error_setg_errno(errp, -ret, "Error setting inflight format");
>         goto err_guest_notifiers;
>     }
> 
> @@ -64,21 +66,21 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
>                                         vs->conf.virtqueue_size,
>                                         vsc->inflight);
>             if (ret < 0) {
> -                error_report("Error getting inflight: %d", -ret);
> +                error_setg_errno(errp, -ret, "Error getting inflight");
>                 goto err_guest_notifiers;
>             }
>         }
> 
>         ret = vhost_dev_set_inflight(&vsc->dev, vsc->inflight);
>         if (ret < 0) {
> -            error_report("Error setting inflight: %d", -ret);
> +            error_setg_errno(errp, -ret, "Error setting inflight");
>             goto err_guest_notifiers;
>         }
>     }
> 
>     ret = vhost_dev_start(&vsc->dev, vdev, true);
>     if (ret < 0) {
> -        error_report("Error start vhost dev");
> +        error_setg_errno(errp, -ret, "Error starting vhost dev");
>         goto err_guest_notifiers;
>     }
> 
> diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
> index 443f67daa4..01a3ab4277 100644
> --- a/hw/scsi/vhost-scsi.c
> +++ b/hw/scsi/vhost-scsi.c
> @@ -75,6 +75,7 @@ static int vhost_scsi_start(VHostSCSI *s)
>     int ret, abi_version;
>     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>     const VhostOps *vhost_ops = vsc->dev.vhost_ops;
> +    Error *local_err = NULL;
> 
>     ret = vhost_ops->vhost_scsi_get_abi_version(&vsc->dev, &abi_version);
>     if (ret < 0) {
> @@ -88,14 +89,14 @@ static int vhost_scsi_start(VHostSCSI *s)
>         return -ENOSYS;
>     }
> 
> -    ret = vhost_scsi_common_start(vsc);
> +    ret = vhost_scsi_common_start(vsc, &local_err);
>     if (ret < 0) {
>         return ret;
>     }
> 
>     ret = vhost_scsi_set_endpoint(s);
>     if (ret < 0) {
> -        error_report("Error setting vhost-scsi endpoint");
> +        error_reportf_err(local_err, "Error setting vhost-scsi endpoint");
>         vhost_scsi_common_stop(vsc);
>     }
> 
> diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
> index e931df9f5b..62fc98bb1c 100644
> --- a/hw/scsi/vhost-user-scsi.c
> +++ b/hw/scsi/vhost-user-scsi.c
> @@ -43,12 +43,12 @@ enum VhostUserProtocolFeature {
>     VHOST_USER_PROTOCOL_F_RESET_DEVICE = 13,
> };
> 
> -static int vhost_user_scsi_start(VHostUserSCSI *s)
> +static int vhost_user_scsi_start(VHostUserSCSI *s, Error **errp)
> {
>     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>     int ret;
> 
> -    ret = vhost_scsi_common_start(vsc);
> +    ret = vhost_scsi_common_start(vsc, errp);
>     s->started_vu = (ret < 0 ? false : true);
> 
>     return ret;
> @@ -73,6 +73,7 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
>     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>     VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(dev);
>     bool should_start = virtio_device_should_start(vdev, status);
> +    Error *local_err = NULL;
>     int ret;
> 
>     if (!s->connected) {
> @@ -84,9 +85,10 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
>     }
> 
>     if (should_start) {
> -        ret = vhost_user_scsi_start(s);
> +        ret = vhost_user_scsi_start(s, &local_err);
>         if (ret < 0) {
> -            error_report("unable to start vhost-user-scsi: %s", strerror(-ret));
> +            error_reportf_err(local_err, "unable to start vhost-user-scsi: %s",
> +                              strerror(-ret));
>             qemu_chr_fe_disconnect(&vs->conf.chardev);
>         }
>     } else {
> @@ -139,7 +141,7 @@ static void vhost_user_scsi_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>      * Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
>      * vhost here instead of waiting for .set_status().
>      */
> -    ret = vhost_user_scsi_start(s);
> +    ret = vhost_user_scsi_start(s, &local_err);
>     if (ret < 0) {
>         error_reportf_err(local_err, "vhost-user-scsi: vhost start failed: ");
>         qemu_chr_fe_disconnect(&vs->conf.chardev);
> @@ -184,7 +186,7 @@ static int vhost_user_scsi_connect(DeviceState *dev, Error **errp)
> 
>     /* restore vhost state */
>     if (virtio_device_started(vdev, vdev->status)) {
> -        ret = vhost_user_scsi_start(s);
> +        ret = vhost_user_scsi_start(s, errp);
>         if (ret < 0) {
>             return ret;
>         }
> diff --git a/include/hw/virtio/vhost-scsi-common.h b/include/hw/virtio/vhost-scsi-common.h
> index 18f115527c..c5d2c09455 100644
> --- a/include/hw/virtio/vhost-scsi-common.h
> +++ b/include/hw/virtio/vhost-scsi-common.h
> @@ -39,7 +39,7 @@ struct VHostSCSICommon {
>     struct vhost_inflight *inflight;
> };
> 
> -int vhost_scsi_common_start(VHostSCSICommon *vsc);
> +int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp);
> void vhost_scsi_common_stop(VHostSCSICommon *vsc);
> char *vhost_scsi_common_get_fw_dev_path(FWPathProvider *p, BusState *bus,
>                                         DeviceState *dev);
> -- 
> 2.41.0
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/2] vhost-user: Fix lost reconnect
  2023-08-29 22:11     ` Raphael Norwitz
@ 2023-08-30  4:51       ` Li Feng
  0 siblings, 0 replies; 27+ messages in thread
From: Li Feng @ 2023-08-30  4:51 UTC (permalink / raw)
  To: Raphael Norwitz
  Cc: Michael S. Tsirkin, Kevin Wolf, Hanna Reitz, Paolo Bonzini,
	Fam Zheng, Alex Bennée, Viresh Kumar,
	open list:Block layer core, open list:All patches CC here

[-- Attachment #1: Type: text/plain, Size: 5797 bytes --]



> On 30 Aug 2023, at 6:11 AM, Raphael Norwitz <raphael.norwitz@nutanix.com> wrote:
> 
> 
> 
>> On Aug 24, 2023, at 3:41 AM, Li Feng <fengli@smartx.com> wrote:
>> 
>> When the vhost-user is reconnecting to the backend, and if the vhost-user fails
>> at the get_features in vhost_dev_init(), then the reconnect will fail
>> and it will not be retriggered forever.
>> 
>> The reason is:
>> When the vhost-user fails at get_features, the vhost_dev_cleanup will be called
>> immediately.
>> 
>> vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'.
>> 
>> The reconnect path is:
>> vhost_user_blk_event
>>  vhost_user_async_close(.. vhost_user_blk_disconnect ..)
>>    qemu_chr_fe_set_handlers <----- clear the notifier callback
>>      schedule vhost_user_async_close_bh
>> 
>> The vhost->vdev is null, so the vhost_user_blk_disconnect will not be
>> called, then the event fd callback will not be reinstalled.
>> 
>> All vhost-user devices have this issue, including vhost-user-blk/scsi.
>> 
>> With this patch, if the vdev->vdev is null, the fd callback will still
>> be reinstalled.
>> 
>> Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling")
>> 
> 
> A couple of NITs, otherwise LGTM
> 
> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com <mailto:raphael.norwitz@nutanix.com>>
> 
>> Signed-off-by: Li Feng <fengli@smartx.com>
>> ---
>> hw/block/vhost-user-blk.c      |  2 +-
>> hw/scsi/vhost-user-scsi.c      |  3 ++-
>> hw/virtio/vhost-user-gpio.c    |  2 +-
>> hw/virtio/vhost-user.c         | 10 ++++++++--
>> include/hw/virtio/vhost-user.h |  4 +++-
>> 5 files changed, 15 insertions(+), 6 deletions(-)
>> 
>> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
>> index 3c69fa47d5..95c758200d 100644
>> --- a/hw/block/vhost-user-blk.c
>> +++ b/hw/block/vhost-user-blk.c
>> @@ -391,7 +391,7 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
>>    case CHR_EVENT_CLOSED:
>>        /* defer close until later to avoid circular close */
>>        vhost_user_async_close(dev, &s->chardev, &s->dev,
>> -                               vhost_user_blk_disconnect);
>> +                               vhost_user_blk_disconnect, vhost_user_blk_event);
>>        break;
>>    case CHR_EVENT_BREAK:
>>    case CHR_EVENT_MUX_IN:
>> diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
>> index a7fa8e8df2..e931df9f5b 100644
>> --- a/hw/scsi/vhost-user-scsi.c
>> +++ b/hw/scsi/vhost-user-scsi.c
>> @@ -236,7 +236,8 @@ static void vhost_user_scsi_event(void *opaque, QEMUChrEvent event)
>>    case CHR_EVENT_CLOSED:
>>        /* defer close until later to avoid circular close */
>>        vhost_user_async_close(dev, &vs->conf.chardev, &vsc->dev,
>> -                               vhost_user_scsi_disconnect);
>> +                               vhost_user_scsi_disconnect,
>> +                               vhost_user_scsi_event);
>>        break;
>>    case CHR_EVENT_BREAK:
>>    case CHR_EVENT_MUX_IN:
>> diff --git a/hw/virtio/vhost-user-gpio.c b/hw/virtio/vhost-user-gpio.c
>> index d9979aa5db..04c2cc79f4 100644
>> --- a/hw/virtio/vhost-user-gpio.c
>> +++ b/hw/virtio/vhost-user-gpio.c
>> @@ -283,7 +283,7 @@ static void vu_gpio_event(void *opaque, QEMUChrEvent event)
>>    case CHR_EVENT_CLOSED:
>>        /* defer close until later to avoid circular close */
>>        vhost_user_async_close(dev, &gpio->chardev, &gpio->vhost_dev,
>> -                               vu_gpio_disconnect);
>> +                               vu_gpio_disconnect, vu_gpio_event);
>>        break;
>>    case CHR_EVENT_BREAK:
>>    case CHR_EVENT_MUX_IN:
>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>> index 8dcf049d42..9540766dd3 100644
>> --- a/hw/virtio/vhost-user.c
>> +++ b/hw/virtio/vhost-user.c
>> @@ -2643,6 +2643,7 @@ typedef struct {
>>    DeviceState *dev;
>>    CharBackend *cd;
>>    struct vhost_dev *vhost;
>> +    IOEventHandler *event_cb;
>> } VhostAsyncCallback;
>> 
>> static void vhost_user_async_close_bh(void *opaque)
>> @@ -2657,7 +2658,10 @@ static void vhost_user_async_close_bh(void *opaque)
>>     */
>>    if (vhost->vdev) {
>>        data->cb(data->dev);
>> -    }
>> +    } else if (data->event_cb) {
>> +        qemu_chr_fe_set_handlers(data->cd, NULL, NULL, data->event_cb,
>> +                                 NULL, data->dev, NULL, true);
>> +   }
>> 
>>    g_free(data);
>> }
>> @@ -2669,7 +2673,9 @@ static void vhost_user_async_close_bh(void *opaque)
>> */
>> void vhost_user_async_close(DeviceState *d,
>>                            CharBackend *chardev, struct vhost_dev *vhost,
>> -                            vu_async_close_fn cb)
>> +                            vu_async_close_fn cb,
>> +                            IOEventHandler *event_cb
> 
> Nit: why the newline before the closing parenthesis?
Acked.

> 
>> +                            )
>> {
>>    if (!runstate_check(RUN_STATE_SHUTDOWN)) {
>>        /*
>> diff --git a/include/hw/virtio/vhost-user.h b/include/hw/virtio/vhost-user.h
>> index 191216a74f..5fdc711d4e 100644
>> --- a/include/hw/virtio/vhost-user.h
>> +++ b/include/hw/virtio/vhost-user.h
>> @@ -84,6 +84,8 @@ typedef void (*vu_async_close_fn)(DeviceState *cb);
>> 
>> void vhost_user_async_close(DeviceState *d,
>>                            CharBackend *chardev, struct vhost_dev *vhost,
>> -                            vu_async_close_fn cb);
>> +                            vu_async_close_fn cb,
>> +                            IOEventHandler *event_cb
> 
> Nit: ditto - don’t think we need this newline before );
Acked.
> 
>> +                            );
>> 
>> #endif
>> -- 
>> 2.41.0


[-- Attachment #2: Type: text/html, Size: 17379 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v3 0/2] Fix vhost reconnect issues
  2023-08-04  5:29 [PATCH 0/2] Fix vhost reconnect issues Li Feng
                   ` (2 preceding siblings ...)
  2023-08-24  7:41 ` [PATCH v2 0/2] Fix vhost reconnect issues Li Feng
@ 2023-08-30  4:57 ` Li Feng
  2023-08-30  4:57   ` [PATCH v3 1/2] vhost-user: Fix lost reconnect Li Feng
  2023-08-30  4:57   ` [PATCH v3 2/2] vhost: Add Error parameter to vhost_scsi_common_start() Li Feng
  3 siblings, 2 replies; 27+ messages in thread
From: Li Feng @ 2023-08-30  4:57 UTC (permalink / raw)
  To: Raphael Norwitz, Michael S. Tsirkin, Kevin Wolf, Hanna Reitz,
	Paolo Bonzini, Fam Zheng, Alex Bennée, Viresh Kumar,
	open list:Block layer core, open list:All patches CC here
  Cc: Li Feng

The patchset fixes the regression issue of vhost reconnect.
It's a serious bug that the vhost-user will lose the reconnect forever.

The 2nd patch enhances the error handle of vhost-user-scsi.

This patchset's parent commit is:
https://lore.kernel.org/all/20230731121018.2856310-1-fengli@smartx.com/

Changes for v3:
- Fix the code style.

Changes for v2:
- Add a event_cb in VhostAsyncCallback to be called when dev is NULL;
- Fix the error report message.

Li Feng (2):
  vhost-user: Fix lost reconnect
  vhost: Add Error parameter to vhost_scsi_common_start()

 hw/block/vhost-user-blk.c             |  2 +-
 hw/scsi/vhost-scsi-common.c           | 16 +++++++++-------
 hw/scsi/vhost-scsi.c                  |  5 +++--
 hw/scsi/vhost-user-scsi.c             | 17 ++++++++++-------
 hw/virtio/vhost-user-gpio.c           |  2 +-
 hw/virtio/vhost-user.c                |  9 +++++++--
 include/hw/virtio/vhost-scsi-common.h |  2 +-
 include/hw/virtio/vhost-user.h        |  3 ++-
 8 files changed, 34 insertions(+), 22 deletions(-)

-- 
2.41.0



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v3 1/2] vhost-user: Fix lost reconnect
  2023-08-30  4:57 ` [PATCH v3 0/2] Fix vhost reconnect issues Li Feng
@ 2023-08-30  4:57   ` Li Feng
  2023-08-30  8:47     ` Raphael Norwitz
  2023-08-30  4:57   ` [PATCH v3 2/2] vhost: Add Error parameter to vhost_scsi_common_start() Li Feng
  1 sibling, 1 reply; 27+ messages in thread
From: Li Feng @ 2023-08-30  4:57 UTC (permalink / raw)
  To: Michael S. Tsirkin, Raphael Norwitz, Kevin Wolf, Hanna Reitz,
	Paolo Bonzini, Fam Zheng, Alex Bennée, Viresh Kumar,
	open list:Block layer core, open list:All patches CC here
  Cc: Li Feng

When the vhost-user is reconnecting to the backend, and if the vhost-user fails
at the get_features in vhost_dev_init(), then the reconnect will fail
and it will not be retriggered forever.

The reason is:
When the vhost-user fails at get_features, the vhost_dev_cleanup will be called
immediately.

vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'.

The reconnect path is:
vhost_user_blk_event
   vhost_user_async_close(.. vhost_user_blk_disconnect ..)
     qemu_chr_fe_set_handlers <----- clear the notifier callback
       schedule vhost_user_async_close_bh

The vhost->vdev is null, so the vhost_user_blk_disconnect will not be
called, then the event fd callback will not be reinstalled.

All vhost-user devices have this issue, including vhost-user-blk/scsi.

With this patch, if the vdev->vdev is null, the fd callback will still
be reinstalled.

Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling")

Signed-off-by: Li Feng <fengli@smartx.com>
---
 hw/block/vhost-user-blk.c      | 2 +-
 hw/scsi/vhost-user-scsi.c      | 3 ++-
 hw/virtio/vhost-user-gpio.c    | 2 +-
 hw/virtio/vhost-user.c         | 9 +++++++--
 include/hw/virtio/vhost-user.h | 3 ++-
 5 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index 3c69fa47d5..95c758200d 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -391,7 +391,7 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
     case CHR_EVENT_CLOSED:
         /* defer close until later to avoid circular close */
         vhost_user_async_close(dev, &s->chardev, &s->dev,
-                               vhost_user_blk_disconnect);
+                               vhost_user_blk_disconnect, vhost_user_blk_event);
         break;
     case CHR_EVENT_BREAK:
     case CHR_EVENT_MUX_IN:
diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
index a7fa8e8df2..e931df9f5b 100644
--- a/hw/scsi/vhost-user-scsi.c
+++ b/hw/scsi/vhost-user-scsi.c
@@ -236,7 +236,8 @@ static void vhost_user_scsi_event(void *opaque, QEMUChrEvent event)
     case CHR_EVENT_CLOSED:
         /* defer close until later to avoid circular close */
         vhost_user_async_close(dev, &vs->conf.chardev, &vsc->dev,
-                               vhost_user_scsi_disconnect);
+                               vhost_user_scsi_disconnect,
+                               vhost_user_scsi_event);
         break;
     case CHR_EVENT_BREAK:
     case CHR_EVENT_MUX_IN:
diff --git a/hw/virtio/vhost-user-gpio.c b/hw/virtio/vhost-user-gpio.c
index d9979aa5db..04c2cc79f4 100644
--- a/hw/virtio/vhost-user-gpio.c
+++ b/hw/virtio/vhost-user-gpio.c
@@ -283,7 +283,7 @@ static void vu_gpio_event(void *opaque, QEMUChrEvent event)
     case CHR_EVENT_CLOSED:
         /* defer close until later to avoid circular close */
         vhost_user_async_close(dev, &gpio->chardev, &gpio->vhost_dev,
-                               vu_gpio_disconnect);
+                               vu_gpio_disconnect, vu_gpio_event);
         break;
     case CHR_EVENT_BREAK:
     case CHR_EVENT_MUX_IN:
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 8dcf049d42..12c4a41f30 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -2643,6 +2643,7 @@ typedef struct {
     DeviceState *dev;
     CharBackend *cd;
     struct vhost_dev *vhost;
+    IOEventHandler *event_cb;
 } VhostAsyncCallback;
 
 static void vhost_user_async_close_bh(void *opaque)
@@ -2657,7 +2658,10 @@ static void vhost_user_async_close_bh(void *opaque)
      */
     if (vhost->vdev) {
         data->cb(data->dev);
-    }
+    } else if (data->event_cb) {
+        qemu_chr_fe_set_handlers(data->cd, NULL, NULL, data->event_cb,
+                                 NULL, data->dev, NULL, true);
+   }
 
     g_free(data);
 }
@@ -2669,7 +2673,8 @@ static void vhost_user_async_close_bh(void *opaque)
  */
 void vhost_user_async_close(DeviceState *d,
                             CharBackend *chardev, struct vhost_dev *vhost,
-                            vu_async_close_fn cb)
+                            vu_async_close_fn cb,
+                            IOEventHandler *event_cb)
 {
     if (!runstate_check(RUN_STATE_SHUTDOWN)) {
         /*
diff --git a/include/hw/virtio/vhost-user.h b/include/hw/virtio/vhost-user.h
index 191216a74f..649e9dd54f 100644
--- a/include/hw/virtio/vhost-user.h
+++ b/include/hw/virtio/vhost-user.h
@@ -84,6 +84,7 @@ typedef void (*vu_async_close_fn)(DeviceState *cb);
 
 void vhost_user_async_close(DeviceState *d,
                             CharBackend *chardev, struct vhost_dev *vhost,
-                            vu_async_close_fn cb);
+                            vu_async_close_fn cb,
+                            IOEventHandler *event_cb);
 
 #endif
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 2/2] vhost: Add Error parameter to vhost_scsi_common_start()
  2023-08-30  4:57 ` [PATCH v3 0/2] Fix vhost reconnect issues Li Feng
  2023-08-30  4:57   ` [PATCH v3 1/2] vhost-user: Fix lost reconnect Li Feng
@ 2023-08-30  4:57   ` Li Feng
  2023-08-30  8:51     ` Raphael Norwitz
  2023-09-01 12:00     ` Markus Armbruster
  1 sibling, 2 replies; 27+ messages in thread
From: Li Feng @ 2023-08-30  4:57 UTC (permalink / raw)
  To: Michael S. Tsirkin, Raphael Norwitz, Kevin Wolf, Hanna Reitz,
	Paolo Bonzini, Fam Zheng, Alex Bennée, Viresh Kumar,
	open list:Block layer core, open list:All patches CC here
  Cc: Li Feng

Add a Error parameter to report the real error, like vhost-user-blk.

Signed-off-by: Li Feng <fengli@smartx.com>
---
 hw/scsi/vhost-scsi-common.c           | 16 +++++++++-------
 hw/scsi/vhost-scsi.c                  |  5 +++--
 hw/scsi/vhost-user-scsi.c             | 14 ++++++++------
 include/hw/virtio/vhost-scsi-common.h |  2 +-
 4 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/hw/scsi/vhost-scsi-common.c b/hw/scsi/vhost-scsi-common.c
index a61cd0e907..4c8637045d 100644
--- a/hw/scsi/vhost-scsi-common.c
+++ b/hw/scsi/vhost-scsi-common.c
@@ -16,6 +16,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qapi/error.h"
 #include "qemu/error-report.h"
 #include "qemu/module.h"
 #include "hw/virtio/vhost.h"
@@ -25,7 +26,7 @@
 #include "hw/virtio/virtio-access.h"
 #include "hw/fw-path-provider.h"
 
-int vhost_scsi_common_start(VHostSCSICommon *vsc)
+int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp)
 {
     int ret, i;
     VirtIODevice *vdev = VIRTIO_DEVICE(vsc);
@@ -35,18 +36,19 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
     VirtIOSCSICommon *vs = (VirtIOSCSICommon *)vsc;
 
     if (!k->set_guest_notifiers) {
-        error_report("binding does not support guest notifiers");
+        error_setg(errp, "binding does not support guest notifiers");
         return -ENOSYS;
     }
 
     ret = vhost_dev_enable_notifiers(&vsc->dev, vdev);
     if (ret < 0) {
+        error_setg_errno(errp, -ret, "Error enabling host notifiers");
         return ret;
     }
 
     ret = k->set_guest_notifiers(qbus->parent, vsc->dev.nvqs, true);
     if (ret < 0) {
-        error_report("Error binding guest notifier");
+        error_setg_errno(errp, -ret, "Error binding guest notifier");
         goto err_host_notifiers;
     }
 
@@ -54,7 +56,7 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
 
     ret = vhost_dev_prepare_inflight(&vsc->dev, vdev);
     if (ret < 0) {
-        error_report("Error setting inflight format: %d", -ret);
+        error_setg_errno(errp, -ret, "Error setting inflight format");
         goto err_guest_notifiers;
     }
 
@@ -64,21 +66,21 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
                                         vs->conf.virtqueue_size,
                                         vsc->inflight);
             if (ret < 0) {
-                error_report("Error getting inflight: %d", -ret);
+                error_setg_errno(errp, -ret, "Error getting inflight");
                 goto err_guest_notifiers;
             }
         }
 
         ret = vhost_dev_set_inflight(&vsc->dev, vsc->inflight);
         if (ret < 0) {
-            error_report("Error setting inflight: %d", -ret);
+            error_setg_errno(errp, -ret, "Error setting inflight");
             goto err_guest_notifiers;
         }
     }
 
     ret = vhost_dev_start(&vsc->dev, vdev, true);
     if (ret < 0) {
-        error_report("Error start vhost dev");
+        error_setg_errno(errp, -ret, "Error starting vhost dev");
         goto err_guest_notifiers;
     }
 
diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index 443f67daa4..01a3ab4277 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -75,6 +75,7 @@ static int vhost_scsi_start(VHostSCSI *s)
     int ret, abi_version;
     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
     const VhostOps *vhost_ops = vsc->dev.vhost_ops;
+    Error *local_err = NULL;
 
     ret = vhost_ops->vhost_scsi_get_abi_version(&vsc->dev, &abi_version);
     if (ret < 0) {
@@ -88,14 +89,14 @@ static int vhost_scsi_start(VHostSCSI *s)
         return -ENOSYS;
     }
 
-    ret = vhost_scsi_common_start(vsc);
+    ret = vhost_scsi_common_start(vsc, &local_err);
     if (ret < 0) {
         return ret;
     }
 
     ret = vhost_scsi_set_endpoint(s);
     if (ret < 0) {
-        error_report("Error setting vhost-scsi endpoint");
+        error_reportf_err(local_err, "Error setting vhost-scsi endpoint");
         vhost_scsi_common_stop(vsc);
     }
 
diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
index e931df9f5b..62fc98bb1c 100644
--- a/hw/scsi/vhost-user-scsi.c
+++ b/hw/scsi/vhost-user-scsi.c
@@ -43,12 +43,12 @@ enum VhostUserProtocolFeature {
     VHOST_USER_PROTOCOL_F_RESET_DEVICE = 13,
 };
 
-static int vhost_user_scsi_start(VHostUserSCSI *s)
+static int vhost_user_scsi_start(VHostUserSCSI *s, Error **errp)
 {
     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
     int ret;
 
-    ret = vhost_scsi_common_start(vsc);
+    ret = vhost_scsi_common_start(vsc, errp);
     s->started_vu = (ret < 0 ? false : true);
 
     return ret;
@@ -73,6 +73,7 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
     VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(dev);
     bool should_start = virtio_device_should_start(vdev, status);
+    Error *local_err = NULL;
     int ret;
 
     if (!s->connected) {
@@ -84,9 +85,10 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
     }
 
     if (should_start) {
-        ret = vhost_user_scsi_start(s);
+        ret = vhost_user_scsi_start(s, &local_err);
         if (ret < 0) {
-            error_report("unable to start vhost-user-scsi: %s", strerror(-ret));
+            error_reportf_err(local_err, "unable to start vhost-user-scsi: %s",
+                              strerror(-ret));
             qemu_chr_fe_disconnect(&vs->conf.chardev);
         }
     } else {
@@ -139,7 +141,7 @@ static void vhost_user_scsi_handle_output(VirtIODevice *vdev, VirtQueue *vq)
      * Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
      * vhost here instead of waiting for .set_status().
      */
-    ret = vhost_user_scsi_start(s);
+    ret = vhost_user_scsi_start(s, &local_err);
     if (ret < 0) {
         error_reportf_err(local_err, "vhost-user-scsi: vhost start failed: ");
         qemu_chr_fe_disconnect(&vs->conf.chardev);
@@ -184,7 +186,7 @@ static int vhost_user_scsi_connect(DeviceState *dev, Error **errp)
 
     /* restore vhost state */
     if (virtio_device_started(vdev, vdev->status)) {
-        ret = vhost_user_scsi_start(s);
+        ret = vhost_user_scsi_start(s, errp);
         if (ret < 0) {
             return ret;
         }
diff --git a/include/hw/virtio/vhost-scsi-common.h b/include/hw/virtio/vhost-scsi-common.h
index 18f115527c..c5d2c09455 100644
--- a/include/hw/virtio/vhost-scsi-common.h
+++ b/include/hw/virtio/vhost-scsi-common.h
@@ -39,7 +39,7 @@ struct VHostSCSICommon {
     struct vhost_inflight *inflight;
 };
 
-int vhost_scsi_common_start(VHostSCSICommon *vsc);
+int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp);
 void vhost_scsi_common_stop(VHostSCSICommon *vsc);
 char *vhost_scsi_common_get_fw_dev_path(FWPathProvider *p, BusState *bus,
                                         DeviceState *dev);
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 1/2] vhost-user: Fix lost reconnect
  2023-08-30  4:57   ` [PATCH v3 1/2] vhost-user: Fix lost reconnect Li Feng
@ 2023-08-30  8:47     ` Raphael Norwitz
  0 siblings, 0 replies; 27+ messages in thread
From: Raphael Norwitz @ 2023-08-30  8:47 UTC (permalink / raw)
  To: Li Feng
  Cc: Michael S. Tsirkin, Kevin Wolf, Hanna Reitz, Paolo Bonzini,
	Fam Zheng, Alex Bennée, Viresh Kumar,
	open list:Block layer core, open list:All patches CC here



> On Aug 30, 2023, at 12:57 AM, Li Feng <fengli@smartx.com> wrote:
> 
> When the vhost-user is reconnecting to the backend, and if the vhost-user fails
> at the get_features in vhost_dev_init(), then the reconnect will fail
> and it will not be retriggered forever.
> 
> The reason is:
> When the vhost-user fails at get_features, the vhost_dev_cleanup will be called
> immediately.
> 
> vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'.
> 
> The reconnect path is:
> vhost_user_blk_event
>   vhost_user_async_close(.. vhost_user_blk_disconnect ..)
>     qemu_chr_fe_set_handlers <----- clear the notifier callback
>       schedule vhost_user_async_close_bh
> 
> The vhost->vdev is null, so the vhost_user_blk_disconnect will not be
> called, then the event fd callback will not be reinstalled.
> 
> All vhost-user devices have this issue, including vhost-user-blk/scsi.
> 
> With this patch, if the vdev->vdev is null, the fd callback will still
> be reinstalled.
> 
> Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling")
> 

Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>

> Signed-off-by: Li Feng <fengli@smartx.com>
> ---
> hw/block/vhost-user-blk.c      | 2 +-
> hw/scsi/vhost-user-scsi.c      | 3 ++-
> hw/virtio/vhost-user-gpio.c    | 2 +-
> hw/virtio/vhost-user.c         | 9 +++++++--
> include/hw/virtio/vhost-user.h | 3 ++-
> 5 files changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index 3c69fa47d5..95c758200d 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -391,7 +391,7 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
>     case CHR_EVENT_CLOSED:
>         /* defer close until later to avoid circular close */
>         vhost_user_async_close(dev, &s->chardev, &s->dev,
> -                               vhost_user_blk_disconnect);
> +                               vhost_user_blk_disconnect, vhost_user_blk_event);
>         break;
>     case CHR_EVENT_BREAK:
>     case CHR_EVENT_MUX_IN:
> diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
> index a7fa8e8df2..e931df9f5b 100644
> --- a/hw/scsi/vhost-user-scsi.c
> +++ b/hw/scsi/vhost-user-scsi.c
> @@ -236,7 +236,8 @@ static void vhost_user_scsi_event(void *opaque, QEMUChrEvent event)
>     case CHR_EVENT_CLOSED:
>         /* defer close until later to avoid circular close */
>         vhost_user_async_close(dev, &vs->conf.chardev, &vsc->dev,
> -                               vhost_user_scsi_disconnect);
> +                               vhost_user_scsi_disconnect,
> +                               vhost_user_scsi_event);
>         break;
>     case CHR_EVENT_BREAK:
>     case CHR_EVENT_MUX_IN:
> diff --git a/hw/virtio/vhost-user-gpio.c b/hw/virtio/vhost-user-gpio.c
> index d9979aa5db..04c2cc79f4 100644
> --- a/hw/virtio/vhost-user-gpio.c
> +++ b/hw/virtio/vhost-user-gpio.c
> @@ -283,7 +283,7 @@ static void vu_gpio_event(void *opaque, QEMUChrEvent event)
>     case CHR_EVENT_CLOSED:
>         /* defer close until later to avoid circular close */
>         vhost_user_async_close(dev, &gpio->chardev, &gpio->vhost_dev,
> -                               vu_gpio_disconnect);
> +                               vu_gpio_disconnect, vu_gpio_event);
>         break;
>     case CHR_EVENT_BREAK:
>     case CHR_EVENT_MUX_IN:
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 8dcf049d42..12c4a41f30 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -2643,6 +2643,7 @@ typedef struct {
>     DeviceState *dev;
>     CharBackend *cd;
>     struct vhost_dev *vhost;
> +    IOEventHandler *event_cb;
> } VhostAsyncCallback;
> 
> static void vhost_user_async_close_bh(void *opaque)
> @@ -2657,7 +2658,10 @@ static void vhost_user_async_close_bh(void *opaque)
>      */
>     if (vhost->vdev) {
>         data->cb(data->dev);
> -    }
> +    } else if (data->event_cb) {
> +        qemu_chr_fe_set_handlers(data->cd, NULL, NULL, data->event_cb,
> +                                 NULL, data->dev, NULL, true);
> +   }
> 
>     g_free(data);
> }
> @@ -2669,7 +2673,8 @@ static void vhost_user_async_close_bh(void *opaque)
>  */
> void vhost_user_async_close(DeviceState *d,
>                             CharBackend *chardev, struct vhost_dev *vhost,
> -                            vu_async_close_fn cb)
> +                            vu_async_close_fn cb,
> +                            IOEventHandler *event_cb)
> {
>     if (!runstate_check(RUN_STATE_SHUTDOWN)) {
>         /*
> diff --git a/include/hw/virtio/vhost-user.h b/include/hw/virtio/vhost-user.h
> index 191216a74f..649e9dd54f 100644
> --- a/include/hw/virtio/vhost-user.h
> +++ b/include/hw/virtio/vhost-user.h
> @@ -84,6 +84,7 @@ typedef void (*vu_async_close_fn)(DeviceState *cb);
> 
> void vhost_user_async_close(DeviceState *d,
>                             CharBackend *chardev, struct vhost_dev *vhost,
> -                            vu_async_close_fn cb);
> +                            vu_async_close_fn cb,
> +                            IOEventHandler *event_cb);
> 
> #endif
> -- 
> 2.41.0
> 
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/2] vhost: Add Error parameter to vhost_scsi_common_start()
  2023-08-30  4:57   ` [PATCH v3 2/2] vhost: Add Error parameter to vhost_scsi_common_start() Li Feng
@ 2023-08-30  8:51     ` Raphael Norwitz
  2023-09-01 12:00     ` Markus Armbruster
  1 sibling, 0 replies; 27+ messages in thread
From: Raphael Norwitz @ 2023-08-30  8:51 UTC (permalink / raw)
  To: Li Feng
  Cc: Michael S. Tsirkin, Kevin Wolf, Hanna Reitz, Paolo Bonzini,
	Fam Zheng, Alex Bennée, Viresh Kumar,
	open list:Block layer core, open list:All patches CC here



> On Aug 30, 2023, at 12:57 AM, Li Feng <fengli@smartx.com> wrote:
> 
> Add a Error parameter to report the real error, like vhost-user-blk.
> 

Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>

> Signed-off-by: Li Feng <fengli@smartx.com>
> ---
> hw/scsi/vhost-scsi-common.c           | 16 +++++++++-------
> hw/scsi/vhost-scsi.c                  |  5 +++--
> hw/scsi/vhost-user-scsi.c             | 14 ++++++++------
> include/hw/virtio/vhost-scsi-common.h |  2 +-
> 4 files changed, 21 insertions(+), 16 deletions(-)
> 
> diff --git a/hw/scsi/vhost-scsi-common.c b/hw/scsi/vhost-scsi-common.c
> index a61cd0e907..4c8637045d 100644
> --- a/hw/scsi/vhost-scsi-common.c
> +++ b/hw/scsi/vhost-scsi-common.c
> @@ -16,6 +16,7 @@
>  */
> 
> #include "qemu/osdep.h"
> +#include "qapi/error.h"
> #include "qemu/error-report.h"
> #include "qemu/module.h"
> #include "hw/virtio/vhost.h"
> @@ -25,7 +26,7 @@
> #include "hw/virtio/virtio-access.h"
> #include "hw/fw-path-provider.h"
> 
> -int vhost_scsi_common_start(VHostSCSICommon *vsc)
> +int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp)
> {
>     int ret, i;
>     VirtIODevice *vdev = VIRTIO_DEVICE(vsc);
> @@ -35,18 +36,19 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
>     VirtIOSCSICommon *vs = (VirtIOSCSICommon *)vsc;
> 
>     if (!k->set_guest_notifiers) {
> -        error_report("binding does not support guest notifiers");
> +        error_setg(errp, "binding does not support guest notifiers");
>         return -ENOSYS;
>     }
> 
>     ret = vhost_dev_enable_notifiers(&vsc->dev, vdev);
>     if (ret < 0) {
> +        error_setg_errno(errp, -ret, "Error enabling host notifiers");
>         return ret;
>     }
> 
>     ret = k->set_guest_notifiers(qbus->parent, vsc->dev.nvqs, true);
>     if (ret < 0) {
> -        error_report("Error binding guest notifier");
> +        error_setg_errno(errp, -ret, "Error binding guest notifier");
>         goto err_host_notifiers;
>     }
> 
> @@ -54,7 +56,7 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
> 
>     ret = vhost_dev_prepare_inflight(&vsc->dev, vdev);
>     if (ret < 0) {
> -        error_report("Error setting inflight format: %d", -ret);
> +        error_setg_errno(errp, -ret, "Error setting inflight format");
>         goto err_guest_notifiers;
>     }
> 
> @@ -64,21 +66,21 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
>                                         vs->conf.virtqueue_size,
>                                         vsc->inflight);
>             if (ret < 0) {
> -                error_report("Error getting inflight: %d", -ret);
> +                error_setg_errno(errp, -ret, "Error getting inflight");
>                 goto err_guest_notifiers;
>             }
>         }
> 
>         ret = vhost_dev_set_inflight(&vsc->dev, vsc->inflight);
>         if (ret < 0) {
> -            error_report("Error setting inflight: %d", -ret);
> +            error_setg_errno(errp, -ret, "Error setting inflight");
>             goto err_guest_notifiers;
>         }
>     }
> 
>     ret = vhost_dev_start(&vsc->dev, vdev, true);
>     if (ret < 0) {
> -        error_report("Error start vhost dev");
> +        error_setg_errno(errp, -ret, "Error starting vhost dev");
>         goto err_guest_notifiers;
>     }
> 
> diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
> index 443f67daa4..01a3ab4277 100644
> --- a/hw/scsi/vhost-scsi.c
> +++ b/hw/scsi/vhost-scsi.c
> @@ -75,6 +75,7 @@ static int vhost_scsi_start(VHostSCSI *s)
>     int ret, abi_version;
>     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>     const VhostOps *vhost_ops = vsc->dev.vhost_ops;
> +    Error *local_err = NULL;
> 
>     ret = vhost_ops->vhost_scsi_get_abi_version(&vsc->dev, &abi_version);
>     if (ret < 0) {
> @@ -88,14 +89,14 @@ static int vhost_scsi_start(VHostSCSI *s)
>         return -ENOSYS;
>     }
> 
> -    ret = vhost_scsi_common_start(vsc);
> +    ret = vhost_scsi_common_start(vsc, &local_err);
>     if (ret < 0) {
>         return ret;
>     }
> 
>     ret = vhost_scsi_set_endpoint(s);
>     if (ret < 0) {
> -        error_report("Error setting vhost-scsi endpoint");
> +        error_reportf_err(local_err, "Error setting vhost-scsi endpoint");
>         vhost_scsi_common_stop(vsc);
>     }
> 
> diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
> index e931df9f5b..62fc98bb1c 100644
> --- a/hw/scsi/vhost-user-scsi.c
> +++ b/hw/scsi/vhost-user-scsi.c
> @@ -43,12 +43,12 @@ enum VhostUserProtocolFeature {
>     VHOST_USER_PROTOCOL_F_RESET_DEVICE = 13,
> };
> 
> -static int vhost_user_scsi_start(VHostUserSCSI *s)
> +static int vhost_user_scsi_start(VHostUserSCSI *s, Error **errp)
> {
>     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>     int ret;
> 
> -    ret = vhost_scsi_common_start(vsc);
> +    ret = vhost_scsi_common_start(vsc, errp);
>     s->started_vu = (ret < 0 ? false : true);
> 
>     return ret;
> @@ -73,6 +73,7 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
>     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>     VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(dev);
>     bool should_start = virtio_device_should_start(vdev, status);
> +    Error *local_err = NULL;
>     int ret;
> 
>     if (!s->connected) {
> @@ -84,9 +85,10 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
>     }
> 
>     if (should_start) {
> -        ret = vhost_user_scsi_start(s);
> +        ret = vhost_user_scsi_start(s, &local_err);
>         if (ret < 0) {
> -            error_report("unable to start vhost-user-scsi: %s", strerror(-ret));
> +            error_reportf_err(local_err, "unable to start vhost-user-scsi: %s",
> +                              strerror(-ret));
>             qemu_chr_fe_disconnect(&vs->conf.chardev);
>         }
>     } else {
> @@ -139,7 +141,7 @@ static void vhost_user_scsi_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>      * Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
>      * vhost here instead of waiting for .set_status().
>      */
> -    ret = vhost_user_scsi_start(s);
> +    ret = vhost_user_scsi_start(s, &local_err);
>     if (ret < 0) {
>         error_reportf_err(local_err, "vhost-user-scsi: vhost start failed: ");
>         qemu_chr_fe_disconnect(&vs->conf.chardev);
> @@ -184,7 +186,7 @@ static int vhost_user_scsi_connect(DeviceState *dev, Error **errp)
> 
>     /* restore vhost state */
>     if (virtio_device_started(vdev, vdev->status)) {
> -        ret = vhost_user_scsi_start(s);
> +        ret = vhost_user_scsi_start(s, errp);
>         if (ret < 0) {
>             return ret;
>         }
> diff --git a/include/hw/virtio/vhost-scsi-common.h b/include/hw/virtio/vhost-scsi-common.h
> index 18f115527c..c5d2c09455 100644
> --- a/include/hw/virtio/vhost-scsi-common.h
> +++ b/include/hw/virtio/vhost-scsi-common.h
> @@ -39,7 +39,7 @@ struct VHostSCSICommon {
>     struct vhost_inflight *inflight;
> };
> 
> -int vhost_scsi_common_start(VHostSCSICommon *vsc);
> +int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp);
> void vhost_scsi_common_stop(VHostSCSICommon *vsc);
> char *vhost_scsi_common_get_fw_dev_path(FWPathProvider *p, BusState *bus,
>                                         DeviceState *dev);
> -- 
> 2.41.0
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/2] vhost: Add Error parameter to vhost_scsi_common_start()
  2023-08-30  4:57   ` [PATCH v3 2/2] vhost: Add Error parameter to vhost_scsi_common_start() Li Feng
  2023-08-30  8:51     ` Raphael Norwitz
@ 2023-09-01 12:00     ` Markus Armbruster
  2023-09-12  8:32       ` Li Feng
  1 sibling, 1 reply; 27+ messages in thread
From: Markus Armbruster @ 2023-09-01 12:00 UTC (permalink / raw)
  To: Li Feng
  Cc: Michael S. Tsirkin, Raphael Norwitz, Kevin Wolf, Hanna Reitz,
	Paolo Bonzini, Fam Zheng, Alex Bennée, Viresh Kumar,
	open list:Block layer core, open list:All patches CC here

Li Feng <fengli@smartx.com> writes:

> Add a Error parameter to report the real error, like vhost-user-blk.
>
> Signed-off-by: Li Feng <fengli@smartx.com>
> ---
>  hw/scsi/vhost-scsi-common.c           | 16 +++++++++-------
>  hw/scsi/vhost-scsi.c                  |  5 +++--
>  hw/scsi/vhost-user-scsi.c             | 14 ++++++++------
>  include/hw/virtio/vhost-scsi-common.h |  2 +-
>  4 files changed, 21 insertions(+), 16 deletions(-)
>
> diff --git a/hw/scsi/vhost-scsi-common.c b/hw/scsi/vhost-scsi-common.c
> index a61cd0e907..4c8637045d 100644
> --- a/hw/scsi/vhost-scsi-common.c
> +++ b/hw/scsi/vhost-scsi-common.c
> @@ -16,6 +16,7 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "qapi/error.h"
>  #include "qemu/error-report.h"
>  #include "qemu/module.h"
>  #include "hw/virtio/vhost.h"
> @@ -25,7 +26,7 @@
>  #include "hw/virtio/virtio-access.h"
>  #include "hw/fw-path-provider.h"
>  
> -int vhost_scsi_common_start(VHostSCSICommon *vsc)
> +int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp)
>  {
>      int ret, i;
>      VirtIODevice *vdev = VIRTIO_DEVICE(vsc);
> @@ -35,18 +36,19 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
>      VirtIOSCSICommon *vs = (VirtIOSCSICommon *)vsc;
>  
>      if (!k->set_guest_notifiers) {
> -        error_report("binding does not support guest notifiers");
> +        error_setg(errp, "binding does not support guest notifiers");
>          return -ENOSYS;
>      }
>  
>      ret = vhost_dev_enable_notifiers(&vsc->dev, vdev);
>      if (ret < 0) {
> +        error_setg_errno(errp, -ret, "Error enabling host notifiers");

Looks like the error is silent before your patch.  Correct?

>          return ret;
>      }
>  
>      ret = k->set_guest_notifiers(qbus->parent, vsc->dev.nvqs, true);
>      if (ret < 0) {
> -        error_report("Error binding guest notifier");
> +        error_setg_errno(errp, -ret, "Error binding guest notifier");

Error message now provides more detail.

>          goto err_host_notifiers;
>      }
>  
> @@ -54,7 +56,7 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
>  
>      ret = vhost_dev_prepare_inflight(&vsc->dev, vdev);
>      if (ret < 0) {
> -        error_report("Error setting inflight format: %d", -ret);
> +        error_setg_errno(errp, -ret, "Error setting inflight format");

Error message now shows errno in human-readable form.

>          goto err_guest_notifiers;
>      }
>  
> @@ -64,21 +66,21 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
>                                          vs->conf.virtqueue_size,
>                                          vsc->inflight);
>              if (ret < 0) {
> -                error_report("Error getting inflight: %d", -ret);
> +                error_setg_errno(errp, -ret, "Error getting inflight");

Likewise.

>                  goto err_guest_notifiers;
>              }
>          }
>  
>          ret = vhost_dev_set_inflight(&vsc->dev, vsc->inflight);
>          if (ret < 0) {
> -            error_report("Error setting inflight: %d", -ret);
> +            error_setg_errno(errp, -ret, "Error setting inflight");

Likewise.

>              goto err_guest_notifiers;
>          }
>      }
>  
>      ret = vhost_dev_start(&vsc->dev, vdev, true);
>      if (ret < 0) {
> -        error_report("Error start vhost dev");
> +        error_setg_errno(errp, -ret, "Error starting vhost dev");

Likewise.

>          goto err_guest_notifiers;
>      }
>  
> diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
> index 443f67daa4..01a3ab4277 100644
> --- a/hw/scsi/vhost-scsi.c
> +++ b/hw/scsi/vhost-scsi.c
> @@ -75,6 +75,7 @@ static int vhost_scsi_start(VHostSCSI *s)
>      int ret, abi_version;
>      VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>      const VhostOps *vhost_ops = vsc->dev.vhost_ops;
> +    Error *local_err = NULL;
>  
>      ret = vhost_ops->vhost_scsi_get_abi_version(&vsc->dev, &abi_version);
>      if (ret < 0) {
> @@ -88,14 +89,14 @@ static int vhost_scsi_start(VHostSCSI *s)
>          return -ENOSYS;
>      }
>  
> -    ret = vhost_scsi_common_start(vsc);
> +    ret = vhost_scsi_common_start(vsc, &local_err);
>      if (ret < 0) {
>          return ret;
>      }
>  
>      ret = vhost_scsi_set_endpoint(s);
>      if (ret < 0) {
> -        error_report("Error setting vhost-scsi endpoint");
> +        error_reportf_err(local_err, "Error setting vhost-scsi endpoint");
>          vhost_scsi_common_stop(vsc);
>      }
>  
> diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
> index e931df9f5b..62fc98bb1c 100644
> --- a/hw/scsi/vhost-user-scsi.c
> +++ b/hw/scsi/vhost-user-scsi.c
> @@ -43,12 +43,12 @@ enum VhostUserProtocolFeature {
>      VHOST_USER_PROTOCOL_F_RESET_DEVICE = 13,
>  };
>  
> -static int vhost_user_scsi_start(VHostUserSCSI *s)
> +static int vhost_user_scsi_start(VHostUserSCSI *s, Error **errp)
>  {
>      VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>      int ret;
>  
> -    ret = vhost_scsi_common_start(vsc);
> +    ret = vhost_scsi_common_start(vsc, errp);
>      s->started_vu = (ret < 0 ? false : true);
>  
>      return ret;
> @@ -73,6 +73,7 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
>      VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>      VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(dev);
>      bool should_start = virtio_device_should_start(vdev, status);
> +    Error *local_err = NULL;
>      int ret;
>  
>      if (!s->connected) {
> @@ -84,9 +85,10 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
>      }
>  
>      if (should_start) {
> -        ret = vhost_user_scsi_start(s);
> +        ret = vhost_user_scsi_start(s, &local_err);
>          if (ret < 0) {
> -            error_report("unable to start vhost-user-scsi: %s", strerror(-ret));
> +            error_reportf_err(local_err, "unable to start vhost-user-scsi: %s",
> +                              strerror(-ret));
>              qemu_chr_fe_disconnect(&vs->conf.chardev);
>          }
>      } else {
> @@ -139,7 +141,7 @@ static void vhost_user_scsi_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>       * Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
>       * vhost here instead of waiting for .set_status().
>       */
> -    ret = vhost_user_scsi_start(s);
> +    ret = vhost_user_scsi_start(s, &local_err);
>      if (ret < 0) {
>          error_reportf_err(local_err, "vhost-user-scsi: vhost start failed: ");

The error_reportf_err() is wrong before the patch, as I just pointed out
in my review of "[PATCH v3 5/5] vhost-user-scsi: start vhost when guest
kicks".  It is correct afterwards.

>          qemu_chr_fe_disconnect(&vs->conf.chardev);
> @@ -184,7 +186,7 @@ static int vhost_user_scsi_connect(DeviceState *dev, Error **errp)
>  
>      /* restore vhost state */
>      if (virtio_device_started(vdev, vdev->status)) {
> -        ret = vhost_user_scsi_start(s);
> +        ret = vhost_user_scsi_start(s, errp);
>          if (ret < 0) {
>              return ret;
>          }

Wrong before the patch, as I just pointed out in my review of "[PATCH v3
4/5] vhost-user-scsi: support reconnect to backend".  Correct afterwards.

> diff --git a/include/hw/virtio/vhost-scsi-common.h b/include/hw/virtio/vhost-scsi-common.h
> index 18f115527c..c5d2c09455 100644
> --- a/include/hw/virtio/vhost-scsi-common.h
> +++ b/include/hw/virtio/vhost-scsi-common.h
> @@ -39,7 +39,7 @@ struct VHostSCSICommon {
>      struct vhost_inflight *inflight;
>  };
>  
> -int vhost_scsi_common_start(VHostSCSICommon *vsc);
> +int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp);
>  void vhost_scsi_common_stop(VHostSCSICommon *vsc);
>  char *vhost_scsi_common_get_fw_dev_path(FWPathProvider *p, BusState *bus,
>                                          DeviceState *dev);

Please mention in the commit message that error messages improve, and
silent errors are now reported.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/2] vhost: Add Error parameter to vhost_scsi_common_start()
  2023-09-01 12:00     ` Markus Armbruster
@ 2023-09-12  8:32       ` Li Feng
  2023-10-01 20:00         ` Michael S. Tsirkin
  0 siblings, 1 reply; 27+ messages in thread
From: Li Feng @ 2023-09-12  8:32 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Michael S. Tsirkin, Raphael Norwitz, Kevin Wolf, Hanna Reitz,
	Paolo Bonzini, Fam Zheng, Alex Bennée, Viresh Kumar,
	open list:Block layer core, open list:All patches CC here

[-- Attachment #1: Type: text/plain, Size: 8496 bytes --]



> On 1 Sep 2023, at 8:00 PM, Markus Armbruster <armbru@redhat.com> wrote:
> 
> Li Feng <fengli@smartx.com <mailto:fengli@smartx.com>> writes:
> 
>> Add a Error parameter to report the real error, like vhost-user-blk.
>> 
>> Signed-off-by: Li Feng <fengli@smartx.com>
>> ---
>> hw/scsi/vhost-scsi-common.c           | 16 +++++++++-------
>> hw/scsi/vhost-scsi.c                  |  5 +++--
>> hw/scsi/vhost-user-scsi.c             | 14 ++++++++------
>> include/hw/virtio/vhost-scsi-common.h |  2 +-
>> 4 files changed, 21 insertions(+), 16 deletions(-)
>> 
>> diff --git a/hw/scsi/vhost-scsi-common.c b/hw/scsi/vhost-scsi-common.c
>> index a61cd0e907..4c8637045d 100644
>> --- a/hw/scsi/vhost-scsi-common.c
>> +++ b/hw/scsi/vhost-scsi-common.c
>> @@ -16,6 +16,7 @@
>>  */
>> 
>> #include "qemu/osdep.h"
>> +#include "qapi/error.h"
>> #include "qemu/error-report.h"
>> #include "qemu/module.h"
>> #include "hw/virtio/vhost.h"
>> @@ -25,7 +26,7 @@
>> #include "hw/virtio/virtio-access.h"
>> #include "hw/fw-path-provider.h"
>> 
>> -int vhost_scsi_common_start(VHostSCSICommon *vsc)
>> +int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp)
>> {
>>     int ret, i;
>>     VirtIODevice *vdev = VIRTIO_DEVICE(vsc);
>> @@ -35,18 +36,19 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
>>     VirtIOSCSICommon *vs = (VirtIOSCSICommon *)vsc;
>> 
>>     if (!k->set_guest_notifiers) {
>> -        error_report("binding does not support guest notifiers");
>> +        error_setg(errp, "binding does not support guest notifiers");
>>         return -ENOSYS;
>>     }
>> 
>>     ret = vhost_dev_enable_notifiers(&vsc->dev, vdev);
>>     if (ret < 0) {
>> +        error_setg_errno(errp, -ret, "Error enabling host notifiers");
> 
> Looks like the error is silent before your patch.  Correct?
Yes, before this patch, it’s a silent error.

> 
>>         return ret;
>>     }
>> 
>>     ret = k->set_guest_notifiers(qbus->parent, vsc->dev.nvqs, true);
>>     if (ret < 0) {
>> -        error_report("Error binding guest notifier");
>> +        error_setg_errno(errp, -ret, "Error binding guest notifier");
> 
> Error message now provides more detail.
Yes.

> 
>>         goto err_host_notifiers;
>>     }
>> 
>> @@ -54,7 +56,7 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
>> 
>>     ret = vhost_dev_prepare_inflight(&vsc->dev, vdev);
>>     if (ret < 0) {
>> -        error_report("Error setting inflight format: %d", -ret);
>> +        error_setg_errno(errp, -ret, "Error setting inflight format");
> 
> Error message now shows errno in human-readable form.
Yes.

> 
>>         goto err_guest_notifiers;
>>     }
>> 
>> @@ -64,21 +66,21 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
>>                                         vs->conf.virtqueue_size,
>>                                         vsc->inflight);
>>             if (ret < 0) {
>> -                error_report("Error getting inflight: %d", -ret);
>> +                error_setg_errno(errp, -ret, "Error getting inflight");
> 
> Likewise.
Yes.

> 
>>                 goto err_guest_notifiers;
>>             }
>>         }
>> 
>>         ret = vhost_dev_set_inflight(&vsc->dev, vsc->inflight);
>>         if (ret < 0) {
>> -            error_report("Error setting inflight: %d", -ret);
>> +            error_setg_errno(errp, -ret, "Error setting inflight");
> 
> Likewise.
Yes.

> 
>>             goto err_guest_notifiers;
>>         }
>>     }
>> 
>>     ret = vhost_dev_start(&vsc->dev, vdev, true);
>>     if (ret < 0) {
>> -        error_report("Error start vhost dev");
>> +        error_setg_errno(errp, -ret, "Error starting vhost dev");
> 
> Likewise.
Yes.

> 
>>         goto err_guest_notifiers;
>>     }
>> 
>> diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
>> index 443f67daa4..01a3ab4277 100644
>> --- a/hw/scsi/vhost-scsi.c
>> +++ b/hw/scsi/vhost-scsi.c
>> @@ -75,6 +75,7 @@ static int vhost_scsi_start(VHostSCSI *s)
>>     int ret, abi_version;
>>     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>>     const VhostOps *vhost_ops = vsc->dev.vhost_ops;
>> +    Error *local_err = NULL;
>> 
>>     ret = vhost_ops->vhost_scsi_get_abi_version(&vsc->dev, &abi_version);
>>     if (ret < 0) {
>> @@ -88,14 +89,14 @@ static int vhost_scsi_start(VHostSCSI *s)
>>         return -ENOSYS;
>>     }
>> 
>> -    ret = vhost_scsi_common_start(vsc);
>> +    ret = vhost_scsi_common_start(vsc, &local_err);
>>     if (ret < 0) {
>>         return ret;
>>     }
>> 
>>     ret = vhost_scsi_set_endpoint(s);
>>     if (ret < 0) {
>> -        error_report("Error setting vhost-scsi endpoint");
>> +        error_reportf_err(local_err, "Error setting vhost-scsi endpoint");
>>         vhost_scsi_common_stop(vsc);
>>     }
>> 
>> diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
>> index e931df9f5b..62fc98bb1c 100644
>> --- a/hw/scsi/vhost-user-scsi.c
>> +++ b/hw/scsi/vhost-user-scsi.c
>> @@ -43,12 +43,12 @@ enum VhostUserProtocolFeature {
>>     VHOST_USER_PROTOCOL_F_RESET_DEVICE = 13,
>> };
>> 
>> -static int vhost_user_scsi_start(VHostUserSCSI *s)
>> +static int vhost_user_scsi_start(VHostUserSCSI *s, Error **errp)
>> {
>>     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>>     int ret;
>> 
>> -    ret = vhost_scsi_common_start(vsc);
>> +    ret = vhost_scsi_common_start(vsc, errp);
>>     s->started_vu = (ret < 0 ? false : true);
>> 
>>     return ret;
>> @@ -73,6 +73,7 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
>>     VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>>     VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(dev);
>>     bool should_start = virtio_device_should_start(vdev, status);
>> +    Error *local_err = NULL;
>>     int ret;
>> 
>>     if (!s->connected) {
>> @@ -84,9 +85,10 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, uint8_t status)
>>     }
>> 
>>     if (should_start) {
>> -        ret = vhost_user_scsi_start(s);
>> +        ret = vhost_user_scsi_start(s, &local_err);
>>         if (ret < 0) {
>> -            error_report("unable to start vhost-user-scsi: %s", strerror(-ret));
>> +            error_reportf_err(local_err, "unable to start vhost-user-scsi: %s",
>> +                              strerror(-ret));
>>             qemu_chr_fe_disconnect(&vs->conf.chardev);
>>         }
>>     } else {
>> @@ -139,7 +141,7 @@ static void vhost_user_scsi_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>>      * Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
>>      * vhost here instead of waiting for .set_status().
>>      */
>> -    ret = vhost_user_scsi_start(s);
>> +    ret = vhost_user_scsi_start(s, &local_err);
>>     if (ret < 0) {
>>         error_reportf_err(local_err, "vhost-user-scsi: vhost start failed: ");
> 
> The error_reportf_err() is wrong before the patch, as I just pointed out
> in my review of "[PATCH v3 5/5] vhost-user-scsi: start vhost when guest
> kicks".  It is correct afterwards.
Yes.

> 
>>         qemu_chr_fe_disconnect(&vs->conf.chardev);
>> @@ -184,7 +186,7 @@ static int vhost_user_scsi_connect(DeviceState *dev, Error **errp)
>> 
>>     /* restore vhost state */
>>     if (virtio_device_started(vdev, vdev->status)) {
>> -        ret = vhost_user_scsi_start(s);
>> +        ret = vhost_user_scsi_start(s, errp);
>>         if (ret < 0) {
>>             return ret;
>>         }
> 
> Wrong before the patch, as I just pointed out in my review of "[PATCH v3
> 4/5] vhost-user-scsi: support reconnect to backend".  Correct afterwards.
Yes.

> 
>> diff --git a/include/hw/virtio/vhost-scsi-common.h b/include/hw/virtio/vhost-scsi-common.h
>> index 18f115527c..c5d2c09455 100644
>> --- a/include/hw/virtio/vhost-scsi-common.h
>> +++ b/include/hw/virtio/vhost-scsi-common.h
>> @@ -39,7 +39,7 @@ struct VHostSCSICommon {
>>     struct vhost_inflight *inflight;
>> };
>> 
>> -int vhost_scsi_common_start(VHostSCSICommon *vsc);
>> +int vhost_scsi_common_start(VHostSCSICommon *vsc, Error **errp);
>> void vhost_scsi_common_stop(VHostSCSICommon *vsc);
>> char *vhost_scsi_common_get_fw_dev_path(FWPathProvider *p, BusState *bus,
>>                                         DeviceState *dev);
> 
> Please mention in the commit message that error messages improve, and
> silent errors are now reported.
Ack.



[-- Attachment #2: Type: text/html, Size: 31880 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/2] vhost: Add Error parameter to vhost_scsi_common_start()
  2023-09-12  8:32       ` Li Feng
@ 2023-10-01 20:00         ` Michael S. Tsirkin
  0 siblings, 0 replies; 27+ messages in thread
From: Michael S. Tsirkin @ 2023-10-01 20:00 UTC (permalink / raw)
  To: Li Feng
  Cc: Markus Armbruster, Raphael Norwitz, Kevin Wolf, Hanna Reitz,
	Paolo Bonzini, Fam Zheng, Alex Bennée, Viresh Kumar,
	open list:Block layer core, open list:All patches CC here

On Tue, Sep 12, 2023 at 04:32:59PM +0800, Li Feng wrote:
>     Please mention in the commit message that error messages improve, and
>     silent errors are now reported.
> 
> Ack.


Still waiting for v4 with the updated commit log.

-- 
MST



^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2023-10-01 20:01 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-04  5:29 [PATCH 0/2] Fix vhost reconnect issues Li Feng
2023-08-04  5:29 ` [PATCH 1/2] vhost-user: fix lost reconnect Li Feng
2023-08-14 12:11   ` Raphael Norwitz
2023-08-17  6:40     ` Li Feng
2023-08-22  0:38       ` Raphael Norwitz
2023-08-22  4:49         ` Li Feng
2023-08-22 10:17           ` Raphael Norwitz
2023-08-24  7:27             ` Li Feng
2023-08-04  5:29 ` [PATCH 2/2] vhost: Add Error parameter to vhost_scsi_common_start() Li Feng
2023-08-14 12:11   ` Raphael Norwitz
2023-08-17  6:49     ` Li Feng
2023-08-21 12:09       ` Markus Armbruster
2023-08-22  4:47         ` Li Feng
2023-08-24  7:41 ` [PATCH v2 0/2] Fix vhost reconnect issues Li Feng
2023-08-24  7:41   ` [PATCH v2 1/2] vhost-user: Fix lost reconnect Li Feng
2023-08-29 22:11     ` Raphael Norwitz
2023-08-30  4:51       ` Li Feng
2023-08-24  7:41   ` [PATCH v2 2/2] vhost: Add Error parameter to vhost_scsi_common_start() Li Feng
2023-08-29 22:20     ` Raphael Norwitz
2023-08-30  4:57 ` [PATCH v3 0/2] Fix vhost reconnect issues Li Feng
2023-08-30  4:57   ` [PATCH v3 1/2] vhost-user: Fix lost reconnect Li Feng
2023-08-30  8:47     ` Raphael Norwitz
2023-08-30  4:57   ` [PATCH v3 2/2] vhost: Add Error parameter to vhost_scsi_common_start() Li Feng
2023-08-30  8:51     ` Raphael Norwitz
2023-09-01 12:00     ` Markus Armbruster
2023-09-12  8:32       ` Li Feng
2023-10-01 20:00         ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).