kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/9] vhost-scsi: log write descriptors for live migration (and three bugfix)
@ 2025-04-03  6:29 Dongli Zhang
  2025-04-03  6:29 ` [PATCH v3 1/9] vhost-scsi: protect vq->log_used with vq->mutex Dongli Zhang
                   ` (8 more replies)
  0 siblings, 9 replies; 18+ messages in thread
From: Dongli Zhang @ 2025-04-03  6:29 UTC (permalink / raw)
  To: virtualization, kvm, netdev
  Cc: mst, jasowang, michael.christie, pbonzini, stefanha, eperezma,
	joao.m.martins, joe.jin, si-wei.liu, linux-kernel

The live migration with vhost-scsi has been enabled by QEMU commit
b3e89c941a85 ("vhost-scsi: Allow user to enable migration"), which
thoroughly explains the workflow that QEMU collaborates with vhost-scsi on
the live migration.

Although it logs dirty data for the used ring, it doesn't log any write
descriptor (VRING_DESC_F_WRITE).

In comparison, vhost-net logs write descriptors via vhost_log_write(). The
SPDK (vhost-user-scsi backend) also logs write descriptors via
vhost_log_req_desc().

As a result, there is likely data mismatch between memory and vhost-scsi
disk during the live migration.

1. Suppose there is high workload and high memory usage. Suppose some
systemd userspace pages are swapped out to the swap disk.

2. Upon request from systemd, the kernel reads some pages from the swap
disk to the memory via vhost-scsi.

3. Although those userspace pages' data are updated, they are not marked as
dirty by vhost-scsi (this is the bug). They are not going to migrate to the
target host during memory transfer iterations.

4. Suppose systemd doesn't write to those pages any longer. Those pages
never get the chance to be dirty or migrated any longer.

5. Once the guest VM is resumed on the target host, because of the lack of
those dirty pages' data, the systemd may run into abnormal status, i.e.,
there may be systemd segfault.

Log all write descriptors to fix the issue.

In addition, the patchset also fixes three bugs in vhost-scsi.

Changed since v1:
  - Rebase on top of most recent vhost changes.
  - Don't allocate log buffer during initialization. Allocate only once for
    each command. Don't free until not used any longer.
  - Add bugfix for vhost_scsi_send_status().
Changed since v2:
  - Document parameters of vhost_log_write().
  - Use (len == U64_MAX) to indicate whether log all pages, instead of
    introducing a new parameter.
  - Merge PATCH 6 and PATCH 7 from v2 as one patch, to Allocate for only
    once in submission path in runtime. Reclaim int
    VHOST_SET_FEATURES/VHOST_SCSI_SET_ENDPOINT.
  - Encapsulate the one-time on-demand per-cmd log buffer alloc/copy in a
    helper, as suggested by Mike.

Dongli Zhang (vhost-scsi bugfix):
  vhost-scsi: protect vq->log_used with vq->mutex
  vhost-scsi: Fix vhost_scsi_send_bad_target()
  vhost-scsi: Fix vhost_scsi_send_status()

Dongli Zhang (log descriptor, suggested by Joao Martins):
  vhost: modify vhost_log_write() for broader users
  vhost-scsi: adjust vhost_scsi_get_desc() to log vring descriptors
  vhost-scsi: log I/O queue write descriptors
  vhost-scsi: log control queue write descriptors
  vhost-scsi: log event queue write descriptors
  vhost: add WARNING if log_num is more than limit

 drivers/vhost/scsi.c  | 264 ++++++++++++++++++++++++++++++++++++++++-----
 drivers/vhost/vhost.c |  46 ++++++--
 2 files changed, 273 insertions(+), 37 deletions(-)


base-commit: ac34bd6a617c03cad0fbb61f189f7c4dafbbddfb
branch: remotes/origin/linux-next
tree: https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git

Thank you very much!

Dongli Zhang


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 1/9] vhost-scsi: protect vq->log_used with vq->mutex
  2025-04-03  6:29 [PATCH v3 0/9] vhost-scsi: log write descriptors for live migration (and three bugfix) Dongli Zhang
@ 2025-04-03  6:29 ` Dongli Zhang
  2025-04-03  6:29 ` [PATCH v3 2/9] vhost-scsi: Fix vhost_scsi_send_bad_target() Dongli Zhang
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Dongli Zhang @ 2025-04-03  6:29 UTC (permalink / raw)
  To: virtualization, kvm, netdev
  Cc: mst, jasowang, michael.christie, pbonzini, stefanha, eperezma,
	joao.m.martins, joe.jin, si-wei.liu, linux-kernel

The vhost-scsi completion path may access vq->log_base when vq->log_used is
already set to false.

    vhost-thread                       QEMU-thread

vhost_scsi_complete_cmd_work()
-> vhost_add_used()
   -> vhost_add_used_n()
      if (unlikely(vq->log_used))
                                      QEMU disables vq->log_used
                                      via VHOST_SET_VRING_ADDR.
                                      mutex_lock(&vq->mutex);
                                      vq->log_used = false now!
                                      mutex_unlock(&vq->mutex);

				      QEMU gfree(vq->log_base)
        log_used()
        -> log_write(vq->log_base)

Assuming the VMM is QEMU. The vq->log_base is from QEMU userpace and can be
reclaimed via gfree(). As a result, this causes invalid memory writes to
QEMU userspace.

The control queue path has the same issue.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
---
Changed since v1:
  - Move lock to the begin and end of vhost_scsi_complete_cmd_work() as it
    is per-vq now. This reduces the number of mutex_lock().
  - Move this bugfix patch to before dirty log tracking patches.

 drivers/vhost/scsi.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index f6f5a7ac7894..f846f2aa7c87 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -627,6 +627,9 @@ static void vhost_scsi_complete_cmd_work(struct vhost_work *work)
 	int ret;
 
 	llnode = llist_del_all(&svq->completion_list);
+
+	mutex_lock(&svq->vq.mutex);
+
 	llist_for_each_entry_safe(cmd, t, llnode, tvc_completion_list) {
 		se_cmd = &cmd->tvc_se_cmd;
 
@@ -660,6 +663,8 @@ static void vhost_scsi_complete_cmd_work(struct vhost_work *work)
 		vhost_scsi_release_cmd_res(se_cmd);
 	}
 
+	mutex_unlock(&svq->vq.mutex);
+
 	if (signal)
 		vhost_signal(&svq->vs->dev, &svq->vq);
 }
@@ -1432,8 +1437,11 @@ static void vhost_scsi_tmf_resp_work(struct vhost_work *work)
 	else
 		resp_code = VIRTIO_SCSI_S_FUNCTION_REJECTED;
 
+	mutex_lock(&tmf->svq->vq.mutex);
 	vhost_scsi_send_tmf_resp(tmf->vhost, &tmf->svq->vq, tmf->in_iovs,
 				 tmf->vq_desc, &tmf->resp_iov, resp_code);
+	mutex_unlock(&tmf->svq->vq.mutex);
+
 	vhost_scsi_release_tmf_res(tmf);
 }
 
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 2/9] vhost-scsi: Fix vhost_scsi_send_bad_target()
  2025-04-03  6:29 [PATCH v3 0/9] vhost-scsi: log write descriptors for live migration (and three bugfix) Dongli Zhang
  2025-04-03  6:29 ` [PATCH v3 1/9] vhost-scsi: protect vq->log_used with vq->mutex Dongli Zhang
@ 2025-04-03  6:29 ` Dongli Zhang
  2025-04-03  6:29 ` [PATCH v3 3/9] vhost-scsi: Fix vhost_scsi_send_status() Dongli Zhang
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Dongli Zhang @ 2025-04-03  6:29 UTC (permalink / raw)
  To: virtualization, kvm, netdev
  Cc: mst, jasowang, michael.christie, pbonzini, stefanha, eperezma,
	joao.m.martins, joe.jin, si-wei.liu, linux-kernel

Although the support of VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 was
signaled by the commit 664ed90e621c ("vhost/scsi: Set
VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits"),
vhost_scsi_send_bad_target() still assumes the response in a single
descriptor.

In addition, although vhost_scsi_send_bad_target() is used by both I/O
queue and control queue, the response header is always
virtio_scsi_cmd_resp. It is required to use virtio_scsi_ctrl_tmf_resp or
virtio_scsi_ctrl_an_resp for control queue.

Fixes: 664ed90e621c ("vhost/scsi: Set VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits")
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
---
Changed since v1:
  - Move this bugfix patch to before dirty log tracking patches.

 drivers/vhost/scsi.c | 48 ++++++++++++++++++++++++++++++++++----------
 1 file changed, 37 insertions(+), 11 deletions(-)

diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index f846f2aa7c87..59d907b94c5e 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1015,23 +1015,46 @@ vhost_scsi_send_status(struct vhost_scsi *vs, struct vhost_virtqueue *vq,
 		pr_err("Faulted on virtio_scsi_cmd_resp\n");
 }
 
+#define TYPE_IO_CMD    0
+#define TYPE_CTRL_TMF  1
+#define TYPE_CTRL_AN   2
+
 static void
 vhost_scsi_send_bad_target(struct vhost_scsi *vs,
 			   struct vhost_virtqueue *vq,
-			   int head, unsigned out)
+			   struct vhost_scsi_ctx *vc, int type)
 {
-	struct virtio_scsi_cmd_resp __user *resp;
-	struct virtio_scsi_cmd_resp rsp;
+	union {
+		struct virtio_scsi_cmd_resp cmd;
+		struct virtio_scsi_ctrl_tmf_resp tmf;
+		struct virtio_scsi_ctrl_an_resp an;
+	} rsp;
+	struct iov_iter iov_iter;
+	size_t rsp_size;
 	int ret;
 
 	memset(&rsp, 0, sizeof(rsp));
-	rsp.response = VIRTIO_SCSI_S_BAD_TARGET;
-	resp = vq->iov[out].iov_base;
-	ret = __copy_to_user(resp, &rsp, sizeof(rsp));
-	if (!ret)
-		vhost_add_used_and_signal(&vs->dev, vq, head, 0);
+
+	if (type == TYPE_IO_CMD) {
+		rsp_size = sizeof(struct virtio_scsi_cmd_resp);
+		rsp.cmd.response = VIRTIO_SCSI_S_BAD_TARGET;
+	} else if (type == TYPE_CTRL_TMF) {
+		rsp_size = sizeof(struct virtio_scsi_ctrl_tmf_resp);
+		rsp.tmf.response = VIRTIO_SCSI_S_BAD_TARGET;
+	} else {
+		rsp_size = sizeof(struct virtio_scsi_ctrl_an_resp);
+		rsp.an.response = VIRTIO_SCSI_S_BAD_TARGET;
+	}
+
+	iov_iter_init(&iov_iter, ITER_DEST, &vq->iov[vc->out], vc->in,
+		      rsp_size);
+
+	ret = copy_to_iter(&rsp, rsp_size, &iov_iter);
+
+	if (likely(ret == rsp_size))
+		vhost_add_used_and_signal(&vs->dev, vq, vc->head, 0);
 	else
-		pr_err("Faulted on virtio_scsi_cmd_resp\n");
+		pr_err("Faulted on virtio scsi type=%d\n", type);
 }
 
 static int
@@ -1395,7 +1418,7 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
 		if (ret == -ENXIO)
 			break;
 		else if (ret == -EIO)
-			vhost_scsi_send_bad_target(vs, vq, vc.head, vc.out);
+			vhost_scsi_send_bad_target(vs, vq, &vc, TYPE_IO_CMD);
 		else if (ret == -ENOMEM)
 			vhost_scsi_send_status(vs, vq, vc.head, vc.out,
 					       SAM_STAT_TASK_SET_FULL);
@@ -1631,7 +1654,10 @@ vhost_scsi_ctl_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
 		if (ret == -ENXIO)
 			break;
 		else if (ret == -EIO)
-			vhost_scsi_send_bad_target(vs, vq, vc.head, vc.out);
+			vhost_scsi_send_bad_target(vs, vq, &vc,
+						   v_req.type == VIRTIO_SCSI_T_TMF ?
+						   TYPE_CTRL_TMF :
+						   TYPE_CTRL_AN);
 	} while (likely(!vhost_exceeds_weight(vq, ++c, 0)));
 out:
 	mutex_unlock(&vq->mutex);
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 3/9] vhost-scsi: Fix vhost_scsi_send_status()
  2025-04-03  6:29 [PATCH v3 0/9] vhost-scsi: log write descriptors for live migration (and three bugfix) Dongli Zhang
  2025-04-03  6:29 ` [PATCH v3 1/9] vhost-scsi: protect vq->log_used with vq->mutex Dongli Zhang
  2025-04-03  6:29 ` [PATCH v3 2/9] vhost-scsi: Fix vhost_scsi_send_bad_target() Dongli Zhang
@ 2025-04-03  6:29 ` Dongli Zhang
  2025-04-03  6:29 ` [PATCH v3 4/9] vhost: modify vhost_log_write() for broader users Dongli Zhang
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Dongli Zhang @ 2025-04-03  6:29 UTC (permalink / raw)
  To: virtualization, kvm, netdev
  Cc: mst, jasowang, michael.christie, pbonzini, stefanha, eperezma,
	joao.m.martins, joe.jin, si-wei.liu, linux-kernel

Although the support of VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 was
signaled by the commit 664ed90e621c ("vhost/scsi: Set
VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits"),
vhost_scsi_send_bad_target() still assumes the response in a single
descriptor.

Similar issue in vhost_scsi_send_bad_target() has been fixed in previous
commit. In addition, similar issue for vhost_scsi_complete_cmd_work() has
been fixed by the commit 6dd88fd59da8 ("vhost-scsi: unbreak any layout for
response").

Fixes: 3ca51662f818 ("vhost-scsi: Add better resource allocation failure handling")
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
---
Changed since v1:
  - New patch to fix vhost_scsi_send_status().
Changed since v2:
  - Mention the commit 6dd88fd59da8 in the commit message.

 drivers/vhost/scsi.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 59d907b94c5e..26bcf3a7f70c 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -999,18 +999,22 @@ static void vhost_scsi_target_queue_cmd(struct vhost_scsi_nexus *nexus,
 
 static void
 vhost_scsi_send_status(struct vhost_scsi *vs, struct vhost_virtqueue *vq,
-		       int head, unsigned int out, u8 status)
+		       struct vhost_scsi_ctx *vc, u8 status)
 {
-	struct virtio_scsi_cmd_resp __user *resp;
 	struct virtio_scsi_cmd_resp rsp;
+	struct iov_iter iov_iter;
 	int ret;
 
 	memset(&rsp, 0, sizeof(rsp));
 	rsp.status = status;
-	resp = vq->iov[out].iov_base;
-	ret = __copy_to_user(resp, &rsp, sizeof(rsp));
-	if (!ret)
-		vhost_add_used_and_signal(&vs->dev, vq, head, 0);
+
+	iov_iter_init(&iov_iter, ITER_DEST, &vq->iov[vc->out], vc->in,
+		      sizeof(rsp));
+
+	ret = copy_to_iter(&rsp, sizeof(rsp), &iov_iter);
+
+	if (likely(ret == sizeof(rsp)))
+		vhost_add_used_and_signal(&vs->dev, vq, vc->head, 0);
 	else
 		pr_err("Faulted on virtio_scsi_cmd_resp\n");
 }
@@ -1420,7 +1424,7 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
 		else if (ret == -EIO)
 			vhost_scsi_send_bad_target(vs, vq, &vc, TYPE_IO_CMD);
 		else if (ret == -ENOMEM)
-			vhost_scsi_send_status(vs, vq, vc.head, vc.out,
+			vhost_scsi_send_status(vs, vq, &vc,
 					       SAM_STAT_TASK_SET_FULL);
 	} while (likely(!vhost_exceeds_weight(vq, ++c, 0)));
 out:
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 4/9] vhost: modify vhost_log_write() for broader users
  2025-04-03  6:29 [PATCH v3 0/9] vhost-scsi: log write descriptors for live migration (and three bugfix) Dongli Zhang
                   ` (2 preceding siblings ...)
  2025-04-03  6:29 ` [PATCH v3 3/9] vhost-scsi: Fix vhost_scsi_send_status() Dongli Zhang
@ 2025-04-03  6:29 ` Dongli Zhang
  2025-04-16  7:58   ` Dongli Zhang
  2025-04-21  3:08   ` Jason Wang
  2025-04-03  6:29 ` [PATCH v3 5/9] vhost-scsi: adjust vhost_scsi_get_desc() to log vring descriptors Dongli Zhang
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 18+ messages in thread
From: Dongli Zhang @ 2025-04-03  6:29 UTC (permalink / raw)
  To: virtualization, kvm, netdev
  Cc: mst, jasowang, michael.christie, pbonzini, stefanha, eperezma,
	joao.m.martins, joe.jin, si-wei.liu, linux-kernel

Currently, the only user of vhost_log_write() is vhost-net. The 'len'
argument prevents logging of pages that are not tainted by the RX path.

Adjustments are needed since more drivers (i.e. vhost-scsi) begin using
vhost_log_write(). So far vhost-net RX path may only partially use pages
shared via the last vring descriptor. Unlike vhost-net, vhost-scsi always
logs all pages shared via vring descriptors. To accommodate this,
use (len == U64_MAX) to indicate whether the driver would log all pages of
vring descriptors, or only pages that are tainted by the driver.

In addition, removes BUG().

Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
Changed since v2:
  - Document parameters of vhost_log_write().
  - Use (len == U64_MAX) to indicate whether log all pages, instead of
    introducing a new parameter.

 drivers/vhost/vhost.c | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 9ac25d08f473..494b3da5423a 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2304,6 +2304,19 @@ static int log_used(struct vhost_virtqueue *vq, u64 used_offset, u64 len)
 	return 0;
 }
 
+/*
+ * vhost_log_write() - Log in dirty page bitmap
+ * @vq:      vhost virtqueue.
+ * @log:     Array of dirty memory in GPA.
+ * @log_num: Size of vhost_log arrary.
+ * @len:     The total length of memory buffer to log in the dirty bitmap.
+ *	     Some drivers may only partially use pages shared via the last
+ *	     vring descriptor (i.e. vhost-net RX buffer).
+ *	     Use (len == U64_MAX) to indicate the driver would log all
+ *           pages of vring descriptors.
+ * @iov:     Array of dirty memory in HVA.
+ * @count:   Size of iovec array.
+ */
 int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
 		    unsigned int log_num, u64 len, struct iovec *iov, int count)
 {
@@ -2327,15 +2340,14 @@ int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
 		r = log_write(vq->log_base, log[i].addr, l);
 		if (r < 0)
 			return r;
-		len -= l;
-		if (!len) {
-			if (vq->log_ctx)
-				eventfd_signal(vq->log_ctx);
-			return 0;
-		}
+
+		if (len != U64_MAX)
+			len -= l;
 	}
-	/* Length written exceeds what we have stored. This is a bug. */
-	BUG();
+
+	if (vq->log_ctx)
+		eventfd_signal(vq->log_ctx);
+
 	return 0;
 }
 EXPORT_SYMBOL_GPL(vhost_log_write);
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 5/9] vhost-scsi: adjust vhost_scsi_get_desc() to log vring descriptors
  2025-04-03  6:29 [PATCH v3 0/9] vhost-scsi: log write descriptors for live migration (and three bugfix) Dongli Zhang
                   ` (3 preceding siblings ...)
  2025-04-03  6:29 ` [PATCH v3 4/9] vhost: modify vhost_log_write() for broader users Dongli Zhang
@ 2025-04-03  6:29 ` Dongli Zhang
  2025-04-03  6:29 ` [PATCH v3 6/9] vhost-scsi: log I/O queue write descriptors Dongli Zhang
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Dongli Zhang @ 2025-04-03  6:29 UTC (permalink / raw)
  To: virtualization, kvm, netdev
  Cc: mst, jasowang, michael.christie, pbonzini, stefanha, eperezma,
	joao.m.martins, joe.jin, si-wei.liu, linux-kernel

Adjust vhost_scsi_get_desc() to facilitate logging of vring descriptors.

Add new arguments to allow passing the log buffer and length to
vhost_get_vq_desc().

In addition, reset 'log_num' since vhost_get_vq_desc() may reset it only
after certain condition checks.

Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/vhost/scsi.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 26bcf3a7f70c..3875967dee36 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1063,13 +1063,17 @@ vhost_scsi_send_bad_target(struct vhost_scsi *vs,
 
 static int
 vhost_scsi_get_desc(struct vhost_scsi *vs, struct vhost_virtqueue *vq,
-		    struct vhost_scsi_ctx *vc)
+		    struct vhost_scsi_ctx *vc,
+		    struct vhost_log *log, unsigned int *log_num)
 {
 	int ret = -ENXIO;
 
+	if (likely(log_num))
+		*log_num = 0;
+
 	vc->head = vhost_get_vq_desc(vq, vq->iov,
 				     ARRAY_SIZE(vq->iov), &vc->out, &vc->in,
-				     NULL, NULL);
+				     log, log_num);
 
 	pr_debug("vhost_get_vq_desc: head: %d, out: %u in: %u\n",
 		 vc->head, vc->out, vc->in);
@@ -1237,7 +1241,7 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
 	vhost_disable_notify(&vs->dev, vq);
 
 	do {
-		ret = vhost_scsi_get_desc(vs, vq, &vc);
+		ret = vhost_scsi_get_desc(vs, vq, &vc, NULL, NULL);
 		if (ret)
 			goto err;
 
@@ -1581,7 +1585,7 @@ vhost_scsi_ctl_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
 	vhost_disable_notify(&vs->dev, vq);
 
 	do {
-		ret = vhost_scsi_get_desc(vs, vq, &vc);
+		ret = vhost_scsi_get_desc(vs, vq, &vc, NULL, NULL);
 		if (ret)
 			goto err;
 
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 6/9] vhost-scsi: log I/O queue write descriptors
  2025-04-03  6:29 [PATCH v3 0/9] vhost-scsi: log write descriptors for live migration (and three bugfix) Dongli Zhang
                   ` (4 preceding siblings ...)
  2025-04-03  6:29 ` [PATCH v3 5/9] vhost-scsi: adjust vhost_scsi_get_desc() to log vring descriptors Dongli Zhang
@ 2025-04-03  6:29 ` Dongli Zhang
  2025-04-06 21:41   ` Mike Christie
  2025-04-03  6:29 ` [PATCH v3 7/9] vhost-scsi: log control " Dongli Zhang
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 18+ messages in thread
From: Dongli Zhang @ 2025-04-03  6:29 UTC (permalink / raw)
  To: virtualization, kvm, netdev
  Cc: mst, jasowang, michael.christie, pbonzini, stefanha, eperezma,
	joao.m.martins, joe.jin, si-wei.liu, linux-kernel

Log write descriptors for the I/O queue, leveraging vhost_scsi_get_desc()
and vhost_get_vq_desc() to retrieve the array of write descriptors to
obtain the log buffer.

In addition, introduce a vhost-scsi specific function to log vring
descriptors. In this function, the 'partial' argument is set to false, and
the 'len' argument is set to 0, because vhost-scsi always logs all pages
shared by a vring descriptor. Add WARN_ON_ONCE() since vhost-scsi doesn't
support VIRTIO_F_ACCESS_PLATFORM.

The per-cmd log buffer is allocated on demand in the submission path after
VHOST_F_LOG_ALL is set. Return -ENOMEM on allocation failure, in order to
send SAM_STAT_TASK_SET_FULL to the guest.

It isn't reclaimed in the completion path. Instead, it is reclaimed when
VHOST_F_LOG_ALL is removed, or during VHOST_SCSI_SET_ENDPOINT when all
commands are destroyed.

Store the log buffer during the submission path and log it in the
completion path. Logging is also required in the error handling path of the
submission process.

Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
Changed since v1:
  - Don't allocate log buffer during initialization. Allocate during
  - VHOST_SET_FEATURES or VHOST_SCSI_SET_ENDPOINT.
  - Re-order if staments in vhost_scsi_log_write().
  - Log after vhost_scsi_send_status() as well.
Changed since v2:
  - Merge PATCH 6 and PATCH 7 from v2 as one patch.
  - Don't pre-allocate log buffer in
    VHOST_SET_FEATURES/VHOST_SCSI_SET_ENDPOINT. Allocate for only once in
    submission path in runtime. Reclaim int
    VHOST_SET_FEATURES/VHOST_SCSI_SET_ENDPOINT.
  - Encapsulate the one-time on-demand per-cmd log buffer alloc/copy in a
    helper, as suggested by Mike.

 drivers/vhost/scsi.c | 119 +++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 116 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 3875967dee36..af7b0ee42b6d 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -133,6 +133,11 @@ struct vhost_scsi_cmd {
 	struct se_cmd tvc_se_cmd;
 	/* Sense buffer that will be mapped into outgoing status */
 	unsigned char tvc_sense_buf[TRANSPORT_SENSE_BUFFER];
+	/*
+	 * Dirty write descriptors of this command.
+	 */
+	struct vhost_log *tvc_log;
+	unsigned int tvc_log_num;
 	/* Completed commands list, serviced from vhost worker thread */
 	struct llist_node tvc_completion_list;
 	/* Used to track inflight cmd */
@@ -362,6 +367,45 @@ static int vhost_scsi_check_prot_fabric_only(struct se_portal_group *se_tpg)
 	return tpg->tv_fabric_prot_type;
 }
 
+static int vhost_scsi_copy_cmd_log(struct vhost_virtqueue *vq,
+				   struct vhost_scsi_cmd *cmd,
+				   struct vhost_log *log,
+				   unsigned int log_num)
+{
+	if (!cmd->tvc_log)
+		cmd->tvc_log = kmalloc_array(vq->dev->iov_limit,
+					     sizeof(*cmd->tvc_log),
+					     GFP_KERNEL);
+
+	if (unlikely(!cmd->tvc_log)) {
+		vq_err(vq, "Failed to alloc tvc_log\n");
+		return -ENOMEM;
+	}
+
+	memcpy(cmd->tvc_log, log, sizeof(*cmd->tvc_log) * log_num);
+	cmd->tvc_log_num = log_num;
+
+	return 0;
+}
+
+static void vhost_scsi_log_write(struct vhost_virtqueue *vq,
+				 struct vhost_log *log,
+				 unsigned int log_num)
+{
+	if (likely(!vhost_has_feature(vq, VHOST_F_LOG_ALL)))
+		return;
+
+	if (likely(!log_num || !log))
+		return;
+
+	/*
+	 * vhost-scsi doesn't support VIRTIO_F_ACCESS_PLATFORM.
+	 * No requirement for vq->iotlb case.
+	 */
+	WARN_ON_ONCE(unlikely(vq->iotlb));
+	vhost_log_write(vq, log, log_num, U64_MAX, NULL, 0);
+}
+
 static void vhost_scsi_release_cmd_res(struct se_cmd *se_cmd)
 {
 	struct vhost_scsi_cmd *tv_cmd = container_of(se_cmd,
@@ -660,6 +704,9 @@ static void vhost_scsi_complete_cmd_work(struct vhost_work *work)
 		} else
 			pr_err("Faulted on virtio_scsi_cmd_resp\n");
 
+		vhost_scsi_log_write(cmd->tvc_vq, cmd->tvc_log,
+				     cmd->tvc_log_num);
+
 		vhost_scsi_release_cmd_res(se_cmd);
 	}
 
@@ -676,6 +723,7 @@ vhost_scsi_get_cmd(struct vhost_virtqueue *vq, u64 scsi_tag)
 					struct vhost_scsi_virtqueue, vq);
 	struct vhost_scsi_cmd *cmd;
 	struct scatterlist *sgl, *prot_sgl;
+	struct vhost_log *log;
 	int tag;
 
 	tag = sbitmap_get(&svq->scsi_tags);
@@ -687,9 +735,11 @@ vhost_scsi_get_cmd(struct vhost_virtqueue *vq, u64 scsi_tag)
 	cmd = &svq->scsi_cmds[tag];
 	sgl = cmd->sgl;
 	prot_sgl = cmd->prot_sgl;
+	log = cmd->tvc_log;
 	memset(cmd, 0, sizeof(*cmd));
 	cmd->sgl = sgl;
 	cmd->prot_sgl = prot_sgl;
+	cmd->tvc_log = log;
 	cmd->tvc_se_cmd.map_tag = tag;
 	cmd->inflight = vhost_scsi_get_inflight(vq);
 
@@ -1225,6 +1275,8 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
 	u8 task_attr;
 	bool t10_pi = vhost_has_feature(vq, VIRTIO_SCSI_F_T10_PI);
 	u8 *cdb;
+	struct vhost_log *vq_log;
+	unsigned int log_num;
 
 	mutex_lock(&vq->mutex);
 	/*
@@ -1240,8 +1292,11 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
 
 	vhost_disable_notify(&vs->dev, vq);
 
+	vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ?
+		vq->log : NULL;
+
 	do {
-		ret = vhost_scsi_get_desc(vs, vq, &vc, NULL, NULL);
+		ret = vhost_scsi_get_desc(vs, vq, &vc, vq_log, &log_num);
 		if (ret)
 			goto err;
 
@@ -1390,6 +1445,14 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
 			goto err;
 		}
 
+		if (unlikely(vq_log && log_num)) {
+			ret = vhost_scsi_copy_cmd_log(vq, cmd, vq_log, log_num);
+			if (unlikely(ret)) {
+				vhost_scsi_release_cmd_res(&cmd->tvc_se_cmd);
+				goto err;
+			}
+		}
+
 		pr_debug("vhost_scsi got command opcode: %#02x, lun: %d\n",
 			 cdb[0], lun);
 		pr_debug("cmd: %p exp_data_len: %d, prot_bytes: %d data_direction:"
@@ -1425,11 +1488,14 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
 		 */
 		if (ret == -ENXIO)
 			break;
-		else if (ret == -EIO)
+		else if (ret == -EIO) {
 			vhost_scsi_send_bad_target(vs, vq, &vc, TYPE_IO_CMD);
-		else if (ret == -ENOMEM)
+			vhost_scsi_log_write(vq, vq_log, log_num);
+		} else if (ret == -ENOMEM) {
 			vhost_scsi_send_status(vs, vq, &vc,
 					       SAM_STAT_TASK_SET_FULL);
+			vhost_scsi_log_write(vq, vq_log, log_num);
+		}
 	} while (likely(!vhost_exceeds_weight(vq, ++c, 0)));
 out:
 	mutex_unlock(&vq->mutex);
@@ -1760,6 +1826,24 @@ static void vhost_scsi_flush(struct vhost_scsi *vs)
 		wait_for_completion(&vs->old_inflight[i]->comp);
 }
 
+static void vhost_scsi_destroy_vq_log(struct vhost_virtqueue *vq)
+{
+	struct vhost_scsi_virtqueue *svq = container_of(vq,
+					struct vhost_scsi_virtqueue, vq);
+	struct vhost_scsi_cmd *tv_cmd;
+	unsigned int i;
+
+	if (!svq->scsi_cmds)
+		return;
+
+	for (i = 0; i < svq->max_cmds; i++) {
+		tv_cmd = &svq->scsi_cmds[i];
+		kfree(tv_cmd->tvc_log);
+		tv_cmd->tvc_log = NULL;
+		tv_cmd->tvc_log_num = 0;
+	}
+}
+
 static void vhost_scsi_destroy_vq_cmds(struct vhost_virtqueue *vq)
 {
 	struct vhost_scsi_virtqueue *svq = container_of(vq,
@@ -1779,6 +1863,7 @@ static void vhost_scsi_destroy_vq_cmds(struct vhost_virtqueue *vq)
 
 	sbitmap_free(&svq->scsi_tags);
 	kfree(svq->upages);
+	vhost_scsi_destroy_vq_log(vq);
 	kfree(svq->scsi_cmds);
 	svq->scsi_cmds = NULL;
 }
@@ -2088,6 +2173,7 @@ vhost_scsi_clear_endpoint(struct vhost_scsi *vs,
 static int vhost_scsi_set_features(struct vhost_scsi *vs, u64 features)
 {
 	struct vhost_virtqueue *vq;
+	bool is_log, was_log;
 	int i;
 
 	if (features & ~VHOST_SCSI_FEATURES)
@@ -2100,12 +2186,39 @@ static int vhost_scsi_set_features(struct vhost_scsi *vs, u64 features)
 		return -EFAULT;
 	}
 
+	if (!vs->dev.nvqs)
+		goto out;
+
+	is_log = features & (1 << VHOST_F_LOG_ALL);
+	/*
+	 * All VQs should have same feature.
+	 */
+	was_log = vhost_has_feature(&vs->vqs[0].vq, VHOST_F_LOG_ALL);
+
 	for (i = 0; i < vs->dev.nvqs; i++) {
 		vq = &vs->vqs[i].vq;
 		mutex_lock(&vq->mutex);
 		vq->acked_features = features;
 		mutex_unlock(&vq->mutex);
 	}
+
+	/*
+	 * If VHOST_F_LOG_ALL is removed, free tvc_log after
+	 * vq->acked_features is committed.
+	 */
+	if (!is_log && was_log) {
+		for (i = VHOST_SCSI_VQ_IO; i < vs->dev.nvqs; i++) {
+			if (!vs->vqs[i].scsi_cmds)
+				continue;
+
+			vq = &vs->vqs[i].vq;
+			mutex_lock(&vq->mutex);
+			vhost_scsi_destroy_vq_log(vq);
+			mutex_unlock(&vq->mutex);
+		}
+	}
+
+out:
 	mutex_unlock(&vs->dev.mutex);
 	return 0;
 }
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 7/9] vhost-scsi: log control queue write descriptors
  2025-04-03  6:29 [PATCH v3 0/9] vhost-scsi: log write descriptors for live migration (and three bugfix) Dongli Zhang
                   ` (5 preceding siblings ...)
  2025-04-03  6:29 ` [PATCH v3 6/9] vhost-scsi: log I/O queue write descriptors Dongli Zhang
@ 2025-04-03  6:29 ` Dongli Zhang
  2025-04-06 21:43   ` Mike Christie
  2025-04-03  6:29 ` [PATCH v3 8/9] vhost-scsi: log event " Dongli Zhang
  2025-04-03  6:29 ` [PATCH v3 9/9] vhost: add WARNING if log_num is more than limit Dongli Zhang
  8 siblings, 1 reply; 18+ messages in thread
From: Dongli Zhang @ 2025-04-03  6:29 UTC (permalink / raw)
  To: virtualization, kvm, netdev
  Cc: mst, jasowang, michael.christie, pbonzini, stefanha, eperezma,
	joao.m.martins, joe.jin, si-wei.liu, linux-kernel

Log write descriptors for the control queue, leveraging
vhost_scsi_get_desc() and vhost_get_vq_desc() to retrieve the array of
write descriptors to obtain the log buffer.

For Task Management Requests, similar to the I/O queue, store the log
buffer during the submission path and log it in the completion or error
handling path.

For Asynchronous Notifications, only the submission path is involved.

Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
Changed since v1:
  - Call kfree(tmf->tmf_log) unconditionally.
  - Return VIRTIO_SCSI_S_FUNCTION_REJECTED on log buffer allocation failure.

 drivers/vhost/scsi.c | 47 +++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 42 insertions(+), 5 deletions(-)

diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index af7b0ee42b6d..1a1a4cd95ace 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -263,6 +263,12 @@ struct vhost_scsi_tmf {
 	struct iovec resp_iov;
 	int in_iovs;
 	int vq_desc;
+
+	/*
+	 * Dirty write descriptors of this command.
+	 */
+	struct vhost_log *tmf_log;
+	unsigned int tmf_log_num;
 };
 
 /*
@@ -452,6 +458,10 @@ static void vhost_scsi_release_tmf_res(struct vhost_scsi_tmf *tmf)
 {
 	struct vhost_scsi_inflight *inflight = tmf->inflight;
 
+	/*
+	 * tmf->tmf_log is default NULL unless VHOST_F_LOG_ALL is set.
+	 */
+	kfree(tmf->tmf_log);
 	kfree(tmf);
 	vhost_scsi_put_inflight(inflight);
 }
@@ -1537,6 +1547,8 @@ static void vhost_scsi_tmf_resp_work(struct vhost_work *work)
 	mutex_lock(&tmf->svq->vq.mutex);
 	vhost_scsi_send_tmf_resp(tmf->vhost, &tmf->svq->vq, tmf->in_iovs,
 				 tmf->vq_desc, &tmf->resp_iov, resp_code);
+	vhost_scsi_log_write(&tmf->svq->vq, tmf->tmf_log,
+			     tmf->tmf_log_num);
 	mutex_unlock(&tmf->svq->vq.mutex);
 
 	vhost_scsi_release_tmf_res(tmf);
@@ -1560,7 +1572,8 @@ static void
 vhost_scsi_handle_tmf(struct vhost_scsi *vs, struct vhost_scsi_tpg *tpg,
 		      struct vhost_virtqueue *vq,
 		      struct virtio_scsi_ctrl_tmf_req *vtmf,
-		      struct vhost_scsi_ctx *vc)
+		      struct vhost_scsi_ctx *vc,
+		      struct vhost_log *log, unsigned int log_num)
 {
 	struct vhost_scsi_virtqueue *svq = container_of(vq,
 					struct vhost_scsi_virtqueue, vq);
@@ -1588,6 +1601,19 @@ vhost_scsi_handle_tmf(struct vhost_scsi *vs, struct vhost_scsi_tpg *tpg,
 	tmf->in_iovs = vc->in;
 	tmf->inflight = vhost_scsi_get_inflight(vq);
 
+	if (unlikely(log && log_num)) {
+		tmf->tmf_log = kmalloc_array(log_num, sizeof(*tmf->tmf_log),
+					     GFP_KERNEL);
+		if (tmf->tmf_log) {
+			memcpy(tmf->tmf_log, log, sizeof(*tmf->tmf_log) * log_num);
+			tmf->tmf_log_num = log_num;
+		} else {
+			pr_err("vhost_scsi tmf log allocation error\n");
+			vhost_scsi_release_tmf_res(tmf);
+			goto send_reject;
+		}
+	}
+
 	if (target_submit_tmr(&tmf->se_cmd, tpg->tpg_nexus->tvn_se_sess, NULL,
 			      vhost_buf_to_lun(vtmf->lun), NULL,
 			      TMR_LUN_RESET, GFP_KERNEL, 0,
@@ -1601,6 +1627,7 @@ vhost_scsi_handle_tmf(struct vhost_scsi *vs, struct vhost_scsi_tpg *tpg,
 send_reject:
 	vhost_scsi_send_tmf_resp(vs, vq, vc->in, vc->head, &vq->iov[vc->out],
 				 VIRTIO_SCSI_S_FUNCTION_REJECTED);
+	vhost_scsi_log_write(vq, log, log_num);
 }
 
 static void
@@ -1637,6 +1664,8 @@ vhost_scsi_ctl_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
 	struct vhost_scsi_ctx vc;
 	size_t typ_size;
 	int ret, c = 0;
+	struct vhost_log *vq_log;
+	unsigned int log_num;
 
 	mutex_lock(&vq->mutex);
 	/*
@@ -1650,8 +1679,11 @@ vhost_scsi_ctl_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
 
 	vhost_disable_notify(&vs->dev, vq);
 
+	vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ?
+		vq->log : NULL;
+
 	do {
-		ret = vhost_scsi_get_desc(vs, vq, &vc, NULL, NULL);
+		ret = vhost_scsi_get_desc(vs, vq, &vc, vq_log, &log_num);
 		if (ret)
 			goto err;
 
@@ -1715,9 +1747,12 @@ vhost_scsi_ctl_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
 			goto err;
 
 		if (v_req.type == VIRTIO_SCSI_T_TMF)
-			vhost_scsi_handle_tmf(vs, tpg, vq, &v_req.tmf, &vc);
-		else
+			vhost_scsi_handle_tmf(vs, tpg, vq, &v_req.tmf, &vc,
+					      vq_log, log_num);
+		else {
 			vhost_scsi_send_an_resp(vs, vq, &vc);
+			vhost_scsi_log_write(vq, vq_log, log_num);
+		}
 err:
 		/*
 		 * ENXIO:  No more requests, or read error, wait for next kick
@@ -1727,11 +1762,13 @@ vhost_scsi_ctl_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
 		 */
 		if (ret == -ENXIO)
 			break;
-		else if (ret == -EIO)
+		else if (ret == -EIO) {
 			vhost_scsi_send_bad_target(vs, vq, &vc,
 						   v_req.type == VIRTIO_SCSI_T_TMF ?
 						   TYPE_CTRL_TMF :
 						   TYPE_CTRL_AN);
+			vhost_scsi_log_write(vq, vq_log, log_num);
+		}
 	} while (likely(!vhost_exceeds_weight(vq, ++c, 0)));
 out:
 	mutex_unlock(&vq->mutex);
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 8/9] vhost-scsi: log event queue write descriptors
  2025-04-03  6:29 [PATCH v3 0/9] vhost-scsi: log write descriptors for live migration (and three bugfix) Dongli Zhang
                   ` (6 preceding siblings ...)
  2025-04-03  6:29 ` [PATCH v3 7/9] vhost-scsi: log control " Dongli Zhang
@ 2025-04-03  6:29 ` Dongli Zhang
  2025-04-03  6:29 ` [PATCH v3 9/9] vhost: add WARNING if log_num is more than limit Dongli Zhang
  8 siblings, 0 replies; 18+ messages in thread
From: Dongli Zhang @ 2025-04-03  6:29 UTC (permalink / raw)
  To: virtualization, kvm, netdev
  Cc: mst, jasowang, michael.christie, pbonzini, stefanha, eperezma,
	joao.m.martins, joe.jin, si-wei.liu, linux-kernel

Log write descriptors for the event queue, leveraging vhost_get_vq_desc()
to retrieve the array of write descriptors to obtain the log buffer.

There is only one path for event queue.

Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/vhost/scsi.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 1a1a4cd95ace..08e38f866a2c 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -571,6 +571,8 @@ vhost_scsi_do_evt_work(struct vhost_scsi *vs, struct vhost_scsi_evt *evt)
 	struct vhost_virtqueue *vq = &vs->vqs[VHOST_SCSI_VQ_EVT].vq;
 	struct virtio_scsi_event *event = &evt->event;
 	struct virtio_scsi_event __user *eventp;
+	struct vhost_log *vq_log;
+	unsigned int log_num;
 	unsigned out, in;
 	int head, ret;
 
@@ -581,9 +583,19 @@ vhost_scsi_do_evt_work(struct vhost_scsi *vs, struct vhost_scsi_evt *evt)
 
 again:
 	vhost_disable_notify(&vs->dev, vq);
+
+	vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ?
+		vq->log : NULL;
+
+	/*
+	 * Reset 'log_num' since vhost_get_vq_desc() may reset it only
+	 * after certain condition checks.
+	 */
+	log_num = 0;
+
 	head = vhost_get_vq_desc(vq, vq->iov,
 			ARRAY_SIZE(vq->iov), &out, &in,
-			NULL, NULL);
+			vq_log, &log_num);
 	if (head < 0) {
 		vs->vs_events_missed = true;
 		return;
@@ -613,6 +625,8 @@ vhost_scsi_do_evt_work(struct vhost_scsi *vs, struct vhost_scsi_evt *evt)
 		vhost_add_used_and_signal(&vs->dev, vq, head, 0);
 	else
 		vq_err(vq, "Faulted on vhost_scsi_send_event\n");
+
+	vhost_scsi_log_write(vq, vq_log, log_num);
 }
 
 static void vhost_scsi_complete_events(struct vhost_scsi *vs, bool drop)
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 9/9] vhost: add WARNING if log_num is more than limit
  2025-04-03  6:29 [PATCH v3 0/9] vhost-scsi: log write descriptors for live migration (and three bugfix) Dongli Zhang
                   ` (7 preceding siblings ...)
  2025-04-03  6:29 ` [PATCH v3 8/9] vhost-scsi: log event " Dongli Zhang
@ 2025-04-03  6:29 ` Dongli Zhang
  2025-04-14 16:32   ` Michael S. Tsirkin
  8 siblings, 1 reply; 18+ messages in thread
From: Dongli Zhang @ 2025-04-03  6:29 UTC (permalink / raw)
  To: virtualization, kvm, netdev
  Cc: mst, jasowang, michael.christie, pbonzini, stefanha, eperezma,
	joao.m.martins, joe.jin, si-wei.liu, linux-kernel

Since long time ago, the only user of vq->log is vhost-net. The concern is
to add support for more devices (i.e. vhost-scsi or vsock) may reveals
unknown issue in the vhost API. Add a WARNING.

Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
 drivers/vhost/vhost.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 494b3da5423a..b7d51d569646 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2559,6 +2559,15 @@ static int get_indirect(struct vhost_virtqueue *vq,
 		if (access == VHOST_ACCESS_WO) {
 			*in_num += ret;
 			if (unlikely(log && ret)) {
+				/*
+				 * Since long time ago, the only user of
+				 * vq->log is vhost-net. The concern is to
+				 * add support for more devices (i.e.
+				 * vhost-scsi or vsock) may reveals unknown
+				 * issue in the vhost API. Add a WARNING.
+				 */
+				WARN_ON_ONCE(*log_num >= vq->dev->iov_limit);
+
 				log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
 				log[*log_num].len = vhost32_to_cpu(vq, desc.len);
 				++*log_num;
@@ -2679,6 +2688,15 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
 			 * increment that count. */
 			*in_num += ret;
 			if (unlikely(log && ret)) {
+				/*
+				 * Since long time ago, the only user of
+				 * vq->log is vhost-net. The concern is to
+				 * add support for more devices (i.e.
+				 * vhost-scsi or vsock) may reveals unknown
+				 * issue in the vhost API. Add a WARNING.
+				 */
+				WARN_ON_ONCE(*log_num >= vq->dev->iov_limit);
+
 				log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
 				log[*log_num].len = vhost32_to_cpu(vq, desc.len);
 				++*log_num;
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 6/9] vhost-scsi: log I/O queue write descriptors
  2025-04-03  6:29 ` [PATCH v3 6/9] vhost-scsi: log I/O queue write descriptors Dongli Zhang
@ 2025-04-06 21:41   ` Mike Christie
  0 siblings, 0 replies; 18+ messages in thread
From: Mike Christie @ 2025-04-06 21:41 UTC (permalink / raw)
  To: Dongli Zhang, virtualization, kvm, netdev
  Cc: mst, jasowang, pbonzini, stefanha, eperezma, joao.m.martins,
	joe.jin, si-wei.liu, linux-kernel

On 4/3/25 1:29 AM, Dongli Zhang wrote:
> Log write descriptors for the I/O queue, leveraging vhost_scsi_get_desc()
> and vhost_get_vq_desc() to retrieve the array of write descriptors to
> obtain the log buffer.
> 
> In addition, introduce a vhost-scsi specific function to log vring
> descriptors. In this function, the 'partial' argument is set to false, and
> the 'len' argument is set to 0, because vhost-scsi always logs all pages
> shared by a vring descriptor. Add WARN_ON_ONCE() since vhost-scsi doesn't
> support VIRTIO_F_ACCESS_PLATFORM.
> 
> The per-cmd log buffer is allocated on demand in the submission path after
> VHOST_F_LOG_ALL is set. Return -ENOMEM on allocation failure, in order to
> send SAM_STAT_TASK_SET_FULL to the guest.
> 
> It isn't reclaimed in the completion path. Instead, it is reclaimed when
> VHOST_F_LOG_ALL is removed, or during VHOST_SCSI_SET_ENDPOINT when all
> commands are destroyed.
> 
> Store the log buffer during the submission path and log it in the
> completion path. Logging is also required in the error handling path of the
> submission process.
> 
> Suggested-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
> Changed since v1:
>   - Don't allocate log buffer during initialization. Allocate during
>   - VHOST_SET_FEATURES or VHOST_SCSI_SET_ENDPOINT.
>   - Re-order if staments in vhost_scsi_log_write().
>   - Log after vhost_scsi_send_status() as well.
> Changed since v2:
>   - Merge PATCH 6 and PATCH 7 from v2 as one patch.
>   - Don't pre-allocate log buffer in
>     VHOST_SET_FEATURES/VHOST_SCSI_SET_ENDPOINT. Allocate for only once in
>     submission path in runtime. Reclaim int
>     VHOST_SET_FEATURES/VHOST_SCSI_SET_ENDPOINT.
>   - Encapsulate the one-time on-demand per-cmd log buffer alloc/copy in a
>     helper, as suggested by Mike.
Reviewed-by: Mike Christie <michael.christie@oracle.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 7/9] vhost-scsi: log control queue write descriptors
  2025-04-03  6:29 ` [PATCH v3 7/9] vhost-scsi: log control " Dongli Zhang
@ 2025-04-06 21:43   ` Mike Christie
  0 siblings, 0 replies; 18+ messages in thread
From: Mike Christie @ 2025-04-06 21:43 UTC (permalink / raw)
  To: Dongli Zhang, virtualization, kvm, netdev
  Cc: mst, jasowang, pbonzini, stefanha, eperezma, joao.m.martins,
	joe.jin, si-wei.liu, linux-kernel

On 4/3/25 1:29 AM, Dongli Zhang wrote:
> Log write descriptors for the control queue, leveraging
> vhost_scsi_get_desc() and vhost_get_vq_desc() to retrieve the array of
> write descriptors to obtain the log buffer.
> 
> For Task Management Requests, similar to the I/O queue, store the log
> buffer during the submission path and log it in the completion or error
> handling path.
> 
> For Asynchronous Notifications, only the submission path is involved.
> 
> Suggested-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>

Reviewed-by: Mike Christie <michael.christie@oracle.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 9/9] vhost: add WARNING if log_num is more than limit
  2025-04-03  6:29 ` [PATCH v3 9/9] vhost: add WARNING if log_num is more than limit Dongli Zhang
@ 2025-04-14 16:32   ` Michael S. Tsirkin
  2025-04-14 16:52     ` Dongli Zhang
  0 siblings, 1 reply; 18+ messages in thread
From: Michael S. Tsirkin @ 2025-04-14 16:32 UTC (permalink / raw)
  To: Dongli Zhang
  Cc: virtualization, kvm, netdev, jasowang, michael.christie, pbonzini,
	stefanha, eperezma, joao.m.martins, joe.jin, si-wei.liu,
	linux-kernel

On Wed, Apr 02, 2025 at 11:29:54PM -0700, Dongli Zhang wrote:
> Since long time ago, the only user of vq->log is vhost-net. The concern is
> to add support for more devices (i.e. vhost-scsi or vsock) may reveals
> unknown issue in the vhost API. Add a WARNING.
> 
> Suggested-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>


Userspace can trigger this I think, this is a problem since
people run with reboot on warn.
Pls grammar issues in comments... I don't think so.

> ---
>  drivers/vhost/vhost.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 494b3da5423a..b7d51d569646 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -2559,6 +2559,15 @@ static int get_indirect(struct vhost_virtqueue *vq,
>  		if (access == VHOST_ACCESS_WO) {
>  			*in_num += ret;
>  			if (unlikely(log && ret)) {
> +				/*
> +				 * Since long time ago, the only user of
> +				 * vq->log is vhost-net. The concern is to
> +				 * add support for more devices (i.e.
> +				 * vhost-scsi or vsock) may reveals unknown
> +				 * issue in the vhost API. Add a WARNING.
> +				 */
> +				WARN_ON_ONCE(*log_num >= vq->dev->iov_limit);
> +
>  				log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
>  				log[*log_num].len = vhost32_to_cpu(vq, desc.len);
>  				++*log_num;
> @@ -2679,6 +2688,15 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
>  			 * increment that count. */
>  			*in_num += ret;
>  			if (unlikely(log && ret)) {
> +				/*
> +				 * Since long time ago, the only user of
> +				 * vq->log is vhost-net. The concern is to
> +				 * add support for more devices (i.e.
> +				 * vhost-scsi or vsock) may reveals unknown
> +				 * issue in the vhost API. Add a WARNING.
> +				 */
> +				WARN_ON_ONCE(*log_num >= vq->dev->iov_limit);
> +
>  				log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
>  				log[*log_num].len = vhost32_to_cpu(vq, desc.len);
>  				++*log_num;
> -- 
> 2.39.3


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 9/9] vhost: add WARNING if log_num is more than limit
  2025-04-14 16:32   ` Michael S. Tsirkin
@ 2025-04-14 16:52     ` Dongli Zhang
  2025-04-14 18:39       ` Michael S. Tsirkin
  0 siblings, 1 reply; 18+ messages in thread
From: Dongli Zhang @ 2025-04-14 16:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtualization, kvm, netdev, jasowang, michael.christie, pbonzini,
	stefanha, eperezma, joao.m.martins, joe.jin, si-wei.liu,
	linux-kernel

Hi Michael,

On 4/14/25 9:32 AM, Michael S. Tsirkin wrote:
> On Wed, Apr 02, 2025 at 11:29:54PM -0700, Dongli Zhang wrote:
>> Since long time ago, the only user of vq->log is vhost-net. The concern is
>> to add support for more devices (i.e. vhost-scsi or vsock) may reveals
>> unknown issue in the vhost API. Add a WARNING.
>>
>> Suggested-by: Joao Martins <joao.m.martins@oracle.com>
>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> 
> 
> Userspace can trigger this I think, this is a problem since
> people run with reboot on warn.

I think it will be a severe kernel bug (page fault) if userspace can trigger this.

If (*log_num >= vq->dev->iov_limit), the next line will lead to an out-of-bound
memory access:

    log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);

I could not propose a case to trigger the WARNING from userspace. Would you mind
helping explain if that can happen?

> Pls grammar issues in comments... I don't think so.

I did an analysis of code and so far I could not identify any case to trigger
(*log_num >= vq->dev->iov_limit).

The objective of the patch is to add a WARNING to double confirm the case won't
happen.

Regarding "I don't think so", would you mean we don't need this patch/WARNING
because the code is robust enough?

Thank you very much!

Dongli Zhang

> 
>> ---
>>  drivers/vhost/vhost.c | 18 ++++++++++++++++++
>>  1 file changed, 18 insertions(+)
>>
>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>> index 494b3da5423a..b7d51d569646 100644
>> --- a/drivers/vhost/vhost.c
>> +++ b/drivers/vhost/vhost.c
>> @@ -2559,6 +2559,15 @@ static int get_indirect(struct vhost_virtqueue *vq,
>>  		if (access == VHOST_ACCESS_WO) {
>>  			*in_num += ret;
>>  			if (unlikely(log && ret)) {
>> +				/*
>> +				 * Since long time ago, the only user of
>> +				 * vq->log is vhost-net. The concern is to
>> +				 * add support for more devices (i.e.
>> +				 * vhost-scsi or vsock) may reveals unknown
>> +				 * issue in the vhost API. Add a WARNING.
>> +				 */
>> +				WARN_ON_ONCE(*log_num >= vq->dev->iov_limit);
>> +
>>  				log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
>>  				log[*log_num].len = vhost32_to_cpu(vq, desc.len);
>>  				++*log_num;
>> @@ -2679,6 +2688,15 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
>>  			 * increment that count. */
>>  			*in_num += ret;
>>  			if (unlikely(log && ret)) {
>> +				/*
>> +				 * Since long time ago, the only user of
>> +				 * vq->log is vhost-net. The concern is to
>> +				 * add support for more devices (i.e.
>> +				 * vhost-scsi or vsock) may reveals unknown
>> +				 * issue in the vhost API. Add a WARNING.
>> +				 */
>> +				WARN_ON_ONCE(*log_num >= vq->dev->iov_limit);
>> +
>>  				log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
>>  				log[*log_num].len = vhost32_to_cpu(vq, desc.len);
>>  				++*log_num;
>> -- 
>> 2.39.3
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 9/9] vhost: add WARNING if log_num is more than limit
  2025-04-14 16:52     ` Dongli Zhang
@ 2025-04-14 18:39       ` Michael S. Tsirkin
  2025-04-14 20:52         ` Dongli Zhang
  0 siblings, 1 reply; 18+ messages in thread
From: Michael S. Tsirkin @ 2025-04-14 18:39 UTC (permalink / raw)
  To: Dongli Zhang
  Cc: virtualization, kvm, netdev, jasowang, michael.christie, pbonzini,
	stefanha, eperezma, joao.m.martins, joe.jin, si-wei.liu,
	linux-kernel

On Mon, Apr 14, 2025 at 09:52:04AM -0700, Dongli Zhang wrote:
> Hi Michael,
> 
> On 4/14/25 9:32 AM, Michael S. Tsirkin wrote:
> > On Wed, Apr 02, 2025 at 11:29:54PM -0700, Dongli Zhang wrote:
> >> Since long time ago, the only user of vq->log is vhost-net. The concern is
> >> to add support for more devices (i.e. vhost-scsi or vsock) may reveals
> >> unknown issue in the vhost API. Add a WARNING.
> >>
> >> Suggested-by: Joao Martins <joao.m.martins@oracle.com>
> >> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> > 
> > 
> > Userspace can trigger this I think, this is a problem since
> > people run with reboot on warn.
> 
> I think it will be a severe kernel bug (page fault) if userspace can trigger this.
> 
> If (*log_num >= vq->dev->iov_limit), the next line will lead to an out-of-bound
> memory access:
> 
>     log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
> 
> I could not propose a case to trigger the WARNING from userspace. Would you mind
> helping explain if that can happen?

Oh I see. the commit log made me think this is an actual issue,
not a debugging aid just in case.


> > Pls grammar issues in comments... I don't think so.
> 
> I did an analysis of code and so far I could not identify any case to trigger
> (*log_num >= vq->dev->iov_limit).
> 
> The objective of the patch is to add a WARNING to double confirm the case won't
> happen.
> 
> Regarding "I don't think so", would you mean we don't need this patch/WARNING
> because the code is robust enough?
> 
> Thank you very much!
> 
> Dongli Zhang


Let me clarify the comment is misleading.
All it has to say is:

	/* Let's make sure we are not out of bounds. */
	BUG_ON(*log_num >= vq->dev->iov_limit);

at the same time, this is unnecessary pointer chasing
on critical path, and I don't much like it that we are
making an assumption about array size here.

If you strongly want to do it, you must document it near
get_indirect: 
@log - array of size at least vq->dev->iov_limit


> > 
> >> ---
> >>  drivers/vhost/vhost.c | 18 ++++++++++++++++++
> >>  1 file changed, 18 insertions(+)
> >>
> >> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> >> index 494b3da5423a..b7d51d569646 100644
> >> --- a/drivers/vhost/vhost.c
> >> +++ b/drivers/vhost/vhost.c
> >> @@ -2559,6 +2559,15 @@ static int get_indirect(struct vhost_virtqueue *vq,
> >>  		if (access == VHOST_ACCESS_WO) {
> >>  			*in_num += ret;
> >>  			if (unlikely(log && ret)) {
> >> +				/*
> >> +				 * Since long time ago, the only user of
> >> +				 * vq->log is vhost-net. The concern is to
> >> +				 * add support for more devices (i.e.
> >> +				 * vhost-scsi or vsock) may reveals unknown
> >> +				 * issue in the vhost API. Add a WARNING.
> >> +				 */
> >> +				WARN_ON_ONCE(*log_num >= vq->dev->iov_limit);
> >> +
> >>  				log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
> >>  				log[*log_num].len = vhost32_to_cpu(vq, desc.len);
> >>  				++*log_num;
> >> @@ -2679,6 +2688,15 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
> >>  			 * increment that count. */
> >>  			*in_num += ret;
> >>  			if (unlikely(log && ret)) {
> >> +				/*
> >> +				 * Since long time ago, the only user of
> >> +				 * vq->log is vhost-net. The concern is to
> >> +				 * add support for more devices (i.e.
> >> +				 * vhost-scsi or vsock) may reveals unknown
> >> +				 * issue in the vhost API. Add a WARNING.
> >> +				 */
> >> +				WARN_ON_ONCE(*log_num >= vq->dev->iov_limit);
> >> +
> >>  				log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
> >>  				log[*log_num].len = vhost32_to_cpu(vq, desc.len);
> >>  				++*log_num;
> >> -- 
> >> 2.39.3
> > 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 9/9] vhost: add WARNING if log_num is more than limit
  2025-04-14 18:39       ` Michael S. Tsirkin
@ 2025-04-14 20:52         ` Dongli Zhang
  0 siblings, 0 replies; 18+ messages in thread
From: Dongli Zhang @ 2025-04-14 20:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtualization, kvm, netdev, jasowang, michael.christie, pbonzini,
	stefanha, eperezma, joao.m.martins, joe.jin, si-wei.liu,
	linux-kernel

Hi Michael,

On 4/14/25 11:39 AM, Michael S. Tsirkin wrote:
> On Mon, Apr 14, 2025 at 09:52:04AM -0700, Dongli Zhang wrote:
>> Hi Michael,
>>
>> On 4/14/25 9:32 AM, Michael S. Tsirkin wrote:
>>> On Wed, Apr 02, 2025 at 11:29:54PM -0700, Dongli Zhang wrote:
>>>> Since long time ago, the only user of vq->log is vhost-net. The concern is
>>>> to add support for more devices (i.e. vhost-scsi or vsock) may reveals
>>>> unknown issue in the vhost API. Add a WARNING.
>>>>
>>>> Suggested-by: Joao Martins <joao.m.martins@oracle.com>
>>>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>>>
>>>
>>> Userspace can trigger this I think, this is a problem since
>>> people run with reboot on warn.
>>
>> I think it will be a severe kernel bug (page fault) if userspace can trigger this.
>>
>> If (*log_num >= vq->dev->iov_limit), the next line will lead to an out-of-bound
>> memory access:
>>
>>     log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
>>
>> I could not propose a case to trigger the WARNING from userspace. Would you mind
>> helping explain if that can happen?
> 
> Oh I see. the commit log made me think this is an actual issue,
> not a debugging aid just in case.
> 
> 
>>> Pls grammar issues in comments... I don't think so.
>>
>> I did an analysis of code and so far I could not identify any case to trigger
>> (*log_num >= vq->dev->iov_limit).
>>
>> The objective of the patch is to add a WARNING to double confirm the case won't
>> happen.
>>
>> Regarding "I don't think so", would you mean we don't need this patch/WARNING
>> because the code is robust enough?
>>
>> Thank you very much!
>>
>> Dongli Zhang
> 
> 
> Let me clarify the comment is misleading.
> All it has to say is:
> 
> 	/* Let's make sure we are not out of bounds. */
> 	BUG_ON(*log_num >= vq->dev->iov_limit);

This is a critical path only during Live Migration is in progress. So far I
didn't encounter any issue for either vhost-net or vhost-scsi. That's why I used
WARN_ON_ONCE() instead of a BUG_ON().

I agree with your point on "unnecessary pointer chasing on critical path", I am
going to remove this patch in the next version.

Thank you very much for feedback and suggestion!

Dongli Zhang

> 
> at the same time, this is unnecessary pointer chasing
> on critical path, and I don't much like it that we are
> making an assumption about array size here.
> 
> If you strongly want to do it, you must document it near
> get_indirect: 
> @log - array of size at least vq->dev->iov_limit
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 4/9] vhost: modify vhost_log_write() for broader users
  2025-04-03  6:29 ` [PATCH v3 4/9] vhost: modify vhost_log_write() for broader users Dongli Zhang
@ 2025-04-16  7:58   ` Dongli Zhang
  2025-04-21  3:08   ` Jason Wang
  1 sibling, 0 replies; 18+ messages in thread
From: Dongli Zhang @ 2025-04-16  7:58 UTC (permalink / raw)
  To: virtualization, kvm, netdev, jasowang
  Cc: mst, michael.christie, pbonzini, stefanha, eperezma,
	joao.m.martins, joe.jin, si-wei.liu, linux-kernel

Hi Jason,

Would you mind helping confirm if it resolves your concern in v2?

Thank you very much!

Dongli Zhang

On 4/2/25 11:29 PM, Dongli Zhang wrote:
> Currently, the only user of vhost_log_write() is vhost-net. The 'len'
> argument prevents logging of pages that are not tainted by the RX path.
> 
> Adjustments are needed since more drivers (i.e. vhost-scsi) begin using
> vhost_log_write(). So far vhost-net RX path may only partially use pages
> shared via the last vring descriptor. Unlike vhost-net, vhost-scsi always
> logs all pages shared via vring descriptors. To accommodate this,
> use (len == U64_MAX) to indicate whether the driver would log all pages of
> vring descriptors, or only pages that are tainted by the driver.
> 
> In addition, removes BUG().
> 
> Suggested-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
> Changed since v2:
>   - Document parameters of vhost_log_write().
>   - Use (len == U64_MAX) to indicate whether log all pages, instead of
>     introducing a new parameter.
> 
>  drivers/vhost/vhost.c | 28 ++++++++++++++++++++--------
>  1 file changed, 20 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 9ac25d08f473..494b3da5423a 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -2304,6 +2304,19 @@ static int log_used(struct vhost_virtqueue *vq, u64 used_offset, u64 len)
>  	return 0;
>  }
>  
> +/*
> + * vhost_log_write() - Log in dirty page bitmap
> + * @vq:      vhost virtqueue.
> + * @log:     Array of dirty memory in GPA.
> + * @log_num: Size of vhost_log arrary.
> + * @len:     The total length of memory buffer to log in the dirty bitmap.
> + *	     Some drivers may only partially use pages shared via the last
> + *	     vring descriptor (i.e. vhost-net RX buffer).
> + *	     Use (len == U64_MAX) to indicate the driver would log all
> + *           pages of vring descriptors.
> + * @iov:     Array of dirty memory in HVA.
> + * @count:   Size of iovec array.
> + */
>  int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
>  		    unsigned int log_num, u64 len, struct iovec *iov, int count)
>  {
> @@ -2327,15 +2340,14 @@ int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
>  		r = log_write(vq->log_base, log[i].addr, l);
>  		if (r < 0)
>  			return r;
> -		len -= l;
> -		if (!len) {
> -			if (vq->log_ctx)
> -				eventfd_signal(vq->log_ctx);
> -			return 0;
> -		}
> +
> +		if (len != U64_MAX)
> +			len -= l;
>  	}
> -	/* Length written exceeds what we have stored. This is a bug. */
> -	BUG();
> +
> +	if (vq->log_ctx)
> +		eventfd_signal(vq->log_ctx);
> +
>  	return 0;
>  }
>  EXPORT_SYMBOL_GPL(vhost_log_write);


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 4/9] vhost: modify vhost_log_write() for broader users
  2025-04-03  6:29 ` [PATCH v3 4/9] vhost: modify vhost_log_write() for broader users Dongli Zhang
  2025-04-16  7:58   ` Dongli Zhang
@ 2025-04-21  3:08   ` Jason Wang
  1 sibling, 0 replies; 18+ messages in thread
From: Jason Wang @ 2025-04-21  3:08 UTC (permalink / raw)
  To: Dongli Zhang
  Cc: virtualization, kvm, netdev, mst, michael.christie, pbonzini,
	stefanha, eperezma, joao.m.martins, joe.jin, si-wei.liu,
	linux-kernel

On Thu, Apr 3, 2025 at 2:32 PM Dongli Zhang <dongli.zhang@oracle.com> wrote:
>
> Currently, the only user of vhost_log_write() is vhost-net. The 'len'
> argument prevents logging of pages that are not tainted by the RX path.
>
> Adjustments are needed since more drivers (i.e. vhost-scsi) begin using
> vhost_log_write(). So far vhost-net RX path may only partially use pages
> shared via the last vring descriptor. Unlike vhost-net, vhost-scsi always
> logs all pages shared via vring descriptors. To accommodate this,
> use (len == U64_MAX) to indicate whether the driver would log all pages of
> vring descriptors, or only pages that are tainted by the driver.
>
> In addition, removes BUG().
>
> Suggested-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>

Acked-by: Jason Wang <jasowang@redhat.com>

Thanks


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-04-21  3:08 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-03  6:29 [PATCH v3 0/9] vhost-scsi: log write descriptors for live migration (and three bugfix) Dongli Zhang
2025-04-03  6:29 ` [PATCH v3 1/9] vhost-scsi: protect vq->log_used with vq->mutex Dongli Zhang
2025-04-03  6:29 ` [PATCH v3 2/9] vhost-scsi: Fix vhost_scsi_send_bad_target() Dongli Zhang
2025-04-03  6:29 ` [PATCH v3 3/9] vhost-scsi: Fix vhost_scsi_send_status() Dongli Zhang
2025-04-03  6:29 ` [PATCH v3 4/9] vhost: modify vhost_log_write() for broader users Dongli Zhang
2025-04-16  7:58   ` Dongli Zhang
2025-04-21  3:08   ` Jason Wang
2025-04-03  6:29 ` [PATCH v3 5/9] vhost-scsi: adjust vhost_scsi_get_desc() to log vring descriptors Dongli Zhang
2025-04-03  6:29 ` [PATCH v3 6/9] vhost-scsi: log I/O queue write descriptors Dongli Zhang
2025-04-06 21:41   ` Mike Christie
2025-04-03  6:29 ` [PATCH v3 7/9] vhost-scsi: log control " Dongli Zhang
2025-04-06 21:43   ` Mike Christie
2025-04-03  6:29 ` [PATCH v3 8/9] vhost-scsi: log event " Dongli Zhang
2025-04-03  6:29 ` [PATCH v3 9/9] vhost: add WARNING if log_num is more than limit Dongli Zhang
2025-04-14 16:32   ` Michael S. Tsirkin
2025-04-14 16:52     ` Dongli Zhang
2025-04-14 18:39       ` Michael S. Tsirkin
2025-04-14 20:52         ` Dongli Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).