* [PATCH v2 00/10] vhost-scsi: log write descriptors for live migration (and three bugfix)
@ 2025-03-17 23:55 Dongli Zhang
2025-03-17 23:55 ` [PATCH v2 01/10] vhost-scsi: protect vq->log_used with vq->mutex Dongli Zhang
` (9 more replies)
0 siblings, 10 replies; 28+ messages in thread
From: Dongli Zhang @ 2025-03-17 23:55 UTC (permalink / raw)
To: virtualization, kvm, netdev
Cc: mst, jasowang, michael.christie, pbonzini, stefanha, eperezma,
joao.m.martins, joe.jin, si-wei.liu, linux-kernel
The live migration with vhost-scsi has been enabled by QEMU commit
b3e89c941a85 ("vhost-scsi: Allow user to enable migration"), which
thoroughly explains the workflow that QEMU collaborates with vhost-scsi on
the live migration.
Although it logs dirty data for the used ring, it doesn't log any write
descriptor (VRING_DESC_F_WRITE).
In comparison, vhost-net logs write descriptors via vhost_log_write(). The
SPDK (vhost-user-scsi backend) also logs write descriptors via
vhost_log_req_desc().
As a result, there is likely data mismatch between memory and vhost-scsi
disk during the live migration.
1. Suppose there is high workload and high memory usage. Suppose some
systemd userspace pages are swapped out to the swap disk.
2. Upon request from systemd, the kernel reads some pages from the swap
disk to the memory via vhost-scsi.
3. Although those userspace pages' data are updated, they are not marked as
dirty by vhost-scsi (this is the bug). They are not going to migrate to the
target host during memory transfer iterations.
4. Suppose systemd doesn't write to those pages any longer. Those pages
never get the chance to be dirty or migrated any longer.
5. Once the guest VM is resumed on the target host, because of the lack of
those dirty pages' data, the systemd may run into abnormal status, i.e.,
there may be systemd segfault.
Log all write descriptors to fix the issue.
In addition, the patchset also fixes three bugs in vhost-scsi.
Changed since v1:
- Rebase on top of most recent vhost changes.
- Don't allocate log buffer during initialization. Allocate during
VHOST_SET_FEATURES or VHOST_SCSI_SET_ENDPOINT.
- Add bugfix for vhost_scsi_send_status().
Dongli Zhang (vhost-scsi bugfix):
vhost-scsi: protect vq->log_used with vq->mutex
vhost-scsi: Fix vhost_scsi_send_bad_target()
vhost-scsi: Fix vhost_scsi_send_status()
Dongli Zhang (log descriptor, suggested by Joao Martins):
vhost: modify vhost_log_write() for broader users
vhost-scsi: adjust vhost_scsi_get_desc() to log vring descriptors
vhost-scsi: cache log buffer in I/O queue vhost_scsi_cmd
vhost-scsi: log I/O queue write descriptors
vhost-scsi: log control queue write descriptors
vhost-scsi: log event queue write descriptors
vhost: add WARNING if log_num is more than limit
drivers/vhost/net.c | 2 +-
drivers/vhost/scsi.c | 314 ++++++++++++++++++++++++++++++++++++++++-----
drivers/vhost/vhost.c | 46 +++++--
drivers/vhost/vhost.h | 2 +-
4 files changed, 322 insertions(+), 42 deletions(-)
base-commit: 9d8960672d63db4b3b04542f5622748b345c637a
branch: remotes/origin/linux-next
tree: https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git
Thank you very much!
Dongli Zhang
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v2 01/10] vhost-scsi: protect vq->log_used with vq->mutex
2025-03-17 23:55 [PATCH v2 00/10] vhost-scsi: log write descriptors for live migration (and three bugfix) Dongli Zhang
@ 2025-03-17 23:55 ` Dongli Zhang
2025-03-18 0:47 ` Jason Wang
2025-03-26 22:22 ` Mike Christie
2025-03-17 23:55 ` [PATCH v2 02/10] vhost-scsi: Fix vhost_scsi_send_bad_target() Dongli Zhang
` (8 subsequent siblings)
9 siblings, 2 replies; 28+ messages in thread
From: Dongli Zhang @ 2025-03-17 23:55 UTC (permalink / raw)
To: virtualization, kvm, netdev
Cc: mst, jasowang, michael.christie, pbonzini, stefanha, eperezma,
joao.m.martins, joe.jin, si-wei.liu, linux-kernel
The vhost-scsi completion path may access vq->log_base when vq->log_used is
already set to false.
vhost-thread QEMU-thread
vhost_scsi_complete_cmd_work()
-> vhost_add_used()
-> vhost_add_used_n()
if (unlikely(vq->log_used))
QEMU disables vq->log_used
via VHOST_SET_VRING_ADDR.
mutex_lock(&vq->mutex);
vq->log_used = false now!
mutex_unlock(&vq->mutex);
QEMU gfree(vq->log_base)
log_used()
-> log_write(vq->log_base)
Assuming the VMM is QEMU. The vq->log_base is from QEMU userpace and can be
reclaimed via gfree(). As a result, this causes invalid memory writes to
QEMU userspace.
The control queue path has the same issue.
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
Changed since v1:
- Move lock to the begin and end of vhost_scsi_complete_cmd_work() as it
is per-vq now. This reduces the number of mutex_lock().
- Move this bugfix patch to before dirty log tracking patches.
drivers/vhost/scsi.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index f6f5a7ac7894..f846f2aa7c87 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -627,6 +627,9 @@ static void vhost_scsi_complete_cmd_work(struct vhost_work *work)
int ret;
llnode = llist_del_all(&svq->completion_list);
+
+ mutex_lock(&svq->vq.mutex);
+
llist_for_each_entry_safe(cmd, t, llnode, tvc_completion_list) {
se_cmd = &cmd->tvc_se_cmd;
@@ -660,6 +663,8 @@ static void vhost_scsi_complete_cmd_work(struct vhost_work *work)
vhost_scsi_release_cmd_res(se_cmd);
}
+ mutex_unlock(&svq->vq.mutex);
+
if (signal)
vhost_signal(&svq->vs->dev, &svq->vq);
}
@@ -1432,8 +1437,11 @@ static void vhost_scsi_tmf_resp_work(struct vhost_work *work)
else
resp_code = VIRTIO_SCSI_S_FUNCTION_REJECTED;
+ mutex_lock(&tmf->svq->vq.mutex);
vhost_scsi_send_tmf_resp(tmf->vhost, &tmf->svq->vq, tmf->in_iovs,
tmf->vq_desc, &tmf->resp_iov, resp_code);
+ mutex_unlock(&tmf->svq->vq.mutex);
+
vhost_scsi_release_tmf_res(tmf);
}
--
2.39.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 02/10] vhost-scsi: Fix vhost_scsi_send_bad_target()
2025-03-17 23:55 [PATCH v2 00/10] vhost-scsi: log write descriptors for live migration (and three bugfix) Dongli Zhang
2025-03-17 23:55 ` [PATCH v2 01/10] vhost-scsi: protect vq->log_used with vq->mutex Dongli Zhang
@ 2025-03-17 23:55 ` Dongli Zhang
2025-03-18 0:48 ` Jason Wang
2025-03-26 22:25 ` Mike Christie
2025-03-17 23:55 ` [PATCH v2 03/10] vhost-scsi: Fix vhost_scsi_send_status() Dongli Zhang
` (7 subsequent siblings)
9 siblings, 2 replies; 28+ messages in thread
From: Dongli Zhang @ 2025-03-17 23:55 UTC (permalink / raw)
To: virtualization, kvm, netdev
Cc: mst, jasowang, michael.christie, pbonzini, stefanha, eperezma,
joao.m.martins, joe.jin, si-wei.liu, linux-kernel
Although the support of VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 was
signaled by the commit 664ed90e621c ("vhost/scsi: Set
VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits"),
vhost_scsi_send_bad_target() still assumes the response in a single
descriptor.
In addition, although vhost_scsi_send_bad_target() is used by both I/O
queue and control queue, the response header is always
virtio_scsi_cmd_resp. It is required to use virtio_scsi_ctrl_tmf_resp or
virtio_scsi_ctrl_an_resp for control queue.
Fixes: 664ed90e621c ("vhost/scsi: Set VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits")
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
Changed since v1:
- Move this bugfix patch to before dirty log tracking patches.
drivers/vhost/scsi.c | 48 ++++++++++++++++++++++++++++++++++----------
1 file changed, 37 insertions(+), 11 deletions(-)
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index f846f2aa7c87..59d907b94c5e 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1015,23 +1015,46 @@ vhost_scsi_send_status(struct vhost_scsi *vs, struct vhost_virtqueue *vq,
pr_err("Faulted on virtio_scsi_cmd_resp\n");
}
+#define TYPE_IO_CMD 0
+#define TYPE_CTRL_TMF 1
+#define TYPE_CTRL_AN 2
+
static void
vhost_scsi_send_bad_target(struct vhost_scsi *vs,
struct vhost_virtqueue *vq,
- int head, unsigned out)
+ struct vhost_scsi_ctx *vc, int type)
{
- struct virtio_scsi_cmd_resp __user *resp;
- struct virtio_scsi_cmd_resp rsp;
+ union {
+ struct virtio_scsi_cmd_resp cmd;
+ struct virtio_scsi_ctrl_tmf_resp tmf;
+ struct virtio_scsi_ctrl_an_resp an;
+ } rsp;
+ struct iov_iter iov_iter;
+ size_t rsp_size;
int ret;
memset(&rsp, 0, sizeof(rsp));
- rsp.response = VIRTIO_SCSI_S_BAD_TARGET;
- resp = vq->iov[out].iov_base;
- ret = __copy_to_user(resp, &rsp, sizeof(rsp));
- if (!ret)
- vhost_add_used_and_signal(&vs->dev, vq, head, 0);
+
+ if (type == TYPE_IO_CMD) {
+ rsp_size = sizeof(struct virtio_scsi_cmd_resp);
+ rsp.cmd.response = VIRTIO_SCSI_S_BAD_TARGET;
+ } else if (type == TYPE_CTRL_TMF) {
+ rsp_size = sizeof(struct virtio_scsi_ctrl_tmf_resp);
+ rsp.tmf.response = VIRTIO_SCSI_S_BAD_TARGET;
+ } else {
+ rsp_size = sizeof(struct virtio_scsi_ctrl_an_resp);
+ rsp.an.response = VIRTIO_SCSI_S_BAD_TARGET;
+ }
+
+ iov_iter_init(&iov_iter, ITER_DEST, &vq->iov[vc->out], vc->in,
+ rsp_size);
+
+ ret = copy_to_iter(&rsp, rsp_size, &iov_iter);
+
+ if (likely(ret == rsp_size))
+ vhost_add_used_and_signal(&vs->dev, vq, vc->head, 0);
else
- pr_err("Faulted on virtio_scsi_cmd_resp\n");
+ pr_err("Faulted on virtio scsi type=%d\n", type);
}
static int
@@ -1395,7 +1418,7 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
if (ret == -ENXIO)
break;
else if (ret == -EIO)
- vhost_scsi_send_bad_target(vs, vq, vc.head, vc.out);
+ vhost_scsi_send_bad_target(vs, vq, &vc, TYPE_IO_CMD);
else if (ret == -ENOMEM)
vhost_scsi_send_status(vs, vq, vc.head, vc.out,
SAM_STAT_TASK_SET_FULL);
@@ -1631,7 +1654,10 @@ vhost_scsi_ctl_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
if (ret == -ENXIO)
break;
else if (ret == -EIO)
- vhost_scsi_send_bad_target(vs, vq, vc.head, vc.out);
+ vhost_scsi_send_bad_target(vs, vq, &vc,
+ v_req.type == VIRTIO_SCSI_T_TMF ?
+ TYPE_CTRL_TMF :
+ TYPE_CTRL_AN);
} while (likely(!vhost_exceeds_weight(vq, ++c, 0)));
out:
mutex_unlock(&vq->mutex);
--
2.39.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 03/10] vhost-scsi: Fix vhost_scsi_send_status()
2025-03-17 23:55 [PATCH v2 00/10] vhost-scsi: log write descriptors for live migration (and three bugfix) Dongli Zhang
2025-03-17 23:55 ` [PATCH v2 01/10] vhost-scsi: protect vq->log_used with vq->mutex Dongli Zhang
2025-03-17 23:55 ` [PATCH v2 02/10] vhost-scsi: Fix vhost_scsi_send_bad_target() Dongli Zhang
@ 2025-03-17 23:55 ` Dongli Zhang
2025-03-18 0:48 ` Jason Wang
2025-03-26 22:27 ` Mike Christie
2025-03-17 23:55 ` [PATCH v2 04/10] vhost: modify vhost_log_write() for broader users Dongli Zhang
` (6 subsequent siblings)
9 siblings, 2 replies; 28+ messages in thread
From: Dongli Zhang @ 2025-03-17 23:55 UTC (permalink / raw)
To: virtualization, kvm, netdev
Cc: mst, jasowang, michael.christie, pbonzini, stefanha, eperezma,
joao.m.martins, joe.jin, si-wei.liu, linux-kernel
Although the support of VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 was
signaled by the commit 664ed90e621c ("vhost/scsi: Set
VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits"),
vhost_scsi_send_bad_target() still assumes the response in a single
descriptor.
Similar issue in vhost_scsi_send_bad_target() has been fixed in previous
commit.
Fixes: 3ca51662f818 ("vhost-scsi: Add better resource allocation failure handling")
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
Changed since v1:
- New patch to fix vhost_scsi_send_status().
drivers/vhost/scsi.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 59d907b94c5e..26bcf3a7f70c 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -999,18 +999,22 @@ static void vhost_scsi_target_queue_cmd(struct vhost_scsi_nexus *nexus,
static void
vhost_scsi_send_status(struct vhost_scsi *vs, struct vhost_virtqueue *vq,
- int head, unsigned int out, u8 status)
+ struct vhost_scsi_ctx *vc, u8 status)
{
- struct virtio_scsi_cmd_resp __user *resp;
struct virtio_scsi_cmd_resp rsp;
+ struct iov_iter iov_iter;
int ret;
memset(&rsp, 0, sizeof(rsp));
rsp.status = status;
- resp = vq->iov[out].iov_base;
- ret = __copy_to_user(resp, &rsp, sizeof(rsp));
- if (!ret)
- vhost_add_used_and_signal(&vs->dev, vq, head, 0);
+
+ iov_iter_init(&iov_iter, ITER_DEST, &vq->iov[vc->out], vc->in,
+ sizeof(rsp));
+
+ ret = copy_to_iter(&rsp, sizeof(rsp), &iov_iter);
+
+ if (likely(ret == sizeof(rsp)))
+ vhost_add_used_and_signal(&vs->dev, vq, vc->head, 0);
else
pr_err("Faulted on virtio_scsi_cmd_resp\n");
}
@@ -1420,7 +1424,7 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
else if (ret == -EIO)
vhost_scsi_send_bad_target(vs, vq, &vc, TYPE_IO_CMD);
else if (ret == -ENOMEM)
- vhost_scsi_send_status(vs, vq, vc.head, vc.out,
+ vhost_scsi_send_status(vs, vq, &vc,
SAM_STAT_TASK_SET_FULL);
} while (likely(!vhost_exceeds_weight(vq, ++c, 0)));
out:
--
2.39.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 04/10] vhost: modify vhost_log_write() for broader users
2025-03-17 23:55 [PATCH v2 00/10] vhost-scsi: log write descriptors for live migration (and three bugfix) Dongli Zhang
` (2 preceding siblings ...)
2025-03-17 23:55 ` [PATCH v2 03/10] vhost-scsi: Fix vhost_scsi_send_status() Dongli Zhang
@ 2025-03-17 23:55 ` Dongli Zhang
2025-03-18 1:12 ` Jason Wang
2025-03-17 23:55 ` [PATCH v2 05/10] vhost-scsi: adjust vhost_scsi_get_desc() to log vring descriptors Dongli Zhang
` (5 subsequent siblings)
9 siblings, 1 reply; 28+ messages in thread
From: Dongli Zhang @ 2025-03-17 23:55 UTC (permalink / raw)
To: virtualization, kvm, netdev
Cc: mst, jasowang, michael.christie, pbonzini, stefanha, eperezma,
joao.m.martins, joe.jin, si-wei.liu, linux-kernel
Currently, the only user of vhost_log_write() is vhost-net. The 'len'
argument prevents logging of pages that are not tainted by the RX path.
Adjustments are needed since more drivers (i.e. vhost-scsi) begin using
vhost_log_write(). So far vhost-net RX path may only partially use pages
shared by the last vring descriptor. Unlike vhost-net, vhost-scsi always
logs all pages shared via vring descriptors. To accommodate this, a new
argument 'partial' is introduced. This argument works alongside 'len' to
indicate whether the driver should log all pages of a vring descriptor, or
only pages that are tainted by the driver.
In addition, removes BUG().
Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
drivers/vhost/net.c | 2 +-
drivers/vhost/vhost.c | 28 +++++++++++++++++-----------
drivers/vhost/vhost.h | 2 +-
3 files changed, 19 insertions(+), 13 deletions(-)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index b9b9e9d40951..0e5d82bfde76 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1219,7 +1219,7 @@ static void handle_rx(struct vhost_net *net)
if (nvq->done_idx > VHOST_NET_BATCH)
vhost_net_signal_used(nvq);
if (unlikely(vq_log))
- vhost_log_write(vq, vq_log, log, vhost_len,
+ vhost_log_write(vq, vq_log, log, vhost_len, true,
vq->iov, in);
total_len += vhost_len;
} while (likely(!vhost_exceeds_weight(vq, ++recv_pkts, total_len)));
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 9ac25d08f473..db3b30aba940 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2304,8 +2304,14 @@ static int log_used(struct vhost_virtqueue *vq, u64 used_offset, u64 len)
return 0;
}
-int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
- unsigned int log_num, u64 len, struct iovec *iov, int count)
+/*
+ * 'len' is used only when 'partial' is true, to indicate whether the
+ * entire length of each descriptor is logged.
+ */
+int vhost_log_write(struct vhost_virtqueue *vq,
+ struct vhost_log *log, unsigned int log_num,
+ u64 len, bool partial,
+ struct iovec *iov, int count)
{
int i, r;
@@ -2323,19 +2329,19 @@ int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
}
for (i = 0; i < log_num; ++i) {
- u64 l = min(log[i].len, len);
+ u64 l = partial ? min(log[i].len, len) : log[i].len;
+
r = log_write(vq->log_base, log[i].addr, l);
if (r < 0)
return r;
- len -= l;
- if (!len) {
- if (vq->log_ctx)
- eventfd_signal(vq->log_ctx);
- return 0;
- }
+
+ if (partial)
+ len -= l;
}
- /* Length written exceeds what we have stored. This is a bug. */
- BUG();
+
+ if (vq->log_ctx)
+ eventfd_signal(vq->log_ctx);
+
return 0;
}
EXPORT_SYMBOL_GPL(vhost_log_write);
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index bb75a292d50c..5de5941988fe 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -224,7 +224,7 @@ bool vhost_vq_avail_empty(struct vhost_dev *, struct vhost_virtqueue *);
bool vhost_enable_notify(struct vhost_dev *, struct vhost_virtqueue *);
int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
- unsigned int log_num, u64 len,
+ unsigned int log_num, u64 len, bool partial,
struct iovec *iov, int count);
int vq_meta_prefetch(struct vhost_virtqueue *vq);
--
2.39.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 05/10] vhost-scsi: adjust vhost_scsi_get_desc() to log vring descriptors
2025-03-17 23:55 [PATCH v2 00/10] vhost-scsi: log write descriptors for live migration (and three bugfix) Dongli Zhang
` (3 preceding siblings ...)
2025-03-17 23:55 ` [PATCH v2 04/10] vhost: modify vhost_log_write() for broader users Dongli Zhang
@ 2025-03-17 23:55 ` Dongli Zhang
2025-03-26 23:21 ` Mike Christie
2025-03-17 23:55 ` [PATCH v2 06/10] vhost-scsi: cache log buffer in I/O queue vhost_scsi_cmd Dongli Zhang
` (4 subsequent siblings)
9 siblings, 1 reply; 28+ messages in thread
From: Dongli Zhang @ 2025-03-17 23:55 UTC (permalink / raw)
To: virtualization, kvm, netdev
Cc: mst, jasowang, michael.christie, pbonzini, stefanha, eperezma,
joao.m.martins, joe.jin, si-wei.liu, linux-kernel
Adjust vhost_scsi_get_desc() to facilitate logging of vring descriptors.
Add new arguments to allow passing the log buffer and length to
vhost_get_vq_desc().
In addition, reset 'log_num' since vhost_get_vq_desc() may reset it only
after certain condition checks.
Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
drivers/vhost/scsi.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 26bcf3a7f70c..3875967dee36 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1063,13 +1063,17 @@ vhost_scsi_send_bad_target(struct vhost_scsi *vs,
static int
vhost_scsi_get_desc(struct vhost_scsi *vs, struct vhost_virtqueue *vq,
- struct vhost_scsi_ctx *vc)
+ struct vhost_scsi_ctx *vc,
+ struct vhost_log *log, unsigned int *log_num)
{
int ret = -ENXIO;
+ if (likely(log_num))
+ *log_num = 0;
+
vc->head = vhost_get_vq_desc(vq, vq->iov,
ARRAY_SIZE(vq->iov), &vc->out, &vc->in,
- NULL, NULL);
+ log, log_num);
pr_debug("vhost_get_vq_desc: head: %d, out: %u in: %u\n",
vc->head, vc->out, vc->in);
@@ -1237,7 +1241,7 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
vhost_disable_notify(&vs->dev, vq);
do {
- ret = vhost_scsi_get_desc(vs, vq, &vc);
+ ret = vhost_scsi_get_desc(vs, vq, &vc, NULL, NULL);
if (ret)
goto err;
@@ -1581,7 +1585,7 @@ vhost_scsi_ctl_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
vhost_disable_notify(&vs->dev, vq);
do {
- ret = vhost_scsi_get_desc(vs, vq, &vc);
+ ret = vhost_scsi_get_desc(vs, vq, &vc, NULL, NULL);
if (ret)
goto err;
--
2.39.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 06/10] vhost-scsi: cache log buffer in I/O queue vhost_scsi_cmd
2025-03-17 23:55 [PATCH v2 00/10] vhost-scsi: log write descriptors for live migration (and three bugfix) Dongli Zhang
` (4 preceding siblings ...)
2025-03-17 23:55 ` [PATCH v2 05/10] vhost-scsi: adjust vhost_scsi_get_desc() to log vring descriptors Dongli Zhang
@ 2025-03-17 23:55 ` Dongli Zhang
2025-03-18 0:04 ` Dongli Zhang
2025-03-17 23:55 ` [PATCH v2 07/10] vhost-scsi: log I/O queue write descriptors Dongli Zhang
` (3 subsequent siblings)
9 siblings, 1 reply; 28+ messages in thread
From: Dongli Zhang @ 2025-03-17 23:55 UTC (permalink / raw)
To: virtualization, kvm, netdev
Cc: mst, jasowang, michael.christie, pbonzini, stefanha, eperezma,
joao.m.martins, joe.jin, si-wei.liu, linux-kernel
The vhost-scsi I/O queue uses vhost_scsi_cmd. Allocate the log buffer
during vhost_scsi_cmd allocation or when VHOST_F_LOG_ALL is set. Free the
log buffer when vhost_scsi_cmd is reclaimed or when VHOST_F_LOG_ALL is
removed.
Fail vhost_scsi_set_endpoint or vhost_scsi_set_features() on allocation
failure.
The cached log buffer will be uses in upcoming patches to log write
descriptors for the I/O queue. The core idea is to cache the log in the
per-command log buffer in the submission path, and use them to log write
descriptors in the completion path.
As a reminder, currently QEMU's vhost-scsi VHOST_SET_FEATURES handler
doesn't process the failure gracefully. Instead, it crashes immediately on
failure from VHOST_SET_FEATURES.
Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
Changed since v1:
- Don't allocate log buffer during initialization. Allocate during
VHOST_SET_FEATURES or VHOST_SCSI_SET_ENDPOINT.
drivers/vhost/scsi.c | 126 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 126 insertions(+)
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 3875967dee36..1b7211a55562 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -133,6 +133,11 @@ struct vhost_scsi_cmd {
struct se_cmd tvc_se_cmd;
/* Sense buffer that will be mapped into outgoing status */
unsigned char tvc_sense_buf[TRANSPORT_SENSE_BUFFER];
+ /*
+ * Dirty write descriptors of this command.
+ */
+ struct vhost_log *tvc_log;
+ unsigned int tvc_log_num;
/* Completed commands list, serviced from vhost worker thread */
struct llist_node tvc_completion_list;
/* Used to track inflight cmd */
@@ -676,6 +681,7 @@ vhost_scsi_get_cmd(struct vhost_virtqueue *vq, u64 scsi_tag)
struct vhost_scsi_virtqueue, vq);
struct vhost_scsi_cmd *cmd;
struct scatterlist *sgl, *prot_sgl;
+ struct vhost_log *log;
int tag;
tag = sbitmap_get(&svq->scsi_tags);
@@ -687,9 +693,11 @@ vhost_scsi_get_cmd(struct vhost_virtqueue *vq, u64 scsi_tag)
cmd = &svq->scsi_cmds[tag];
sgl = cmd->sgl;
prot_sgl = cmd->prot_sgl;
+ log = cmd->tvc_log;
memset(cmd, 0, sizeof(*cmd));
cmd->sgl = sgl;
cmd->prot_sgl = prot_sgl;
+ cmd->tvc_log = log;
cmd->tvc_se_cmd.map_tag = tag;
cmd->inflight = vhost_scsi_get_inflight(vq);
@@ -1760,6 +1768,55 @@ static void vhost_scsi_flush(struct vhost_scsi *vs)
wait_for_completion(&vs->old_inflight[i]->comp);
}
+static void vhost_scsi_destroy_vq_log(struct vhost_virtqueue *vq)
+{
+ struct vhost_scsi_virtqueue *svq = container_of(vq,
+ struct vhost_scsi_virtqueue, vq);
+ struct vhost_scsi_cmd *tv_cmd;
+ unsigned int i;
+
+ if (!svq->scsi_cmds)
+ return;
+
+ for (i = 0; i < svq->max_cmds; i++) {
+ tv_cmd = &svq->scsi_cmds[i];
+ kfree(tv_cmd->tvc_log);
+ tv_cmd->tvc_log = NULL;
+ tv_cmd->tvc_log_num = 0;
+ }
+}
+
+static int vhost_scsi_setup_vq_log(struct vhost_virtqueue *vq)
+{
+ struct vhost_scsi_virtqueue *svq = container_of(vq,
+ struct vhost_scsi_virtqueue, vq);
+ struct vhost_scsi_cmd *tv_cmd;
+ unsigned int i;
+
+ if (!svq->scsi_cmds)
+ return 0;
+
+ for (i = 0; i < svq->max_cmds; i++) {
+ tv_cmd = &svq->scsi_cmds[i];
+ WARN_ON_ONCE(unlikely(tv_cmd->tvc_log ||
+ tv_cmd->tvc_log_num));
+ tv_cmd->tvc_log_num = 0;
+ tv_cmd->tvc_log = kcalloc(vq->dev->iov_limit,
+ sizeof(struct vhost_log),
+ GFP_KERNEL);
+ if (!tv_cmd->tvc_log) {
+ pr_err("Unable to allocate tv_cmd->tvc_log\n");
+ goto err;
+ }
+ }
+
+ return 0;
+
+err:
+ vhost_scsi_destroy_vq_log(vq);
+ return -ENOMEM;
+}
+
static void vhost_scsi_destroy_vq_cmds(struct vhost_virtqueue *vq)
{
struct vhost_scsi_virtqueue *svq = container_of(vq,
@@ -1779,6 +1836,7 @@ static void vhost_scsi_destroy_vq_cmds(struct vhost_virtqueue *vq)
sbitmap_free(&svq->scsi_tags);
kfree(svq->upages);
+ vhost_scsi_destroy_vq_log(vq);
kfree(svq->scsi_cmds);
svq->scsi_cmds = NULL;
}
@@ -1834,6 +1892,11 @@ static int vhost_scsi_setup_vq_cmds(struct vhost_virtqueue *vq, int max_cmds)
}
}
}
+
+ if (vhost_has_feature(vq, VHOST_F_LOG_ALL) &&
+ vhost_scsi_setup_vq_log(vq))
+ goto out;
+
return 0;
out:
vhost_scsi_destroy_vq_cmds(vq);
@@ -2088,6 +2151,8 @@ vhost_scsi_clear_endpoint(struct vhost_scsi *vs,
static int vhost_scsi_set_features(struct vhost_scsi *vs, u64 features)
{
struct vhost_virtqueue *vq;
+ bool is_log, was_log;
+ int ret;
int i;
if (features & ~VHOST_SCSI_FEATURES)
@@ -2100,14 +2165,75 @@ static int vhost_scsi_set_features(struct vhost_scsi *vs, u64 features)
return -EFAULT;
}
+ if (!vs->dev.nvqs)
+ goto out;
+
+ is_log = features & (1 << VHOST_F_LOG_ALL);
+ /*
+ * All VQs should have same feature.
+ */
+ was_log = vhost_has_feature(&vs->vqs[0].vq, VHOST_F_LOG_ALL);
+
+ /*
+ * If VHOST_F_LOG_ALL is going to be added, allocate tvc_log before
+ * vq->acked_features is committed.
+ * Return -ENOMEM on allocation failure.
+ */
+ if (is_log && !was_log) {
+ for (i = VHOST_SCSI_VQ_IO; i < vs->dev.nvqs; i++) {
+ if (!vs->vqs[i].scsi_cmds)
+ continue;
+
+ vq = &vs->vqs[i].vq;
+
+ mutex_lock(&vq->mutex);
+ ret = vhost_scsi_setup_vq_log(vq);
+ mutex_unlock(&vq->mutex);
+
+ if (ret)
+ goto destroy_cmd_log;
+ }
+ }
+
for (i = 0; i < vs->dev.nvqs; i++) {
vq = &vs->vqs[i].vq;
mutex_lock(&vq->mutex);
vq->acked_features = features;
mutex_unlock(&vq->mutex);
}
+
+ /*
+ * If VHOST_F_LOG_ALL is removed, free tvc_log after
+ * vq->acked_features is committed.
+ */
+ if (!is_log && was_log) {
+ for (i = VHOST_SCSI_VQ_IO; i < vs->dev.nvqs; i++) {
+ if (!vs->vqs[i].scsi_cmds)
+ continue;
+
+ vq = &vs->vqs[i].vq;
+ mutex_lock(&vq->mutex);
+ vhost_scsi_destroy_vq_log(vq);
+ mutex_unlock(&vq->mutex);
+ }
+ }
+
+out:
mutex_unlock(&vs->dev.mutex);
return 0;
+
+destroy_cmd_log:
+ for (i--; i >= VHOST_SCSI_VQ_IO; i--) {
+ if (!vs->vqs[i].scsi_cmds)
+ continue;
+
+ vq = &vs->vqs[i].vq;
+ mutex_lock(&vq->mutex);
+ vhost_scsi_destroy_vq_log(vq);
+ mutex_unlock(&vq->mutex);
+ }
+ mutex_unlock(&vs->dev.mutex);
+ return -ENOMEM;
}
static int vhost_scsi_open(struct inode *inode, struct file *f)
--
2.39.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 07/10] vhost-scsi: log I/O queue write descriptors
2025-03-17 23:55 [PATCH v2 00/10] vhost-scsi: log write descriptors for live migration (and three bugfix) Dongli Zhang
` (5 preceding siblings ...)
2025-03-17 23:55 ` [PATCH v2 06/10] vhost-scsi: cache log buffer in I/O queue vhost_scsi_cmd Dongli Zhang
@ 2025-03-17 23:55 ` Dongli Zhang
2025-03-17 23:55 ` [PATCH v2 08/10] vhost-scsi: log control " Dongli Zhang
` (2 subsequent siblings)
9 siblings, 0 replies; 28+ messages in thread
From: Dongli Zhang @ 2025-03-17 23:55 UTC (permalink / raw)
To: virtualization, kvm, netdev
Cc: mst, jasowang, michael.christie, pbonzini, stefanha, eperezma,
joao.m.martins, joe.jin, si-wei.liu, linux-kernel
Log write descriptors for the I/O queue, leveraging vhost_scsi_get_desc()
and vhost_get_vq_desc() to retrieve the array of write descriptors to
obtain the log buffer.
In addition, introduce a vhost-scsi specific function to log vring
descriptors. In this function, the 'partial' argument is set to false, and
the 'len' argument is set to 0, because vhost-scsi always logs all pages
shared by a vring descriptor. Add WARN_ON_ONCE() since vhost-scsi doesn't
support VIRTIO_F_ACCESS_PLATFORM.
Store the log buffer during the submission path and log it in the
completion path. Logging is also required in the error handling path of the
submission process.
Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
Changed since v1:
- Re-order if staments in vhost_scsi_log_write().
- Log after vhost_scsi_send_status() as well.
drivers/vhost/scsi.c | 43 ++++++++++++++++++++++++++++++++++++++++---
1 file changed, 40 insertions(+), 3 deletions(-)
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 1b7211a55562..8a1b0a19fe58 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -367,6 +367,24 @@ static int vhost_scsi_check_prot_fabric_only(struct se_portal_group *se_tpg)
return tpg->tv_fabric_prot_type;
}
+static void vhost_scsi_log_write(struct vhost_virtqueue *vq,
+ struct vhost_log *log,
+ unsigned int log_num)
+{
+ if (likely(!vhost_has_feature(vq, VHOST_F_LOG_ALL)))
+ return;
+
+ if (likely(!log_num || !log))
+ return;
+
+ /*
+ * vhost-scsi doesn't support VIRTIO_F_ACCESS_PLATFORM.
+ * No requirement for vq->iotlb case.
+ */
+ WARN_ON_ONCE(unlikely(vq->iotlb));
+ vhost_log_write(vq, log, log_num, 0, false, NULL, 0);
+}
+
static void vhost_scsi_release_cmd_res(struct se_cmd *se_cmd)
{
struct vhost_scsi_cmd *tv_cmd = container_of(se_cmd,
@@ -665,6 +683,9 @@ static void vhost_scsi_complete_cmd_work(struct vhost_work *work)
} else
pr_err("Faulted on virtio_scsi_cmd_resp\n");
+ vhost_scsi_log_write(cmd->tvc_vq, cmd->tvc_log,
+ cmd->tvc_log_num);
+
vhost_scsi_release_cmd_res(se_cmd);
}
@@ -1233,6 +1254,8 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
u8 task_attr;
bool t10_pi = vhost_has_feature(vq, VIRTIO_SCSI_F_T10_PI);
u8 *cdb;
+ struct vhost_log *vq_log;
+ unsigned int log_num;
mutex_lock(&vq->mutex);
/*
@@ -1248,8 +1271,11 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
vhost_disable_notify(&vs->dev, vq);
+ vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ?
+ vq->log : NULL;
+
do {
- ret = vhost_scsi_get_desc(vs, vq, &vc, NULL, NULL);
+ ret = vhost_scsi_get_desc(vs, vq, &vc, vq_log, &log_num);
if (ret)
goto err;
@@ -1398,6 +1424,14 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
goto err;
}
+ if (unlikely(vq_log && log_num)) {
+ /*
+ * cmd->tvc_log depends on VHOST_F_LOG_ALL.
+ */
+ memcpy(cmd->tvc_log, vq->log, sizeof(*cmd->tvc_log) * log_num);
+ cmd->tvc_log_num = log_num;
+ }
+
pr_debug("vhost_scsi got command opcode: %#02x, lun: %d\n",
cdb[0], lun);
pr_debug("cmd: %p exp_data_len: %d, prot_bytes: %d data_direction:"
@@ -1433,11 +1467,14 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
*/
if (ret == -ENXIO)
break;
- else if (ret == -EIO)
+ else if (ret == -EIO) {
vhost_scsi_send_bad_target(vs, vq, &vc, TYPE_IO_CMD);
- else if (ret == -ENOMEM)
+ vhost_scsi_log_write(vq, vq_log, log_num);
+ } else if (ret == -ENOMEM) {
vhost_scsi_send_status(vs, vq, &vc,
SAM_STAT_TASK_SET_FULL);
+ vhost_scsi_log_write(vq, vq_log, log_num);
+ }
} while (likely(!vhost_exceeds_weight(vq, ++c, 0)));
out:
mutex_unlock(&vq->mutex);
--
2.39.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 08/10] vhost-scsi: log control queue write descriptors
2025-03-17 23:55 [PATCH v2 00/10] vhost-scsi: log write descriptors for live migration (and three bugfix) Dongli Zhang
` (6 preceding siblings ...)
2025-03-17 23:55 ` [PATCH v2 07/10] vhost-scsi: log I/O queue write descriptors Dongli Zhang
@ 2025-03-17 23:55 ` Dongli Zhang
2025-03-17 23:55 ` [PATCH v2 09/10] vhost-scsi: log event " Dongli Zhang
2025-03-17 23:55 ` [PATCH v2 10/10] vhost: add WARNING if log_num is more than limit Dongli Zhang
9 siblings, 0 replies; 28+ messages in thread
From: Dongli Zhang @ 2025-03-17 23:55 UTC (permalink / raw)
To: virtualization, kvm, netdev
Cc: mst, jasowang, michael.christie, pbonzini, stefanha, eperezma,
joao.m.martins, joe.jin, si-wei.liu, linux-kernel
Log write descriptors for the control queue, leveraging
vhost_scsi_get_desc() and vhost_get_vq_desc() to retrieve the array of
write descriptors to obtain the log buffer.
For Task Management Requests, similar to the I/O queue, store the log
buffer during the submission path and log it in the completion or error
handling path.
For Asynchronous Notifications, only the submission path is involved.
Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
Changed since v1:
- Call kfree(tmf->tmf_log) unconditionally.
- Return VIRTIO_SCSI_S_FUNCTION_REJECTED on log buffer allocation failure.
drivers/vhost/scsi.c | 47 +++++++++++++++++++++++++++++++++++++++-----
1 file changed, 42 insertions(+), 5 deletions(-)
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 8a1b0a19fe58..3cdc5c2fa60e 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -263,6 +263,12 @@ struct vhost_scsi_tmf {
struct iovec resp_iov;
int in_iovs;
int vq_desc;
+
+ /*
+ * Dirty write descriptors of this command.
+ */
+ struct vhost_log *tmf_log;
+ unsigned int tmf_log_num;
};
/*
@@ -431,6 +437,10 @@ static void vhost_scsi_release_tmf_res(struct vhost_scsi_tmf *tmf)
{
struct vhost_scsi_inflight *inflight = tmf->inflight;
+ /*
+ * tmf->tmf_log is default NULL unless VHOST_F_LOG_ALL is set.
+ */
+ kfree(tmf->tmf_log);
kfree(tmf);
vhost_scsi_put_inflight(inflight);
}
@@ -1516,6 +1526,8 @@ static void vhost_scsi_tmf_resp_work(struct vhost_work *work)
mutex_lock(&tmf->svq->vq.mutex);
vhost_scsi_send_tmf_resp(tmf->vhost, &tmf->svq->vq, tmf->in_iovs,
tmf->vq_desc, &tmf->resp_iov, resp_code);
+ vhost_scsi_log_write(&tmf->svq->vq, tmf->tmf_log,
+ tmf->tmf_log_num);
mutex_unlock(&tmf->svq->vq.mutex);
vhost_scsi_release_tmf_res(tmf);
@@ -1539,7 +1551,8 @@ static void
vhost_scsi_handle_tmf(struct vhost_scsi *vs, struct vhost_scsi_tpg *tpg,
struct vhost_virtqueue *vq,
struct virtio_scsi_ctrl_tmf_req *vtmf,
- struct vhost_scsi_ctx *vc)
+ struct vhost_scsi_ctx *vc,
+ struct vhost_log *log, unsigned int log_num)
{
struct vhost_scsi_virtqueue *svq = container_of(vq,
struct vhost_scsi_virtqueue, vq);
@@ -1567,6 +1580,19 @@ vhost_scsi_handle_tmf(struct vhost_scsi *vs, struct vhost_scsi_tpg *tpg,
tmf->in_iovs = vc->in;
tmf->inflight = vhost_scsi_get_inflight(vq);
+ if (unlikely(log && log_num)) {
+ tmf->tmf_log = kmalloc_array(log_num, sizeof(*tmf->tmf_log),
+ GFP_KERNEL);
+ if (tmf->tmf_log) {
+ memcpy(tmf->tmf_log, log, sizeof(*tmf->tmf_log) * log_num);
+ tmf->tmf_log_num = log_num;
+ } else {
+ pr_err("vhost_scsi tmf log allocation error\n");
+ vhost_scsi_release_tmf_res(tmf);
+ goto send_reject;
+ }
+ }
+
if (target_submit_tmr(&tmf->se_cmd, tpg->tpg_nexus->tvn_se_sess, NULL,
vhost_buf_to_lun(vtmf->lun), NULL,
TMR_LUN_RESET, GFP_KERNEL, 0,
@@ -1580,6 +1606,7 @@ vhost_scsi_handle_tmf(struct vhost_scsi *vs, struct vhost_scsi_tpg *tpg,
send_reject:
vhost_scsi_send_tmf_resp(vs, vq, vc->in, vc->head, &vq->iov[vc->out],
VIRTIO_SCSI_S_FUNCTION_REJECTED);
+ vhost_scsi_log_write(vq, log, log_num);
}
static void
@@ -1616,6 +1643,8 @@ vhost_scsi_ctl_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
struct vhost_scsi_ctx vc;
size_t typ_size;
int ret, c = 0;
+ struct vhost_log *vq_log;
+ unsigned int log_num;
mutex_lock(&vq->mutex);
/*
@@ -1629,8 +1658,11 @@ vhost_scsi_ctl_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
vhost_disable_notify(&vs->dev, vq);
+ vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ?
+ vq->log : NULL;
+
do {
- ret = vhost_scsi_get_desc(vs, vq, &vc, NULL, NULL);
+ ret = vhost_scsi_get_desc(vs, vq, &vc, vq_log, &log_num);
if (ret)
goto err;
@@ -1694,9 +1726,12 @@ vhost_scsi_ctl_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
goto err;
if (v_req.type == VIRTIO_SCSI_T_TMF)
- vhost_scsi_handle_tmf(vs, tpg, vq, &v_req.tmf, &vc);
- else
+ vhost_scsi_handle_tmf(vs, tpg, vq, &v_req.tmf, &vc,
+ vq_log, log_num);
+ else {
vhost_scsi_send_an_resp(vs, vq, &vc);
+ vhost_scsi_log_write(vq, vq_log, log_num);
+ }
err:
/*
* ENXIO: No more requests, or read error, wait for next kick
@@ -1706,11 +1741,13 @@ vhost_scsi_ctl_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
*/
if (ret == -ENXIO)
break;
- else if (ret == -EIO)
+ else if (ret == -EIO) {
vhost_scsi_send_bad_target(vs, vq, &vc,
v_req.type == VIRTIO_SCSI_T_TMF ?
TYPE_CTRL_TMF :
TYPE_CTRL_AN);
+ vhost_scsi_log_write(vq, vq_log, log_num);
+ }
} while (likely(!vhost_exceeds_weight(vq, ++c, 0)));
out:
mutex_unlock(&vq->mutex);
--
2.39.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 09/10] vhost-scsi: log event queue write descriptors
2025-03-17 23:55 [PATCH v2 00/10] vhost-scsi: log write descriptors for live migration (and three bugfix) Dongli Zhang
` (7 preceding siblings ...)
2025-03-17 23:55 ` [PATCH v2 08/10] vhost-scsi: log control " Dongli Zhang
@ 2025-03-17 23:55 ` Dongli Zhang
2025-03-26 23:50 ` Mike Christie
2025-03-26 23:51 ` Mike Christie
2025-03-17 23:55 ` [PATCH v2 10/10] vhost: add WARNING if log_num is more than limit Dongli Zhang
9 siblings, 2 replies; 28+ messages in thread
From: Dongli Zhang @ 2025-03-17 23:55 UTC (permalink / raw)
To: virtualization, kvm, netdev
Cc: mst, jasowang, michael.christie, pbonzini, stefanha, eperezma,
joao.m.martins, joe.jin, si-wei.liu, linux-kernel
Log write descriptors for the event queue, leveraging vhost_get_vq_desc()
to retrieve the array of write descriptors to obtain the log buffer.
There is only one path for event queue.
Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
drivers/vhost/scsi.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 3cdc5c2fa60e..525610bbabc9 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -550,6 +550,8 @@ vhost_scsi_do_evt_work(struct vhost_scsi *vs, struct vhost_scsi_evt *evt)
struct vhost_virtqueue *vq = &vs->vqs[VHOST_SCSI_VQ_EVT].vq;
struct virtio_scsi_event *event = &evt->event;
struct virtio_scsi_event __user *eventp;
+ struct vhost_log *vq_log;
+ unsigned int log_num;
unsigned out, in;
int head, ret;
@@ -560,9 +562,19 @@ vhost_scsi_do_evt_work(struct vhost_scsi *vs, struct vhost_scsi_evt *evt)
again:
vhost_disable_notify(&vs->dev, vq);
+
+ vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ?
+ vq->log : NULL;
+
+ /*
+ * Reset 'log_num' since vhost_get_vq_desc() may reset it only
+ * after certain condition checks.
+ */
+ log_num = 0;
+
head = vhost_get_vq_desc(vq, vq->iov,
ARRAY_SIZE(vq->iov), &out, &in,
- NULL, NULL);
+ vq_log, &log_num);
if (head < 0) {
vs->vs_events_missed = true;
return;
@@ -592,6 +604,8 @@ vhost_scsi_do_evt_work(struct vhost_scsi *vs, struct vhost_scsi_evt *evt)
vhost_add_used_and_signal(&vs->dev, vq, head, 0);
else
vq_err(vq, "Faulted on vhost_scsi_send_event\n");
+
+ vhost_scsi_log_write(vq, vq_log, log_num);
}
static void vhost_scsi_complete_events(struct vhost_scsi *vs, bool drop)
--
2.39.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v2 10/10] vhost: add WARNING if log_num is more than limit
2025-03-17 23:55 [PATCH v2 00/10] vhost-scsi: log write descriptors for live migration (and three bugfix) Dongli Zhang
` (8 preceding siblings ...)
2025-03-17 23:55 ` [PATCH v2 09/10] vhost-scsi: log event " Dongli Zhang
@ 2025-03-17 23:55 ` Dongli Zhang
9 siblings, 0 replies; 28+ messages in thread
From: Dongli Zhang @ 2025-03-17 23:55 UTC (permalink / raw)
To: virtualization, kvm, netdev
Cc: mst, jasowang, michael.christie, pbonzini, stefanha, eperezma,
joao.m.martins, joe.jin, si-wei.liu, linux-kernel
Since long time ago, the only user of vq->log is vhost-net. The concern is
to add support for more devices (i.e. vhost-scsi or vsock) may reveals
unknown issue in the vhost API. Add a WARNING.
Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
drivers/vhost/vhost.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index db3b30aba940..8368370b40f7 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2553,6 +2553,15 @@ static int get_indirect(struct vhost_virtqueue *vq,
if (access == VHOST_ACCESS_WO) {
*in_num += ret;
if (unlikely(log && ret)) {
+ /*
+ * Since long time ago, the only user of
+ * vq->log is vhost-net. The concern is to
+ * add support for more devices (i.e.
+ * vhost-scsi or vsock) may reveals unknown
+ * issue in the vhost API. Add a WARNING.
+ */
+ WARN_ON_ONCE(*log_num >= vq->dev->iov_limit);
+
log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
log[*log_num].len = vhost32_to_cpu(vq, desc.len);
++*log_num;
@@ -2673,6 +2682,15 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
* increment that count. */
*in_num += ret;
if (unlikely(log && ret)) {
+ /*
+ * Since long time ago, the only user of
+ * vq->log is vhost-net. The concern is to
+ * add support for more devices (i.e.
+ * vhost-scsi or vsock) may reveals unknown
+ * issue in the vhost API. Add a WARNING.
+ */
+ WARN_ON_ONCE(*log_num >= vq->dev->iov_limit);
+
log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
log[*log_num].len = vhost32_to_cpu(vq, desc.len);
++*log_num;
--
2.39.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v2 06/10] vhost-scsi: cache log buffer in I/O queue vhost_scsi_cmd
2025-03-17 23:55 ` [PATCH v2 06/10] vhost-scsi: cache log buffer in I/O queue vhost_scsi_cmd Dongli Zhang
@ 2025-03-18 0:04 ` Dongli Zhang
2025-03-26 23:37 ` Mike Christie
0 siblings, 1 reply; 28+ messages in thread
From: Dongli Zhang @ 2025-03-18 0:04 UTC (permalink / raw)
To: virtualization, kvm, netdev, michael.christie
Cc: mst, jasowang, pbonzini, stefanha, eperezma, joao.m.martins,
joe.jin, si-wei.liu, linux-kernel
Hi Mike,
On 3/17/25 4:55 PM, Dongli Zhang wrote:
> The vhost-scsi I/O queue uses vhost_scsi_cmd. Allocate the log buffer
> during vhost_scsi_cmd allocation or when VHOST_F_LOG_ALL is set. Free the
> log buffer when vhost_scsi_cmd is reclaimed or when VHOST_F_LOG_ALL is
> removed.
>
> Fail vhost_scsi_set_endpoint or vhost_scsi_set_features() on allocation
> failure.
>
> The cached log buffer will be uses in upcoming patches to log write
> descriptors for the I/O queue. The core idea is to cache the log in the
> per-command log buffer in the submission path, and use them to log write
> descriptors in the completion path.
>
> As a reminder, currently QEMU's vhost-scsi VHOST_SET_FEATURES handler
> doesn't process the failure gracefully. Instead, it crashes immediately on
> failure from VHOST_SET_FEATURES.
>
We have discussed the allocation of log buffer at:
https://lore.kernel.org/all/b058d4c6-f8cf-456b-aa60-8a8ccedb277e@oracle.com/
This patchset allocate and free log buffer during
VHOST_SET_FEATURES/VHOST_SCSI_SET_ENDPOINT.
Unfortunately, QEMU's VHOST_SET_FEATURES handler may crash QEMU if there is
error from VHOST_SET_FEATURES (i.e. -ENOMEM).
From user's perspective, it is better to never start VHOST_F_LOG_ALL if
memory isn't enough. However, that requires change at QEMU side.
I have another implementation by combining PATCH 06 and PATCH 07 in this
patchset.
The core idea is to allocate only once for each cmd. In addition, don't
free log buffer unless VHOST_F_LOG_ALL is removed, or during
VHOST_SCSI_SET_ENDPOINT when all commands are destroyed.
It returns SAM_STAT_TASK_SET_FULL to guest when allocation is failed.
-------------------------------
[PATCH v2 6/9] vhost-scsi: log I/O queue write descriptors
Log write descriptors for the I/O queue, leveraging vhost_scsi_get_desc()
and vhost_get_vq_desc() to retrieve the array of write descriptors to
obtain the log buffer.
In addition, introduce a vhost-scsi specific function to log vring
descriptors. In this function, the 'partial' argument is set to false, and
the 'len' argument is set to 0, because vhost-scsi always logs all pages
shared by a vring descriptor. Add WARN_ON_ONCE() since vhost-scsi doesn't
support VIRTIO_F_ACCESS_PLATFORM.
The per-cmd log buffer is allocated on demand in the submission path after
VHOST_F_LOG_ALL is set. Return -ENOMEM on allocation failure, in order to
send SAM_STAT_TASK_SET_FULL to the guest.
It isn't reclaimed in the completion path. Instead, it is reclaimed when
VHOST_F_LOG_ALL is removed, or during VHOST_SCSI_SET_ENDPOINT when all
commands are destroyed.
Store the log buffer during the submission path and log it in the
completion path. Logging is also required in the error handling path of the
submission process.
Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
Changed since v1:
- Don't allocate log buffer during initialization. Allocate only once for
each command. Don't free until not used any longer.
- Re-order if staments in vhost_scsi_log_write().
- Log after vhost_scsi_send_status() as well.
drivers/vhost/scsi.c | 108 +++++++++++++++++++++++++++++++++++++++++--
1 file changed, 105 insertions(+), 3 deletions(-)
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 3875967dee36..6ae4d7de9a5f 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -133,6 +133,11 @@ struct vhost_scsi_cmd {
struct se_cmd tvc_se_cmd;
/* Sense buffer that will be mapped into outgoing status */
unsigned char tvc_sense_buf[TRANSPORT_SENSE_BUFFER];
+ /*
+ * Dirty write descriptors of this command.
+ */
+ struct vhost_log *tvc_log;
+ unsigned int tvc_log_num;
/* Completed commands list, serviced from vhost worker thread */
struct llist_node tvc_completion_list;
/* Used to track inflight cmd */
@@ -362,6 +367,24 @@ static int vhost_scsi_check_prot_fabric_only(struct se_portal_group *se_tpg)
return tpg->tv_fabric_prot_type;
}
+static void vhost_scsi_log_write(struct vhost_virtqueue *vq,
+ struct vhost_log *log,
+ unsigned int log_num)
+{
+ if (likely(!vhost_has_feature(vq, VHOST_F_LOG_ALL)))
+ return;
+
+ if (likely(!log_num || !log))
+ return;
+
+ /*
+ * vhost-scsi doesn't support VIRTIO_F_ACCESS_PLATFORM.
+ * No requirement for vq->iotlb case.
+ */
+ WARN_ON_ONCE(unlikely(vq->iotlb));
+ vhost_log_write(vq, log, log_num, 0, false, NULL, 0);
+}
+
static void vhost_scsi_release_cmd_res(struct se_cmd *se_cmd)
{
struct vhost_scsi_cmd *tv_cmd = container_of(se_cmd,
@@ -660,6 +683,9 @@ static void vhost_scsi_complete_cmd_work(struct vhost_work *work)
} else
pr_err("Faulted on virtio_scsi_cmd_resp\n");
+ vhost_scsi_log_write(cmd->tvc_vq, cmd->tvc_log,
+ cmd->tvc_log_num);
+
vhost_scsi_release_cmd_res(se_cmd);
}
@@ -676,6 +702,7 @@ vhost_scsi_get_cmd(struct vhost_virtqueue *vq, u64 scsi_tag)
struct vhost_scsi_virtqueue, vq);
struct vhost_scsi_cmd *cmd;
struct scatterlist *sgl, *prot_sgl;
+ struct vhost_log *log;
int tag;
tag = sbitmap_get(&svq->scsi_tags);
@@ -687,9 +714,11 @@ vhost_scsi_get_cmd(struct vhost_virtqueue *vq, u64 scsi_tag)
cmd = &svq->scsi_cmds[tag];
sgl = cmd->sgl;
prot_sgl = cmd->prot_sgl;
+ log = cmd->tvc_log;
memset(cmd, 0, sizeof(*cmd));
cmd->sgl = sgl;
cmd->prot_sgl = prot_sgl;
+ cmd->tvc_log = log;
cmd->tvc_se_cmd.map_tag = tag;
cmd->inflight = vhost_scsi_get_inflight(vq);
@@ -1225,6 +1254,8 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
u8 task_attr;
bool t10_pi = vhost_has_feature(vq, VIRTIO_SCSI_F_T10_PI);
u8 *cdb;
+ struct vhost_log *vq_log;
+ unsigned int log_num;
mutex_lock(&vq->mutex);
/*
@@ -1240,8 +1271,11 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
vhost_disable_notify(&vs->dev, vq);
+ vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ?
+ vq->log : NULL;
+
do {
- ret = vhost_scsi_get_desc(vs, vq, &vc, NULL, NULL);
+ ret = vhost_scsi_get_desc(vs, vq, &vc, vq_log, &log_num);
if (ret)
goto err;
@@ -1390,6 +1424,24 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
goto err;
}
+ if (unlikely(vq_log && log_num)) {
+ if (!cmd->tvc_log)
+ cmd->tvc_log = kmalloc_array(vq->dev->iov_limit,
+ sizeof(*cmd->tvc_log),
+ GFP_KERNEL);
+
+ if (likely(cmd->tvc_log)) {
+ memcpy(cmd->tvc_log, vq->log,
+ sizeof(*cmd->tvc_log) * log_num);
+ cmd->tvc_log_num = log_num;
+ } else {
+ ret = -ENOMEM;
+ vq_err(vq, "Failed to alloc tvc_log\n");
+ vhost_scsi_release_cmd_res(&cmd->tvc_se_cmd);
+ goto err;
+ }
+ }
+
pr_debug("vhost_scsi got command opcode: %#02x, lun: %d\n",
cdb[0], lun);
pr_debug("cmd: %p exp_data_len: %d, prot_bytes: %d data_direction:"
@@ -1425,11 +1477,14 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
*/
if (ret == -ENXIO)
break;
- else if (ret == -EIO)
+ else if (ret == -EIO) {
vhost_scsi_send_bad_target(vs, vq, &vc, TYPE_IO_CMD);
- else if (ret == -ENOMEM)
+ vhost_scsi_log_write(vq, vq_log, log_num);
+ } else if (ret == -ENOMEM) {
vhost_scsi_send_status(vs, vq, &vc,
SAM_STAT_TASK_SET_FULL);
+ vhost_scsi_log_write(vq, vq_log, log_num);
+ }
} while (likely(!vhost_exceeds_weight(vq, ++c, 0)));
out:
mutex_unlock(&vq->mutex);
@@ -1760,6 +1815,24 @@ static void vhost_scsi_flush(struct vhost_scsi *vs)
wait_for_completion(&vs->old_inflight[i]->comp);
}
+static void vhost_scsi_destroy_vq_log(struct vhost_virtqueue *vq)
+{
+ struct vhost_scsi_virtqueue *svq = container_of(vq,
+ struct vhost_scsi_virtqueue, vq);
+ struct vhost_scsi_cmd *tv_cmd;
+ unsigned int i;
+
+ if (!svq->scsi_cmds)
+ return;
+
+ for (i = 0; i < svq->max_cmds; i++) {
+ tv_cmd = &svq->scsi_cmds[i];
+ kfree(tv_cmd->tvc_log);
+ tv_cmd->tvc_log = NULL;
+ tv_cmd->tvc_log_num = 0;
+ }
+}
+
static void vhost_scsi_destroy_vq_cmds(struct vhost_virtqueue *vq)
{
struct vhost_scsi_virtqueue *svq = container_of(vq,
@@ -1779,6 +1852,7 @@ static void vhost_scsi_destroy_vq_cmds(struct vhost_virtqueue *vq)
sbitmap_free(&svq->scsi_tags);
kfree(svq->upages);
+ vhost_scsi_destroy_vq_log(vq);
kfree(svq->scsi_cmds);
svq->scsi_cmds = NULL;
}
@@ -2088,6 +2162,7 @@ vhost_scsi_clear_endpoint(struct vhost_scsi *vs,
static int vhost_scsi_set_features(struct vhost_scsi *vs, u64 features)
{
struct vhost_virtqueue *vq;
+ bool is_log, was_log;
int i;
if (features & ~VHOST_SCSI_FEATURES)
@@ -2100,12 +2175,39 @@ static int vhost_scsi_set_features(struct vhost_scsi *vs, u64 features)
return -EFAULT;
}
+ if (!vs->dev.nvqs)
+ goto out;
+
+ is_log = features & (1 << VHOST_F_LOG_ALL);
+ /*
+ * All VQs should have same feature.
+ */
+ was_log = vhost_has_feature(&vs->vqs[0].vq, VHOST_F_LOG_ALL);
+
for (i = 0; i < vs->dev.nvqs; i++) {
vq = &vs->vqs[i].vq;
mutex_lock(&vq->mutex);
vq->acked_features = features;
mutex_unlock(&vq->mutex);
}
+
+ /*
+ * If VHOST_F_LOG_ALL is removed, free tvc_log after
+ * vq->acked_features is committed.
+ */
+ if (!is_log && was_log) {
+ for (i = VHOST_SCSI_VQ_IO; i < vs->dev.nvqs; i++) {
+ if (!vs->vqs[i].scsi_cmds)
+ continue;
+
+ vq = &vs->vqs[i].vq;
+ mutex_lock(&vq->mutex);
+ vhost_scsi_destroy_vq_log(vq);
+ mutex_unlock(&vq->mutex);
+ }
+ }
+
+out:
mutex_unlock(&vs->dev.mutex);
return 0;
}
--
2.39.3
-------------------------------
Thank you very much!
Dongli Zhang
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v2 01/10] vhost-scsi: protect vq->log_used with vq->mutex
2025-03-17 23:55 ` [PATCH v2 01/10] vhost-scsi: protect vq->log_used with vq->mutex Dongli Zhang
@ 2025-03-18 0:47 ` Jason Wang
2025-03-26 22:22 ` Mike Christie
1 sibling, 0 replies; 28+ messages in thread
From: Jason Wang @ 2025-03-18 0:47 UTC (permalink / raw)
To: Dongli Zhang
Cc: virtualization, kvm, netdev, mst, michael.christie, pbonzini,
stefanha, eperezma, joao.m.martins, joe.jin, si-wei.liu,
linux-kernel
On Tue, Mar 18, 2025 at 7:51 AM Dongli Zhang <dongli.zhang@oracle.com> wrote:
>
> The vhost-scsi completion path may access vq->log_base when vq->log_used is
> already set to false.
>
> vhost-thread QEMU-thread
>
> vhost_scsi_complete_cmd_work()
> -> vhost_add_used()
> -> vhost_add_used_n()
> if (unlikely(vq->log_used))
> QEMU disables vq->log_used
> via VHOST_SET_VRING_ADDR.
> mutex_lock(&vq->mutex);
> vq->log_used = false now!
> mutex_unlock(&vq->mutex);
>
> QEMU gfree(vq->log_base)
> log_used()
> -> log_write(vq->log_base)
>
> Assuming the VMM is QEMU. The vq->log_base is from QEMU userpace and can be
> reclaimed via gfree(). As a result, this causes invalid memory writes to
> QEMU userspace.
>
> The control queue path has the same issue.
>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
Acked-by: Jason Wang <jasowang@redhat.com>
Thanks
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2 02/10] vhost-scsi: Fix vhost_scsi_send_bad_target()
2025-03-17 23:55 ` [PATCH v2 02/10] vhost-scsi: Fix vhost_scsi_send_bad_target() Dongli Zhang
@ 2025-03-18 0:48 ` Jason Wang
2025-03-26 22:25 ` Mike Christie
1 sibling, 0 replies; 28+ messages in thread
From: Jason Wang @ 2025-03-18 0:48 UTC (permalink / raw)
To: Dongli Zhang
Cc: virtualization, kvm, netdev, mst, michael.christie, pbonzini,
stefanha, eperezma, joao.m.martins, joe.jin, si-wei.liu,
linux-kernel
On Tue, Mar 18, 2025 at 7:52 AM Dongli Zhang <dongli.zhang@oracle.com> wrote:
>
> Although the support of VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 was
> signaled by the commit 664ed90e621c ("vhost/scsi: Set
> VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits"),
> vhost_scsi_send_bad_target() still assumes the response in a single
> descriptor.
>
> In addition, although vhost_scsi_send_bad_target() is used by both I/O
> queue and control queue, the response header is always
> virtio_scsi_cmd_resp. It is required to use virtio_scsi_ctrl_tmf_resp or
> virtio_scsi_ctrl_an_resp for control queue.
>
> Fixes: 664ed90e621c ("vhost/scsi: Set VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits")
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
> Changed since v1:
> - Move this bugfix patch to before dirty log tracking patches.
>
> drivers/vhost/scsi.c | 48 ++++++++++++++++++++++++++++++++++----------
> 1 file changed, 37 insertions(+), 11 deletions(-)
>
Acked-by: Jason Wang <jasowang@redhat.com>
Thanks
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2 03/10] vhost-scsi: Fix vhost_scsi_send_status()
2025-03-17 23:55 ` [PATCH v2 03/10] vhost-scsi: Fix vhost_scsi_send_status() Dongli Zhang
@ 2025-03-18 0:48 ` Jason Wang
2025-03-19 17:54 ` Dongli Zhang
2025-03-26 22:27 ` Mike Christie
1 sibling, 1 reply; 28+ messages in thread
From: Jason Wang @ 2025-03-18 0:48 UTC (permalink / raw)
To: Dongli Zhang
Cc: virtualization, kvm, netdev, mst, michael.christie, pbonzini,
stefanha, eperezma, joao.m.martins, joe.jin, si-wei.liu,
linux-kernel
On Tue, Mar 18, 2025 at 7:52 AM Dongli Zhang <dongli.zhang@oracle.com> wrote:
>
> Although the support of VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 was
> signaled by the commit 664ed90e621c ("vhost/scsi: Set
> VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits"),
> vhost_scsi_send_bad_target() still assumes the response in a single
> descriptor.
>
> Similar issue in vhost_scsi_send_bad_target() has been fixed in previous
> commit.
>
> Fixes: 3ca51662f818 ("vhost-scsi: Add better resource allocation failure handling")
And
6dd88fd59da84631b5fe5c8176931c38cfa3b265 ("vhost-scsi: unbreak any
layout for response")
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
> Changed since v1:
> - New patch to fix vhost_scsi_send_status().
>
> drivers/vhost/scsi.c | 18 +++++++++++-------
> 1 file changed, 11 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
> index 59d907b94c5e..26bcf3a7f70c 100644
> --- a/drivers/vhost/scsi.c
> +++ b/drivers/vhost/scsi.c
> @@ -999,18 +999,22 @@ static void vhost_scsi_target_queue_cmd(struct vhost_scsi_nexus *nexus,
>
> static void
> vhost_scsi_send_status(struct vhost_scsi *vs, struct vhost_virtqueue *vq,
> - int head, unsigned int out, u8 status)
> + struct vhost_scsi_ctx *vc, u8 status)
> {
> - struct virtio_scsi_cmd_resp __user *resp;
> struct virtio_scsi_cmd_resp rsp;
> + struct iov_iter iov_iter;
> int ret;
>
> memset(&rsp, 0, sizeof(rsp));
> rsp.status = status;
> - resp = vq->iov[out].iov_base;
> - ret = __copy_to_user(resp, &rsp, sizeof(rsp));
> - if (!ret)
> - vhost_add_used_and_signal(&vs->dev, vq, head, 0);
> +
> + iov_iter_init(&iov_iter, ITER_DEST, &vq->iov[vc->out], vc->in,
> + sizeof(rsp));
> +
> + ret = copy_to_iter(&rsp, sizeof(rsp), &iov_iter);
> +
> + if (likely(ret == sizeof(rsp)))
> + vhost_add_used_and_signal(&vs->dev, vq, vc->head, 0);
> else
> pr_err("Faulted on virtio_scsi_cmd_resp\n");
> }
> @@ -1420,7 +1424,7 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
> else if (ret == -EIO)
> vhost_scsi_send_bad_target(vs, vq, &vc, TYPE_IO_CMD);
> else if (ret == -ENOMEM)
> - vhost_scsi_send_status(vs, vq, vc.head, vc.out,
> + vhost_scsi_send_status(vs, vq, &vc,
> SAM_STAT_TASK_SET_FULL);
> } while (likely(!vhost_exceeds_weight(vq, ++c, 0)));
> out:
> --
> 2.39.3
>
Acked-by: Jason Wang <jasowang@redhat.com>
Thanks
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2 04/10] vhost: modify vhost_log_write() for broader users
2025-03-17 23:55 ` [PATCH v2 04/10] vhost: modify vhost_log_write() for broader users Dongli Zhang
@ 2025-03-18 1:12 ` Jason Wang
2025-03-19 16:38 ` Dongli Zhang
0 siblings, 1 reply; 28+ messages in thread
From: Jason Wang @ 2025-03-18 1:12 UTC (permalink / raw)
To: Dongli Zhang
Cc: virtualization, kvm, netdev, mst, michael.christie, pbonzini,
stefanha, eperezma, joao.m.martins, joe.jin, si-wei.liu,
linux-kernel
On Tue, Mar 18, 2025 at 7:51 AM Dongli Zhang <dongli.zhang@oracle.com> wrote:
>
> Currently, the only user of vhost_log_write() is vhost-net. The 'len'
> argument prevents logging of pages that are not tainted by the RX path.
>
> Adjustments are needed since more drivers (i.e. vhost-scsi) begin using
> vhost_log_write(). So far vhost-net RX path may only partially use pages
> shared by the last vring descriptor. Unlike vhost-net, vhost-scsi always
> logs all pages shared via vring descriptors. To accommodate this, a new
> argument 'partial' is introduced. This argument works alongside 'len' to
> indicate whether the driver should log all pages of a vring descriptor, or
> only pages that are tainted by the driver.
>
> In addition, removes BUG().
>
> Suggested-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
> drivers/vhost/net.c | 2 +-
> drivers/vhost/vhost.c | 28 +++++++++++++++++-----------
> drivers/vhost/vhost.h | 2 +-
> 3 files changed, 19 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index b9b9e9d40951..0e5d82bfde76 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -1219,7 +1219,7 @@ static void handle_rx(struct vhost_net *net)
> if (nvq->done_idx > VHOST_NET_BATCH)
> vhost_net_signal_used(nvq);
> if (unlikely(vq_log))
> - vhost_log_write(vq, vq_log, log, vhost_len,
> + vhost_log_write(vq, vq_log, log, vhost_len, true,
> vq->iov, in);
> total_len += vhost_len;
> } while (likely(!vhost_exceeds_weight(vq, ++recv_pkts, total_len)));
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 9ac25d08f473..db3b30aba940 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -2304,8 +2304,14 @@ static int log_used(struct vhost_virtqueue *vq, u64 used_offset, u64 len)
> return 0;
> }
>
> -int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
> - unsigned int log_num, u64 len, struct iovec *iov, int count)
> +/*
> + * 'len' is used only when 'partial' is true, to indicate whether the
> + * entire length of each descriptor is logged.
> + */
While at it, let's document all the parameters here.
> +int vhost_log_write(struct vhost_virtqueue *vq,
> + struct vhost_log *log, unsigned int log_num,
> + u64 len, bool partial,
> + struct iovec *iov, int count)
> {
> int i, r;
>
> @@ -2323,19 +2329,19 @@ int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
> }
>
> for (i = 0; i < log_num; ++i) {
> - u64 l = min(log[i].len, len);
> + u64 l = partial ? min(log[i].len, len) : log[i].len;
> +
> r = log_write(vq->log_base, log[i].addr, l);
> if (r < 0)
> return r;
> - len -= l;
> - if (!len) {
> - if (vq->log_ctx)
> - eventfd_signal(vq->log_ctx);
> - return 0;
> - }
> +
> + if (partial)
> + len -= l;
I wonder if it's simpler to just tweak the caller to call with the
correct len (or probably U64_MAX) in this case?
> }
> - /* Length written exceeds what we have stored. This is a bug. */
> - BUG();
> +
> + if (vq->log_ctx)
> + eventfd_signal(vq->log_ctx);
> +
> return 0;
> }
> EXPORT_SYMBOL_GPL(vhost_log_write);
> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> index bb75a292d50c..5de5941988fe 100644
> --- a/drivers/vhost/vhost.h
> +++ b/drivers/vhost/vhost.h
> @@ -224,7 +224,7 @@ bool vhost_vq_avail_empty(struct vhost_dev *, struct vhost_virtqueue *);
> bool vhost_enable_notify(struct vhost_dev *, struct vhost_virtqueue *);
>
> int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
> - unsigned int log_num, u64 len,
> + unsigned int log_num, u64 len, bool partial,
> struct iovec *iov, int count);
> int vq_meta_prefetch(struct vhost_virtqueue *vq);
>
> --
> 2.39.3
>
Thanks
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2 04/10] vhost: modify vhost_log_write() for broader users
2025-03-18 1:12 ` Jason Wang
@ 2025-03-19 16:38 ` Dongli Zhang
2025-03-20 1:41 ` Jason Wang
0 siblings, 1 reply; 28+ messages in thread
From: Dongli Zhang @ 2025-03-19 16:38 UTC (permalink / raw)
To: Jason Wang
Cc: virtualization, kvm, netdev, mst, michael.christie, pbonzini,
stefanha, eperezma, joao.m.martins, joe.jin, si-wei.liu,
linux-kernel
Hi Jason,
On 3/17/25 6:12 PM, Jason Wang wrote:
> On Tue, Mar 18, 2025 at 7:51 AM Dongli Zhang <dongli.zhang@oracle.com> wrote:
>>
>> Currently, the only user of vhost_log_write() is vhost-net. The 'len'
>> argument prevents logging of pages that are not tainted by the RX path.
>>
>> Adjustments are needed since more drivers (i.e. vhost-scsi) begin using
>> vhost_log_write(). So far vhost-net RX path may only partially use pages
>> shared by the last vring descriptor. Unlike vhost-net, vhost-scsi always
>> logs all pages shared via vring descriptors. To accommodate this, a new
>> argument 'partial' is introduced. This argument works alongside 'len' to
>> indicate whether the driver should log all pages of a vring descriptor, or
>> only pages that are tainted by the driver.
>>
>> In addition, removes BUG().
>>
>> Suggested-by: Joao Martins <joao.m.martins@oracle.com>
>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>> ---
>> drivers/vhost/net.c | 2 +-
>> drivers/vhost/vhost.c | 28 +++++++++++++++++-----------
>> drivers/vhost/vhost.h | 2 +-
>> 3 files changed, 19 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>> index b9b9e9d40951..0e5d82bfde76 100644
>> --- a/drivers/vhost/net.c
>> +++ b/drivers/vhost/net.c
>> @@ -1219,7 +1219,7 @@ static void handle_rx(struct vhost_net *net)
>> if (nvq->done_idx > VHOST_NET_BATCH)
>> vhost_net_signal_used(nvq);
>> if (unlikely(vq_log))
>> - vhost_log_write(vq, vq_log, log, vhost_len,
>> + vhost_log_write(vq, vq_log, log, vhost_len, true,
>> vq->iov, in);
>> total_len += vhost_len;
>> } while (likely(!vhost_exceeds_weight(vq, ++recv_pkts, total_len)));
>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>> index 9ac25d08f473..db3b30aba940 100644
>> --- a/drivers/vhost/vhost.c
>> +++ b/drivers/vhost/vhost.c
>> @@ -2304,8 +2304,14 @@ static int log_used(struct vhost_virtqueue *vq, u64 used_offset, u64 len)
>> return 0;
>> }
>>
>> -int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
>> - unsigned int log_num, u64 len, struct iovec *iov, int count)
>> +/*
>> + * 'len' is used only when 'partial' is true, to indicate whether the
>> + * entire length of each descriptor is logged.
>> + */
>
> While at it, let's document all the parameters here.
Sure.
>
>> +int vhost_log_write(struct vhost_virtqueue *vq,
>> + struct vhost_log *log, unsigned int log_num,
>> + u64 len, bool partial,
>> + struct iovec *iov, int count)
>> {
>> int i, r;
>>
>> @@ -2323,19 +2329,19 @@ int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
>> }
>>
>> for (i = 0; i < log_num; ++i) {
>> - u64 l = min(log[i].len, len);
>> + u64 l = partial ? min(log[i].len, len) : log[i].len;
>> +
>> r = log_write(vq->log_base, log[i].addr, l);
>> if (r < 0)
>> return r;
>> - len -= l;
>> - if (!len) {
>> - if (vq->log_ctx)
>> - eventfd_signal(vq->log_ctx);
>> - return 0;
>> - }
>> +
>> + if (partial)
>> + len -= l;
>
> I wonder if it's simpler to just tweak the caller to call with the
> correct len (or probably U64_MAX) in this case?
To "tweak the caller to call with the correct len" may need to sum the length
of all log[i].
Regarding U64_MAX, would you like something below? That is, only use 'len'
when it isn't U64_MAX.
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 9ac25d08f473..5b49de05e752 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2327,15 +2327,14 @@ int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
r = log_write(vq->log_base, log[i].addr, l);
if (r < 0)
return r;
- len -= l;
- if (!len) {
- if (vq->log_ctx)
- eventfd_signal(vq->log_ctx);
- return 0;
- }
+
+ if (len != U64_MAX) ---> It is impossible to have len = U64_MAX from vhost-net
+ len -= l; How about keeping those two lines?
}
- /* Length written exceeds what we have stored. This is a bug. */
- BUG();
+
+ if (vq->log_ctx)
+ eventfd_signal(vq->log_ctx);
+
return 0;
}
EXPORT_SYMBOL_GPL(vhost_log_write);
Thank you very much!
Dongli Zhang
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v2 03/10] vhost-scsi: Fix vhost_scsi_send_status()
2025-03-18 0:48 ` Jason Wang
@ 2025-03-19 17:54 ` Dongli Zhang
2025-03-20 1:42 ` Jason Wang
0 siblings, 1 reply; 28+ messages in thread
From: Dongli Zhang @ 2025-03-19 17:54 UTC (permalink / raw)
To: Jason Wang
Cc: virtualization, kvm, netdev, mst, michael.christie, pbonzini,
stefanha, eperezma, joao.m.martins, joe.jin, si-wei.liu,
linux-kernel
Hi Jason,
On 3/17/25 5:48 PM, Jason Wang wrote:
> On Tue, Mar 18, 2025 at 7:52 AM Dongli Zhang <dongli.zhang@oracle.com> wrote:
>>
>> Although the support of VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 was
>> signaled by the commit 664ed90e621c ("vhost/scsi: Set
>> VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits"),
>> vhost_scsi_send_bad_target() still assumes the response in a single
>> descriptor.
>>
>> Similar issue in vhost_scsi_send_bad_target() has been fixed in previous
>> commit.
>>
>> Fixes: 3ca51662f818 ("vhost-scsi: Add better resource allocation failure handling")
>
> And
>
> 6dd88fd59da84631b5fe5c8176931c38cfa3b265 ("vhost-scsi: unbreak any
> layout for response")
>
Would suggest add the commit to Fixes?
vhost_scsi_send_status() has been introduced by the most recent patch.
It isn't related to that commit. That commit is to fix
vhost_scsi_complete_cmd_work()
Or would you suggest mention it as part of "Similar issue has been fixed in
previous commit."?
Thank you very much!
Dongli Zhang
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2 04/10] vhost: modify vhost_log_write() for broader users
2025-03-19 16:38 ` Dongli Zhang
@ 2025-03-20 1:41 ` Jason Wang
0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2025-03-20 1:41 UTC (permalink / raw)
To: Dongli Zhang
Cc: virtualization, kvm, netdev, mst, michael.christie, pbonzini,
stefanha, eperezma, joao.m.martins, joe.jin, si-wei.liu,
linux-kernel
On Thu, Mar 20, 2025 at 12:38 AM Dongli Zhang <dongli.zhang@oracle.com> wrote:
>
> Hi Jason,
>
> On 3/17/25 6:12 PM, Jason Wang wrote:
> > On Tue, Mar 18, 2025 at 7:51 AM Dongli Zhang <dongli.zhang@oracle.com> wrote:
> >>
> >> Currently, the only user of vhost_log_write() is vhost-net. The 'len'
> >> argument prevents logging of pages that are not tainted by the RX path.
> >>
> >> Adjustments are needed since more drivers (i.e. vhost-scsi) begin using
> >> vhost_log_write(). So far vhost-net RX path may only partially use pages
> >> shared by the last vring descriptor. Unlike vhost-net, vhost-scsi always
> >> logs all pages shared via vring descriptors. To accommodate this, a new
> >> argument 'partial' is introduced. This argument works alongside 'len' to
> >> indicate whether the driver should log all pages of a vring descriptor, or
> >> only pages that are tainted by the driver.
> >>
> >> In addition, removes BUG().
> >>
> >> Suggested-by: Joao Martins <joao.m.martins@oracle.com>
> >> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> >> ---
> >> drivers/vhost/net.c | 2 +-
> >> drivers/vhost/vhost.c | 28 +++++++++++++++++-----------
> >> drivers/vhost/vhost.h | 2 +-
> >> 3 files changed, 19 insertions(+), 13 deletions(-)
> >>
> >> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> >> index b9b9e9d40951..0e5d82bfde76 100644
> >> --- a/drivers/vhost/net.c
> >> +++ b/drivers/vhost/net.c
> >> @@ -1219,7 +1219,7 @@ static void handle_rx(struct vhost_net *net)
> >> if (nvq->done_idx > VHOST_NET_BATCH)
> >> vhost_net_signal_used(nvq);
> >> if (unlikely(vq_log))
> >> - vhost_log_write(vq, vq_log, log, vhost_len,
> >> + vhost_log_write(vq, vq_log, log, vhost_len, true,
> >> vq->iov, in);
> >> total_len += vhost_len;
> >> } while (likely(!vhost_exceeds_weight(vq, ++recv_pkts, total_len)));
> >> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> >> index 9ac25d08f473..db3b30aba940 100644
> >> --- a/drivers/vhost/vhost.c
> >> +++ b/drivers/vhost/vhost.c
> >> @@ -2304,8 +2304,14 @@ static int log_used(struct vhost_virtqueue *vq, u64 used_offset, u64 len)
> >> return 0;
> >> }
> >>
> >> -int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
> >> - unsigned int log_num, u64 len, struct iovec *iov, int count)
> >> +/*
> >> + * 'len' is used only when 'partial' is true, to indicate whether the
> >> + * entire length of each descriptor is logged.
> >> + */
> >
> > While at it, let's document all the parameters here.
>
> Sure.
>
> >
> >> +int vhost_log_write(struct vhost_virtqueue *vq,
> >> + struct vhost_log *log, unsigned int log_num,
> >> + u64 len, bool partial,
> >> + struct iovec *iov, int count)
> >> {
> >> int i, r;
> >>
> >> @@ -2323,19 +2329,19 @@ int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
> >> }
> >>
> >> for (i = 0; i < log_num; ++i) {
> >> - u64 l = min(log[i].len, len);
> >> + u64 l = partial ? min(log[i].len, len) : log[i].len;
> >> +
> >> r = log_write(vq->log_base, log[i].addr, l);
> >> if (r < 0)
> >> return r;
> >> - len -= l;
> >> - if (!len) {
> >> - if (vq->log_ctx)
> >> - eventfd_signal(vq->log_ctx);
> >> - return 0;
> >> - }
> >> +
> >> + if (partial)
> >> + len -= l;
> >
> > I wonder if it's simpler to just tweak the caller to call with the
> > correct len (or probably U64_MAX) in this case?
>
> To "tweak the caller to call with the correct len" may need to sum the length
> of all log[i].
>
> Regarding U64_MAX, would you like something below? That is, only use 'len'
> when it isn't U64_MAX.
>
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 9ac25d08f473..5b49de05e752 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -2327,15 +2327,14 @@ int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
> r = log_write(vq->log_base, log[i].addr, l);
> if (r < 0)
> return r;
> - len -= l;
> - if (!len) {
> - if (vq->log_ctx)
> - eventfd_signal(vq->log_ctx);
> - return 0;
> - }
> +
> + if (len != U64_MAX) ---> It is impossible to have len = U64_MAX from vhost-net
> + len -= l; How about keeping those two lines?
> }
> - /* Length written exceeds what we have stored. This is a bug. */
> - BUG();
> +
> + if (vq->log_ctx)
> + eventfd_signal(vq->log_ctx);
> +
> return 0;
> }
> EXPORT_SYMBOL_GPL(vhost_log_write);
Something like this.
Thanks
>
>
> Thank you very much!
>
> Dongli Zhang
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2 03/10] vhost-scsi: Fix vhost_scsi_send_status()
2025-03-19 17:54 ` Dongli Zhang
@ 2025-03-20 1:42 ` Jason Wang
0 siblings, 0 replies; 28+ messages in thread
From: Jason Wang @ 2025-03-20 1:42 UTC (permalink / raw)
To: Dongli Zhang
Cc: virtualization, kvm, netdev, mst, michael.christie, pbonzini,
stefanha, eperezma, joao.m.martins, joe.jin, si-wei.liu,
linux-kernel
On Thu, Mar 20, 2025 at 1:54 AM Dongli Zhang <dongli.zhang@oracle.com> wrote:
>
> Hi Jason,
>
> On 3/17/25 5:48 PM, Jason Wang wrote:
> > On Tue, Mar 18, 2025 at 7:52 AM Dongli Zhang <dongli.zhang@oracle.com> wrote:
> >>
> >> Although the support of VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 was
> >> signaled by the commit 664ed90e621c ("vhost/scsi: Set
> >> VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits"),
> >> vhost_scsi_send_bad_target() still assumes the response in a single
> >> descriptor.
> >>
> >> Similar issue in vhost_scsi_send_bad_target() has been fixed in previous
> >> commit.
> >>
> >> Fixes: 3ca51662f818 ("vhost-scsi: Add better resource allocation failure handling")
> >
> > And
> >
> > 6dd88fd59da84631b5fe5c8176931c38cfa3b265 ("vhost-scsi: unbreak any
> > layout for response")
> >
>
> Would suggest add the commit to Fixes?
>
> vhost_scsi_send_status() has been introduced by the most recent patch.
>
> It isn't related to that commit. That commit is to fix
> vhost_scsi_complete_cmd_work()
>
> Or would you suggest mention it as part of "Similar issue has been fixed in
> previous commit."?
I'm fine with either, they are all fixes for any header layout anyhow.
Thanks
>
> Thank you very much!
>
> Dongli Zhang
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2 01/10] vhost-scsi: protect vq->log_used with vq->mutex
2025-03-17 23:55 ` [PATCH v2 01/10] vhost-scsi: protect vq->log_used with vq->mutex Dongli Zhang
2025-03-18 0:47 ` Jason Wang
@ 2025-03-26 22:22 ` Mike Christie
1 sibling, 0 replies; 28+ messages in thread
From: Mike Christie @ 2025-03-26 22:22 UTC (permalink / raw)
To: Dongli Zhang, virtualization, kvm, netdev
Cc: mst, jasowang, pbonzini, stefanha, eperezma, joao.m.martins,
joe.jin, si-wei.liu, linux-kernel
On 3/17/25 6:55 PM, Dongli Zhang wrote:
> The vhost-scsi completion path may access vq->log_base when vq->log_used is
> already set to false.
>
> vhost-thread QEMU-thread
>
> vhost_scsi_complete_cmd_work()
> -> vhost_add_used()
> -> vhost_add_used_n()
> if (unlikely(vq->log_used))
> QEMU disables vq->log_used
> via VHOST_SET_VRING_ADDR.
> mutex_lock(&vq->mutex);
> vq->log_used = false now!
> mutex_unlock(&vq->mutex);
>
> QEMU gfree(vq->log_base)
> log_used()
> -> log_write(vq->log_base)
>
> Assuming the VMM is QEMU. The vq->log_base is from QEMU userpace and can be
> reclaimed via gfree(). As a result, this causes invalid memory writes to
> QEMU userspace.
>
> The control queue path has the same issue.
>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
Reviewed-by: Mike Christie <michael.christie@oracle.com>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2 02/10] vhost-scsi: Fix vhost_scsi_send_bad_target()
2025-03-17 23:55 ` [PATCH v2 02/10] vhost-scsi: Fix vhost_scsi_send_bad_target() Dongli Zhang
2025-03-18 0:48 ` Jason Wang
@ 2025-03-26 22:25 ` Mike Christie
1 sibling, 0 replies; 28+ messages in thread
From: Mike Christie @ 2025-03-26 22:25 UTC (permalink / raw)
To: Dongli Zhang, virtualization, kvm, netdev
Cc: mst, jasowang, pbonzini, stefanha, eperezma, joao.m.martins,
joe.jin, si-wei.liu, linux-kernel
On 3/17/25 6:55 PM, Dongli Zhang wrote:
> Although the support of VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 was
> signaled by the commit 664ed90e621c ("vhost/scsi: Set
> VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits"),
> vhost_scsi_send_bad_target() still assumes the response in a single
> descriptor.
>
> In addition, although vhost_scsi_send_bad_target() is used by both I/O
> queue and control queue, the response header is always
> virtio_scsi_cmd_resp. It is required to use virtio_scsi_ctrl_tmf_resp or
> virtio_scsi_ctrl_an_resp for control queue.
>
> Fixes: 664ed90e621c ("vhost/scsi: Set VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits")
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2 03/10] vhost-scsi: Fix vhost_scsi_send_status()
2025-03-17 23:55 ` [PATCH v2 03/10] vhost-scsi: Fix vhost_scsi_send_status() Dongli Zhang
2025-03-18 0:48 ` Jason Wang
@ 2025-03-26 22:27 ` Mike Christie
1 sibling, 0 replies; 28+ messages in thread
From: Mike Christie @ 2025-03-26 22:27 UTC (permalink / raw)
To: Dongli Zhang, virtualization, kvm, netdev
Cc: mst, jasowang, pbonzini, stefanha, eperezma, joao.m.martins,
joe.jin, si-wei.liu, linux-kernel
On 3/17/25 6:55 PM, Dongli Zhang wrote:
> Although the support of VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 was
> signaled by the commit 664ed90e621c ("vhost/scsi: Set
> VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits"),
> vhost_scsi_send_bad_target() still assumes the response in a single
> descriptor.
>
> Similar issue in vhost_scsi_send_bad_target() has been fixed in previous
> commit.
>
> Fixes: 3ca51662f818 ("vhost-scsi: Add better resource allocation failure handling")
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2 05/10] vhost-scsi: adjust vhost_scsi_get_desc() to log vring descriptors
2025-03-17 23:55 ` [PATCH v2 05/10] vhost-scsi: adjust vhost_scsi_get_desc() to log vring descriptors Dongli Zhang
@ 2025-03-26 23:21 ` Mike Christie
0 siblings, 0 replies; 28+ messages in thread
From: Mike Christie @ 2025-03-26 23:21 UTC (permalink / raw)
To: Dongli Zhang, virtualization, kvm, netdev
Cc: mst, jasowang, pbonzini, stefanha, eperezma, joao.m.martins,
joe.jin, si-wei.liu, linux-kernel
On 3/17/25 6:55 PM, Dongli Zhang wrote:
> Adjust vhost_scsi_get_desc() to facilitate logging of vring descriptors.
>
> Add new arguments to allow passing the log buffer and length to
> vhost_get_vq_desc().
>
> In addition, reset 'log_num' since vhost_get_vq_desc() may reset it only
> after certain condition checks.
>
> Suggested-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2 06/10] vhost-scsi: cache log buffer in I/O queue vhost_scsi_cmd
2025-03-18 0:04 ` Dongli Zhang
@ 2025-03-26 23:37 ` Mike Christie
2025-03-27 20:27 ` Dongli Zhang
0 siblings, 1 reply; 28+ messages in thread
From: Mike Christie @ 2025-03-26 23:37 UTC (permalink / raw)
To: Dongli Zhang, virtualization, kvm, netdev
Cc: mst, jasowang, pbonzini, stefanha, eperezma, joao.m.martins,
joe.jin, si-wei.liu, linux-kernel
On 3/17/25 7:04 PM, Dongli Zhang wrote:
> @@ -1390,6 +1424,24 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
> goto err;
> }
>
> + if (unlikely(vq_log && log_num)) {
> + if (!cmd->tvc_log)
> + cmd->tvc_log = kmalloc_array(vq->dev->iov_limit,
> + sizeof(*cmd->tvc_log),
> + GFP_KERNEL);
> +
> + if (likely(cmd->tvc_log)) {
> + memcpy(cmd->tvc_log, vq->log,> + sizeof(*cmd->tvc_log) * log_num);
> + cmd->tvc_log_num = log_num;
Hey Dongli, this approach seems ok.
Could you just move this to a function?
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2 09/10] vhost-scsi: log event queue write descriptors
2025-03-17 23:55 ` [PATCH v2 09/10] vhost-scsi: log event " Dongli Zhang
@ 2025-03-26 23:50 ` Mike Christie
2025-03-26 23:51 ` Mike Christie
1 sibling, 0 replies; 28+ messages in thread
From: Mike Christie @ 2025-03-26 23:50 UTC (permalink / raw)
To: Dongli Zhang, virtualization, kvm, netdev
Cc: mst, jasowang, pbonzini, stefanha, eperezma, joao.m.martins,
joe.jin, si-wei.liu, linux-kernel
On 3/17/25 6:55 PM, Dongli Zhang wrote:
> Log write descriptors for the event queue, leveraging vhost_get_vq_desc()
> to retrieve the array of write descriptors to obtain the log buffer.
>
> There is only one path for event queue.
>
> Suggested-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2 09/10] vhost-scsi: log event queue write descriptors
2025-03-17 23:55 ` [PATCH v2 09/10] vhost-scsi: log event " Dongli Zhang
2025-03-26 23:50 ` Mike Christie
@ 2025-03-26 23:51 ` Mike Christie
1 sibling, 0 replies; 28+ messages in thread
From: Mike Christie @ 2025-03-26 23:51 UTC (permalink / raw)
To: Dongli Zhang, virtualization, kvm, netdev
Cc: mst, jasowang, pbonzini, stefanha, eperezma, joao.m.martins,
joe.jin, si-wei.liu, linux-kernel
On 3/17/25 6:55 PM, Dongli Zhang wrote:
> Log write descriptors for the event queue, leveraging vhost_get_vq_desc()
> to retrieve the array of write descriptors to obtain the log buffer.
>
> There is only one path for event queue.
>
> Suggested-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2 06/10] vhost-scsi: cache log buffer in I/O queue vhost_scsi_cmd
2025-03-26 23:37 ` Mike Christie
@ 2025-03-27 20:27 ` Dongli Zhang
0 siblings, 0 replies; 28+ messages in thread
From: Dongli Zhang @ 2025-03-27 20:27 UTC (permalink / raw)
To: Mike Christie, virtualization, kvm, netdev
Cc: mst, jasowang, pbonzini, stefanha, eperezma, joao.m.martins,
joe.jin, si-wei.liu, linux-kernel
Hi Mike,
On 3/26/25 4:37 PM, Mike Christie wrote:
> On 3/17/25 7:04 PM, Dongli Zhang wrote:
>> @@ -1390,6 +1424,24 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
>> goto err;
>> }
>>
>> + if (unlikely(vq_log && log_num)) {
>> + if (!cmd->tvc_log)
>> + cmd->tvc_log = kmalloc_array(vq->dev->iov_limit,
>> + sizeof(*cmd->tvc_log),
>> + GFP_KERNEL);
>> +
>> + if (likely(cmd->tvc_log)) {
>> + memcpy(cmd->tvc_log, vq->log,> + sizeof(*cmd->tvc_log) * log_num);
>> + cmd->tvc_log_num = log_num;
>
>
> Hey Dongli, this approach seems ok.
>
> Could you just move this to a function?
Sure. I may re-send v3 following this approach (allocate on-demand in runtime),
and encapsulate above code into a function.
Thank you very much!
Dongli Zhang
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2025-03-27 20:27 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-17 23:55 [PATCH v2 00/10] vhost-scsi: log write descriptors for live migration (and three bugfix) Dongli Zhang
2025-03-17 23:55 ` [PATCH v2 01/10] vhost-scsi: protect vq->log_used with vq->mutex Dongli Zhang
2025-03-18 0:47 ` Jason Wang
2025-03-26 22:22 ` Mike Christie
2025-03-17 23:55 ` [PATCH v2 02/10] vhost-scsi: Fix vhost_scsi_send_bad_target() Dongli Zhang
2025-03-18 0:48 ` Jason Wang
2025-03-26 22:25 ` Mike Christie
2025-03-17 23:55 ` [PATCH v2 03/10] vhost-scsi: Fix vhost_scsi_send_status() Dongli Zhang
2025-03-18 0:48 ` Jason Wang
2025-03-19 17:54 ` Dongli Zhang
2025-03-20 1:42 ` Jason Wang
2025-03-26 22:27 ` Mike Christie
2025-03-17 23:55 ` [PATCH v2 04/10] vhost: modify vhost_log_write() for broader users Dongli Zhang
2025-03-18 1:12 ` Jason Wang
2025-03-19 16:38 ` Dongli Zhang
2025-03-20 1:41 ` Jason Wang
2025-03-17 23:55 ` [PATCH v2 05/10] vhost-scsi: adjust vhost_scsi_get_desc() to log vring descriptors Dongli Zhang
2025-03-26 23:21 ` Mike Christie
2025-03-17 23:55 ` [PATCH v2 06/10] vhost-scsi: cache log buffer in I/O queue vhost_scsi_cmd Dongli Zhang
2025-03-18 0:04 ` Dongli Zhang
2025-03-26 23:37 ` Mike Christie
2025-03-27 20:27 ` Dongli Zhang
2025-03-17 23:55 ` [PATCH v2 07/10] vhost-scsi: log I/O queue write descriptors Dongli Zhang
2025-03-17 23:55 ` [PATCH v2 08/10] vhost-scsi: log control " Dongli Zhang
2025-03-17 23:55 ` [PATCH v2 09/10] vhost-scsi: log event " Dongli Zhang
2025-03-26 23:50 ` Mike Christie
2025-03-26 23:51 ` Mike Christie
2025-03-17 23:55 ` [PATCH v2 10/10] vhost: add WARNING if log_num is more than limit Dongli Zhang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).