qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] Add Split indirect descriptors for SVQ
@ 2025-12-04  7:37 Wafer Xie
  2025-12-04  7:37 ` [PATCH 1/4] vhost: add data structure of virtio indirect descriptors in SVQ Wafer Xie
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Wafer Xie @ 2025-12-04  7:37 UTC (permalink / raw)
  To: mst, eperezma, jasowang, qemu-devel; +Cc: leiyang, sgarzare, angus.chen, wafer

This patch series adds support for VIRTIO split indirect descriptors.
The feature is VIRTIO_RING_F_INDIRECT_DESC.

Eugenio hs submitted a patch: vhost: accept indirect descriptors in shadow virtqueue
https://lists.nongnu.org/archive/html/qemu-devel/2025-12/msg00056.html
Therefore, this patch must be applied first.


Wafer Xie (4):
  vhost: add data structure of virtio indirect descriptors in SVQ
  vdpa: implement the interfaces alloc/free for indirect desc
  vhost: supported the virtio indirect desc of available ring
  vhost: supported the viriot indriect desc of used ring

 hw/virtio/vhost-shadow-virtqueue.c | 224 +++++++++++++++++++++++++++--
 hw/virtio/vhost-shadow-virtqueue.h |  57 +++++++-
 hw/virtio/vhost-vdpa.c             | 130 ++++++++++++++++-
 3 files changed, 395 insertions(+), 16 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4] vhost: add data structure of virtio indirect descriptors in SVQ
  2025-12-04  7:37 [PATCH 0/4] Add Split indirect descriptors for SVQ Wafer Xie
@ 2025-12-04  7:37 ` Wafer Xie
  2025-12-04  7:37 ` [PATCH 2/4] vdpa: implement the interfaces alloc/free for indirect desc Wafer Xie
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Wafer Xie @ 2025-12-04  7:37 UTC (permalink / raw)
  To: mst, eperezma, jasowang, qemu-devel; +Cc: leiyang, sgarzare, angus.chen, wafer

Store the virtio indirect descriptor in the SVQDescState structure

Signed-off-by: Wafer Xie <wafer@jaguarmicro.com>
---
 hw/virtio/vhost-shadow-virtqueue.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
index 9c273739d6..2b36df4dd5 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -23,6 +23,14 @@ typedef struct SVQDescState {
      * guest's
      */
     unsigned int ndescs;
+
+    /*
+     * Indirect descriptor table for this descriptor chain.
+     * NULL if not using indirect descriptors.
+     */
+    vring_desc_t *indirect_desc;
+    hwaddr indirect_iova;
+    size_t indirect_size;
 } SVQDescState;
 
 typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4] vdpa: implement the interfaces alloc/free for indirect desc
  2025-12-04  7:37 [PATCH 0/4] Add Split indirect descriptors for SVQ Wafer Xie
  2025-12-04  7:37 ` [PATCH 1/4] vhost: add data structure of virtio indirect descriptors in SVQ Wafer Xie
@ 2025-12-04  7:37 ` Wafer Xie
  2025-12-04  8:41   ` Eugenio Perez Martin
  2025-12-04  7:37 ` [PATCH 3/4] vhost: supported the virtio indirect desc of available ring Wafer Xie
  2025-12-04  7:56 ` [PATCH 0/4] Add Split indirect descriptors for SVQ Jason Wang
  3 siblings, 1 reply; 9+ messages in thread
From: Wafer Xie @ 2025-12-04  7:37 UTC (permalink / raw)
  To: mst, eperezma, jasowang, qemu-devel; +Cc: leiyang, sgarzare, angus.chen, wafer

Dynamic Allocation/Free indirect descriptor.

Signed-off-by: Wafer Xie <wafer@jaguarmicro.com>
---
 hw/virtio/vhost-shadow-virtqueue.c |   5 +-
 hw/virtio/vhost-shadow-virtqueue.h |  49 ++++++++++-
 hw/virtio/vhost-vdpa.c             | 130 ++++++++++++++++++++++++++++-
 3 files changed, 180 insertions(+), 4 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 2481d49345..ecc3245138 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -756,15 +756,18 @@ void vhost_svq_stop(VhostShadowVirtqueue *svq)
  *
  * @ops: SVQ owner callbacks
  * @ops_opaque: ops opaque pointer
+ * @indirect_ops: Indirect descriptor operations
  */
 VhostShadowVirtqueue *vhost_svq_new(const VhostShadowVirtqueueOps *ops,
-                                    void *ops_opaque)
+                                    void *ops_opaque,
+                                    SVQIndirectOps *indirect_ops)
 {
     VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
 
     event_notifier_init_fd(&svq->svq_kick, VHOST_FILE_UNBIND);
     svq->ops = ops;
     svq->ops_opaque = ops_opaque;
+    svq->indirect_ops = indirect_ops;
     return svq;
 }
 
diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
index 2b36df4dd5..c5d654eae8 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -54,6 +54,49 @@ typedef struct VhostShadowVirtqueueOps {
     VirtQueueAvailCallback avail_handler;
 } VhostShadowVirtqueueOps;
 
+/**
+ * Callback to allocate indirect descriptor table.
+ *
+ * @svq: Shadow virtqueue
+ * @num: Number of descriptors
+ * @desc: Output pointer to allocated descriptor table
+ * @iova: Output IOVA of the allocated table
+ * @size: Output size of the allocated region
+ * @opaque: Opaque data (vhost_vdpa pointer)
+ *
+ * Returns true on success, false on failure.
+ */
+typedef bool (*SVQIndirectDescAlloc)(VhostShadowVirtqueue *svq,
+                                     size_t num,
+                                     vring_desc_t **desc,
+                                     hwaddr *iova,
+                                     size_t *size,
+                                     void *opaque);
+
+/**
+ * Callback to free indirect descriptor table.
+ *
+ * @svq: Shadow virtqueue
+ * @desc: Descriptor table to free
+ * @iova: IOVA of the table
+ * @size: Size of the allocated region
+ * @opaque: Opaque data (vhost_vdpa pointer)
+ */
+typedef void (*SVQIndirectDescFree)(VhostShadowVirtqueue *svq,
+                                    vring_desc_t *desc,
+                                    hwaddr iova,
+                                    size_t size,
+                                    void *opaque);
+
+/**
+ * Indirect descriptor operations and context
+ */
+typedef struct SVQIndirectOps {
+    SVQIndirectDescAlloc alloc;
+    SVQIndirectDescFree free;
+    void *opaque;  /* Pointer to struct vhost_vdpa */
+} SVQIndirectOps;
+
 /* Shadow virtqueue to relay notifications */
 typedef struct VhostShadowVirtqueue {
     /* Shadow vring */
@@ -104,6 +147,9 @@ typedef struct VhostShadowVirtqueue {
     /* Caller callbacks opaque */
     void *ops_opaque;
 
+    /* Indirect descriptor operations */
+    SVQIndirectOps *indirect_ops;
+
     /* Next head to expose to the device */
     uint16_t shadow_avail_idx;
 
@@ -143,7 +189,8 @@ void vhost_svq_start(VhostShadowVirtqueue *svq, VirtIODevice *vdev,
 void vhost_svq_stop(VhostShadowVirtqueue *svq);
 
 VhostShadowVirtqueue *vhost_svq_new(const VhostShadowVirtqueueOps *ops,
-                                    void *ops_opaque);
+                                    void *ops_opaque,
+                                    SVQIndirectOps *indirect_ops);
 
 void vhost_svq_free(gpointer vq);
 G_DEFINE_AUTOPTR_CLEANUP_FUNC(VhostShadowVirtqueue, vhost_svq_free);
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 7061b6e1a3..1719993f52 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -584,15 +584,130 @@ static int vhost_vdpa_get_dev_features(struct vhost_dev *dev,
     return ret;
 }
 
+/**
+ * Allocate indirect descriptor table for SVQ
+ *
+ * @svq: Shadow virtqueue
+ * @num: Number of descriptors needed
+ * @desc: Output pointer to the allocated table
+ * @iova: Output IOVA for the table
+ * @size: Output size of the allocated region
+ * @opaque: Opaque pointer (vhost_vdpa instance)
+ *
+ * Returns true on success, false on failure.
+ */
+static bool vhost_vdpa_svq_indirect_desc_alloc(VhostShadowVirtqueue *svq,
+                                                size_t num,
+                                                vring_desc_t **desc,
+                                                hwaddr *iova,
+                                                size_t *size,
+                                                void *opaque)
+{
+    struct vhost_vdpa *v = opaque;
+    size_t desc_size = sizeof(vring_desc_t) * num;
+    size_t alloc_size = ROUND_UP(desc_size, qemu_real_host_page_size());
+    DMAMap needle = {
+        .size = alloc_size - 1,
+        .perm = IOMMU_RO,
+    };
+    vring_desc_t *indirect_desc;
+    int r;
+
+    indirect_desc = mmap(NULL, alloc_size, PROT_READ | PROT_WRITE,
+                         MAP_SHARED | MAP_ANONYMOUS, -1, 0);
+    if (indirect_desc == MAP_FAILED) {
+        error_report("Cannot allocate indirect descriptor table");
+        return false;
+    }
+
+    r = vhost_iova_tree_map_alloc(v->shared->iova_tree, &needle,
+                                  (hwaddr)(uintptr_t)indirect_desc);
+    if (unlikely(r != IOVA_OK)) {
+        error_report("Cannot allocate iova for indirect descriptors (%d)", r);
+        goto err_iova_alloc;
+    }
+
+    r = vhost_vdpa_dma_map(v->shared, v->address_space_id, needle.iova,
+                           alloc_size, indirect_desc, true);
+    if (unlikely(r != 0)) {
+        error_report("Cannot map indirect descriptors to device: %s (%d)",
+                     g_strerror(-r), -r);
+        goto err_dma_map;
+    }
+
+    *desc = indirect_desc;
+    *iova = needle.iova;
+    *size = alloc_size;
+    return true;
+
+err_dma_map:
+    vhost_iova_tree_remove(v->shared->iova_tree, needle);
+err_iova_alloc:
+    munmap(indirect_desc, alloc_size);
+    return false;
+}
+
+/**
+ * Free indirect descriptor table for SVQ
+ *
+ * @svq: Shadow virtqueue
+ * @desc: Descriptor table to free
+ * @iova: IOVA of the table
+ * @size: Size of the allocated region
+ * @opaque: Opaque pointer (vhost_vdpa instance)
+ */
+static void vhost_vdpa_svq_indirect_desc_free(VhostShadowVirtqueue *svq,
+                                               vring_desc_t *desc,
+                                               hwaddr iova,
+                                               size_t size,
+                                               void *opaque)
+{
+    struct vhost_vdpa *v = opaque;
+    const DMAMap needle = {
+        .translated_addr = (hwaddr)(uintptr_t)desc,
+    };
+    const DMAMap *result;
+    int r;
+
+    /* Find the mapping in the IOVA tree by HVA */
+    result = vhost_iova_tree_find_iova(v->shared->iova_tree, &needle);
+    if (unlikely(!result)) {
+        error_report("Cannot find indirect descriptor mapping to unmap: "
+                     "iova=0x%" HWADDR_PRIx " hva=0x%" HWADDR_PRIx " size=%zu",
+                     iova, needle.translated_addr, size);
+        return;
+    }
+
+    r = vhost_vdpa_dma_unmap(v->shared, v->address_space_id, result->iova,
+                             size);
+    if (unlikely(r != 0)) {
+        error_report("Cannot unmap indirect descriptors: "
+                     "iova=0x%" HWADDR_PRIx " size=%zu: %s (%d)",
+                     result->iova, size, g_strerror(-r), -r);
+    }
+
+    vhost_iova_tree_remove(v->shared->iova_tree, *result);
+
+    munmap(desc, size);
+}
+
 static void vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v)
 {
     g_autoptr(GPtrArray) shadow_vqs = NULL;
+    SVQIndirectOps *indirect_ops;
+
+    /* Create indirect descriptor ops */
+    indirect_ops = g_new0(SVQIndirectOps, 1);
+    indirect_ops->alloc = vhost_vdpa_svq_indirect_desc_alloc;
+    indirect_ops->free = vhost_vdpa_svq_indirect_desc_free;
+    indirect_ops->opaque = v;
 
     shadow_vqs = g_ptr_array_new_full(hdev->nvqs, vhost_svq_free);
     for (unsigned n = 0; n < hdev->nvqs; ++n) {
         VhostShadowVirtqueue *svq;
 
-        svq = vhost_svq_new(v->shadow_vq_ops, v->shadow_vq_ops_opaque);
+        svq = vhost_svq_new(v->shadow_vq_ops, v->shadow_vq_ops_opaque,
+                            indirect_ops);
         g_ptr_array_add(shadow_vqs, svq);
     }
 
@@ -789,10 +904,21 @@ static void vhost_vdpa_svq_cleanup(struct vhost_dev *dev)
 {
     struct vhost_vdpa *v = dev->opaque;
     size_t idx;
+    SVQIndirectOps *indirect_ops = NULL;
 
     for (idx = 0; idx < v->shadow_vqs->len; ++idx) {
-        vhost_svq_stop(g_ptr_array_index(v->shadow_vqs, idx));
+        VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, idx);
+
+        /* Save indirect_ops pointer from first SVQ (all share the same one) */
+        if (idx == 0 && svq->indirect_ops) {
+            indirect_ops = svq->indirect_ops;
+        }
+
+        vhost_svq_stop(svq);
     }
+
+    g_free(indirect_ops);
+
     g_ptr_array_free(v->shadow_vqs, true);
 }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/4] vhost: supported the virtio indirect desc of available ring
  2025-12-04  7:37 [PATCH 0/4] Add Split indirect descriptors for SVQ Wafer Xie
  2025-12-04  7:37 ` [PATCH 1/4] vhost: add data structure of virtio indirect descriptors in SVQ Wafer Xie
  2025-12-04  7:37 ` [PATCH 2/4] vdpa: implement the interfaces alloc/free for indirect desc Wafer Xie
@ 2025-12-04  7:37 ` Wafer Xie
  2025-12-04  7:56 ` [PATCH 0/4] Add Split indirect descriptors for SVQ Jason Wang
  3 siblings, 0 replies; 9+ messages in thread
From: Wafer Xie @ 2025-12-04  7:37 UTC (permalink / raw)
  To: mst, eperezma, jasowang, qemu-devel; +Cc: leiyang, sgarzare, angus.chen, wafer

Implement the insertion of available buffers in the
indirect descriptor area of split shadow virtqueues.

Signed-off-by: Wafer Xie <wafer@jaguarmicro.com>
---
 hw/virtio/vhost-shadow-virtqueue.c | 183 +++++++++++++++++++++++++++--
 1 file changed, 175 insertions(+), 8 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index ecc3245138..94ad5c3a57 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -189,6 +189,126 @@ static bool vhost_svq_vring_write_descs(VhostShadowVirtqueue *svq, hwaddr *sg,
     return true;
 }
 
+/**
+ * Write descriptors to indirect descriptor table
+ *
+ * @svq: The shadow virtqueue
+ * @sg: Cache for hwaddr
+ * @iovec: The iovec from the guest
+ * @num: iovec length
+ * @addr: Descriptors' GPAs, if backed by guest memory
+ * @indirect_desc: The indirect descriptor table
+ * @start_idx: Starting index in the indirect descriptor table
+ * @total_descs: Total number of descriptors in the table
+ * @write: True if they are writeable descriptors
+ *
+ * Return true if success, false otherwise and print error.
+ */
+static bool vhost_svq_vring_write_indirect_descs(VhostShadowVirtqueue *svq,
+                                                  hwaddr *sg,
+                                                  const struct iovec *iovec,
+                                                  size_t num,
+                                                  const hwaddr *addr,
+                                                  vring_desc_t *indirect_desc,
+                                                  size_t start_idx,
+                                                  size_t total_descs,
+                                                  bool write)
+{
+    bool ok;
+    uint16_t flags = write ? cpu_to_le16(VRING_DESC_F_WRITE) : 0;
+
+    if (num == 0) {
+        return true;
+    }
+
+    ok = vhost_svq_translate_addr(svq, sg, iovec, num, addr);
+    if (unlikely(!ok)) {
+        return false;
+    }
+
+    for (size_t n = 0; n < num; n++) {
+        size_t idx = start_idx + n;
+        indirect_desc[idx].addr = cpu_to_le64(sg[n]);
+        indirect_desc[idx].len = cpu_to_le32(iovec[n].iov_len);
+        indirect_desc[idx].flags = flags;
+        if (idx + 1 < total_descs) {
+            indirect_desc[idx].flags |= cpu_to_le16(VRING_DESC_F_NEXT);
+            indirect_desc[idx].next = cpu_to_le16(idx + 1);
+        }
+    }
+
+    return true;
+}
+
+/**
+ * Add descriptors to SVQ vring using indirect descriptors
+ *
+ * @svq: The shadow virtqueue
+ * @out_sg: The out iovec from the guest
+ * @out_num: The out iovec length
+ * @out_addr: The out descriptors' GPAs
+ * @in_sg: The in iovec from the guest
+ * @in_num: The in iovec length
+ * @in_addr: The in descriptors' GPAs
+ * @sgs: Cache for hwaddr
+ * @total_descs: Total number of descriptors (out_num + in_num)
+ * @indirect_desc: Pre-allocated indirect descriptor table
+ * @indirect_iova: IOVA of the indirect descriptor table
+ * @indirect_size: Size of the indirect descriptor table
+ *
+ * Return true if success, false otherwise and print error.
+ */
+static bool vhost_svq_add_split_indirect(VhostShadowVirtqueue *svq,
+                                         const struct iovec *out_sg,
+                                         size_t out_num,
+                                         const hwaddr *out_addr,
+                                         const struct iovec *in_sg,
+                                         size_t in_num,
+                                         const hwaddr *in_addr,
+                                         hwaddr *sgs, size_t total_descs,
+                                         vring_desc_t *indirect_desc,
+                                         hwaddr indirect_iova,
+                                         size_t indirect_size)
+{
+    bool ok;
+
+    /* Populate indirect descriptor table for out descriptors */
+    ok = vhost_svq_vring_write_indirect_descs(svq, sgs, out_sg, out_num,
+                                               out_addr, indirect_desc,
+                                               0, total_descs, false);
+    if (unlikely(!ok)) {
+        svq->indirect_ops->free(svq, indirect_desc, indirect_iova,
+                                indirect_size, svq->indirect_ops->opaque);
+        return false;
+    }
+
+    /* Populate indirect descriptor table for in descriptors */
+    ok = vhost_svq_vring_write_indirect_descs(svq, sgs, in_sg, in_num,
+                                               in_addr, indirect_desc,
+                                               out_num, total_descs, true);
+    if (unlikely(!ok)) {
+        svq->indirect_ops->free(svq, indirect_desc, indirect_iova,
+                                indirect_size, svq->indirect_ops->opaque);
+        return false;
+    }
+
+    /* Add a single descriptor pointing to the indirect table */
+    svq->vring.desc[svq->free_head].addr = cpu_to_le64(indirect_iova);
+    svq->vring.desc[svq->free_head].len =
+            cpu_to_le32(total_descs * sizeof(vring_desc_t));
+    svq->vring.desc[svq->free_head].flags = cpu_to_le16(VRING_DESC_F_INDIRECT);
+
+    /* Store indirect descriptor info in desc_state */
+    svq->desc_state[svq->free_head].indirect_desc = indirect_desc;
+    svq->desc_state[svq->free_head].indirect_iova = indirect_iova;
+    svq->desc_state[svq->free_head].indirect_size = indirect_size;
+
+    /* Move free_head forward */
+    svq->free_head = svq->desc_next[svq->free_head];
+
+    return true;
+}
+
 static bool vhost_svq_add_split(VhostShadowVirtqueue *svq,
                                 const struct iovec *out_sg, size_t out_num,
                                 const hwaddr *out_addr,
@@ -199,6 +319,8 @@ static bool vhost_svq_add_split(VhostShadowVirtqueue *svq,
     vring_avail_t *avail = svq->vring.avail;
     bool ok;
     g_autofree hwaddr *sgs = g_new(hwaddr, MAX(out_num, in_num));
+    size_t total_descs = out_num + in_num;
+    bool use_indirect;
 
     *head = svq->free_head;
 
@@ -209,16 +331,61 @@ static bool vhost_svq_add_split(VhostShadowVirtqueue *svq,
         return false;
     }
 
-    ok = vhost_svq_vring_write_descs(svq, sgs, out_sg, out_num, out_addr,
-                                     in_num > 0, false);
-    if (unlikely(!ok)) {
-        return false;
+    /*
+     * Use indirect descriptors if:
+     * 1. Feature is negotiated
+     * 2. Total descriptors > 1
+     * 3. Callback is available
+     */
+    use_indirect = virtio_vdev_has_feature(svq->vdev,
+                                       VIRTIO_RING_F_INDIRECT_DESC) &&
+                   total_descs > 1 &&
+                   svq->indirect_ops &&
+                   svq->indirect_ops->alloc;
+
+    if (use_indirect) {
+        vring_desc_t *indirect_desc = NULL;
+        hwaddr indirect_iova = 0;
+        size_t indirect_size = 0;
+
+        /* Try to allocate indirect descriptor table */
+        ok = svq->indirect_ops->alloc(svq, total_descs,
+                                      &indirect_desc, &indirect_iova,
+                                      &indirect_size,
+                                      svq->indirect_ops->opaque);
+        if (ok) {
+            /* Allocation succeeded, use indirect mode */
+            ok = vhost_svq_add_split_indirect(svq, out_sg, out_num, out_addr,
+                                              in_sg, in_num, in_addr,
+                                              sgs, total_descs,
+                                              indirect_desc, indirect_iova,
+                                              indirect_size);
+            if (unlikely(!ok)) {
+                /* Failed to populate indirect descriptors, free and fallback */
+                svq->indirect_ops->free(svq, indirect_desc, indirect_iova,
+                                        indirect_size,
+                                        svq->indirect_ops->opaque);
+                return false;
+            }
+        } else {
+            /* Allocation failed, fallback to direct mode */
+            use_indirect = false;
+        }
     }
 
-    ok = vhost_svq_vring_write_descs(svq, sgs, in_sg, in_num, in_addr, false,
-                                     true);
-    if (unlikely(!ok)) {
-        return false;
+    if (!use_indirect) {
+        /* Use direct descriptors */
+        ok = vhost_svq_vring_write_descs(svq, sgs, out_sg, out_num, out_addr,
+                                         in_num > 0, false);
+        if (unlikely(!ok)) {
+            return false;
+        }
+
+        ok = vhost_svq_vring_write_descs(svq, sgs, in_sg, in_num, in_addr,
+                                         false, true);
+        if (unlikely(!ok)) {
+            return false;
+        }
     }
 
     /*
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] Add Split indirect descriptors for SVQ
  2025-12-04  7:37 [PATCH 0/4] Add Split indirect descriptors for SVQ Wafer Xie
                   ` (2 preceding siblings ...)
  2025-12-04  7:37 ` [PATCH 3/4] vhost: supported the virtio indirect desc of available ring Wafer Xie
@ 2025-12-04  7:56 ` Jason Wang
  2025-12-05  1:30   ` Wafer
  3 siblings, 1 reply; 9+ messages in thread
From: Jason Wang @ 2025-12-04  7:56 UTC (permalink / raw)
  To: Wafer Xie; +Cc: mst, eperezma, qemu-devel, leiyang, sgarzare, angus.chen

On Thu, Dec 4, 2025 at 3:38 PM Wafer Xie <wafer@jaguarmicro.com> wrote:
>
> This patch series adds support for VIRTIO split indirect descriptors.
> The feature is VIRTIO_RING_F_INDIRECT_DESC.
>
> Eugenio hs submitted a patch: vhost: accept indirect descriptors in shadow virtqueue
> https://lists.nongnu.org/archive/html/qemu-devel/2025-12/msg00056.html
> Therefore, this patch must be applied first.

I may miss something but svq can read indirect descriptors even though
it doesn't use indirect descriptor, that's sufficient for making it
claims to support indirect descriptors. That's what Eugenio said in
that series.

This series could be an optimization on top?

Thanks

>
>
> Wafer Xie (4):
>   vhost: add data structure of virtio indirect descriptors in SVQ
>   vdpa: implement the interfaces alloc/free for indirect desc
>   vhost: supported the virtio indirect desc of available ring
>   vhost: supported the viriot indriect desc of used ring
>
>  hw/virtio/vhost-shadow-virtqueue.c | 224 +++++++++++++++++++++++++++--
>  hw/virtio/vhost-shadow-virtqueue.h |  57 +++++++-
>  hw/virtio/vhost-vdpa.c             | 130 ++++++++++++++++-
>  3 files changed, 395 insertions(+), 16 deletions(-)
>
> --
> 2.34.1
>



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/4] vdpa: implement the interfaces alloc/free for indirect desc
  2025-12-04  7:37 ` [PATCH 2/4] vdpa: implement the interfaces alloc/free for indirect desc Wafer Xie
@ 2025-12-04  8:41   ` Eugenio Perez Martin
  2025-12-08 11:27     ` Wafer
  0 siblings, 1 reply; 9+ messages in thread
From: Eugenio Perez Martin @ 2025-12-04  8:41 UTC (permalink / raw)
  To: Wafer Xie; +Cc: mst, jasowang, qemu-devel, leiyang, sgarzare, angus.chen

On Thu, Dec 4, 2025 at 8:39 AM Wafer Xie <wafer@jaguarmicro.com> wrote:
>
> Dynamic Allocation/Free indirect descriptor.
>
> Signed-off-by: Wafer Xie <wafer@jaguarmicro.com>
> ---
>  hw/virtio/vhost-shadow-virtqueue.c |   5 +-
>  hw/virtio/vhost-shadow-virtqueue.h |  49 ++++++++++-
>  hw/virtio/vhost-vdpa.c             | 130 ++++++++++++++++++++++++++++-
>  3 files changed, 180 insertions(+), 4 deletions(-)
>
> diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
> index 2481d49345..ecc3245138 100644
> --- a/hw/virtio/vhost-shadow-virtqueue.c
> +++ b/hw/virtio/vhost-shadow-virtqueue.c
> @@ -756,15 +756,18 @@ void vhost_svq_stop(VhostShadowVirtqueue *svq)
>   *
>   * @ops: SVQ owner callbacks
>   * @ops_opaque: ops opaque pointer
> + * @indirect_ops: Indirect descriptor operations
>   */
>  VhostShadowVirtqueue *vhost_svq_new(const VhostShadowVirtqueueOps *ops,
> -                                    void *ops_opaque)
> +                                    void *ops_opaque,
> +                                    SVQIndirectOps *indirect_ops)
>  {
>      VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
>
>      event_notifier_init_fd(&svq->svq_kick, VHOST_FILE_UNBIND);
>      svq->ops = ops;
>      svq->ops_opaque = ops_opaque;
> +    svq->indirect_ops = indirect_ops;
>      return svq;
>  }
>
> diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
> index 2b36df4dd5..c5d654eae8 100644
> --- a/hw/virtio/vhost-shadow-virtqueue.h
> +++ b/hw/virtio/vhost-shadow-virtqueue.h
> @@ -54,6 +54,49 @@ typedef struct VhostShadowVirtqueueOps {
>      VirtQueueAvailCallback avail_handler;
>  } VhostShadowVirtqueueOps;
>
> +/**
> + * Callback to allocate indirect descriptor table.
> + *
> + * @svq: Shadow virtqueue
> + * @num: Number of descriptors
> + * @desc: Output pointer to allocated descriptor table
> + * @iova: Output IOVA of the allocated table
> + * @size: Output size of the allocated region
> + * @opaque: Opaque data (vhost_vdpa pointer)
> + *
> + * Returns true on success, false on failure.
> + */
> +typedef bool (*SVQIndirectDescAlloc)(VhostShadowVirtqueue *svq,
> +                                     size_t num,
> +                                     vring_desc_t **desc,
> +                                     hwaddr *iova,
> +                                     size_t *size,
> +                                     void *opaque);
> +
> +/**
> + * Callback to free indirect descriptor table.
> + *
> + * @svq: Shadow virtqueue
> + * @desc: Descriptor table to free
> + * @iova: IOVA of the table
> + * @size: Size of the allocated region
> + * @opaque: Opaque data (vhost_vdpa pointer)
> + */
> +typedef void (*SVQIndirectDescFree)(VhostShadowVirtqueue *svq,
> +                                    vring_desc_t *desc,
> +                                    hwaddr iova,
> +                                    size_t size,
> +                                    void *opaque);
> +
> +/**
> + * Indirect descriptor operations and context
> + */
> +typedef struct SVQIndirectOps {
> +    SVQIndirectDescAlloc alloc;
> +    SVQIndirectDescFree free;
> +    void *opaque;  /* Pointer to struct vhost_vdpa */
> +} SVQIndirectOps;
> +
>  /* Shadow virtqueue to relay notifications */
>  typedef struct VhostShadowVirtqueue {
>      /* Shadow vring */
> @@ -104,6 +147,9 @@ typedef struct VhostShadowVirtqueue {
>      /* Caller callbacks opaque */
>      void *ops_opaque;
>
> +    /* Indirect descriptor operations */
> +    SVQIndirectOps *indirect_ops;
> +
>      /* Next head to expose to the device */
>      uint16_t shadow_avail_idx;
>
> @@ -143,7 +189,8 @@ void vhost_svq_start(VhostShadowVirtqueue *svq, VirtIODevice *vdev,
>  void vhost_svq_stop(VhostShadowVirtqueue *svq);
>
>  VhostShadowVirtqueue *vhost_svq_new(const VhostShadowVirtqueueOps *ops,
> -                                    void *ops_opaque);
> +                                    void *ops_opaque,
> +                                    SVQIndirectOps *indirect_ops);
>
>  void vhost_svq_free(gpointer vq);
>  G_DEFINE_AUTOPTR_CLEANUP_FUNC(VhostShadowVirtqueue, vhost_svq_free);
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index 7061b6e1a3..1719993f52 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -584,15 +584,130 @@ static int vhost_vdpa_get_dev_features(struct vhost_dev *dev,
>      return ret;
>  }
>
> +/**
> + * Allocate indirect descriptor table for SVQ
> + *
> + * @svq: Shadow virtqueue
> + * @num: Number of descriptors needed
> + * @desc: Output pointer to the allocated table
> + * @iova: Output IOVA for the table
> + * @size: Output size of the allocated region
> + * @opaque: Opaque pointer (vhost_vdpa instance)
> + *
> + * Returns true on success, false on failure.
> + */
> +static bool vhost_vdpa_svq_indirect_desc_alloc(VhostShadowVirtqueue *svq,
> +                                                size_t num,
> +                                                vring_desc_t **desc,
> +                                                hwaddr *iova,
> +                                                size_t *size,
> +                                                void *opaque)
> +{
> +    struct vhost_vdpa *v = opaque;
> +    size_t desc_size = sizeof(vring_desc_t) * num;
> +    size_t alloc_size = ROUND_UP(desc_size, qemu_real_host_page_size());
> +    DMAMap needle = {
> +        .size = alloc_size - 1,
> +        .perm = IOMMU_RO,
> +    };
> +    vring_desc_t *indirect_desc;
> +    int r;
> +
> +    indirect_desc = mmap(NULL, alloc_size, PROT_READ | PROT_WRITE,
> +                         MAP_SHARED | MAP_ANONYMOUS, -1, 0);
> +    if (indirect_desc == MAP_FAILED) {
> +        error_report("Cannot allocate indirect descriptor table");
> +        return false;
> +    }
> +
> +    r = vhost_iova_tree_map_alloc(v->shared->iova_tree, &needle,
> +                                  (hwaddr)(uintptr_t)indirect_desc);
> +    if (unlikely(r != IOVA_OK)) {
> +        error_report("Cannot allocate iova for indirect descriptors (%d)", r);
> +        goto err_iova_alloc;
> +    }
> +
> +    r = vhost_vdpa_dma_map(v->shared, v->address_space_id, needle.iova,
> +                           alloc_size, indirect_desc, true);

Thanks for moving this forward! But the map and unmap operations could
be heavy in vDPA. My idea was to use a static allocated buffer for the
indirect table first, but then we need to agree on the right size of
it etc.

Can you profile the impact of it vs converting them to chained? If we
find we can improve the performance then it would be a great addition.

> +    if (unlikely(r != 0)) {
> +        error_report("Cannot map indirect descriptors to device: %s (%d)",
> +                     g_strerror(-r), -r);
> +        goto err_dma_map;
> +    }
> +
> +    *desc = indirect_desc;
> +    *iova = needle.iova;
> +    *size = alloc_size;
> +    return true;
> +
> +err_dma_map:
> +    vhost_iova_tree_remove(v->shared->iova_tree, needle);
> +err_iova_alloc:
> +    munmap(indirect_desc, alloc_size);
> +    return false;
> +}
> +
> +/**
> + * Free indirect descriptor table for SVQ
> + *
> + * @svq: Shadow virtqueue
> + * @desc: Descriptor table to free
> + * @iova: IOVA of the table
> + * @size: Size of the allocated region
> + * @opaque: Opaque pointer (vhost_vdpa instance)
> + */
> +static void vhost_vdpa_svq_indirect_desc_free(VhostShadowVirtqueue *svq,
> +                                               vring_desc_t *desc,
> +                                               hwaddr iova,
> +                                               size_t size,
> +                                               void *opaque)
> +{
> +    struct vhost_vdpa *v = opaque;
> +    const DMAMap needle = {
> +        .translated_addr = (hwaddr)(uintptr_t)desc,
> +    };
> +    const DMAMap *result;
> +    int r;
> +
> +    /* Find the mapping in the IOVA tree by HVA */
> +    result = vhost_iova_tree_find_iova(v->shared->iova_tree, &needle);
> +    if (unlikely(!result)) {
> +        error_report("Cannot find indirect descriptor mapping to unmap: "
> +                     "iova=0x%" HWADDR_PRIx " hva=0x%" HWADDR_PRIx " size=%zu",
> +                     iova, needle.translated_addr, size);
> +        return;
> +    }
> +
> +    r = vhost_vdpa_dma_unmap(v->shared, v->address_space_id, result->iova,
> +                             size);
> +    if (unlikely(r != 0)) {
> +        error_report("Cannot unmap indirect descriptors: "
> +                     "iova=0x%" HWADDR_PRIx " size=%zu: %s (%d)",
> +                     result->iova, size, g_strerror(-r), -r);
> +    }
> +
> +    vhost_iova_tree_remove(v->shared->iova_tree, *result);
> +
> +    munmap(desc, size);
> +}
> +
>  static void vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v)
>  {
>      g_autoptr(GPtrArray) shadow_vqs = NULL;
> +    SVQIndirectOps *indirect_ops;
> +
> +    /* Create indirect descriptor ops */
> +    indirect_ops = g_new0(SVQIndirectOps, 1);
> +    indirect_ops->alloc = vhost_vdpa_svq_indirect_desc_alloc;
> +    indirect_ops->free = vhost_vdpa_svq_indirect_desc_free;
> +    indirect_ops->opaque = v;
>
>      shadow_vqs = g_ptr_array_new_full(hdev->nvqs, vhost_svq_free);
>      for (unsigned n = 0; n < hdev->nvqs; ++n) {
>          VhostShadowVirtqueue *svq;
>
> -        svq = vhost_svq_new(v->shadow_vq_ops, v->shadow_vq_ops_opaque);
> +        svq = vhost_svq_new(v->shadow_vq_ops, v->shadow_vq_ops_opaque,
> +                            indirect_ops);
>          g_ptr_array_add(shadow_vqs, svq);
>      }
>
> @@ -789,10 +904,21 @@ static void vhost_vdpa_svq_cleanup(struct vhost_dev *dev)
>  {
>      struct vhost_vdpa *v = dev->opaque;
>      size_t idx;
> +    SVQIndirectOps *indirect_ops = NULL;
>
>      for (idx = 0; idx < v->shadow_vqs->len; ++idx) {
> -        vhost_svq_stop(g_ptr_array_index(v->shadow_vqs, idx));
> +        VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, idx);
> +
> +        /* Save indirect_ops pointer from first SVQ (all share the same one) */

Why not put it in shared them?

But I don't really get why they are callbacks. What's the issue with
if (indirect) { call to indirect functions } ?

> +        if (idx == 0 && svq->indirect_ops) {
> +            indirect_ops = svq->indirect_ops;
> +        }
> +
> +        vhost_svq_stop(svq);
>      }
> +
> +    g_free(indirect_ops);
> +
>      g_ptr_array_free(v->shadow_vqs, true);
>  }
>
> --
> 2.34.1
>



^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH 0/4] Add Split indirect descriptors for SVQ
  2025-12-04  7:56 ` [PATCH 0/4] Add Split indirect descriptors for SVQ Jason Wang
@ 2025-12-05  1:30   ` Wafer
  2025-12-05  1:33     ` Jason Wang
  0 siblings, 1 reply; 9+ messages in thread
From: Wafer @ 2025-12-05  1:30 UTC (permalink / raw)
  To: Jason Wang
  Cc: mst@redhat.com, eperezma@redhat.com, qemu-devel@nongnu.org,
	leiyang@redhat.com, sgarzare@redhat.com, Angus Chen



> -----Original Message-----
> From: Jason Wang <jasowang@redhat.com>
> Sent: 2025年12月4日 15:56
> To: Wafer <wafer@jaguarmicro.com>
> Cc: mst@redhat.com; eperezma@redhat.com; qemu-devel@nongnu.org;
> leiyang@redhat.com; sgarzare@redhat.com; Angus Chen
> <angus.chen@jaguarmicro.com>
> Subject: Re: [PATCH 0/4] Add Split indirect descriptors for SVQ
> 
> External Mail: This email originated from OUTSIDE of the organization!
> Do not click links, open attachments or provide ANY information unless you
> recognize the sender and know the content is safe.
> 
> 
> On Thu, Dec 4, 2025 at 3:38 PM Wafer Xie <wafer@jaguarmicro.com> wrote:
> >
> > This patch series adds support for VIRTIO split indirect descriptors.
> > The feature is VIRTIO_RING_F_INDIRECT_DESC.
> >
> > Eugenio hs submitted a patch: vhost: accept indirect descriptors in
> > shadow virtqueue
> > https://lists.nongnu.org/archive/html/qemu-devel/2025-
> 12/msg00056.html
> > Therefore, this patch must be applied first.
> 
> I may miss something but svq can read indirect descriptors even though it
> doesn't use indirect descriptor, that's sufficient for making it claims to
> support indirect descriptors. That's what Eugenio said in that series.
> 
> This series could be an optimization on top?
> 
> Thanks
> 

Thanks for your reply!
Right, the current svq implementation can read the indirect descriptors provided by the driver, but it cannot use indirect descriptors when interacting with the backend device.
However, if a backend device implements indirect descriptors and wants to use QEMU to validate its indirect descriptor support, then QEMU needs to fully support indirect descriptors in this path. This series is intended to provide that full support, so that such backend devices can be properly tested and validated via QEMU.

> >
> >
> > Wafer Xie (4):
> >   vhost: add data structure of virtio indirect descriptors in SVQ
> >   vdpa: implement the interfaces alloc/free for indirect desc
> >   vhost: supported the virtio indirect desc of available ring
> >   vhost: supported the viriot indriect desc of used ring
> >
> >  hw/virtio/vhost-shadow-virtqueue.c | 224
> > +++++++++++++++++++++++++++--  hw/virtio/vhost-shadow-virtqueue.h
> |  57 +++++++-
> >  hw/virtio/vhost-vdpa.c             | 130 ++++++++++++++++-
> >  3 files changed, 395 insertions(+), 16 deletions(-)
> >
> > --
> > 2.34.1
> >


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] Add Split indirect descriptors for SVQ
  2025-12-05  1:30   ` Wafer
@ 2025-12-05  1:33     ` Jason Wang
  0 siblings, 0 replies; 9+ messages in thread
From: Jason Wang @ 2025-12-05  1:33 UTC (permalink / raw)
  To: Wafer
  Cc: mst@redhat.com, eperezma@redhat.com, qemu-devel@nongnu.org,
	leiyang@redhat.com, sgarzare@redhat.com, Angus Chen

On Fri, Dec 5, 2025 at 9:30 AM Wafer <wafer@jaguarmicro.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: 2025年12月4日 15:56
> > To: Wafer <wafer@jaguarmicro.com>
> > Cc: mst@redhat.com; eperezma@redhat.com; qemu-devel@nongnu.org;
> > leiyang@redhat.com; sgarzare@redhat.com; Angus Chen
> > <angus.chen@jaguarmicro.com>
> > Subject: Re: [PATCH 0/4] Add Split indirect descriptors for SVQ
> >
> > External Mail: This email originated from OUTSIDE of the organization!
> > Do not click links, open attachments or provide ANY information unless you
> > recognize the sender and know the content is safe.
> >
> >
> > On Thu, Dec 4, 2025 at 3:38 PM Wafer Xie <wafer@jaguarmicro.com> wrote:
> > >
> > > This patch series adds support for VIRTIO split indirect descriptors.
> > > The feature is VIRTIO_RING_F_INDIRECT_DESC.
> > >
> > > Eugenio hs submitted a patch: vhost: accept indirect descriptors in
> > > shadow virtqueue
> > > https://lists.nongnu.org/archive/html/qemu-devel/2025-
> > 12/msg00056.html
> > > Therefore, this patch must be applied first.
> >
> > I may miss something but svq can read indirect descriptors even though it
> > doesn't use indirect descriptor, that's sufficient for making it claims to
> > support indirect descriptors. That's what Eugenio said in that series.
> >
> > This series could be an optimization on top?
> >
> > Thanks
> >
>
> Thanks for your reply!
> Right, the current svq implementation can read the indirect descriptors provided by the driver, but it cannot use indirect descriptors when interacting with the backend device.
> However, if a backend device implements indirect descriptors and wants to use QEMU to validate its indirect descriptor support, then QEMU needs to fully support indirect descriptors in this path. This series is intended to provide that full support, so that such backend devices can be properly tested and validated via QEMU.

Great, let's tweak the changelog for the next version like this.

Thanks

>
> > >
> > >
> > > Wafer Xie (4):
> > >   vhost: add data structure of virtio indirect descriptors in SVQ
> > >   vdpa: implement the interfaces alloc/free for indirect desc
> > >   vhost: supported the virtio indirect desc of available ring
> > >   vhost: supported the viriot indriect desc of used ring
> > >
> > >  hw/virtio/vhost-shadow-virtqueue.c | 224
> > > +++++++++++++++++++++++++++--  hw/virtio/vhost-shadow-virtqueue.h
> > |  57 +++++++-
> > >  hw/virtio/vhost-vdpa.c             | 130 ++++++++++++++++-
> > >  3 files changed, 395 insertions(+), 16 deletions(-)
> > >
> > > --
> > > 2.34.1
> > >
>



^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH 2/4] vdpa: implement the interfaces alloc/free for indirect desc
  2025-12-04  8:41   ` Eugenio Perez Martin
@ 2025-12-08 11:27     ` Wafer
  0 siblings, 0 replies; 9+ messages in thread
From: Wafer @ 2025-12-08 11:27 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: mst@redhat.com, jasowang@redhat.com, qemu-devel@nongnu.org,
	leiyang@redhat.com, sgarzare@redhat.com, Angus Chen



> -----Original Message-----
> From: Eugenio Perez Martin <eperezma@redhat.com>
> Sent: 2025年12月4日 16:41
> To: Wafer <wafer@jaguarmicro.com>
> Cc: mst@redhat.com; jasowang@redhat.com; qemu-devel@nongnu.org;
> leiyang@redhat.com; sgarzare@redhat.com; Angus Chen
> <angus.chen@jaguarmicro.com>
> Subject: Re: [PATCH 2/4] vdpa: implement the interfaces alloc/free for
> indirect desc
> 
> External Mail: This email originated from OUTSIDE of the organization!
> Do not click links, open attachments or provide ANY information unless you
> recognize the sender and know the content is safe.
> 
> 
> On Thu, Dec 4, 2025 at 8:39 AM Wafer Xie <wafer@jaguarmicro.com> wrote:
> >
> > Dynamic Allocation/Free indirect descriptor.
> >
> > Signed-off-by: Wafer Xie <wafer@jaguarmicro.com>
> > ---
> >  hw/virtio/vhost-shadow-virtqueue.c |   5 +-
> >  hw/virtio/vhost-shadow-virtqueue.h |  49 ++++++++++-
> >  hw/virtio/vhost-vdpa.c             | 130 ++++++++++++++++++++++++++++-
> >  3 files changed, 180 insertions(+), 4 deletions(-)
> >
> > diff --git a/hw/virtio/vhost-shadow-virtqueue.c
> > b/hw/virtio/vhost-shadow-virtqueue.c
> > index 2481d49345..ecc3245138 100644
> > --- a/hw/virtio/vhost-shadow-virtqueue.c
> > +++ b/hw/virtio/vhost-shadow-virtqueue.c
> > @@ -756,15 +756,18 @@ void vhost_svq_stop(VhostShadowVirtqueue
> *svq)
> >   *
> >   * @ops: SVQ owner callbacks
> >   * @ops_opaque: ops opaque pointer
> > + * @indirect_ops: Indirect descriptor operations
> >   */
> >  VhostShadowVirtqueue *vhost_svq_new(const
> VhostShadowVirtqueueOps *ops,
> > -                                    void *ops_opaque)
> > +                                    void *ops_opaque,
> > +                                    SVQIndirectOps *indirect_ops)
> >  {
> >      VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
> >
> >      event_notifier_init_fd(&svq->svq_kick, VHOST_FILE_UNBIND);
> >      svq->ops = ops;
> >      svq->ops_opaque = ops_opaque;
> > +    svq->indirect_ops = indirect_ops;
> >      return svq;
> >  }
> >
> > diff --git a/hw/virtio/vhost-shadow-virtqueue.h
> > b/hw/virtio/vhost-shadow-virtqueue.h
> > index 2b36df4dd5..c5d654eae8 100644
> > --- a/hw/virtio/vhost-shadow-virtqueue.h
> > +++ b/hw/virtio/vhost-shadow-virtqueue.h
> > @@ -54,6 +54,49 @@ typedef struct VhostShadowVirtqueueOps {
> >      VirtQueueAvailCallback avail_handler;  } VhostShadowVirtqueueOps;
> >
> > +/**
> > + * Callback to allocate indirect descriptor table.
> > + *
> > + * @svq: Shadow virtqueue
> > + * @num: Number of descriptors
> > + * @desc: Output pointer to allocated descriptor table
> > + * @iova: Output IOVA of the allocated table
> > + * @size: Output size of the allocated region
> > + * @opaque: Opaque data (vhost_vdpa pointer)
> > + *
> > + * Returns true on success, false on failure.
> > + */
> > +typedef bool (*SVQIndirectDescAlloc)(VhostShadowVirtqueue *svq,
> > +                                     size_t num,
> > +                                     vring_desc_t **desc,
> > +                                     hwaddr *iova,
> > +                                     size_t *size,
> > +                                     void *opaque);
> > +
> > +/**
> > + * Callback to free indirect descriptor table.
> > + *
> > + * @svq: Shadow virtqueue
> > + * @desc: Descriptor table to free
> > + * @iova: IOVA of the table
> > + * @size: Size of the allocated region
> > + * @opaque: Opaque data (vhost_vdpa pointer)  */ typedef void
> > +(*SVQIndirectDescFree)(VhostShadowVirtqueue *svq,
> > +                                    vring_desc_t *desc,
> > +                                    hwaddr iova,
> > +                                    size_t size,
> > +                                    void *opaque);
> > +
> > +/**
> > + * Indirect descriptor operations and context  */ typedef struct
> > +SVQIndirectOps {
> > +    SVQIndirectDescAlloc alloc;
> > +    SVQIndirectDescFree free;
> > +    void *opaque;  /* Pointer to struct vhost_vdpa */ }
> > +SVQIndirectOps;
> > +
> >  /* Shadow virtqueue to relay notifications */  typedef struct
> > VhostShadowVirtqueue {
> >      /* Shadow vring */
> > @@ -104,6 +147,9 @@ typedef struct VhostShadowVirtqueue {
> >      /* Caller callbacks opaque */
> >      void *ops_opaque;
> >
> > +    /* Indirect descriptor operations */
> > +    SVQIndirectOps *indirect_ops;
> > +
> >      /* Next head to expose to the device */
> >      uint16_t shadow_avail_idx;
> >
> > @@ -143,7 +189,8 @@ void vhost_svq_start(VhostShadowVirtqueue *svq,
> > VirtIODevice *vdev,  void vhost_svq_stop(VhostShadowVirtqueue *svq);
> >
> >  VhostShadowVirtqueue *vhost_svq_new(const
> VhostShadowVirtqueueOps *ops,
> > -                                    void *ops_opaque);
> > +                                    void *ops_opaque,
> > +                                    SVQIndirectOps *indirect_ops);
> >
> >  void vhost_svq_free(gpointer vq);
> >  G_DEFINE_AUTOPTR_CLEANUP_FUNC(VhostShadowVirtqueue,
> vhost_svq_free);
> > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c index
> > 7061b6e1a3..1719993f52 100644
> > --- a/hw/virtio/vhost-vdpa.c
> > +++ b/hw/virtio/vhost-vdpa.c
> > @@ -584,15 +584,130 @@ static int vhost_vdpa_get_dev_features(struct
> vhost_dev *dev,
> >      return ret;
> >  }
> >
> > +/**
> > + * Allocate indirect descriptor table for SVQ
> > + *
> > + * @svq: Shadow virtqueue
> > + * @num: Number of descriptors needed
> > + * @desc: Output pointer to the allocated table
> > + * @iova: Output IOVA for the table
> > + * @size: Output size of the allocated region
> > + * @opaque: Opaque pointer (vhost_vdpa instance)
> > + *
> > + * Returns true on success, false on failure.
> > + */
> > +static bool vhost_vdpa_svq_indirect_desc_alloc(VhostShadowVirtqueue
> *svq,
> > +                                                size_t num,
> > +                                                vring_desc_t **desc,
> > +                                                hwaddr *iova,
> > +                                                size_t *size,
> > +                                                void *opaque) {
> > +    struct vhost_vdpa *v = opaque;
> > +    size_t desc_size = sizeof(vring_desc_t) * num;
> > +    size_t alloc_size = ROUND_UP(desc_size,
> qemu_real_host_page_size());
> > +    DMAMap needle = {
> > +        .size = alloc_size - 1,
> > +        .perm = IOMMU_RO,
> > +    };
> > +    vring_desc_t *indirect_desc;
> > +    int r;
> > +
> > +    indirect_desc = mmap(NULL, alloc_size, PROT_READ | PROT_WRITE,
> > +                         MAP_SHARED | MAP_ANONYMOUS, -1, 0);
> > +    if (indirect_desc == MAP_FAILED) {
> > +        error_report("Cannot allocate indirect descriptor table");
> > +        return false;
> > +    }
> > +
> > +    r = vhost_iova_tree_map_alloc(v->shared->iova_tree, &needle,
> > +                                  (hwaddr)(uintptr_t)indirect_desc);
> > +    if (unlikely(r != IOVA_OK)) {
> > +        error_report("Cannot allocate iova for indirect descriptors (%d)", r);
> > +        goto err_iova_alloc;
> > +    }
> > +
> > +    r = vhost_vdpa_dma_map(v->shared, v->address_space_id,
> needle.iova,
> > +                           alloc_size, indirect_desc, true);
> 
> Thanks for moving this forward! But the map and unmap operations could be
> heavy in vDPA. My idea was to use a static allocated buffer for the indirect
> table first, but then we need to agree on the right size of it etc.
> 
Yes, since virtio indirect descriptors require a contiguous descriptor array and the size is not fixed, using a pool doesn’t really fit. 
I’m planning to implement it using a dual-array approach instead. The updated patch will be ready soon.

> Can you profile the impact of it vs converting them to chained? If we find we
> can improve the performance then it would be a great addition.
> 
Regarding the performance impact and the possibility of switching to chained descriptors, 
I will include the benchmark results  between chained and indirect descriptors in the next version of the patch.

> > +    if (unlikely(r != 0)) {
> > +        error_report("Cannot map indirect descriptors to device: %s (%d)",
> > +                     g_strerror(-r), -r);
> > +        goto err_dma_map;
> > +    }
> > +
> > +    *desc = indirect_desc;
> > +    *iova = needle.iova;
> > +    *size = alloc_size;
> > +    return true;
> > +
> > +err_dma_map:
> > +    vhost_iova_tree_remove(v->shared->iova_tree, needle);
> > +err_iova_alloc:
> > +    munmap(indirect_desc, alloc_size);
> > +    return false;
> > +}
> > +
> > +/**
> > + * Free indirect descriptor table for SVQ
> > + *
> > + * @svq: Shadow virtqueue
> > + * @desc: Descriptor table to free
> > + * @iova: IOVA of the table
> > + * @size: Size of the allocated region
> > + * @opaque: Opaque pointer (vhost_vdpa instance)  */ static void
> > +vhost_vdpa_svq_indirect_desc_free(VhostShadowVirtqueue *svq,
> > +                                               vring_desc_t *desc,
> > +                                               hwaddr iova,
> > +                                               size_t size,
> > +                                               void *opaque) {
> > +    struct vhost_vdpa *v = opaque;
> > +    const DMAMap needle = {
> > +        .translated_addr = (hwaddr)(uintptr_t)desc,
> > +    };
> > +    const DMAMap *result;
> > +    int r;
> > +
> > +    /* Find the mapping in the IOVA tree by HVA */
> > +    result = vhost_iova_tree_find_iova(v->shared->iova_tree, &needle);
> > +    if (unlikely(!result)) {
> > +        error_report("Cannot find indirect descriptor mapping to unmap: "
> > +                     "iova=0x%" HWADDR_PRIx " hva=0x%" HWADDR_PRIx "
> size=%zu",
> > +                     iova, needle.translated_addr, size);
> > +        return;
> > +    }
> > +
> > +    r = vhost_vdpa_dma_unmap(v->shared, v->address_space_id, result-
> >iova,
> > +                             size);
> > +    if (unlikely(r != 0)) {
> > +        error_report("Cannot unmap indirect descriptors: "
> > +                     "iova=0x%" HWADDR_PRIx " size=%zu: %s (%d)",
> > +                     result->iova, size, g_strerror(-r), -r);
> > +    }
> > +
> > +    vhost_iova_tree_remove(v->shared->iova_tree, *result);
> > +
> > +    munmap(desc, size);
> > +}
> > +
> >  static void vhost_vdpa_init_svq(struct vhost_dev *hdev, struct
> > vhost_vdpa *v)  {
> >      g_autoptr(GPtrArray) shadow_vqs = NULL;
> > +    SVQIndirectOps *indirect_ops;
> > +
> > +    /* Create indirect descriptor ops */
> > +    indirect_ops = g_new0(SVQIndirectOps, 1);
> > +    indirect_ops->alloc = vhost_vdpa_svq_indirect_desc_alloc;
> > +    indirect_ops->free = vhost_vdpa_svq_indirect_desc_free;
> > +    indirect_ops->opaque = v;
> >
> >      shadow_vqs = g_ptr_array_new_full(hdev->nvqs, vhost_svq_free);
> >      for (unsigned n = 0; n < hdev->nvqs; ++n) {
> >          VhostShadowVirtqueue *svq;
> >
> > -        svq = vhost_svq_new(v->shadow_vq_ops, v-
> >shadow_vq_ops_opaque);
> > +        svq = vhost_svq_new(v->shadow_vq_ops, v-
> >shadow_vq_ops_opaque,
> > +                            indirect_ops);
> >          g_ptr_array_add(shadow_vqs, svq);
> >      }
> >
> > @@ -789,10 +904,21 @@ static void vhost_vdpa_svq_cleanup(struct
> > vhost_dev *dev)  {
> >      struct vhost_vdpa *v = dev->opaque;
> >      size_t idx;
> > +    SVQIndirectOps *indirect_ops = NULL;
> >
> >      for (idx = 0; idx < v->shadow_vqs->len; ++idx) {
> > -        vhost_svq_stop(g_ptr_array_index(v->shadow_vqs, idx));
> > +        VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs,
> > + idx);
> > +
> > +        /* Save indirect_ops pointer from first SVQ (all share the
> > + same one) */
> 
> Why not put it in shared them?
> 
> But I don't really get why they are callbacks. What's the issue with if (indirect)
> { call to indirect functions } ?
> 
The next version of the patch will use a static allocated buffer, so no callbacks will be added.

> > +        if (idx == 0 && svq->indirect_ops) {
> > +            indirect_ops = svq->indirect_ops;
> > +        }
> > +
> > +        vhost_svq_stop(svq);
> >      }
> > +
> > +    g_free(indirect_ops);
> > +
> >      g_ptr_array_free(v->shadow_vqs, true);  }
> >
> > --
> > 2.34.1
> >


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-12-08 11:32 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-04  7:37 [PATCH 0/4] Add Split indirect descriptors for SVQ Wafer Xie
2025-12-04  7:37 ` [PATCH 1/4] vhost: add data structure of virtio indirect descriptors in SVQ Wafer Xie
2025-12-04  7:37 ` [PATCH 2/4] vdpa: implement the interfaces alloc/free for indirect desc Wafer Xie
2025-12-04  8:41   ` Eugenio Perez Martin
2025-12-08 11:27     ` Wafer
2025-12-04  7:37 ` [PATCH 3/4] vhost: supported the virtio indirect desc of available ring Wafer Xie
2025-12-04  7:56 ` [PATCH 0/4] Add Split indirect descriptors for SVQ Jason Wang
2025-12-05  1:30   ` Wafer
2025-12-05  1:33     ` Jason Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).