qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [RFC v2 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for hotplugged devices
@ 2024-06-07 14:37 Eric Auger
  2024-06-07 14:37 ` [RFC v2 1/7] HostIOMMUDevice: Store the VFIO/VDPA agent Eric Auger
                   ` (6 more replies)
  0 siblings, 7 replies; 19+ messages in thread
From: Eric Auger @ 2024-06-07 14:37 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, mst,
	jean-philippe, imammedo, peter.maydell, clg, yanghliu,
	zhenzhong.duan
  Cc: alex.williamson, jasowang, pbonzini, berrange

This series is based on Zhenzhong HostIOMMUDevice:

[PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU
https://lore.kernel.org/all/20240605083043.317831-1-zhenzhong.duan@intel.com/

It allows to convey host IOVA reserved regions to the virtio-iommu and
uses the HostIOMMUDevice infrastructure. This replaces the usage of
IOMMU MR ops which fail to satisfy this need for hotplugged devices.

See below for additional background.

In [1] we attempted to fix a case where a VFIO-PCI device protected
with a virtio-iommu was assigned to an x86 guest. On x86 the physical
IOMMU may have an address width (gaw) of 39 or 48 bits whereas the
virtio-iommu used to expose a 64b address space by default.
Hence the guest was trying to use the full 64b space and we hit
DMA MAP failures. To work around this issue we managed to pass
usable IOVA regions (excluding the out of range space) from VFIO
to the virtio-iommu device. This was made feasible by introducing
a new IOMMU Memory Region callback dubbed iommu_set_iova_regions().
This latter gets called when the IOMMU MR is enabled which
causes the vfio_listener_region_add() to be called.

However with VFIO-PCI hotplug, this technique fails due to the
race between the call to the callback in the add memory listener
and the virtio-iommu probe request. Indeed the probe request gets
called before the attach to the domain. So in that case the usable
regions are communicated after the probe request and fail to be
conveyed to the guest. To be honest the problem was hinted by
Jean-Philippe in [1] and I should have been more careful at
listening to him and testing with hotplug :-(

Eventually in 9.0 we introduced an aw-bits option which gives
extra flexibility.

For coldplugged device the technique works because we make sure all
the IOMMU MR are enabled once on the machine init done: 94df5b2180
("virtio-iommu: Fix 64kB host page size VFIO device assignment")
for granule freeze. But I would be keen to get rid of this trick.

Using an IOMMU MR Ops is unpractical because this relies on the IOMMU
MR to have been enabled and the corresponding vfio_listener_region_add()
to be executed. Instead this series proposes to replace the usage of this
API by the recently introduced PCIIOMMUOps: ba7d12eb8c  ("hw/pci: modify
pci_setup_iommu() to set PCIIOMMUOps"). That way, the callback can be
called earlier, once the usable IOVA regions have been collected by
VFIO, without the need for the IOMMU MR to be enabled.

This series also removes the spurious message:
qemu-system-aarch64: warning: virtio-iommu-memory-region-7-0: Notified about new host reserved regions after probe

In the short term this may also be used for passing the page size
mask, which would allow to get rid of the hacky transient IOMMU
MR enablement mentionned above.

[1] [PATCH v4 00/12] VIRTIO-IOMMU/VFIO: Don't assume 64b IOVA space
    https://lore.kernel.org/all/20231019134651.842175-1-eric.auger@redhat.com/

[2] https://lore.kernel.org/all/20230929161547.GB2957297@myrica/

Extra Notes:
With that series, the reserved memory regions are communicated on time
so that the virtio-iommu probe request grabs them. However this is not
sufficient. In some cases (my case), I still see some DMA MAP failures
and the guest keeps on using IOVA ranges outside the geometry of the
physical IOMMU. This is due to the fact the VFIO-PCI device is in the
same iommu group as the pcie root port. Normally the kernel
iova_reserve_iommu_regions (dma-iommu.c) is supposed to call reserve_iova()
for each reserved IOVA, which carves them out of the allocator. When
iommu_dma_init_domain() gets called for the hotplugged vfio-pci device
the iova domain is already allocated and set and we don't call
iova_reserve_iommu_regions() again for the vfio-pci device. So its
corresponding reserved regions are not properly taken into account.

This is not trivial to fix because theoretically the 1st attached
devices could already have allocated IOVAs within the reserved regions
of the second device. Also we are somehow hijacking the reserved
memory regions to model the geometry of the physical IOMMU so not sure
any attempt to fix that upstream will be accepted. At the moment one
solution is to make sure assigned devices end up in singleton group.
Another solution is to work on a different approach where the gaw
can be passed as an option to the virtio-iommu device, similarly at
what is done with intel iommu.

This series can be found at:
https://github.com/eauger/qemu/tree/iommufd_nesting_preq_v7_resv_regions_v2


Eric Auger (7):
  HostIOMMUDevice: Store the VFIO/VDPA agent
  virtio-iommu: Implement set|unset]_iommu_device() callbacks
  HostIOMMUDevice: Introduce get_iova_ranges callback
  virtio-iommu: Compute host reserved regions
  virtio-iommu: Remove the implementation of iommu_set_iova_range
  hw/vfio: Remove memory_region_iommu_set_iova_ranges() call
  memory: Remove IOMMU MR iommu_set_iova_range API

 include/exec/memory.h              |  32 ---
 include/hw/virtio/virtio-iommu.h   |   9 +
 include/sysemu/host_iommu_device.h |   9 +
 hw/vfio/common.c                   |  10 -
 hw/vfio/container.c                |  15 ++
 hw/vfio/iommufd.c                  |  16 ++
 hw/virtio/virtio-iommu.c           | 305 +++++++++++++++++++----------
 system/memory.c                    |  13 --
 8 files changed, 253 insertions(+), 156 deletions(-)

-- 
2.41.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC v2 1/7] HostIOMMUDevice: Store the VFIO/VDPA agent
  2024-06-07 14:37 [RFC v2 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for hotplugged devices Eric Auger
@ 2024-06-07 14:37 ` Eric Auger
  2024-06-11  3:22   ` Duan, Zhenzhong
  2024-06-07 14:37 ` [RFC v2 2/7] virtio-iommu: Implement set|unset]_iommu_device() callbacks Eric Auger
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Eric Auger @ 2024-06-07 14:37 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, mst,
	jean-philippe, imammedo, peter.maydell, clg, yanghliu,
	zhenzhong.duan
  Cc: alex.williamson, jasowang, pbonzini, berrange

Store the agent device (VFIO or VDPA) in the host IOMMU device.
This will allow easy access to some of its resources.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 include/sysemu/host_iommu_device.h | 1 +
 hw/vfio/container.c                | 1 +
 hw/vfio/iommufd.c                  | 2 ++
 3 files changed, 4 insertions(+)

diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
index a57873958b..3e5f058e7b 100644
--- a/include/sysemu/host_iommu_device.h
+++ b/include/sysemu/host_iommu_device.h
@@ -34,6 +34,7 @@ struct HostIOMMUDevice {
     Object parent_obj;
 
     char *name;
+    void *agent; /* pointer to agent device, ie. VFIO or VDPA device */
     HostIOMMUDeviceCaps caps;
 };
 
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 26e6f7fb4f..b728b978a2 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -1145,6 +1145,7 @@ static bool hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
 
     hiod->name = g_strdup(vdev->name);
     hiod->caps.aw_bits = vfio_device_get_aw_bits(vdev);
+    hiod->agent = opaque;
 
     return true;
 }
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 409ed3dcc9..dbdae1adbb 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -631,6 +631,8 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
         struct iommu_hw_info_vtd vtd;
     } data;
 
+    hiod->agent = opaque;
+
     if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid,
                                          &type, &data, sizeof(data), errp)) {
         return false;
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC v2 2/7] virtio-iommu: Implement set|unset]_iommu_device() callbacks
  2024-06-07 14:37 [RFC v2 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for hotplugged devices Eric Auger
  2024-06-07 14:37 ` [RFC v2 1/7] HostIOMMUDevice: Store the VFIO/VDPA agent Eric Auger
@ 2024-06-07 14:37 ` Eric Auger
  2024-06-11  2:38   ` Duan, Zhenzhong
  2024-06-07 14:37 ` [RFC v2 3/7] HostIOMMUDevice: Introduce get_iova_ranges callback Eric Auger
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Eric Auger @ 2024-06-07 14:37 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, mst,
	jean-philippe, imammedo, peter.maydell, clg, yanghliu,
	zhenzhong.duan
  Cc: alex.williamson, jasowang, pbonzini, berrange

Implement PCIIOMMUOPs [set|unset]_iommu_device() callbacks.
In set(), a VirtioHostIOMMUDevice is allocated which holds
a reference to the HostIOMMUDevice. This object is stored in a hash
table indexed by PCI BDF. The handle to the Host IOMMU device
will allow to retrieve information related to the physical IOMMU.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 include/hw/virtio/virtio-iommu.h |  9 ++++
 hw/virtio/virtio-iommu.c         | 87 ++++++++++++++++++++++++++++++++
 2 files changed, 96 insertions(+)

diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
index 83a52cc446..4f664ea0c4 100644
--- a/include/hw/virtio/virtio-iommu.h
+++ b/include/hw/virtio/virtio-iommu.h
@@ -45,6 +45,14 @@ typedef struct IOMMUDevice {
     bool probe_done;
 } IOMMUDevice;
 
+typedef struct VirtioHostIOMMUDevice {
+    void *viommu;
+    PCIBus *bus;
+    uint8_t devfn;
+    HostIOMMUDevice *dev;
+    QLIST_ENTRY(VirtioHostIOMMUDevice) next;
+} VirtioHostIOMMUDevice;
+
 typedef struct IOMMUPciBus {
     PCIBus       *bus;
     IOMMUDevice  *pbdev[]; /* Parent array is sparse, so dynamically alloc */
@@ -57,6 +65,7 @@ struct VirtIOIOMMU {
     struct virtio_iommu_config config;
     uint64_t features;
     GHashTable *as_by_busptr;
+    GHashTable *host_iommu_devices;
     IOMMUPciBus *iommu_pcibus_by_bus_num[PCI_BUS_MAX];
     PCIBus *primary_bus;
     ReservedRegion *prop_resv_regions;
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 1326c6ec41..0680a357f0 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -28,6 +28,7 @@
 #include "sysemu/kvm.h"
 #include "sysemu/reset.h"
 #include "sysemu/sysemu.h"
+#include "sysemu/host_iommu_device.h"
 #include "qemu/reserved-region.h"
 #include "qemu/units.h"
 #include "qapi/error.h"
@@ -69,6 +70,11 @@ typedef struct VirtIOIOMMUMapping {
     uint32_t flags;
 } VirtIOIOMMUMapping;
 
+struct hiod_key {
+    PCIBus *bus;
+    uint8_t devfn;
+};
+
 static inline uint16_t virtio_iommu_get_bdf(IOMMUDevice *dev)
 {
     return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
@@ -462,8 +468,86 @@ static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
     return &sdev->as;
 }
 
+static gboolean hiod_equal(gconstpointer v1, gconstpointer v2)
+{
+    const struct hiod_key *key1 = v1;
+    const struct hiod_key *key2 = v2;
+
+    return (key1->bus == key2->bus) && (key1->devfn == key2->devfn);
+}
+
+static guint hiod_hash(gconstpointer v)
+{
+    const struct hiod_key *key = v;
+    guint value = (guint)(uintptr_t)key->bus;
+
+    return (guint)(value << 8 | key->devfn);
+}
+
+static VirtioHostIOMMUDevice *
+get_host_iommu_device(VirtIOIOMMU *viommu, PCIBus *bus, int devfn) {
+    struct hiod_key key = {
+        .bus = bus,
+        .devfn = devfn,
+    };
+
+    return g_hash_table_lookup(viommu->host_iommu_devices, &key);
+}
+
+static bool virtio_iommu_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
+                                          HostIOMMUDevice *hiod, Error **errp)
+{
+    VirtIOIOMMU *viommu = opaque;
+    VirtioHostIOMMUDevice *vhiod;
+    struct hiod_key *new_key;
+
+    assert(hiod);
+
+    vhiod = get_host_iommu_device(viommu, bus, devfn);
+    if (vhiod) {
+        error_setg(errp, "VirtioHostIOMMUDevice already exists");
+        return false;
+    }
+
+    vhiod = g_malloc0(sizeof(VirtioHostIOMMUDevice));
+    vhiod->bus = bus;
+    vhiod->devfn = (uint8_t)devfn;
+    vhiod->viommu = viommu;
+    vhiod->dev = hiod;
+
+    new_key = g_malloc(sizeof(*new_key));
+    new_key->bus = bus;
+    new_key->devfn = devfn;
+
+    object_ref(hiod);
+    g_hash_table_insert(viommu->host_iommu_devices, new_key, vhiod);
+
+    return true;
+}
+
+static void
+virtio_iommu_unset_iommu_device(PCIBus *bus, void *opaque, int devfn)
+{
+    VirtIOIOMMU *viommu = opaque;
+    VirtioHostIOMMUDevice *vhiod;
+    struct hiod_key key = {
+        .bus = bus,
+        .devfn = devfn,
+    };
+
+    vhiod = g_hash_table_lookup(viommu->host_iommu_devices, &key);
+    if (!vhiod) {
+        return;
+    }
+
+    g_hash_table_remove(viommu->host_iommu_devices, &key);
+    object_unref(vhiod->dev);
+}
+
 static const PCIIOMMUOps virtio_iommu_ops = {
     .get_address_space = virtio_iommu_find_add_as,
+    .set_iommu_device = virtio_iommu_set_iommu_device,
+    .unset_iommu_device = virtio_iommu_unset_iommu_device,
 };
 
 static int virtio_iommu_attach(VirtIOIOMMU *s,
@@ -1357,6 +1441,9 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
 
     s->as_by_busptr = g_hash_table_new_full(NULL, NULL, NULL, g_free);
 
+    s->host_iommu_devices = g_hash_table_new_full(hiod_hash, hiod_equal,
+                                                  g_free, g_free);
+
     if (s->primary_bus) {
         pci_setup_iommu(s->primary_bus, &virtio_iommu_ops, s);
     } else {
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC v2 3/7] HostIOMMUDevice: Introduce get_iova_ranges callback
  2024-06-07 14:37 [RFC v2 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for hotplugged devices Eric Auger
  2024-06-07 14:37 ` [RFC v2 1/7] HostIOMMUDevice: Store the VFIO/VDPA agent Eric Auger
  2024-06-07 14:37 ` [RFC v2 2/7] virtio-iommu: Implement set|unset]_iommu_device() callbacks Eric Auger
@ 2024-06-07 14:37 ` Eric Auger
  2024-06-11  3:24   ` Duan, Zhenzhong
  2024-06-07 14:37 ` [RFC v2 4/7] virtio-iommu: Compute host reserved regions Eric Auger
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Eric Auger @ 2024-06-07 14:37 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, mst,
	jean-philippe, imammedo, peter.maydell, clg, yanghliu,
	zhenzhong.duan
  Cc: alex.williamson, jasowang, pbonzini, berrange

Introduce a new HostIOMMUDevice callback that allows to
retrieve the usable IOVA ranges.

Implement this callback in the legacy VFIO and IOMMUFD VFIO
host iommu devices. This relies on the VFIODevice agent's
base container iova_ranges resource.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 include/sysemu/host_iommu_device.h |  8 ++++++++
 hw/vfio/container.c                | 14 ++++++++++++++
 hw/vfio/iommufd.c                  | 14 ++++++++++++++
 3 files changed, 36 insertions(+)

diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
index 3e5f058e7b..40e0fa13ef 100644
--- a/include/sysemu/host_iommu_device.h
+++ b/include/sysemu/host_iommu_device.h
@@ -80,6 +80,14 @@ struct HostIOMMUDeviceClass {
      * i.e., HOST_IOMMU_DEVICE_CAP_AW_BITS.
      */
     int (*get_cap)(HostIOMMUDevice *hiod, int cap, Error **errp);
+    /**
+     * @get_iova_ranges: Return the list of usable iova_ranges along with
+     * @hiod Host IOMMU device
+     *
+     * @hiod: handle to the host IOMMU device
+     * @errp: error handle
+     */
+    GList* (*get_iova_ranges)(HostIOMMUDevice *hiod, Error **errp);
 };
 
 /*
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index b728b978a2..edd0df6262 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -1164,12 +1164,26 @@ static int hiod_legacy_vfio_get_cap(HostIOMMUDevice *hiod, int cap,
     }
 }
 
+static GList *
+hiod_legacy_vfio_get_iova_ranges(HostIOMMUDevice *hiod, Error **errp)
+{
+    VFIODevice *vdev = hiod->agent;
+    GList *l = NULL;
+
+    if (vdev && vdev->bcontainer) {
+        l = g_list_copy(vdev->bcontainer->iova_ranges);
+    }
+
+    return l;
+}
+
 static void hiod_legacy_vfio_class_init(ObjectClass *oc, void *data)
 {
     HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
 
     hioc->realize = hiod_legacy_vfio_realize;
     hioc->get_cap = hiod_legacy_vfio_get_cap;
+    hioc->get_iova_ranges = hiod_legacy_vfio_get_iova_ranges;
 };
 
 static const TypeInfo types[] = {
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index dbdae1adbb..1706784063 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -645,11 +645,25 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
     return true;
 }
 
+static GList *
+hiod_iommufd_vfio_get_iova_ranges(HostIOMMUDevice *hiod, Error **errp)
+{
+    VFIODevice *vdev = hiod->agent;
+    GList *l = NULL;
+
+    if (vdev && vdev->bcontainer) {
+        l = g_list_copy(vdev->bcontainer->iova_ranges);
+    }
+
+    return l;
+}
+
 static void hiod_iommufd_vfio_class_init(ObjectClass *oc, void *data)
 {
     HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_CLASS(oc);
 
     hiodc->realize = hiod_iommufd_vfio_realize;
+    hiodc->get_iova_ranges = hiod_iommufd_vfio_get_iova_ranges;
 };
 
 static const TypeInfo types[] = {
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC v2 4/7] virtio-iommu: Compute host reserved regions
  2024-06-07 14:37 [RFC v2 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for hotplugged devices Eric Auger
                   ` (2 preceding siblings ...)
  2024-06-07 14:37 ` [RFC v2 3/7] HostIOMMUDevice: Introduce get_iova_ranges callback Eric Auger
@ 2024-06-07 14:37 ` Eric Auger
  2024-06-11  3:25   ` Duan, Zhenzhong
  2024-06-07 14:37 ` [RFC v2 5/7] virtio-iommu: Remove the implementation of iommu_set_iova_range Eric Auger
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Eric Auger @ 2024-06-07 14:37 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, mst,
	jean-philippe, imammedo, peter.maydell, clg, yanghliu,
	zhenzhong.duan
  Cc: alex.williamson, jasowang, pbonzini, berrange

Compute the host reserved regions in virtio_iommu_set_iommu_device().
The usable IOVA regions are retrieved from the HOSTIOMMUDevice.
The virtio_iommu_set_host_iova_ranges() helper turns usable regions
into complementary reserved regions while testing the inclusion
into existing ones. virtio_iommu_set_host_iova_ranges() reuse the
implementation of virtio_iommu_set_iova_ranges() which will be
removed in subsequent patches. rebuild_resv_regions() is just moved.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/virtio/virtio-iommu.c | 151 ++++++++++++++++++++++++++++++---------
 1 file changed, 117 insertions(+), 34 deletions(-)

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 0680a357f0..33e9682b83 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -494,12 +494,114 @@ get_host_iommu_device(VirtIOIOMMU *viommu, PCIBus *bus, int devfn) {
     return g_hash_table_lookup(viommu->host_iommu_devices, &key);
 }
 
+/**
+ * rebuild_resv_regions: rebuild resv regions with both the
+ * info of host resv ranges and property set resv ranges
+ */
+static int rebuild_resv_regions(IOMMUDevice *sdev)
+{
+    GList *l;
+    int i = 0;
+
+    /* free the existing list and rebuild it from scratch */
+    g_list_free_full(sdev->resv_regions, g_free);
+    sdev->resv_regions = NULL;
+
+    /* First add host reserved regions if any, all tagged as RESERVED */
+    for (l = sdev->host_resv_ranges; l; l = l->next) {
+        ReservedRegion *reg = g_new0(ReservedRegion, 1);
+        Range *r = (Range *)l->data;
+
+        reg->type = VIRTIO_IOMMU_RESV_MEM_T_RESERVED;
+        range_set_bounds(&reg->range, range_lob(r), range_upb(r));
+        sdev->resv_regions = resv_region_list_insert(sdev->resv_regions, reg);
+        trace_virtio_iommu_host_resv_regions(sdev->iommu_mr.parent_obj.name, i,
+                                             range_lob(&reg->range),
+                                             range_upb(&reg->range));
+        i++;
+    }
+    /*
+     * then add higher priority reserved regions set by the machine
+     * through properties
+     */
+    add_prop_resv_regions(sdev);
+    return 0;
+}
+
+static int virtio_iommu_set_host_iova_ranges(VirtIOIOMMU *s, PCIBus *bus,
+                                             int devfn, GList *iova_ranges,
+                                             Error **errp)
+{
+    IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus);
+    IOMMUDevice *sdev;
+    GList *current_ranges;
+    GList *l, *tmp, *new_ranges = NULL;
+    int ret = -EINVAL;
+
+    if (!sbus) {
+        error_report("%s no sbus", __func__);
+    }
+
+    sdev = sbus->pbdev[devfn];
+
+    current_ranges = sdev->host_resv_ranges;
+
+    if (sdev->probe_done) {
+        error_setg(errp,
+                   "%s: Notified about new host reserved regions after probe",
+                   __func__);
+        goto out;
+    }
+
+    /* check that each new resv region is included in an existing one */
+    if (sdev->host_resv_ranges) {
+        range_inverse_array(iova_ranges,
+                            &new_ranges,
+                            0, UINT64_MAX);
+
+        for (tmp = new_ranges; tmp; tmp = tmp->next) {
+            Range *newr = (Range *)tmp->data;
+            bool included = false;
+
+            for (l = current_ranges; l; l = l->next) {
+                Range * r = (Range *)l->data;
+
+                if (range_contains_range(r, newr)) {
+                    included = true;
+                    break;
+                }
+            }
+            if (!included) {
+                goto error;
+            }
+        }
+        /* all new reserved ranges are included in existing ones */
+        ret = 0;
+        goto out;
+    }
+
+    range_inverse_array(iova_ranges,
+                        &sdev->host_resv_ranges,
+                        0, UINT64_MAX);
+    rebuild_resv_regions(sdev);
+
+    return 0;
+error:
+    error_setg(errp, "%s Conflicting host reserved ranges set!",
+               __func__);
+out:
+    g_list_free_full(new_ranges, g_free);
+    return ret;
+}
+
 static bool virtio_iommu_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
                                           HostIOMMUDevice *hiod, Error **errp)
 {
     VirtIOIOMMU *viommu = opaque;
     VirtioHostIOMMUDevice *vhiod;
+    HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_GET_CLASS(hiod);
     struct hiod_key *new_key;
+    GList *host_iova_ranges = NULL;
 
     assert(hiod);
 
@@ -509,6 +611,20 @@ static bool virtio_iommu_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
         return false;
     }
 
+    if (hiodc->get_iova_ranges) {
+        int ret;
+        host_iova_ranges = hiodc->get_iova_ranges(hiod, errp);
+        if (!host_iova_ranges) {
+            return true; /* some old kernels may not support that capability */
+        }
+        ret = virtio_iommu_set_host_iova_ranges(viommu, bus, devfn,
+                                                host_iova_ranges, errp);
+        if (ret) {
+            g_list_free_full(host_iova_ranges, g_free);
+            return false;
+        }
+    }
+
     vhiod = g_malloc0(sizeof(VirtioHostIOMMUDevice));
     vhiod->bus = bus;
     vhiod->devfn = (uint8_t)devfn;
@@ -521,6 +637,7 @@ static bool virtio_iommu_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
 
     object_ref(hiod);
     g_hash_table_insert(viommu->host_iommu_devices, new_key, vhiod);
+    g_list_free_full(host_iova_ranges, g_free);
 
     return true;
 }
@@ -1243,40 +1360,6 @@ static int virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr,
     return 0;
 }
 
-/**
- * rebuild_resv_regions: rebuild resv regions with both the
- * info of host resv ranges and property set resv ranges
- */
-static int rebuild_resv_regions(IOMMUDevice *sdev)
-{
-    GList *l;
-    int i = 0;
-
-    /* free the existing list and rebuild it from scratch */
-    g_list_free_full(sdev->resv_regions, g_free);
-    sdev->resv_regions = NULL;
-
-    /* First add host reserved regions if any, all tagged as RESERVED */
-    for (l = sdev->host_resv_ranges; l; l = l->next) {
-        ReservedRegion *reg = g_new0(ReservedRegion, 1);
-        Range *r = (Range *)l->data;
-
-        reg->type = VIRTIO_IOMMU_RESV_MEM_T_RESERVED;
-        range_set_bounds(&reg->range, range_lob(r), range_upb(r));
-        sdev->resv_regions = resv_region_list_insert(sdev->resv_regions, reg);
-        trace_virtio_iommu_host_resv_regions(sdev->iommu_mr.parent_obj.name, i,
-                                             range_lob(&reg->range),
-                                             range_upb(&reg->range));
-        i++;
-    }
-    /*
-     * then add higher priority reserved regions set by the machine
-     * through properties
-     */
-    add_prop_resv_regions(sdev);
-    return 0;
-}
-
 /**
  * virtio_iommu_set_iova_ranges: Conveys the usable IOVA ranges
  *
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC v2 5/7] virtio-iommu: Remove the implementation of iommu_set_iova_range
  2024-06-07 14:37 [RFC v2 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for hotplugged devices Eric Auger
                   ` (3 preceding siblings ...)
  2024-06-07 14:37 ` [RFC v2 4/7] virtio-iommu: Compute host reserved regions Eric Auger
@ 2024-06-07 14:37 ` Eric Auger
  2024-06-11  3:22   ` Duan, Zhenzhong
  2024-06-07 14:37 ` [RFC v2 6/7] hw/vfio: Remove memory_region_iommu_set_iova_ranges() call Eric Auger
  2024-06-07 14:37 ` [RFC v2 7/7] memory: Remove IOMMU MR iommu_set_iova_range API Eric Auger
  6 siblings, 1 reply; 19+ messages in thread
From: Eric Auger @ 2024-06-07 14:37 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, mst,
	jean-philippe, imammedo, peter.maydell, clg, yanghliu,
	zhenzhong.duan
  Cc: alex.williamson, jasowang, pbonzini, berrange

Now that we use PCIIOMMUOps to convey information about usable IOVA
ranges we do not to implement the iommu_set_iova_ranges IOMMU MR
callback.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/virtio/virtio-iommu.c | 67 ----------------------------------------
 1 file changed, 67 deletions(-)

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 33e9682b83..643bbb060b 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -1360,72 +1360,6 @@ static int virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr,
     return 0;
 }
 
-/**
- * virtio_iommu_set_iova_ranges: Conveys the usable IOVA ranges
- *
- * The function turns those into reserved ranges. Once some
- * reserved ranges have been set, new reserved regions cannot be
- * added outside of the original ones.
- *
- * @mr: IOMMU MR
- * @iova_ranges: list of usable IOVA ranges
- * @errp: error handle
- */
-static int virtio_iommu_set_iova_ranges(IOMMUMemoryRegion *mr,
-                                        GList *iova_ranges,
-                                        Error **errp)
-{
-    IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
-    GList *current_ranges = sdev->host_resv_ranges;
-    GList *l, *tmp, *new_ranges = NULL;
-    int ret = -EINVAL;
-
-    /* check that each new resv region is included in an existing one */
-    if (sdev->host_resv_ranges) {
-        range_inverse_array(iova_ranges,
-                            &new_ranges,
-                            0, UINT64_MAX);
-
-        for (tmp = new_ranges; tmp; tmp = tmp->next) {
-            Range *newr = (Range *)tmp->data;
-            bool included = false;
-
-            for (l = current_ranges; l; l = l->next) {
-                Range * r = (Range *)l->data;
-
-                if (range_contains_range(r, newr)) {
-                    included = true;
-                    break;
-                }
-            }
-            if (!included) {
-                goto error;
-            }
-        }
-        /* all new reserved ranges are included in existing ones */
-        ret = 0;
-        goto out;
-    }
-
-    if (sdev->probe_done) {
-        warn_report("%s: Notified about new host reserved regions after probe",
-                    mr->parent_obj.name);
-    }
-
-    range_inverse_array(iova_ranges,
-                        &sdev->host_resv_ranges,
-                        0, UINT64_MAX);
-    rebuild_resv_regions(sdev);
-
-    return 0;
-error:
-    error_setg(errp, "IOMMU mr=%s Conflicting host reserved ranges set!",
-               mr->parent_obj.name);
-out:
-    g_list_free_full(new_ranges, g_free);
-    return ret;
-}
-
 static void virtio_iommu_system_reset(void *opaque)
 {
     VirtIOIOMMU *s = opaque;
@@ -1751,7 +1685,6 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass,
     imrc->replay = virtio_iommu_replay;
     imrc->notify_flag_changed = virtio_iommu_notify_flag_changed;
     imrc->iommu_set_page_size_mask = virtio_iommu_set_page_size_mask;
-    imrc->iommu_set_iova_ranges = virtio_iommu_set_iova_ranges;
 }
 
 static const TypeInfo virtio_iommu_info = {
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC v2 6/7] hw/vfio: Remove memory_region_iommu_set_iova_ranges() call
  2024-06-07 14:37 [RFC v2 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for hotplugged devices Eric Auger
                   ` (4 preceding siblings ...)
  2024-06-07 14:37 ` [RFC v2 5/7] virtio-iommu: Remove the implementation of iommu_set_iova_range Eric Auger
@ 2024-06-07 14:37 ` Eric Auger
  2024-06-11  3:23   ` Duan, Zhenzhong
  2024-06-07 14:37 ` [RFC v2 7/7] memory: Remove IOMMU MR iommu_set_iova_range API Eric Auger
  6 siblings, 1 reply; 19+ messages in thread
From: Eric Auger @ 2024-06-07 14:37 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, mst,
	jean-philippe, imammedo, peter.maydell, clg, yanghliu,
	zhenzhong.duan
  Cc: alex.williamson, jasowang, pbonzini, berrange

As we have just removed the only implementation of
iommu_set_iova_ranges IOMMU MR callback in the virtio-iommu,
let's remove the call to the memory wrapper. Usable IOVA ranges
are now conveyed through the PCIIOMMUOps in VFIO-PCI.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/vfio/common.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index f20a7b5bba..9e4c0cc95f 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -630,16 +630,6 @@ static void vfio_listener_region_add(MemoryListener *listener,
             goto fail;
         }
 
-        if (bcontainer->iova_ranges) {
-            ret = memory_region_iommu_set_iova_ranges(giommu->iommu_mr,
-                                                      bcontainer->iova_ranges,
-                                                      &err);
-            if (ret) {
-                g_free(giommu);
-                goto fail;
-            }
-        }
-
         ret = memory_region_register_iommu_notifier(section->mr, &giommu->n,
                                                     &err);
         if (ret) {
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC v2 7/7] memory: Remove IOMMU MR iommu_set_iova_range API
  2024-06-07 14:37 [RFC v2 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for hotplugged devices Eric Auger
                   ` (5 preceding siblings ...)
  2024-06-07 14:37 ` [RFC v2 6/7] hw/vfio: Remove memory_region_iommu_set_iova_ranges() call Eric Auger
@ 2024-06-07 14:37 ` Eric Auger
  2024-06-11  3:23   ` Duan, Zhenzhong
  6 siblings, 1 reply; 19+ messages in thread
From: Eric Auger @ 2024-06-07 14:37 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, mst,
	jean-philippe, imammedo, peter.maydell, clg, yanghliu,
	zhenzhong.duan
  Cc: alex.williamson, jasowang, pbonzini, berrange

Since the host IOVA ranges are now passed through the
PCIIOMMUOps set_host_resv_regions and we have removed
the only implementation of iommu_set_iova_range() in
the virtio-iommu and the only call site in vfio/common,
let's retire the IOMMU MR API and its memory wrapper.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 include/exec/memory.h | 32 --------------------------------
 system/memory.c       | 13 -------------
 2 files changed, 45 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 9cdd64e9c6..35d772e52b 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -530,26 +530,6 @@ struct IOMMUMemoryRegionClass {
      int (*iommu_set_page_size_mask)(IOMMUMemoryRegion *iommu,
                                      uint64_t page_size_mask,
                                      Error **errp);
-    /**
-     * @iommu_set_iova_ranges:
-     *
-     * Propagate information about the usable IOVA ranges for a given IOMMU
-     * memory region. Used for example to propagate host physical device
-     * reserved memory region constraints to the virtual IOMMU.
-     *
-     * Optional method: if this method is not provided, then the default IOVA
-     * aperture is used.
-     *
-     * @iommu: the IOMMUMemoryRegion
-     *
-     * @iova_ranges: list of ordered IOVA ranges (at least one range)
-     *
-     * Returns 0 on success, or a negative error. In case of failure, the error
-     * object must be created.
-     */
-     int (*iommu_set_iova_ranges)(IOMMUMemoryRegion *iommu,
-                                  GList *iova_ranges,
-                                  Error **errp);
 };
 
 typedef struct RamDiscardListener RamDiscardListener;
@@ -1945,18 +1925,6 @@ int memory_region_iommu_set_page_size_mask(IOMMUMemoryRegion *iommu_mr,
                                            uint64_t page_size_mask,
                                            Error **errp);
 
-/**
- * memory_region_iommu_set_iova_ranges - Set the usable IOVA ranges
- * for a given IOMMU MR region
- *
- * @iommu: IOMMU memory region
- * @iova_ranges: list of ordered IOVA ranges (at least one range)
- * @errp: pointer to Error*, to store an error if it happens.
- */
-int memory_region_iommu_set_iova_ranges(IOMMUMemoryRegion *iommu,
-                                        GList *iova_ranges,
-                                        Error **errp);
-
 /**
  * memory_region_name: get a memory region's name
  *
diff --git a/system/memory.c b/system/memory.c
index 9540caa8a1..248d514f83 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -1914,19 +1914,6 @@ int memory_region_iommu_set_page_size_mask(IOMMUMemoryRegion *iommu_mr,
     return ret;
 }
 
-int memory_region_iommu_set_iova_ranges(IOMMUMemoryRegion *iommu_mr,
-                                        GList *iova_ranges,
-                                        Error **errp)
-{
-    IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr);
-    int ret = 0;
-
-    if (imrc->iommu_set_iova_ranges) {
-        ret = imrc->iommu_set_iova_ranges(iommu_mr, iova_ranges, errp);
-    }
-    return ret;
-}
-
 int memory_region_register_iommu_notifier(MemoryRegion *mr,
                                           IOMMUNotifier *n, Error **errp)
 {
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* RE: [RFC v2 2/7] virtio-iommu: Implement set|unset]_iommu_device() callbacks
  2024-06-07 14:37 ` [RFC v2 2/7] virtio-iommu: Implement set|unset]_iommu_device() callbacks Eric Auger
@ 2024-06-11  2:38   ` Duan, Zhenzhong
  2024-06-13  8:13     ` Eric Auger
  0 siblings, 1 reply; 19+ messages in thread
From: Duan, Zhenzhong @ 2024-06-11  2:38 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro@gmail.com, qemu-devel@nongnu.org,
	qemu-arm@nongnu.org, mst@redhat.com, jean-philippe@linaro.org,
	imammedo@redhat.com, peter.maydell@linaro.org, clg@redhat.com,
	yanghliu@redhat.com
  Cc: alex.williamson@redhat.com, jasowang@redhat.com,
	pbonzini@redhat.com, berrange@redhat.com

Hi Eric,

>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: [RFC v2 2/7] virtio-iommu: Implement set|unset]_iommu_device()
>callbacks
>
>Implement PCIIOMMUOPs [set|unset]_iommu_device() callbacks.
>In set(), a VirtioHostIOMMUDevice is allocated which holds
>a reference to the HostIOMMUDevice. This object is stored in a hash
>table indexed by PCI BDF. The handle to the Host IOMMU device
>will allow to retrieve information related to the physical IOMMU.
>
>Signed-off-by: Eric Auger <eric.auger@redhat.com>
>---
> include/hw/virtio/virtio-iommu.h |  9 ++++
> hw/virtio/virtio-iommu.c         | 87
>++++++++++++++++++++++++++++++++
> 2 files changed, 96 insertions(+)
>
>diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-
>iommu.h
>index 83a52cc446..4f664ea0c4 100644
>--- a/include/hw/virtio/virtio-iommu.h
>+++ b/include/hw/virtio/virtio-iommu.h
>@@ -45,6 +45,14 @@ typedef struct IOMMUDevice {
>     bool probe_done;
> } IOMMUDevice;
>
>+typedef struct VirtioHostIOMMUDevice {
>+    void *viommu;
>+    PCIBus *bus;
>+    uint8_t devfn;
>+    HostIOMMUDevice *dev;
>+    QLIST_ENTRY(VirtioHostIOMMUDevice) next;
>+} VirtioHostIOMMUDevice;
>+
> typedef struct IOMMUPciBus {
>     PCIBus       *bus;
>     IOMMUDevice  *pbdev[]; /* Parent array is sparse, so dynamically alloc
>*/
>@@ -57,6 +65,7 @@ struct VirtIOIOMMU {
>     struct virtio_iommu_config config;
>     uint64_t features;
>     GHashTable *as_by_busptr;
>+    GHashTable *host_iommu_devices;
>     IOMMUPciBus *iommu_pcibus_by_bus_num[PCI_BUS_MAX];
>     PCIBus *primary_bus;
>     ReservedRegion *prop_resv_regions;
>diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>index 1326c6ec41..0680a357f0 100644
>--- a/hw/virtio/virtio-iommu.c
>+++ b/hw/virtio/virtio-iommu.c
>@@ -28,6 +28,7 @@
> #include "sysemu/kvm.h"
> #include "sysemu/reset.h"
> #include "sysemu/sysemu.h"
>+#include "sysemu/host_iommu_device.h"

Not sure if better to move this to include/hw/virtio/virtio-iommu.h
as HostIOMMUDevice is used there.

> #include "qemu/reserved-region.h"
> #include "qemu/units.h"
> #include "qapi/error.h"
>@@ -69,6 +70,11 @@ typedef struct VirtIOIOMMUMapping {
>     uint32_t flags;
> } VirtIOIOMMUMapping;
>
>+struct hiod_key {
>+    PCIBus *bus;
>+    uint8_t devfn;
>+};
>+
> static inline uint16_t virtio_iommu_get_bdf(IOMMUDevice *dev)
> {
>     return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
>@@ -462,8 +468,86 @@ static AddressSpace
>*virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
>     return &sdev->as;
> }
>
>+static gboolean hiod_equal(gconstpointer v1, gconstpointer v2)
>+{
>+    const struct hiod_key *key1 = v1;
>+    const struct hiod_key *key2 = v2;
>+
>+    return (key1->bus == key2->bus) && (key1->devfn == key2->devfn);
>+}
>+
>+static guint hiod_hash(gconstpointer v)
>+{
>+    const struct hiod_key *key = v;
>+    guint value = (guint)(uintptr_t)key->bus;
>+
>+    return (guint)(value << 8 | key->devfn);
>+}
>+
>+static VirtioHostIOMMUDevice *
>+get_host_iommu_device(VirtIOIOMMU *viommu, PCIBus *bus, int devfn) {
>+    struct hiod_key key = {
>+        .bus = bus,
>+        .devfn = devfn,
>+    };
>+
>+    return g_hash_table_lookup(viommu->host_iommu_devices, &key);
>+}
>+
>+static bool virtio_iommu_set_iommu_device(PCIBus *bus, void *opaque,
>int devfn,
>+                                          HostIOMMUDevice *hiod, Error **errp)
>+{
>+    VirtIOIOMMU *viommu = opaque;
>+    VirtioHostIOMMUDevice *vhiod;
>+    struct hiod_key *new_key;
>+
>+    assert(hiod);
>+
>+    vhiod = get_host_iommu_device(viommu, bus, devfn);
>+    if (vhiod) {
>+        error_setg(errp, "VirtioHostIOMMUDevice already exists");
>+        return false;
>+    }
>+
>+    vhiod = g_malloc0(sizeof(VirtioHostIOMMUDevice));
>+    vhiod->bus = bus;
>+    vhiod->devfn = (uint8_t)devfn;
>+    vhiod->viommu = viommu;
>+    vhiod->dev = hiod;
>+
>+    new_key = g_malloc(sizeof(*new_key));
>+    new_key->bus = bus;
>+    new_key->devfn = devfn;
>+
>+    object_ref(hiod);
>+    g_hash_table_insert(viommu->host_iommu_devices, new_key, vhiod);
>+
>+    return true;
>+}
>+
>+static void
>+virtio_iommu_unset_iommu_device(PCIBus *bus, void *opaque, int devfn)
>+{
>+    VirtIOIOMMU *viommu = opaque;
>+    VirtioHostIOMMUDevice *vhiod;
>+    struct hiod_key key = {
>+        .bus = bus,
>+        .devfn = devfn,
>+    };
>+
>+    vhiod = g_hash_table_lookup(viommu->host_iommu_devices, &key);
>+    if (!vhiod) {
>+        return;
>+    }
>+
>+    g_hash_table_remove(viommu->host_iommu_devices, &key);
>+    object_unref(vhiod->dev);

This looks a use-after-free.

Thanks
Zhenzhong

>+}
>+
> static const PCIIOMMUOps virtio_iommu_ops = {
>     .get_address_space = virtio_iommu_find_add_as,
>+    .set_iommu_device = virtio_iommu_set_iommu_device,
>+    .unset_iommu_device = virtio_iommu_unset_iommu_device,
> };
>
> static int virtio_iommu_attach(VirtIOIOMMU *s,
>@@ -1357,6 +1441,9 @@ static void
>virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>
>     s->as_by_busptr = g_hash_table_new_full(NULL, NULL, NULL, g_free);
>
>+    s->host_iommu_devices = g_hash_table_new_full(hiod_hash,
>hiod_equal,
>+                                                  g_free, g_free);
>+
>     if (s->primary_bus) {
>         pci_setup_iommu(s->primary_bus, &virtio_iommu_ops, s);
>     } else {
>--
>2.41.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [RFC v2 1/7] HostIOMMUDevice: Store the VFIO/VDPA agent
  2024-06-07 14:37 ` [RFC v2 1/7] HostIOMMUDevice: Store the VFIO/VDPA agent Eric Auger
@ 2024-06-11  3:22   ` Duan, Zhenzhong
  0 siblings, 0 replies; 19+ messages in thread
From: Duan, Zhenzhong @ 2024-06-11  3:22 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro@gmail.com, qemu-devel@nongnu.org,
	qemu-arm@nongnu.org, mst@redhat.com, jean-philippe@linaro.org,
	imammedo@redhat.com, peter.maydell@linaro.org, clg@redhat.com,
	yanghliu@redhat.com
  Cc: alex.williamson@redhat.com, jasowang@redhat.com,
	pbonzini@redhat.com, berrange@redhat.com



>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: [RFC v2 1/7] HostIOMMUDevice: Store the VFIO/VDPA agent
>
>Store the agent device (VFIO or VDPA) in the host IOMMU device.
>This will allow easy access to some of its resources.
>
>Signed-off-by: Eric Auger <eric.auger@redhat.com>
>---

Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> 

Thanks
Zhenzhong

> include/sysemu/host_iommu_device.h | 1 +
> hw/vfio/container.c                | 1 +
> hw/vfio/iommufd.c                  | 2 ++
> 3 files changed, 4 insertions(+)
>
>diff --git a/include/sysemu/host_iommu_device.h
>b/include/sysemu/host_iommu_device.h
>index a57873958b..3e5f058e7b 100644
>--- a/include/sysemu/host_iommu_device.h
>+++ b/include/sysemu/host_iommu_device.h
>@@ -34,6 +34,7 @@ struct HostIOMMUDevice {
>     Object parent_obj;
>
>     char *name;
>+    void *agent; /* pointer to agent device, ie. VFIO or VDPA device */
>     HostIOMMUDeviceCaps caps;
> };
>
>diff --git a/hw/vfio/container.c b/hw/vfio/container.c
>index 26e6f7fb4f..b728b978a2 100644
>--- a/hw/vfio/container.c
>+++ b/hw/vfio/container.c
>@@ -1145,6 +1145,7 @@ static bool
>hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
>
>     hiod->name = g_strdup(vdev->name);
>     hiod->caps.aw_bits = vfio_device_get_aw_bits(vdev);
>+    hiod->agent = opaque;
>
>     return true;
> }
>diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>index 409ed3dcc9..dbdae1adbb 100644
>--- a/hw/vfio/iommufd.c
>+++ b/hw/vfio/iommufd.c
>@@ -631,6 +631,8 @@ static bool
>hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
>         struct iommu_hw_info_vtd vtd;
>     } data;
>
>+    hiod->agent = opaque;
>+
>     if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid,
>                                          &type, &data, sizeof(data), errp)) {
>         return false;
>--
>2.41.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [RFC v2 5/7] virtio-iommu: Remove the implementation of iommu_set_iova_range
  2024-06-07 14:37 ` [RFC v2 5/7] virtio-iommu: Remove the implementation of iommu_set_iova_range Eric Auger
@ 2024-06-11  3:22   ` Duan, Zhenzhong
  0 siblings, 0 replies; 19+ messages in thread
From: Duan, Zhenzhong @ 2024-06-11  3:22 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro@gmail.com, qemu-devel@nongnu.org,
	qemu-arm@nongnu.org, mst@redhat.com, jean-philippe@linaro.org,
	imammedo@redhat.com, peter.maydell@linaro.org, clg@redhat.com,
	yanghliu@redhat.com
  Cc: alex.williamson@redhat.com, jasowang@redhat.com,
	pbonzini@redhat.com, berrange@redhat.com



>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: [RFC v2 5/7] virtio-iommu: Remove the implementation of
>iommu_set_iova_range
>
>Now that we use PCIIOMMUOps to convey information about usable IOVA
>ranges we do not to implement the iommu_set_iova_ranges IOMMU MR
>callback.
>
>Signed-off-by: Eric Auger <eric.auger@redhat.com>

Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> 

Thanks
Zhenzhong

>---
> hw/virtio/virtio-iommu.c | 67 ----------------------------------------
> 1 file changed, 67 deletions(-)
>
>diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>index 33e9682b83..643bbb060b 100644
>--- a/hw/virtio/virtio-iommu.c
>+++ b/hw/virtio/virtio-iommu.c
>@@ -1360,72 +1360,6 @@ static int
>virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr,
>     return 0;
> }
>
>-/**
>- * virtio_iommu_set_iova_ranges: Conveys the usable IOVA ranges
>- *
>- * The function turns those into reserved ranges. Once some
>- * reserved ranges have been set, new reserved regions cannot be
>- * added outside of the original ones.
>- *
>- * @mr: IOMMU MR
>- * @iova_ranges: list of usable IOVA ranges
>- * @errp: error handle
>- */
>-static int virtio_iommu_set_iova_ranges(IOMMUMemoryRegion *mr,
>-                                        GList *iova_ranges,
>-                                        Error **errp)
>-{
>-    IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
>-    GList *current_ranges = sdev->host_resv_ranges;
>-    GList *l, *tmp, *new_ranges = NULL;
>-    int ret = -EINVAL;
>-
>-    /* check that each new resv region is included in an existing one */
>-    if (sdev->host_resv_ranges) {
>-        range_inverse_array(iova_ranges,
>-                            &new_ranges,
>-                            0, UINT64_MAX);
>-
>-        for (tmp = new_ranges; tmp; tmp = tmp->next) {
>-            Range *newr = (Range *)tmp->data;
>-            bool included = false;
>-
>-            for (l = current_ranges; l; l = l->next) {
>-                Range * r = (Range *)l->data;
>-
>-                if (range_contains_range(r, newr)) {
>-                    included = true;
>-                    break;
>-                }
>-            }
>-            if (!included) {
>-                goto error;
>-            }
>-        }
>-        /* all new reserved ranges are included in existing ones */
>-        ret = 0;
>-        goto out;
>-    }
>-
>-    if (sdev->probe_done) {
>-        warn_report("%s: Notified about new host reserved regions after
>probe",
>-                    mr->parent_obj.name);
>-    }
>-
>-    range_inverse_array(iova_ranges,
>-                        &sdev->host_resv_ranges,
>-                        0, UINT64_MAX);
>-    rebuild_resv_regions(sdev);
>-
>-    return 0;
>-error:
>-    error_setg(errp, "IOMMU mr=%s Conflicting host reserved ranges set!",
>-               mr->parent_obj.name);
>-out:
>-    g_list_free_full(new_ranges, g_free);
>-    return ret;
>-}
>-
> static void virtio_iommu_system_reset(void *opaque)
> {
>     VirtIOIOMMU *s = opaque;
>@@ -1751,7 +1685,6 @@ static void
>virtio_iommu_memory_region_class_init(ObjectClass *klass,
>     imrc->replay = virtio_iommu_replay;
>     imrc->notify_flag_changed = virtio_iommu_notify_flag_changed;
>     imrc->iommu_set_page_size_mask = virtio_iommu_set_page_size_mask;
>-    imrc->iommu_set_iova_ranges = virtio_iommu_set_iova_ranges;
> }
>
> static const TypeInfo virtio_iommu_info = {
>--
>2.41.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [RFC v2 6/7] hw/vfio: Remove memory_region_iommu_set_iova_ranges() call
  2024-06-07 14:37 ` [RFC v2 6/7] hw/vfio: Remove memory_region_iommu_set_iova_ranges() call Eric Auger
@ 2024-06-11  3:23   ` Duan, Zhenzhong
  0 siblings, 0 replies; 19+ messages in thread
From: Duan, Zhenzhong @ 2024-06-11  3:23 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro@gmail.com, qemu-devel@nongnu.org,
	qemu-arm@nongnu.org, mst@redhat.com, jean-philippe@linaro.org,
	imammedo@redhat.com, peter.maydell@linaro.org, clg@redhat.com,
	yanghliu@redhat.com
  Cc: alex.williamson@redhat.com, jasowang@redhat.com,
	pbonzini@redhat.com, berrange@redhat.com



>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: [RFC v2 6/7] hw/vfio: Remove
>memory_region_iommu_set_iova_ranges() call
>
>As we have just removed the only implementation of
>iommu_set_iova_ranges IOMMU MR callback in the virtio-iommu,
>let's remove the call to the memory wrapper. Usable IOVA ranges
>are now conveyed through the PCIIOMMUOps in VFIO-PCI.
>
>Signed-off-by: Eric Auger <eric.auger@redhat.com>

Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> 

Thanks
Zhenzhong

>---
> hw/vfio/common.c | 10 ----------
> 1 file changed, 10 deletions(-)
>
>diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>index f20a7b5bba..9e4c0cc95f 100644
>--- a/hw/vfio/common.c
>+++ b/hw/vfio/common.c
>@@ -630,16 +630,6 @@ static void
>vfio_listener_region_add(MemoryListener *listener,
>             goto fail;
>         }
>
>-        if (bcontainer->iova_ranges) {
>-            ret = memory_region_iommu_set_iova_ranges(giommu-
>>iommu_mr,
>-                                                      bcontainer->iova_ranges,
>-                                                      &err);
>-            if (ret) {
>-                g_free(giommu);
>-                goto fail;
>-            }
>-        }
>-
>         ret = memory_region_register_iommu_notifier(section->mr,
>&giommu->n,
>                                                     &err);
>         if (ret) {
>--
>2.41.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [RFC v2 7/7] memory: Remove IOMMU MR iommu_set_iova_range API
  2024-06-07 14:37 ` [RFC v2 7/7] memory: Remove IOMMU MR iommu_set_iova_range API Eric Auger
@ 2024-06-11  3:23   ` Duan, Zhenzhong
  0 siblings, 0 replies; 19+ messages in thread
From: Duan, Zhenzhong @ 2024-06-11  3:23 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro@gmail.com, qemu-devel@nongnu.org,
	qemu-arm@nongnu.org, mst@redhat.com, jean-philippe@linaro.org,
	imammedo@redhat.com, peter.maydell@linaro.org, clg@redhat.com,
	yanghliu@redhat.com
  Cc: alex.williamson@redhat.com, jasowang@redhat.com,
	pbonzini@redhat.com, berrange@redhat.com



>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: [RFC v2 7/7] memory: Remove IOMMU MR iommu_set_iova_range
>API
>
>Since the host IOVA ranges are now passed through the
>PCIIOMMUOps set_host_resv_regions and we have removed
>the only implementation of iommu_set_iova_range() in
>the virtio-iommu and the only call site in vfio/common,
>let's retire the IOMMU MR API and its memory wrapper.
>
>Signed-off-by: Eric Auger <eric.auger@redhat.com>

Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> 

Thanks
Zhenzhong

>---
> include/exec/memory.h | 32 --------------------------------
> system/memory.c       | 13 -------------
> 2 files changed, 45 deletions(-)
>
>diff --git a/include/exec/memory.h b/include/exec/memory.h
>index 9cdd64e9c6..35d772e52b 100644
>--- a/include/exec/memory.h
>+++ b/include/exec/memory.h
>@@ -530,26 +530,6 @@ struct IOMMUMemoryRegionClass {
>      int (*iommu_set_page_size_mask)(IOMMUMemoryRegion *iommu,
>                                      uint64_t page_size_mask,
>                                      Error **errp);
>-    /**
>-     * @iommu_set_iova_ranges:
>-     *
>-     * Propagate information about the usable IOVA ranges for a given
>IOMMU
>-     * memory region. Used for example to propagate host physical device
>-     * reserved memory region constraints to the virtual IOMMU.
>-     *
>-     * Optional method: if this method is not provided, then the default IOVA
>-     * aperture is used.
>-     *
>-     * @iommu: the IOMMUMemoryRegion
>-     *
>-     * @iova_ranges: list of ordered IOVA ranges (at least one range)
>-     *
>-     * Returns 0 on success, or a negative error. In case of failure, the error
>-     * object must be created.
>-     */
>-     int (*iommu_set_iova_ranges)(IOMMUMemoryRegion *iommu,
>-                                  GList *iova_ranges,
>-                                  Error **errp);
> };
>
> typedef struct RamDiscardListener RamDiscardListener;
>@@ -1945,18 +1925,6 @@ int
>memory_region_iommu_set_page_size_mask(IOMMUMemoryRegion
>*iommu_mr,
>                                            uint64_t page_size_mask,
>                                            Error **errp);
>
>-/**
>- * memory_region_iommu_set_iova_ranges - Set the usable IOVA ranges
>- * for a given IOMMU MR region
>- *
>- * @iommu: IOMMU memory region
>- * @iova_ranges: list of ordered IOVA ranges (at least one range)
>- * @errp: pointer to Error*, to store an error if it happens.
>- */
>-int memory_region_iommu_set_iova_ranges(IOMMUMemoryRegion
>*iommu,
>-                                        GList *iova_ranges,
>-                                        Error **errp);
>-
> /**
>  * memory_region_name: get a memory region's name
>  *
>diff --git a/system/memory.c b/system/memory.c
>index 9540caa8a1..248d514f83 100644
>--- a/system/memory.c
>+++ b/system/memory.c
>@@ -1914,19 +1914,6 @@ int
>memory_region_iommu_set_page_size_mask(IOMMUMemoryRegion
>*iommu_mr,
>     return ret;
> }
>
>-int memory_region_iommu_set_iova_ranges(IOMMUMemoryRegion
>*iommu_mr,
>-                                        GList *iova_ranges,
>-                                        Error **errp)
>-{
>-    IOMMUMemoryRegionClass *imrc =
>IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr);
>-    int ret = 0;
>-
>-    if (imrc->iommu_set_iova_ranges) {
>-        ret = imrc->iommu_set_iova_ranges(iommu_mr, iova_ranges, errp);
>-    }
>-    return ret;
>-}
>-
> int memory_region_register_iommu_notifier(MemoryRegion *mr,
>                                           IOMMUNotifier *n, Error **errp)
> {
>--
>2.41.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [RFC v2 3/7] HostIOMMUDevice: Introduce get_iova_ranges callback
  2024-06-07 14:37 ` [RFC v2 3/7] HostIOMMUDevice: Introduce get_iova_ranges callback Eric Auger
@ 2024-06-11  3:24   ` Duan, Zhenzhong
  2024-06-13  8:19     ` Eric Auger
  0 siblings, 1 reply; 19+ messages in thread
From: Duan, Zhenzhong @ 2024-06-11  3:24 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro@gmail.com, qemu-devel@nongnu.org,
	qemu-arm@nongnu.org, mst@redhat.com, jean-philippe@linaro.org,
	imammedo@redhat.com, peter.maydell@linaro.org, clg@redhat.com,
	yanghliu@redhat.com
  Cc: alex.williamson@redhat.com, jasowang@redhat.com,
	pbonzini@redhat.com, berrange@redhat.com



>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: [RFC v2 3/7] HostIOMMUDevice: Introduce get_iova_ranges
>callback
>
>Introduce a new HostIOMMUDevice callback that allows to
>retrieve the usable IOVA ranges.
>
>Implement this callback in the legacy VFIO and IOMMUFD VFIO
>host iommu devices. This relies on the VFIODevice agent's
>base container iova_ranges resource.
>
>Signed-off-by: Eric Auger <eric.auger@redhat.com>
>---
> include/sysemu/host_iommu_device.h |  8 ++++++++
> hw/vfio/container.c                | 14 ++++++++++++++
> hw/vfio/iommufd.c                  | 14 ++++++++++++++
> 3 files changed, 36 insertions(+)
>
>diff --git a/include/sysemu/host_iommu_device.h
>b/include/sysemu/host_iommu_device.h
>index 3e5f058e7b..40e0fa13ef 100644
>--- a/include/sysemu/host_iommu_device.h
>+++ b/include/sysemu/host_iommu_device.h
>@@ -80,6 +80,14 @@ struct HostIOMMUDeviceClass {
>      * i.e., HOST_IOMMU_DEVICE_CAP_AW_BITS.
>      */
>     int (*get_cap)(HostIOMMUDevice *hiod, int cap, Error **errp);
>+    /**
>+     * @get_iova_ranges: Return the list of usable iova_ranges along with
>+     * @hiod Host IOMMU device
>+     *
>+     * @hiod: handle to the host IOMMU device
>+     * @errp: error handle
>+     */
>+    GList* (*get_iova_ranges)(HostIOMMUDevice *hiod, Error **errp);

Previous I thought expose iova_ranges directly in HostIOMMUDevice::caps.iova_ranges,
But a new callback looks better for a Glist pointer.

> };
>
> /*
>diff --git a/hw/vfio/container.c b/hw/vfio/container.c
>index b728b978a2..edd0df6262 100644
>--- a/hw/vfio/container.c
>+++ b/hw/vfio/container.c
>@@ -1164,12 +1164,26 @@ static int
>hiod_legacy_vfio_get_cap(HostIOMMUDevice *hiod, int cap,
>     }
> }
>
>+static GList *
>+hiod_legacy_vfio_get_iova_ranges(HostIOMMUDevice *hiod, Error **errp)
>+{
>+    VFIODevice *vdev = hiod->agent;
>+    GList *l = NULL;

g_assert(vdev)?

>+
>+    if (vdev && vdev->bcontainer) {
>+        l = g_list_copy(vdev->bcontainer->iova_ranges);
>+    }
>+
>+    return l;
>+}
>+
> static void hiod_legacy_vfio_class_init(ObjectClass *oc, void *data)
> {
>     HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
>
>     hioc->realize = hiod_legacy_vfio_realize;
>     hioc->get_cap = hiod_legacy_vfio_get_cap;
>+    hioc->get_iova_ranges = hiod_legacy_vfio_get_iova_ranges;
> };
>
> static const TypeInfo types[] = {
>diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>index dbdae1adbb..1706784063 100644
>--- a/hw/vfio/iommufd.c
>+++ b/hw/vfio/iommufd.c
>@@ -645,11 +645,25 @@ static bool
>hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
>     return true;
> }
>
>+static GList *
>+hiod_iommufd_vfio_get_iova_ranges(HostIOMMUDevice *hiod, Error
>**errp)
>+{
>+    VFIODevice *vdev = hiod->agent;
>+    GList *l = NULL;
>+

Same here.

Thanks
Zhenzhong

>+    if (vdev && vdev->bcontainer) {
>+        l = g_list_copy(vdev->bcontainer->iova_ranges);
>+    }
>+
>+    return l;
>+}
>+
> static void hiod_iommufd_vfio_class_init(ObjectClass *oc, void *data)
> {
>     HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_CLASS(oc);
>
>     hiodc->realize = hiod_iommufd_vfio_realize;
>+    hiodc->get_iova_ranges = hiod_iommufd_vfio_get_iova_ranges;
> };
>
> static const TypeInfo types[] = {
>--
>2.41.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [RFC v2 4/7] virtio-iommu: Compute host reserved regions
  2024-06-07 14:37 ` [RFC v2 4/7] virtio-iommu: Compute host reserved regions Eric Auger
@ 2024-06-11  3:25   ` Duan, Zhenzhong
  2024-06-13  9:01     ` Eric Auger
  0 siblings, 1 reply; 19+ messages in thread
From: Duan, Zhenzhong @ 2024-06-11  3:25 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro@gmail.com, qemu-devel@nongnu.org,
	qemu-arm@nongnu.org, mst@redhat.com, jean-philippe@linaro.org,
	imammedo@redhat.com, peter.maydell@linaro.org, clg@redhat.com,
	yanghliu@redhat.com
  Cc: alex.williamson@redhat.com, jasowang@redhat.com,
	pbonzini@redhat.com, berrange@redhat.com



>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: [RFC v2 4/7] virtio-iommu: Compute host reserved regions
>
>Compute the host reserved regions in virtio_iommu_set_iommu_device().
>The usable IOVA regions are retrieved from the HOSTIOMMUDevice.
>The virtio_iommu_set_host_iova_ranges() helper turns usable regions
>into complementary reserved regions while testing the inclusion
>into existing ones. virtio_iommu_set_host_iova_ranges() reuse the
>implementation of virtio_iommu_set_iova_ranges() which will be	
>removed in subsequent patches. rebuild_resv_regions() is just moved.
>
>Signed-off-by: Eric Auger <eric.auger@redhat.com>
>---
> hw/virtio/virtio-iommu.c | 151 ++++++++++++++++++++++++++++++--------
>-
> 1 file changed, 117 insertions(+), 34 deletions(-)
>
>diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>index 0680a357f0..33e9682b83 100644
>--- a/hw/virtio/virtio-iommu.c
>+++ b/hw/virtio/virtio-iommu.c
>@@ -494,12 +494,114 @@ get_host_iommu_device(VirtIOIOMMU
>*viommu, PCIBus *bus, int devfn) {
>     return g_hash_table_lookup(viommu->host_iommu_devices, &key);
> }
>
>+/**
>+ * rebuild_resv_regions: rebuild resv regions with both the
>+ * info of host resv ranges and property set resv ranges
>+ */
>+static int rebuild_resv_regions(IOMMUDevice *sdev)
>+{
>+    GList *l;
>+    int i = 0;
>+
>+    /* free the existing list and rebuild it from scratch */
>+    g_list_free_full(sdev->resv_regions, g_free);
>+    sdev->resv_regions = NULL;
>+
>+    /* First add host reserved regions if any, all tagged as RESERVED */
>+    for (l = sdev->host_resv_ranges; l; l = l->next) {
>+        ReservedRegion *reg = g_new0(ReservedRegion, 1);
>+        Range *r = (Range *)l->data;
>+
>+        reg->type = VIRTIO_IOMMU_RESV_MEM_T_RESERVED;
>+        range_set_bounds(&reg->range, range_lob(r), range_upb(r));
>+        sdev->resv_regions = resv_region_list_insert(sdev->resv_regions, reg);
>+        trace_virtio_iommu_host_resv_regions(sdev-
>>iommu_mr.parent_obj.name, i,
>+                                             range_lob(&reg->range),
>+                                             range_upb(&reg->range));
>+        i++;
>+    }
>+    /*
>+     * then add higher priority reserved regions set by the machine
>+     * through properties
>+     */
>+    add_prop_resv_regions(sdev);
>+    return 0;
>+}
>+
>+static int virtio_iommu_set_host_iova_ranges(VirtIOIOMMU *s, PCIBus
>*bus,
>+                                             int devfn, GList *iova_ranges,
>+                                             Error **errp)
>+{
>+    IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus);
>+    IOMMUDevice *sdev;
>+    GList *current_ranges;
>+    GList *l, *tmp, *new_ranges = NULL;
>+    int ret = -EINVAL;
>+
>+    if (!sbus) {
>+        error_report("%s no sbus", __func__);
>+    }
>+
>+    sdev = sbus->pbdev[devfn];
>+
>+    current_ranges = sdev->host_resv_ranges;
>+
>+    if (sdev->probe_done) {

Will this still happen with new interface?

>+        error_setg(errp,
>+                   "%s: Notified about new host reserved regions after probe",
>+                   __func__);
>+        goto out;
>+    }
>+
>+    /* check that each new resv region is included in an existing one */
>+    if (sdev->host_resv_ranges) {

Same here.

>+        range_inverse_array(iova_ranges,
>+                            &new_ranges,
>+                            0, UINT64_MAX);
>+
>+        for (tmp = new_ranges; tmp; tmp = tmp->next) {
>+            Range *newr = (Range *)tmp->data;
>+            bool included = false;
>+
>+            for (l = current_ranges; l; l = l->next) {
>+                Range * r = (Range *)l->data;
>+
>+                if (range_contains_range(r, newr)) {
>+                    included = true;
>+                    break;
>+                }
>+            }
>+            if (!included) {
>+                goto error;
>+            }
>+        }
>+        /* all new reserved ranges are included in existing ones */
>+        ret = 0;
>+        goto out;
>+    }
>+
>+    range_inverse_array(iova_ranges,
>+                        &sdev->host_resv_ranges,
>+                        0, UINT64_MAX);
>+    rebuild_resv_regions(sdev);
>+
>+    return 0;
>+error:
>+    error_setg(errp, "%s Conflicting host reserved ranges set!",
>+               __func__);
>+out:
>+    g_list_free_full(new_ranges, g_free);
>+    return ret;
>+}
>+
> static bool virtio_iommu_set_iommu_device(PCIBus *bus, void *opaque,
>int devfn,
>                                           HostIOMMUDevice *hiod, Error **errp)
> {
>     VirtIOIOMMU *viommu = opaque;
>     VirtioHostIOMMUDevice *vhiod;
>+    HostIOMMUDeviceClass *hiodc =
>HOST_IOMMU_DEVICE_GET_CLASS(hiod);
>     struct hiod_key *new_key;
>+    GList *host_iova_ranges = NULL;

g_autoptr(GList)?

Thanks
Zhenzhong

>
>     assert(hiod);
>
>@@ -509,6 +611,20 @@ static bool
>virtio_iommu_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
>         return false;
>     }
>
>+    if (hiodc->get_iova_ranges) {
>+        int ret;
>+        host_iova_ranges = hiodc->get_iova_ranges(hiod, errp);
>+        if (!host_iova_ranges) {
>+            return true; /* some old kernels may not support that capability */
>+        }
>+        ret = virtio_iommu_set_host_iova_ranges(viommu, bus, devfn,
>+                                                host_iova_ranges, errp);
>+        if (ret) {
>+            g_list_free_full(host_iova_ranges, g_free);
>+            return false;
>+        }
>+    }
>+
>     vhiod = g_malloc0(sizeof(VirtioHostIOMMUDevice));
>     vhiod->bus = bus;
>     vhiod->devfn = (uint8_t)devfn;
>@@ -521,6 +637,7 @@ static bool
>virtio_iommu_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
>
>     object_ref(hiod);
>     g_hash_table_insert(viommu->host_iommu_devices, new_key, vhiod);
>+    g_list_free_full(host_iova_ranges, g_free);
>
>     return true;
> }
>@@ -1243,40 +1360,6 @@ static int
>virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr,
>     return 0;
> }
>
>-/**
>- * rebuild_resv_regions: rebuild resv regions with both the
>- * info of host resv ranges and property set resv ranges
>- */
>-static int rebuild_resv_regions(IOMMUDevice *sdev)
>-{
>-    GList *l;
>-    int i = 0;
>-
>-    /* free the existing list and rebuild it from scratch */
>-    g_list_free_full(sdev->resv_regions, g_free);
>-    sdev->resv_regions = NULL;
>-
>-    /* First add host reserved regions if any, all tagged as RESERVED */
>-    for (l = sdev->host_resv_ranges; l; l = l->next) {
>-        ReservedRegion *reg = g_new0(ReservedRegion, 1);
>-        Range *r = (Range *)l->data;
>-
>-        reg->type = VIRTIO_IOMMU_RESV_MEM_T_RESERVED;
>-        range_set_bounds(&reg->range, range_lob(r), range_upb(r));
>-        sdev->resv_regions = resv_region_list_insert(sdev->resv_regions, reg);
>-        trace_virtio_iommu_host_resv_regions(sdev-
>>iommu_mr.parent_obj.name, i,
>-                                             range_lob(&reg->range),
>-                                             range_upb(&reg->range));
>-        i++;
>-    }
>-    /*
>-     * then add higher priority reserved regions set by the machine
>-     * through properties
>-     */
>-    add_prop_resv_regions(sdev);
>-    return 0;
>-}
>-
> /**
>  * virtio_iommu_set_iova_ranges: Conveys the usable IOVA ranges
>  *
>--
>2.41.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC v2 2/7] virtio-iommu: Implement set|unset]_iommu_device() callbacks
  2024-06-11  2:38   ` Duan, Zhenzhong
@ 2024-06-13  8:13     ` Eric Auger
  0 siblings, 0 replies; 19+ messages in thread
From: Eric Auger @ 2024-06-13  8:13 UTC (permalink / raw)
  To: Duan, Zhenzhong, eric.auger.pro@gmail.com, qemu-devel@nongnu.org,
	qemu-arm@nongnu.org, mst@redhat.com, jean-philippe@linaro.org,
	imammedo@redhat.com, peter.maydell@linaro.org, clg@redhat.com,
	yanghliu@redhat.com
  Cc: alex.williamson@redhat.com, jasowang@redhat.com,
	pbonzini@redhat.com, berrange@redhat.com

Hi Zhenzhong,

On 6/11/24 04:38, Duan, Zhenzhong wrote:
> Hi Eric,
>
>> -----Original Message-----
>> From: Eric Auger <eric.auger@redhat.com>
>> Subject: [RFC v2 2/7] virtio-iommu: Implement set|unset]_iommu_device()
>> callbacks
>>
>> Implement PCIIOMMUOPs [set|unset]_iommu_device() callbacks.
>> In set(), a VirtioHostIOMMUDevice is allocated which holds
>> a reference to the HostIOMMUDevice. This object is stored in a hash
>> table indexed by PCI BDF. The handle to the Host IOMMU device
>> will allow to retrieve information related to the physical IOMMU.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>> include/hw/virtio/virtio-iommu.h |  9 ++++
>> hw/virtio/virtio-iommu.c         | 87
>> ++++++++++++++++++++++++++++++++
>> 2 files changed, 96 insertions(+)
>>
>> diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-
>> iommu.h
>> index 83a52cc446..4f664ea0c4 100644
>> --- a/include/hw/virtio/virtio-iommu.h
>> +++ b/include/hw/virtio/virtio-iommu.h
>> @@ -45,6 +45,14 @@ typedef struct IOMMUDevice {
>>     bool probe_done;
>> } IOMMUDevice;
>>
>> +typedef struct VirtioHostIOMMUDevice {
>> +    void *viommu;
>> +    PCIBus *bus;
>> +    uint8_t devfn;
>> +    HostIOMMUDevice *dev;
>> +    QLIST_ENTRY(VirtioHostIOMMUDevice) next;
>> +} VirtioHostIOMMUDevice;
>> +
>> typedef struct IOMMUPciBus {
>>     PCIBus       *bus;
>>     IOMMUDevice  *pbdev[]; /* Parent array is sparse, so dynamically alloc
>> */
>> @@ -57,6 +65,7 @@ struct VirtIOIOMMU {
>>     struct virtio_iommu_config config;
>>     uint64_t features;
>>     GHashTable *as_by_busptr;
>> +    GHashTable *host_iommu_devices;
>>     IOMMUPciBus *iommu_pcibus_by_bus_num[PCI_BUS_MAX];
>>     PCIBus *primary_bus;
>>     ReservedRegion *prop_resv_regions;
>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>> index 1326c6ec41..0680a357f0 100644
>> --- a/hw/virtio/virtio-iommu.c
>> +++ b/hw/virtio/virtio-iommu.c
>> @@ -28,6 +28,7 @@
>> #include "sysemu/kvm.h"
>> #include "sysemu/reset.h"
>> #include "sysemu/sysemu.h"
>> +#include "sysemu/host_iommu_device.h"
> Not sure if better to move this to include/hw/virtio/virtio-iommu.h
> as HostIOMMUDevice is used there.
agreed!
>
>> #include "qemu/reserved-region.h"
>> #include "qemu/units.h"
>> #include "qapi/error.h"
>> @@ -69,6 +70,11 @@ typedef struct VirtIOIOMMUMapping {
>>     uint32_t flags;
>> } VirtIOIOMMUMapping;
>>
>> +struct hiod_key {
>> +    PCIBus *bus;
>> +    uint8_t devfn;
>> +};
>> +
>> static inline uint16_t virtio_iommu_get_bdf(IOMMUDevice *dev)
>> {
>>     return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
>> @@ -462,8 +468,86 @@ static AddressSpace
>> *virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
>>     return &sdev->as;
>> }
>>
>> +static gboolean hiod_equal(gconstpointer v1, gconstpointer v2)
>> +{
>> +    const struct hiod_key *key1 = v1;
>> +    const struct hiod_key *key2 = v2;
>> +
>> +    return (key1->bus == key2->bus) && (key1->devfn == key2->devfn);
>> +}
>> +
>> +static guint hiod_hash(gconstpointer v)
>> +{
>> +    const struct hiod_key *key = v;
>> +    guint value = (guint)(uintptr_t)key->bus;
>> +
>> +    return (guint)(value << 8 | key->devfn);
>> +}
>> +
>> +static VirtioHostIOMMUDevice *
>> +get_host_iommu_device(VirtIOIOMMU *viommu, PCIBus *bus, int devfn) {
>> +    struct hiod_key key = {
>> +        .bus = bus,
>> +        .devfn = devfn,
>> +    };
>> +
>> +    return g_hash_table_lookup(viommu->host_iommu_devices, &key);
>> +}
>> +
>> +static bool virtio_iommu_set_iommu_device(PCIBus *bus, void *opaque,
>> int devfn,
>> +                                          HostIOMMUDevice *hiod, Error **errp)
>> +{
>> +    VirtIOIOMMU *viommu = opaque;
>> +    VirtioHostIOMMUDevice *vhiod;
>> +    struct hiod_key *new_key;
>> +
>> +    assert(hiod);
>> +
>> +    vhiod = get_host_iommu_device(viommu, bus, devfn);
>> +    if (vhiod) {
>> +        error_setg(errp, "VirtioHostIOMMUDevice already exists");
>> +        return false;
>> +    }
>> +
>> +    vhiod = g_malloc0(sizeof(VirtioHostIOMMUDevice));
>> +    vhiod->bus = bus;
>> +    vhiod->devfn = (uint8_t)devfn;
>> +    vhiod->viommu = viommu;
>> +    vhiod->dev = hiod;
>> +
>> +    new_key = g_malloc(sizeof(*new_key));
>> +    new_key->bus = bus;
>> +    new_key->devfn = devfn;
>> +
>> +    object_ref(hiod);
>> +    g_hash_table_insert(viommu->host_iommu_devices, new_key, vhiod);
>> +
>> +    return true;
>> +}
>> +
>> +static void
>> +virtio_iommu_unset_iommu_device(PCIBus *bus, void *opaque, int devfn)
>> +{
>> +    VirtIOIOMMU *viommu = opaque;
>> +    VirtioHostIOMMUDevice *vhiod;
>> +    struct hiod_key key = {
>> +        .bus = bus,
>> +        .devfn = devfn,
>> +    };
>> +
>> +    vhiod = g_hash_table_lookup(viommu->host_iommu_devices, &key);
>> +    if (!vhiod) {
>> +        return;
>> +    }
>> +
>> +    g_hash_table_remove(viommu->host_iommu_devices, &key);
>> +    object_unref(vhiod->dev);
> This looks a use-after-free.
yes. Adopting the destroy function now in place in intel iommu

Thanks!

Eric
>
> Thanks
> Zhenzhong
>
>> +}
>> +
>> static const PCIIOMMUOps virtio_iommu_ops = {
>>     .get_address_space = virtio_iommu_find_add_as,
>> +    .set_iommu_device = virtio_iommu_set_iommu_device,
>> +    .unset_iommu_device = virtio_iommu_unset_iommu_device,
>> };
>>
>> static int virtio_iommu_attach(VirtIOIOMMU *s,
>> @@ -1357,6 +1441,9 @@ static void
>> virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>>
>>     s->as_by_busptr = g_hash_table_new_full(NULL, NULL, NULL, g_free);
>>
>> +    s->host_iommu_devices = g_hash_table_new_full(hiod_hash,
>> hiod_equal,
>> +                                                  g_free, g_free);
>> +
>>     if (s->primary_bus) {
>>         pci_setup_iommu(s->primary_bus, &virtio_iommu_ops, s);
>>     } else {
>> --
>> 2.41.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC v2 3/7] HostIOMMUDevice: Introduce get_iova_ranges callback
  2024-06-11  3:24   ` Duan, Zhenzhong
@ 2024-06-13  8:19     ` Eric Auger
  0 siblings, 0 replies; 19+ messages in thread
From: Eric Auger @ 2024-06-13  8:19 UTC (permalink / raw)
  To: Duan, Zhenzhong, eric.auger.pro@gmail.com, qemu-devel@nongnu.org,
	qemu-arm@nongnu.org, mst@redhat.com, jean-philippe@linaro.org,
	imammedo@redhat.com, peter.maydell@linaro.org, clg@redhat.com,
	yanghliu@redhat.com
  Cc: alex.williamson@redhat.com, jasowang@redhat.com,
	pbonzini@redhat.com, berrange@redhat.com

Hi,

On 6/11/24 05:24, Duan, Zhenzhong wrote:
>
>> -----Original Message-----
>> From: Eric Auger <eric.auger@redhat.com>
>> Subject: [RFC v2 3/7] HostIOMMUDevice: Introduce get_iova_ranges
>> callback
>>
>> Introduce a new HostIOMMUDevice callback that allows to
>> retrieve the usable IOVA ranges.
>>
>> Implement this callback in the legacy VFIO and IOMMUFD VFIO
>> host iommu devices. This relies on the VFIODevice agent's
>> base container iova_ranges resource.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>> include/sysemu/host_iommu_device.h |  8 ++++++++
>> hw/vfio/container.c                | 14 ++++++++++++++
>> hw/vfio/iommufd.c                  | 14 ++++++++++++++
>> 3 files changed, 36 insertions(+)
>>
>> diff --git a/include/sysemu/host_iommu_device.h
>> b/include/sysemu/host_iommu_device.h
>> index 3e5f058e7b..40e0fa13ef 100644
>> --- a/include/sysemu/host_iommu_device.h
>> +++ b/include/sysemu/host_iommu_device.h
>> @@ -80,6 +80,14 @@ struct HostIOMMUDeviceClass {
>>      * i.e., HOST_IOMMU_DEVICE_CAP_AW_BITS.
>>      */
>>     int (*get_cap)(HostIOMMUDevice *hiod, int cap, Error **errp);
>> +    /**
>> +     * @get_iova_ranges: Return the list of usable iova_ranges along with
>> +     * @hiod Host IOMMU device
>> +     *
>> +     * @hiod: handle to the host IOMMU device
>> +     * @errp: error handle
>> +     */
>> +    GList* (*get_iova_ranges)(HostIOMMUDevice *hiod, Error **errp);
> Previous I thought expose iova_ranges directly in HostIOMMUDevice::caps.iova_ranges,
> But a new callback looks better for a Glist pointer.
OK so we are aligned.
>
>> };
>>
>> /*
>> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
>> index b728b978a2..edd0df6262 100644
>> --- a/hw/vfio/container.c
>> +++ b/hw/vfio/container.c
>> @@ -1164,12 +1164,26 @@ static int
>> hiod_legacy_vfio_get_cap(HostIOMMUDevice *hiod, int cap,
>>     }
>> }
>>
>> +static GList *
>> +hiod_legacy_vfio_get_iova_ranges(HostIOMMUDevice *hiod, Error **errp)
>> +{
>> +    VFIODevice *vdev = hiod->agent;
>> +    GList *l = NULL;
> g_assert(vdev)?
yes
>
>> +
>> +    if (vdev && vdev->bcontainer) {
>> +        l = g_list_copy(vdev->bcontainer->iova_ranges);
>> +    }
>> +
>> +    return l;
>> +}
>> +
>> static void hiod_legacy_vfio_class_init(ObjectClass *oc, void *data)
>> {
>>     HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
>>
>>     hioc->realize = hiod_legacy_vfio_realize;
>>     hioc->get_cap = hiod_legacy_vfio_get_cap;
>> +    hioc->get_iova_ranges = hiod_legacy_vfio_get_iova_ranges;
>> };
>>
>> static const TypeInfo types[] = {
>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>> index dbdae1adbb..1706784063 100644
>> --- a/hw/vfio/iommufd.c
>> +++ b/hw/vfio/iommufd.c
>> @@ -645,11 +645,25 @@ static bool
>> hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
>>     return true;
>> }
>>
>> +static GList *
>> +hiod_iommufd_vfio_get_iova_ranges(HostIOMMUDevice *hiod, Error
>> **errp)
>> +{
>> +    VFIODevice *vdev = hiod->agent;
>> +    GList *l = NULL;
>> +
> Same here.
yes

Thanks

Eric
>
> Thanks
> Zhenzhong
>
>> +    if (vdev && vdev->bcontainer) {
>> +        l = g_list_copy(vdev->bcontainer->iova_ranges);
>> +    }
>> +
>> +    return l;
>> +}
>> +
>> static void hiod_iommufd_vfio_class_init(ObjectClass *oc, void *data)
>> {
>>     HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_CLASS(oc);
>>
>>     hiodc->realize = hiod_iommufd_vfio_realize;
>> +    hiodc->get_iova_ranges = hiod_iommufd_vfio_get_iova_ranges;
>> };
>>
>> static const TypeInfo types[] = {
>> --
>> 2.41.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC v2 4/7] virtio-iommu: Compute host reserved regions
  2024-06-11  3:25   ` Duan, Zhenzhong
@ 2024-06-13  9:01     ` Eric Auger
  2024-06-13  9:52       ` Duan, Zhenzhong
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Auger @ 2024-06-13  9:01 UTC (permalink / raw)
  To: Duan, Zhenzhong, eric.auger.pro@gmail.com, qemu-devel@nongnu.org,
	qemu-arm@nongnu.org, mst@redhat.com, jean-philippe@linaro.org,
	imammedo@redhat.com, peter.maydell@linaro.org, clg@redhat.com,
	yanghliu@redhat.com
  Cc: alex.williamson@redhat.com, jasowang@redhat.com,
	pbonzini@redhat.com, berrange@redhat.com

Hi Zhenzhong,

On 6/11/24 05:25, Duan, Zhenzhong wrote:
>
>> -----Original Message-----
>> From: Eric Auger <eric.auger@redhat.com>
>> Subject: [RFC v2 4/7] virtio-iommu: Compute host reserved regions
>>
>> Compute the host reserved regions in virtio_iommu_set_iommu_device().
>> The usable IOVA regions are retrieved from the HOSTIOMMUDevice.
>> The virtio_iommu_set_host_iova_ranges() helper turns usable regions
>> into complementary reserved regions while testing the inclusion
>> into existing ones. virtio_iommu_set_host_iova_ranges() reuse the
>> implementation of virtio_iommu_set_iova_ranges() which will be	
>> removed in subsequent patches. rebuild_resv_regions() is just moved.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>> hw/virtio/virtio-iommu.c | 151 ++++++++++++++++++++++++++++++--------
>> -
>> 1 file changed, 117 insertions(+), 34 deletions(-)
>>
>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>> index 0680a357f0..33e9682b83 100644
>> --- a/hw/virtio/virtio-iommu.c
>> +++ b/hw/virtio/virtio-iommu.c
>> @@ -494,12 +494,114 @@ get_host_iommu_device(VirtIOIOMMU
>> *viommu, PCIBus *bus, int devfn) {
>>     return g_hash_table_lookup(viommu->host_iommu_devices, &key);
>> }
>>
>> +/**
>> + * rebuild_resv_regions: rebuild resv regions with both the
>> + * info of host resv ranges and property set resv ranges
>> + */
>> +static int rebuild_resv_regions(IOMMUDevice *sdev)
>> +{
>> +    GList *l;
>> +    int i = 0;
>> +
>> +    /* free the existing list and rebuild it from scratch */
>> +    g_list_free_full(sdev->resv_regions, g_free);
>> +    sdev->resv_regions = NULL;
>> +
>> +    /* First add host reserved regions if any, all tagged as RESERVED */
>> +    for (l = sdev->host_resv_ranges; l; l = l->next) {
>> +        ReservedRegion *reg = g_new0(ReservedRegion, 1);
>> +        Range *r = (Range *)l->data;
>> +
>> +        reg->type = VIRTIO_IOMMU_RESV_MEM_T_RESERVED;
>> +        range_set_bounds(&reg->range, range_lob(r), range_upb(r));
>> +        sdev->resv_regions = resv_region_list_insert(sdev->resv_regions, reg);
>> +        trace_virtio_iommu_host_resv_regions(sdev-
>>> iommu_mr.parent_obj.name, i,
>> +                                             range_lob(&reg->range),
>> +                                             range_upb(&reg->range));
>> +        i++;
>> +    }
>> +    /*
>> +     * then add higher priority reserved regions set by the machine
>> +     * through properties
>> +     */
>> +    add_prop_resv_regions(sdev);
>> +    return 0;
>> +}
>> +
>> +static int virtio_iommu_set_host_iova_ranges(VirtIOIOMMU *s, PCIBus
>> *bus,
>> +                                             int devfn, GList *iova_ranges,
>> +                                             Error **errp)
>> +{
>> +    IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus);
>> +    IOMMUDevice *sdev;
>> +    GList *current_ranges;
>> +    GList *l, *tmp, *new_ranges = NULL;
>> +    int ret = -EINVAL;
>> +
>> +    if (!sbus) {
>> +        error_report("%s no sbus", __func__);
>> +    }
>> +
>> +    sdev = sbus->pbdev[devfn];
>> +
>> +    current_ranges = sdev->host_resv_ranges;
>> +
>> +    if (sdev->probe_done) {
> Will this still happen with new interface?
no this shouldn't. Replaced by a g_assert(!sdev->probe_done) to make
sure the i/f is used properly.
>
>> +        error_setg(errp,
>> +                   "%s: Notified about new host reserved regions after probe",
>> +                   __func__);
>> +        goto out;
>> +    }
>> +
>> +    /* check that each new resv region is included in an existing one */
>> +    if (sdev->host_resv_ranges) {
> Same here.
To me this one can still happen in case several devices belong to the
same group.
>
>> +        range_inverse_array(iova_ranges,
>> +                            &new_ranges,
>> +                            0, UINT64_MAX);
>> +
>> +        for (tmp = new_ranges; tmp; tmp = tmp->next) {
>> +            Range *newr = (Range *)tmp->data;
>> +            bool included = false;
>> +
>> +            for (l = current_ranges; l; l = l->next) {
>> +                Range * r = (Range *)l->data;
>> +
>> +                if (range_contains_range(r, newr)) {
>> +                    included = true;
>> +                    break;
>> +                }
>> +            }
>> +            if (!included) {
>> +                goto error;
>> +            }
>> +        }
>> +        /* all new reserved ranges are included in existing ones */
>> +        ret = 0;
>> +        goto out;
>> +    }
>> +
>> +    range_inverse_array(iova_ranges,
>> +                        &sdev->host_resv_ranges,
>> +                        0, UINT64_MAX);
>> +    rebuild_resv_regions(sdev);
>> +
>> +    return 0;
>> +error:
>> +    error_setg(errp, "%s Conflicting host reserved ranges set!",
>> +               __func__);
>> +out:
>> +    g_list_free_full(new_ranges, g_free);
>> +    return ret;
>> +}
>> +
>> static bool virtio_iommu_set_iommu_device(PCIBus *bus, void *opaque,
>> int devfn,
>>                                           HostIOMMUDevice *hiod, Error **errp)
>> {
>>     VirtIOIOMMU *viommu = opaque;
>>     VirtioHostIOMMUDevice *vhiod;
>> +    HostIOMMUDeviceClass *hiodc =
>> HOST_IOMMU_DEVICE_GET_CLASS(hiod);
>>     struct hiod_key *new_key;
>> +    GList *host_iova_ranges = NULL;
> g_autoptr(GList)?
are you sure this frees all the elements of the list? As of now I would
be tempted to leave the code as is.

Thanks

Eric
>
> Thanks
> Zhenzhong
>
>>     assert(hiod);
>>
>> @@ -509,6 +611,20 @@ static bool
>> virtio_iommu_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
>>         return false;
>>     }
>>
>> +    if (hiodc->get_iova_ranges) {
>> +        int ret;
>> +        host_iova_ranges = hiodc->get_iova_ranges(hiod, errp);
>> +        if (!host_iova_ranges) {
>> +            return true; /* some old kernels may not support that capability */
>> +        }
>> +        ret = virtio_iommu_set_host_iova_ranges(viommu, bus, devfn,
>> +                                                host_iova_ranges, errp);
>> +        if (ret) {
>> +            g_list_free_full(host_iova_ranges, g_free);
>> +            return false;
>> +        }
>> +    }
>> +
>>     vhiod = g_malloc0(sizeof(VirtioHostIOMMUDevice));
>>     vhiod->bus = bus;
>>     vhiod->devfn = (uint8_t)devfn;
>> @@ -521,6 +637,7 @@ static bool
>> virtio_iommu_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
>>
>>     object_ref(hiod);
>>     g_hash_table_insert(viommu->host_iommu_devices, new_key, vhiod);
>> +    g_list_free_full(host_iova_ranges, g_free);
>>
>>     return true;
>> }
>> @@ -1243,40 +1360,6 @@ static int
>> virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr,
>>     return 0;
>> }
>>
>> -/**
>> - * rebuild_resv_regions: rebuild resv regions with both the
>> - * info of host resv ranges and property set resv ranges
>> - */
>> -static int rebuild_resv_regions(IOMMUDevice *sdev)
>> -{
>> -    GList *l;
>> -    int i = 0;
>> -
>> -    /* free the existing list and rebuild it from scratch */
>> -    g_list_free_full(sdev->resv_regions, g_free);
>> -    sdev->resv_regions = NULL;
>> -
>> -    /* First add host reserved regions if any, all tagged as RESERVED */
>> -    for (l = sdev->host_resv_ranges; l; l = l->next) {
>> -        ReservedRegion *reg = g_new0(ReservedRegion, 1);
>> -        Range *r = (Range *)l->data;
>> -
>> -        reg->type = VIRTIO_IOMMU_RESV_MEM_T_RESERVED;
>> -        range_set_bounds(&reg->range, range_lob(r), range_upb(r));
>> -        sdev->resv_regions = resv_region_list_insert(sdev->resv_regions, reg);
>> -        trace_virtio_iommu_host_resv_regions(sdev-
>>> iommu_mr.parent_obj.name, i,
>> -                                             range_lob(&reg->range),
>> -                                             range_upb(&reg->range));
>> -        i++;
>> -    }
>> -    /*
>> -     * then add higher priority reserved regions set by the machine
>> -     * through properties
>> -     */
>> -    add_prop_resv_regions(sdev);
>> -    return 0;
>> -}
>> -
>> /**
>>  * virtio_iommu_set_iova_ranges: Conveys the usable IOVA ranges
>>  *
>> --
>> 2.41.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [RFC v2 4/7] virtio-iommu: Compute host reserved regions
  2024-06-13  9:01     ` Eric Auger
@ 2024-06-13  9:52       ` Duan, Zhenzhong
  0 siblings, 0 replies; 19+ messages in thread
From: Duan, Zhenzhong @ 2024-06-13  9:52 UTC (permalink / raw)
  To: eric.auger@redhat.com, eric.auger.pro@gmail.com,
	qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com,
	jean-philippe@linaro.org, imammedo@redhat.com,
	peter.maydell@linaro.org, clg@redhat.com, yanghliu@redhat.com
  Cc: alex.williamson@redhat.com, jasowang@redhat.com,
	pbonzini@redhat.com, berrange@redhat.com



>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [RFC v2 4/7] virtio-iommu: Compute host reserved regions
>
>Hi Zhenzhong,
>
>On 6/11/24 05:25, Duan, Zhenzhong wrote:
>>
>>> -----Original Message-----
>>> From: Eric Auger <eric.auger@redhat.com>
>>> Subject: [RFC v2 4/7] virtio-iommu: Compute host reserved regions
>>>
>>> Compute the host reserved regions in virtio_iommu_set_iommu_device().
>>> The usable IOVA regions are retrieved from the HOSTIOMMUDevice.
>>> The virtio_iommu_set_host_iova_ranges() helper turns usable regions
>>> into complementary reserved regions while testing the inclusion
>>> into existing ones. virtio_iommu_set_host_iova_ranges() reuse the
>>> implementation of virtio_iommu_set_iova_ranges() which will be
>>> removed in subsequent patches. rebuild_resv_regions() is just moved.
>>>
>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>> ---
>>> hw/virtio/virtio-iommu.c | 151 ++++++++++++++++++++++++++++++----
>----
>>> -
>>> 1 file changed, 117 insertions(+), 34 deletions(-)
>>>
>>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>>> index 0680a357f0..33e9682b83 100644
>>> --- a/hw/virtio/virtio-iommu.c
>>> +++ b/hw/virtio/virtio-iommu.c
>>> @@ -494,12 +494,114 @@ get_host_iommu_device(VirtIOIOMMU
>>> *viommu, PCIBus *bus, int devfn) {
>>>     return g_hash_table_lookup(viommu->host_iommu_devices, &key);
>>> }
>>>
>>> +/**
>>> + * rebuild_resv_regions: rebuild resv regions with both the
>>> + * info of host resv ranges and property set resv ranges
>>> + */
>>> +static int rebuild_resv_regions(IOMMUDevice *sdev)
>>> +{
>>> +    GList *l;
>>> +    int i = 0;
>>> +
>>> +    /* free the existing list and rebuild it from scratch */
>>> +    g_list_free_full(sdev->resv_regions, g_free);
>>> +    sdev->resv_regions = NULL;
>>> +
>>> +    /* First add host reserved regions if any, all tagged as RESERVED */
>>> +    for (l = sdev->host_resv_ranges; l; l = l->next) {
>>> +        ReservedRegion *reg = g_new0(ReservedRegion, 1);
>>> +        Range *r = (Range *)l->data;
>>> +
>>> +        reg->type = VIRTIO_IOMMU_RESV_MEM_T_RESERVED;
>>> +        range_set_bounds(&reg->range, range_lob(r), range_upb(r));
>>> +        sdev->resv_regions = resv_region_list_insert(sdev->resv_regions,
>reg);
>>> +        trace_virtio_iommu_host_resv_regions(sdev-
>>>> iommu_mr.parent_obj.name, i,
>>> +                                             range_lob(&reg->range),
>>> +                                             range_upb(&reg->range));
>>> +        i++;
>>> +    }
>>> +    /*
>>> +     * then add higher priority reserved regions set by the machine
>>> +     * through properties
>>> +     */
>>> +    add_prop_resv_regions(sdev);
>>> +    return 0;
>>> +}
>>> +
>>> +static int virtio_iommu_set_host_iova_ranges(VirtIOIOMMU *s, PCIBus
>>> *bus,
>>> +                                             int devfn, GList *iova_ranges,
>>> +                                             Error **errp)
>>> +{
>>> +    IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus);
>>> +    IOMMUDevice *sdev;
>>> +    GList *current_ranges;
>>> +    GList *l, *tmp, *new_ranges = NULL;
>>> +    int ret = -EINVAL;
>>> +
>>> +    if (!sbus) {
>>> +        error_report("%s no sbus", __func__);
>>> +    }
>>> +
>>> +    sdev = sbus->pbdev[devfn];
>>> +
>>> +    current_ranges = sdev->host_resv_ranges;
>>> +
>>> +    if (sdev->probe_done) {
>> Will this still happen with new interface?
>no this shouldn't. Replaced by a g_assert(!sdev->probe_done) to make
>sure the i/f is used properly.
>>
>>> +        error_setg(errp,
>>> +                   "%s: Notified about new host reserved regions after probe",
>>> +                   __func__);
>>> +        goto out;
>>> +    }
>>> +
>>> +    /* check that each new resv region is included in an existing one */
>>> +    if (sdev->host_resv_ranges) {
>> Same here.
>To me this one can still happen in case several devices belong to the
>same group.

If same slot is used to plug/unplug vfio devices with different reserved
ranges, the second device plug may check failed.
It looks sdev->host_resv_ranges is not freed after unplug.

>>
>>> +        range_inverse_array(iova_ranges,
>>> +                            &new_ranges,
>>> +                            0, UINT64_MAX);
>>> +
>>> +        for (tmp = new_ranges; tmp; tmp = tmp->next) {
>>> +            Range *newr = (Range *)tmp->data;
>>> +            bool included = false;
>>> +
>>> +            for (l = current_ranges; l; l = l->next) {
>>> +                Range * r = (Range *)l->data;
>>> +
>>> +                if (range_contains_range(r, newr)) {
>>> +                    included = true;
>>> +                    break;
>>> +                }
>>> +            }
>>> +            if (!included) {
>>> +                goto error;
>>> +            }
>>> +        }
>>> +        /* all new reserved ranges are included in existing ones */
>>> +        ret = 0;
>>> +        goto out;
>>> +    }
>>> +
>>> +    range_inverse_array(iova_ranges,
>>> +                        &sdev->host_resv_ranges,
>>> +                        0, UINT64_MAX);
>>> +    rebuild_resv_regions(sdev);
>>> +
>>> +    return 0;
>>> +error:
>>> +    error_setg(errp, "%s Conflicting host reserved ranges set!",
>>> +               __func__);
>>> +out:
>>> +    g_list_free_full(new_ranges, g_free);
>>> +    return ret;
>>> +}
>>> +
>>> static bool virtio_iommu_set_iommu_device(PCIBus *bus, void *opaque,
>>> int devfn,
>>>                                           HostIOMMUDevice *hiod, Error **errp)
>>> {
>>>     VirtIOIOMMU *viommu = opaque;
>>>     VirtioHostIOMMUDevice *vhiod;
>>> +    HostIOMMUDeviceClass *hiodc =
>>> HOST_IOMMU_DEVICE_GET_CLASS(hiod);
>>>     struct hiod_key *new_key;
>>> +    GList *host_iova_ranges = NULL;
>> g_autoptr(GList)?
>are you sure this frees all the elements of the list?	

Not quite sure, just see some code in qemu use it that way.

> As of now I would
>be tempted to leave the code as is.

Sure, that's fine.

Thanks
Zhenzhong


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2024-06-13  9:53 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-07 14:37 [RFC v2 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for hotplugged devices Eric Auger
2024-06-07 14:37 ` [RFC v2 1/7] HostIOMMUDevice: Store the VFIO/VDPA agent Eric Auger
2024-06-11  3:22   ` Duan, Zhenzhong
2024-06-07 14:37 ` [RFC v2 2/7] virtio-iommu: Implement set|unset]_iommu_device() callbacks Eric Auger
2024-06-11  2:38   ` Duan, Zhenzhong
2024-06-13  8:13     ` Eric Auger
2024-06-07 14:37 ` [RFC v2 3/7] HostIOMMUDevice: Introduce get_iova_ranges callback Eric Auger
2024-06-11  3:24   ` Duan, Zhenzhong
2024-06-13  8:19     ` Eric Auger
2024-06-07 14:37 ` [RFC v2 4/7] virtio-iommu: Compute host reserved regions Eric Auger
2024-06-11  3:25   ` Duan, Zhenzhong
2024-06-13  9:01     ` Eric Auger
2024-06-13  9:52       ` Duan, Zhenzhong
2024-06-07 14:37 ` [RFC v2 5/7] virtio-iommu: Remove the implementation of iommu_set_iova_range Eric Auger
2024-06-11  3:22   ` Duan, Zhenzhong
2024-06-07 14:37 ` [RFC v2 6/7] hw/vfio: Remove memory_region_iommu_set_iova_ranges() call Eric Auger
2024-06-11  3:23   ` Duan, Zhenzhong
2024-06-07 14:37 ` [RFC v2 7/7] memory: Remove IOMMU MR iommu_set_iova_range API Eric Auger
2024-06-11  3:23   ` Duan, Zhenzhong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).