qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU
@ 2024-06-05  8:30 Zhenzhong Duan
  2024-06-05  8:30 ` [PATCH v7 01/17] backends: Introduce HostIOMMUDevice abstract Zhenzhong Duan
                   ` (20 more replies)
  0 siblings, 21 replies; 25+ messages in thread
From: Zhenzhong Duan @ 2024-06-05  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
	yi.l.liu, chao.p.peng, Zhenzhong Duan

Hi,

This series introduce a HostIOMMUDevice abstraction and sub-classes.
Also HostIOMMUDeviceCaps structure in HostIOMMUDevice and a new interface
between vIOMMU and HostIOMMUDevice.

A HostIOMMUDevice is an abstraction for an assigned device that is protected
by a physical IOMMU (aka host IOMMU). The userspace interaction with this
physical IOMMU can be done either through the VFIO IOMMU type 1 legacy
backend or the new iommufd backend. The assigned device can be a VFIO device
or a VDPA device. The HostIOMMUDevice is needed to interact with the host
IOMMU that protects the assigned device. It is especially useful when the
device is also protected by a virtual IOMMU as this latter use the translation
services of the physical IOMMU and is constrained by it. In that context the
HostIOMMUDevice can be passed to the virtual IOMMU to collect physical IOMMU
capabilities such as the supported address width. In the future, the virtual
IOMMU will use the HostIOMMUDevice to program the guest page tables in the
first translation stage of the physical IOMMU.

HostIOMMUDeviceClass::realize() is introduced to initialize
HostIOMMUDeviceCaps and other fields of HostIOMMUDevice variants.

HostIOMMUDeviceClass::get_cap() is introduced to query host IOMMU
device capabilities.

The class tree is as below:

                              HostIOMMUDevice
                                     | .caps
                                     | .realize()
                                     | .get_cap()
                                     |
            .-----------------------------------------------.
            |                        |                      |
HostIOMMUDeviceLegacyVFIO  {HostIOMMUDeviceLegacyVDPA}  HostIOMMUDeviceIOMMUFD
            |                        |                      | [.iommufd]
                                                            | [.devid]
                                                            | [.ioas_id]
                                                            | [.attach_hwpt()]
                                                            | [.detach_hwpt()]
                                                            |
                                            .----------------------.
                                            |                      |
                         HostIOMMUDeviceIOMMUFDVFIO  {HostIOMMUDeviceIOMMUFDVDPA}
                                          | [.vdev]                | {.vdev}

* The attributes in [] will be implemented in nesting series.
* The classes in {} will be implemented in future.
* .vdev in different class points to different agent device,
* i.e., VFIODevice or VDPADevice.

PATCH1-4: Introduce HostIOMMUDevice and its sub classes
PATCH5-10: Implement .realize() and .get_cap() handler
PATCH11-14: Create HostIOMMUDevice instance and pass to vIOMMU
PATCH15-17: Implement compatibility check between host IOMMU and vIOMMU(intel_iommu)

Test done:
make check
vfio device hotplug/unplug with different backend on linux
reboot, kexec
build test on linux and windows11

Qemu code can be found at:
https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_preq_v7

Besides the compatibility check in this series, in nesting series, this
host IOMMU device is extended for much wider usage. For anyone interested
on the nesting series, here is the link:
https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_rfcv2

Thanks
Zhenzhong

Changelog:
v7:
- drop config CONFIG_HOST_IOMMU_DEVICE (Cédric)
- introduce HOST_IOMMU_DEVICE_CAP_AW_BITS_MAX (Eric)
- use iova_ranges method in iommufd.realize() (Eric)
- introduce HostIOMMUDevice::name to facilitate tracing (Eric)
- implement a custom destroy hash function (Cédric)
- drop VTDHostIOMMUDevice and save HostIOMMUDevice in hash table (Eric)
- move patch5 after patch1 (Eric)
- squash patch3 and 4, squash patch12 and 13 (Eric)
- refine comments (Eric)
- collect Eric's R-B

v6:
- open coded host_iommu_device_get_cap() to avoid #ifdef in intel_iommu.c (Cédric)

v5:
- pci_device_set_iommu_device return true (Cédric)
- fix build failure on windows (thanks Cédric found that issue)

v4:
- move properties vdev, iommufd and devid to nesting series where need it (Cédric)
- fix 32bit build with clz64 (Cédric)
- change check_cap naming to get_cap (Cédric)
- return bool if error is passed through errp (Cédric)
- drop HostIOMMUDevice[LegacyVFIO|IOMMUFD|IOMMUFDVFIO] declaration (Cédric)
- drop HOST_IOMMU_DEVICE_CAP_IOMMUFD (Cédric)
- replace include directive with forward declaration (Cédric)

v3:
- refine declaration and doc for HostIOMMUDevice (Cédric, Philippe)
- introduce HostIOMMUDeviceCaps, .realize() and .check_cap() (Cédric)
- introduce helper range_get_last_bit() for range operation (Cédric)
- separate pci_device_get_iommu_bus_devfn() in a prereq patch (Cédric)
- replace HIOD_ abbreviation with HOST_IOMMU_DEVICE_ (Cédric)
- add header in include/sysemu/iommufd.h (Cédric)

v2:
- use QOM to abstract host IOMMU device and its sub-classes (Cédric)
- move host IOMMU device creation in attach_device() (Cédric)
- refine pci_device_set/unset_iommu_device doc further (Eric)
- define host IOMMU info format of different backend
- implement get_host_iommu_info() for different backend (Cédric)
- drop cap/ecap update logic (MST)
- check aw-bits from get_host_iommu_info() in legacy mode

v1:
- use HostIOMMUDevice handle instead of union in VFIODevice (Eric)
- change host_iommu_device_init to host_iommu_device_create
- allocate HostIOMMUDevice in host_iommu_device_create callback
  and set the VFIODevice base_hdev handle (Eric)
- refine pci_device_set/unset_iommu_device doc (Eric)
- use HostIOMMUDevice handle instead of union in VTDHostIOMMUDevice (Eric)
- convert HostIOMMUDevice to sub object pointer in vtd_check_hdev

rfcv2:
- introduce common abstract HostIOMMUDevice and sub struct for different BEs (Eric, Cédric)
- remove iommufd_device.[ch] (Cédric)
- remove duplicate iommufd/devid define from VFIODevice (Eric)
- drop the p in aliased_pbus and aliased_pdevfn (Eric)
- assert devfn and iommu_bus in pci_device_get_iommu_bus_devfn (Cédric, Eric)
- use errp in iommufd_device_get_info (Eric)
- split and simplify cap/ecap check/sync code in intel_iommu.c (Cédric)
- move VTDHostIOMMUDevice declaration to intel_iommu_internal.h (Cédric)
- make '(vtd->cap_reg >> 16) & 0x3fULL' a MACRO and add missed '+1' (Cédric)
- block migration if vIOMMU cap/ecap updated based on host IOMMU cap/ecap
- add R-B

Yi Liu (2):
  hw/pci: Introduce pci_device_[set|unset]_iommu_device()
  intel_iommu: Implement [set|unset]_iommu_device() callbacks

Zhenzhong Duan (15):
  backends: Introduce HostIOMMUDevice abstract
  backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
  vfio/container: Introduce TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO device
  backends/iommufd: Introduce TYPE_HOST_IOMMU_DEVICE_IOMMUFD[_VFIO]
    devices
  range: Introduce range_get_last_bit()
  vfio/container: Implement HostIOMMUDeviceClass::realize() handler
  backends/iommufd: Introduce helper function
    iommufd_backend_get_device_info()
  vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
  vfio/container: Implement HostIOMMUDeviceClass::get_cap() handler
  backends/iommufd: Implement HostIOMMUDeviceClass::get_cap() handler
  vfio: Create host IOMMU device instance
  hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn()
  vfio/pci: Pass HostIOMMUDevice to vIOMMU
  intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap
  intel_iommu: Check compatibility with host IOMMU capabilities

 MAINTAINERS                           |   2 +
 include/hw/i386/intel_iommu.h         |   2 +
 include/hw/pci/pci.h                  |  38 ++++-
 include/hw/vfio/vfio-common.h         |   8 +
 include/hw/vfio/vfio-container-base.h |   3 +
 include/qemu/range.h                  |  11 ++
 include/sysemu/host_iommu_device.h    |  91 ++++++++++++
 include/sysemu/iommufd.h              |  19 +++
 backends/host_iommu_device.c          |  33 +++++
 backends/iommufd.c                    |  76 ++++++++--
 hw/i386/intel_iommu.c                 | 203 ++++++++++++++++++++------
 hw/pci/pci.c                          |  75 +++++++++-
 hw/vfio/common.c                      |  16 +-
 hw/vfio/container.c                   |  41 +++++-
 hw/vfio/helpers.c                     |  17 +++
 hw/vfio/iommufd.c                     |  37 ++++-
 hw/vfio/pci.c                         |  19 ++-
 backends/meson.build                  |   1 +
 18 files changed, 623 insertions(+), 69 deletions(-)
 create mode 100644 include/sysemu/host_iommu_device.h
 create mode 100644 backends/host_iommu_device.c

-- 
2.34.1



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v7 01/17] backends: Introduce HostIOMMUDevice abstract
  2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
@ 2024-06-05  8:30 ` Zhenzhong Duan
  2024-06-05  8:30 ` [PATCH v7 02/17] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps Zhenzhong Duan
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Zhenzhong Duan @ 2024-06-05  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
	yi.l.liu, chao.p.peng, Zhenzhong Duan

A HostIOMMUDevice is an abstraction for an assigned device that is protected
by a physical IOMMU (aka host IOMMU). The userspace interaction with this
physical IOMMU can be done either through the VFIO IOMMU type 1 legacy
backend or the new iommufd backend. The assigned device can be a VFIO device
or a VDPA device. The HostIOMMUDevice is needed to interact with the host
IOMMU that protects the assigned device. It is especially useful when the
device is also protected by a virtual IOMMU as this latter use the translation
services of the physical IOMMU and is constrained by it. In that context the
HostIOMMUDevice can be passed to the virtual IOMMU to collect physical IOMMU
capabilities such as the supported address width. In the future, the virtual
IOMMU will use the HostIOMMUDevice to program the guest page tables in the
first translation stage of the physical IOMMU.

Introduce .realize() to initialize HostIOMMUDevice further after instance init.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 MAINTAINERS                        |  2 ++
 include/sysemu/host_iommu_device.h | 53 ++++++++++++++++++++++++++++++
 backends/host_iommu_device.c       | 33 +++++++++++++++++++
 backends/meson.build               |  1 +
 4 files changed, 89 insertions(+)
 create mode 100644 include/sysemu/host_iommu_device.h
 create mode 100644 backends/host_iommu_device.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 448dc951c5..1cf2b25beb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2196,6 +2196,8 @@ M: Zhenzhong Duan <zhenzhong.duan@intel.com>
 S: Supported
 F: backends/iommufd.c
 F: include/sysemu/iommufd.h
+F: backends/host_iommu_device.c
+F: include/sysemu/host_iommu_device.h
 F: include/qemu/chardev_open.h
 F: util/chardev_open.c
 F: docs/devel/vfio-iommufd.rst
diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
new file mode 100644
index 0000000000..db47a16189
--- /dev/null
+++ b/include/sysemu/host_iommu_device.h
@@ -0,0 +1,53 @@
+/*
+ * Host IOMMU device abstract declaration
+ *
+ * Copyright (C) 2024 Intel Corporation.
+ *
+ * Authors: Zhenzhong Duan <zhenzhong.duan@intel.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ */
+
+#ifndef HOST_IOMMU_DEVICE_H
+#define HOST_IOMMU_DEVICE_H
+
+#include "qom/object.h"
+#include "qapi/error.h"
+
+#define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
+OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass, HOST_IOMMU_DEVICE)
+
+struct HostIOMMUDevice {
+    Object parent_obj;
+
+    char *name;
+};
+
+/**
+ * struct HostIOMMUDeviceClass - The base class for all host IOMMU devices.
+ *
+ * Different types of host devices (e.g., VFIO or VDPA device) or devices
+ * with different backend (e.g., VFIO legacy container or IOMMUFD backend)
+ * will have different implementations of the HostIOMMUDeviceClass.
+ */
+struct HostIOMMUDeviceClass {
+    ObjectClass parent_class;
+
+    /**
+     * @realize: initialize host IOMMU device instance further.
+     *
+     * Mandatory callback.
+     *
+     * @hiod: pointer to a host IOMMU device instance.
+     *
+     * @opaque: pointer to agent device of this host IOMMU device,
+     *          e.g., VFIO base device or VDPA device.
+     *
+     * @errp: pass an Error out when realize fails.
+     *
+     * Returns: true on success, false on failure.
+     */
+    bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
+};
+#endif
diff --git a/backends/host_iommu_device.c b/backends/host_iommu_device.c
new file mode 100644
index 0000000000..8f2dda1beb
--- /dev/null
+++ b/backends/host_iommu_device.c
@@ -0,0 +1,33 @@
+/*
+ * Host IOMMU device abstract
+ *
+ * Copyright (C) 2024 Intel Corporation.
+ *
+ * Authors: Zhenzhong Duan <zhenzhong.duan@intel.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "sysemu/host_iommu_device.h"
+
+OBJECT_DEFINE_ABSTRACT_TYPE(HostIOMMUDevice,
+                            host_iommu_device,
+                            HOST_IOMMU_DEVICE,
+                            OBJECT)
+
+static void host_iommu_device_class_init(ObjectClass *oc, void *data)
+{
+}
+
+static void host_iommu_device_init(Object *obj)
+{
+}
+
+static void host_iommu_device_finalize(Object *obj)
+{
+    HostIOMMUDevice *hiod = HOST_IOMMU_DEVICE(obj);
+
+    g_free(hiod->name);
+}
diff --git a/backends/meson.build b/backends/meson.build
index 8b2b111497..106312f0c8 100644
--- a/backends/meson.build
+++ b/backends/meson.build
@@ -16,6 +16,7 @@ if host_os != 'windows'
 endif
 if host_os == 'linux'
   system_ss.add(files('hostmem-memfd.c'))
+  system_ss.add(files('host_iommu_device.c'))
 endif
 if keyutils.found()
     system_ss.add(keyutils, files('cryptodev-lkcf.c'))
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 02/17] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
  2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
  2024-06-05  8:30 ` [PATCH v7 01/17] backends: Introduce HostIOMMUDevice abstract Zhenzhong Duan
@ 2024-06-05  8:30 ` Zhenzhong Duan
  2024-06-05  8:30 ` [PATCH v7 03/17] vfio/container: Introduce TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO device Zhenzhong Duan
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Zhenzhong Duan @ 2024-06-05  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
	yi.l.liu, chao.p.peng, Zhenzhong Duan

HostIOMMUDeviceCaps's elements map to the host IOMMU's capabilities.
Different platform IOMMU can support different elements.

Currently only two elements, type and aw_bits, type hints the host
platform IOMMU type, i.e., INTEL vtd, ARM smmu, etc; aw_bits hints
host IOMMU address width.

Introduce .get_cap() handler to check if HOST_IOMMU_DEVICE_CAP_XXX
is supported.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/sysemu/host_iommu_device.h | 38 ++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
index db47a16189..a57873958b 100644
--- a/include/sysemu/host_iommu_device.h
+++ b/include/sysemu/host_iommu_device.h
@@ -15,6 +15,18 @@
 #include "qom/object.h"
 #include "qapi/error.h"
 
+/**
+ * struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities.
+ *
+ * @type: host platform IOMMU type.
+ *
+ * @aw_bits: host IOMMU address width. 0xff if no limitation.
+ */
+typedef struct HostIOMMUDeviceCaps {
+    uint32_t type;
+    uint8_t aw_bits;
+} HostIOMMUDeviceCaps;
+
 #define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
 OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass, HOST_IOMMU_DEVICE)
 
@@ -22,6 +34,7 @@ struct HostIOMMUDevice {
     Object parent_obj;
 
     char *name;
+    HostIOMMUDeviceCaps caps;
 };
 
 /**
@@ -49,5 +62,30 @@ struct HostIOMMUDeviceClass {
      * Returns: true on success, false on failure.
      */
     bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
+    /**
+     * @get_cap: check if a host IOMMU device capability is supported.
+     *
+     * Optional callback, if not implemented, hint not supporting query
+     * of @cap.
+     *
+     * @hiod: pointer to a host IOMMU device instance.
+     *
+     * @cap: capability to check.
+     *
+     * @errp: pass an Error out when fails to query capability.
+     *
+     * Returns: <0 on failure, 0 if a @cap is unsupported, or else
+     * 1 or some positive value for some special @cap,
+     * i.e., HOST_IOMMU_DEVICE_CAP_AW_BITS.
+     */
+    int (*get_cap)(HostIOMMUDevice *hiod, int cap, Error **errp);
 };
+
+/*
+ * Host IOMMU device capability list.
+ */
+#define HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE        0
+#define HOST_IOMMU_DEVICE_CAP_AW_BITS           1
+
+#define HOST_IOMMU_DEVICE_CAP_AW_BITS_MAX       64
 #endif
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 03/17] vfio/container: Introduce TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO device
  2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
  2024-06-05  8:30 ` [PATCH v7 01/17] backends: Introduce HostIOMMUDevice abstract Zhenzhong Duan
  2024-06-05  8:30 ` [PATCH v7 02/17] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps Zhenzhong Duan
@ 2024-06-05  8:30 ` Zhenzhong Duan
  2024-06-05  8:30 ` [PATCH v7 04/17] backends/iommufd: Introduce TYPE_HOST_IOMMU_DEVICE_IOMMUFD[_VFIO] devices Zhenzhong Duan
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Zhenzhong Duan @ 2024-06-05  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
	yi.l.liu, chao.p.peng, Zhenzhong Duan

TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO represents a host IOMMU device under
VFIO legacy container backend.

It will have its own realize implementation.

Suggested-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/hw/vfio/vfio-common.h | 3 +++
 hw/vfio/container.c           | 5 ++++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 4cb1ab8645..75b167979a 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -31,6 +31,7 @@
 #endif
 #include "sysemu/sysemu.h"
 #include "hw/vfio/vfio-container-base.h"
+#include "sysemu/host_iommu_device.h"
 
 #define VFIO_MSG_PREFIX "vfio %s: "
 
@@ -171,6 +172,8 @@ typedef struct VFIOGroup {
     bool ram_block_discard_allowed;
 } VFIOGroup;
 
+#define TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO TYPE_HOST_IOMMU_DEVICE "-legacy-vfio"
+
 typedef struct VFIODMABuf {
     QemuDmaBuf *buf;
     uint32_t pos_x, pos_y, pos_updates;
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 096cc97258..c4fca2dfca 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -1141,7 +1141,10 @@ static const TypeInfo types[] = {
         .name = TYPE_VFIO_IOMMU_LEGACY,
         .parent = TYPE_VFIO_IOMMU,
         .class_init = vfio_iommu_legacy_class_init,
-    },
+    }, {
+        .name = TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO,
+        .parent = TYPE_HOST_IOMMU_DEVICE,
+    }
 };
 
 DEFINE_TYPES(types)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 04/17] backends/iommufd: Introduce TYPE_HOST_IOMMU_DEVICE_IOMMUFD[_VFIO] devices
  2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (2 preceding siblings ...)
  2024-06-05  8:30 ` [PATCH v7 03/17] vfio/container: Introduce TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO device Zhenzhong Duan
@ 2024-06-05  8:30 ` Zhenzhong Duan
  2024-06-05  8:30 ` [PATCH v7 05/17] range: Introduce range_get_last_bit() Zhenzhong Duan
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Zhenzhong Duan @ 2024-06-05  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
	yi.l.liu, chao.p.peng, Zhenzhong Duan

TYPE_HOST_IOMMU_DEVICE_IOMMUFD represents a host IOMMU device under
iommufd backend. It is abstract, because it is going to be derived
into VFIO or VDPA type'd device.

It will have its own .get_cap() implementation.

TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO is a sub-class of
TYPE_HOST_IOMMU_DEVICE_IOMMUFD, represents a VFIO type'd host IOMMU
device under iommufd backend. It will be created during VFIO device
attaching and passed to vIOMMU.

It will have its own .realize() implementation.

Opportunistically, add missed header to include/sysemu/iommufd.h.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/hw/vfio/vfio-common.h |  3 +++
 include/sysemu/iommufd.h      | 16 ++++++++++++++++
 backends/iommufd.c            | 35 ++++++++++++++++++-----------------
 hw/vfio/iommufd.c             |  5 ++++-
 4 files changed, 41 insertions(+), 18 deletions(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 75b167979a..56d1717211 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -32,6 +32,7 @@
 #include "sysemu/sysemu.h"
 #include "hw/vfio/vfio-container-base.h"
 #include "sysemu/host_iommu_device.h"
+#include "sysemu/iommufd.h"
 
 #define VFIO_MSG_PREFIX "vfio %s: "
 
@@ -173,6 +174,8 @@ typedef struct VFIOGroup {
 } VFIOGroup;
 
 #define TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO TYPE_HOST_IOMMU_DEVICE "-legacy-vfio"
+#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO \
+            TYPE_HOST_IOMMU_DEVICE_IOMMUFD "-vfio"
 
 typedef struct VFIODMABuf {
     QemuDmaBuf *buf;
diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
index 293bfbe967..f6e6d6e1f9 100644
--- a/include/sysemu/iommufd.h
+++ b/include/sysemu/iommufd.h
@@ -1,9 +1,23 @@
+/*
+ * iommufd container backend declaration
+ *
+ * Copyright (C) 2024 Intel Corporation.
+ * Copyright Red Hat, Inc. 2024
+ *
+ * Authors: Yi Liu <yi.l.liu@intel.com>
+ *          Eric Auger <eric.auger@redhat.com>
+ *          Zhenzhong Duan <zhenzhong.duan@intel.com>
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
 #ifndef SYSEMU_IOMMUFD_H
 #define SYSEMU_IOMMUFD_H
 
 #include "qom/object.h"
 #include "exec/hwaddr.h"
 #include "exec/cpu-common.h"
+#include "sysemu/host_iommu_device.h"
 
 #define TYPE_IOMMUFD_BACKEND "iommufd"
 OBJECT_DECLARE_TYPE(IOMMUFDBackend, IOMMUFDBackendClass, IOMMUFD_BACKEND)
@@ -33,4 +47,6 @@ int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova,
                             ram_addr_t size, void *vaddr, bool readonly);
 int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
                               hwaddr iova, ram_addr_t size);
+
+#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
 #endif
diff --git a/backends/iommufd.c b/backends/iommufd.c
index c506afbdac..012f18d8d8 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -208,23 +208,24 @@ int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
     return ret;
 }
 
-static const TypeInfo iommufd_backend_info = {
-    .name = TYPE_IOMMUFD_BACKEND,
-    .parent = TYPE_OBJECT,
-    .instance_size = sizeof(IOMMUFDBackend),
-    .instance_init = iommufd_backend_init,
-    .instance_finalize = iommufd_backend_finalize,
-    .class_size = sizeof(IOMMUFDBackendClass),
-    .class_init = iommufd_backend_class_init,
-    .interfaces = (InterfaceInfo[]) {
-        { TYPE_USER_CREATABLE },
-        { }
+static const TypeInfo types[] = {
+    {
+        .name = TYPE_IOMMUFD_BACKEND,
+        .parent = TYPE_OBJECT,
+        .instance_size = sizeof(IOMMUFDBackend),
+        .instance_init = iommufd_backend_init,
+        .instance_finalize = iommufd_backend_finalize,
+        .class_size = sizeof(IOMMUFDBackendClass),
+        .class_init = iommufd_backend_class_init,
+        .interfaces = (InterfaceInfo[]) {
+            { TYPE_USER_CREATABLE },
+            { }
+        }
+    }, {
+        .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
+        .parent = TYPE_HOST_IOMMU_DEVICE,
+        .abstract = true,
     }
 };
 
-static void register_types(void)
-{
-    type_register_static(&iommufd_backend_info);
-}
-
-type_init(register_types);
+DEFINE_TYPES(types)
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 554f9a6292..e4a507d55c 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -624,7 +624,10 @@ static const TypeInfo types[] = {
         .name = TYPE_VFIO_IOMMU_IOMMUFD,
         .parent = TYPE_VFIO_IOMMU,
         .class_init = vfio_iommu_iommufd_class_init,
-    },
+    }, {
+        .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO,
+        .parent = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
+    }
 };
 
 DEFINE_TYPES(types)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 05/17] range: Introduce range_get_last_bit()
  2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (3 preceding siblings ...)
  2024-06-05  8:30 ` [PATCH v7 04/17] backends/iommufd: Introduce TYPE_HOST_IOMMU_DEVICE_IOMMUFD[_VFIO] devices Zhenzhong Duan
@ 2024-06-05  8:30 ` Zhenzhong Duan
  2024-06-05  8:30 ` [PATCH v7 06/17] vfio/container: Implement HostIOMMUDeviceClass::realize() handler Zhenzhong Duan
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Zhenzhong Duan @ 2024-06-05  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
	yi.l.liu, chao.p.peng, Zhenzhong Duan

This helper get the highest 1 bit position of the upper bound.

If the range is empty or upper bound is zero, -1 is returned.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 include/qemu/range.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/include/qemu/range.h b/include/qemu/range.h
index 205e1da76d..4ce694a398 100644
--- a/include/qemu/range.h
+++ b/include/qemu/range.h
@@ -20,6 +20,8 @@
 #ifndef QEMU_RANGE_H
 #define QEMU_RANGE_H
 
+#include "qemu/bitops.h"
+
 /*
  * Operations on 64 bit address ranges.
  * Notes:
@@ -217,6 +219,15 @@ static inline int ranges_overlap(uint64_t first1, uint64_t len1,
     return !(last2 < first1 || last1 < first2);
 }
 
+/* Get highest non-zero bit position of a range */
+static inline int range_get_last_bit(Range *range)
+{
+    if (range_is_empty(range)) {
+        return -1;
+    }
+    return 63 - clz64(range->upb);
+}
+
 /*
  * Return -1 if @a < @b, 1 @a > @b, and 0 if they touch or overlap.
  * Both @a and @b must not be empty.
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 06/17] vfio/container: Implement HostIOMMUDeviceClass::realize() handler
  2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (4 preceding siblings ...)
  2024-06-05  8:30 ` [PATCH v7 05/17] range: Introduce range_get_last_bit() Zhenzhong Duan
@ 2024-06-05  8:30 ` Zhenzhong Duan
  2024-06-05  8:30 ` [PATCH v7 07/17] backends/iommufd: Introduce helper function iommufd_backend_get_device_info() Zhenzhong Duan
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Zhenzhong Duan @ 2024-06-05  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
	yi.l.liu, chao.p.peng, Zhenzhong Duan

The realize function populates the capabilities. For now only the
aw_bits caps is computed for legacy backend.

Introduce a helper function vfio_device_get_aw_bits() which calls
range_get_last_bit() to get host aw_bits and package it in
HostIOMMUDeviceCaps for query with .get_cap(). This helper will
also be used by iommufd backend.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/hw/vfio/vfio-common.h |  1 +
 hw/vfio/container.c           | 19 +++++++++++++++++++
 hw/vfio/helpers.c             | 17 +++++++++++++++++
 3 files changed, 37 insertions(+)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 56d1717211..105b8b7e80 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -289,4 +289,5 @@ bool vfio_device_get_name(VFIODevice *vbasedev, Error **errp);
 void vfio_device_set_fd(VFIODevice *vbasedev, const char *str, Error **errp);
 void vfio_device_init(VFIODevice *vbasedev, int type, VFIODeviceOps *ops,
                       DeviceState *dev, bool ram_discard);
+int vfio_device_get_aw_bits(VFIODevice *vdev);
 #endif /* HW_VFIO_VFIO_COMMON_H */
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index c4fca2dfca..2f62c13214 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -1136,6 +1136,24 @@ static void vfio_iommu_legacy_class_init(ObjectClass *klass, void *data)
     vioc->pci_hot_reset = vfio_legacy_pci_hot_reset;
 };
 
+static bool hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
+                                     Error **errp)
+{
+    VFIODevice *vdev = opaque;
+
+    hiod->name = g_strdup(vdev->name);
+    hiod->caps.aw_bits = vfio_device_get_aw_bits(vdev);
+
+    return true;
+}
+
+static void hiod_legacy_vfio_class_init(ObjectClass *oc, void *data)
+{
+    HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
+
+    hioc->realize = hiod_legacy_vfio_realize;
+};
+
 static const TypeInfo types[] = {
     {
         .name = TYPE_VFIO_IOMMU_LEGACY,
@@ -1144,6 +1162,7 @@ static const TypeInfo types[] = {
     }, {
         .name = TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO,
         .parent = TYPE_HOST_IOMMU_DEVICE,
+        .class_init = hiod_legacy_vfio_class_init,
     }
 };
 
diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c
index 27ea26aa48..b14edd46ed 100644
--- a/hw/vfio/helpers.c
+++ b/hw/vfio/helpers.c
@@ -658,3 +658,20 @@ void vfio_device_init(VFIODevice *vbasedev, int type, VFIODeviceOps *ops,
 
     vbasedev->ram_block_discard_allowed = ram_discard;
 }
+
+int vfio_device_get_aw_bits(VFIODevice *vdev)
+{
+    /*
+     * iova_ranges is a sorted list. For old kernels that support
+     * VFIO but not support query of iova ranges, iova_ranges is NULL,
+     * in this case HOST_IOMMU_DEVICE_CAP_AW_BITS_MAX(64) is returned.
+     */
+    GList *l = g_list_last(vdev->bcontainer->iova_ranges);
+
+    if (l) {
+        Range *range = l->data;
+        return range_get_last_bit(range) + 1;
+    }
+
+    return HOST_IOMMU_DEVICE_CAP_AW_BITS_MAX;
+}
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 07/17] backends/iommufd: Introduce helper function iommufd_backend_get_device_info()
  2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (5 preceding siblings ...)
  2024-06-05  8:30 ` [PATCH v7 06/17] vfio/container: Implement HostIOMMUDeviceClass::realize() handler Zhenzhong Duan
@ 2024-06-05  8:30 ` Zhenzhong Duan
  2024-06-05  8:30 ` [PATCH v7 08/17] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler Zhenzhong Duan
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Zhenzhong Duan @ 2024-06-05  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
	yi.l.liu, chao.p.peng, Zhenzhong Duan, Yi Sun

Introduce a helper function iommufd_backend_get_device_info() to get
host IOMMU related information through iommufd uAPI.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/sysemu/iommufd.h |  3 +++
 backends/iommufd.c       | 22 ++++++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
index f6e6d6e1f9..9edfec6045 100644
--- a/include/sysemu/iommufd.h
+++ b/include/sysemu/iommufd.h
@@ -47,6 +47,9 @@ int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova,
                             ram_addr_t size, void *vaddr, bool readonly);
 int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
                               hwaddr iova, ram_addr_t size);
+bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
+                                     uint32_t *type, void *data, uint32_t len,
+                                     Error **errp);
 
 #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
 #endif
diff --git a/backends/iommufd.c b/backends/iommufd.c
index 012f18d8d8..c7e969d6f7 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -208,6 +208,28 @@ int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
     return ret;
 }
 
+bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
+                                     uint32_t *type, void *data, uint32_t len,
+                                     Error **errp)
+{
+    struct iommu_hw_info info = {
+        .size = sizeof(info),
+        .dev_id = devid,
+        .data_len = len,
+        .data_uptr = (uintptr_t)data,
+    };
+
+    if (ioctl(be->fd, IOMMU_GET_HW_INFO, &info)) {
+        error_setg_errno(errp, errno, "Failed to get hardware info");
+        return false;
+    }
+
+    g_assert(type);
+    *type = info.out_data_type;
+
+    return true;
+}
+
 static const TypeInfo types[] = {
     {
         .name = TYPE_IOMMUFD_BACKEND,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 08/17] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
  2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (6 preceding siblings ...)
  2024-06-05  8:30 ` [PATCH v7 07/17] backends/iommufd: Introduce helper function iommufd_backend_get_device_info() Zhenzhong Duan
@ 2024-06-05  8:30 ` Zhenzhong Duan
  2024-06-05  8:30 ` [PATCH v7 09/17] vfio/container: Implement HostIOMMUDeviceClass::get_cap() handler Zhenzhong Duan
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Zhenzhong Duan @ 2024-06-05  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
	yi.l.liu, chao.p.peng, Zhenzhong Duan

It calls iommufd_backend_get_device_info() to get host IOMMU
related information and translate it into HostIOMMUDeviceCaps
for query with .get_cap().

For aw_bits, use the same way as legacy backend by calling
vfio_device_get_aw_bits() which is common for different vendor
IOMMU.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 hw/vfio/iommufd.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index e4a507d55c..1674c61227 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -619,6 +619,35 @@ static void vfio_iommu_iommufd_class_init(ObjectClass *klass, void *data)
     vioc->pci_hot_reset = iommufd_cdev_pci_hot_reset;
 };
 
+static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
+                                      Error **errp)
+{
+    VFIODevice *vdev = opaque;
+    HostIOMMUDeviceCaps *caps = &hiod->caps;
+    enum iommu_hw_info_type type;
+    union {
+        struct iommu_hw_info_vtd vtd;
+    } data;
+
+    if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid,
+                                         &type, &data, sizeof(data), errp)) {
+        return false;
+    }
+
+    hiod->name = g_strdup(vdev->name);
+    caps->type = type;
+    caps->aw_bits = vfio_device_get_aw_bits(vdev);
+
+    return true;
+}
+
+static void hiod_iommufd_vfio_class_init(ObjectClass *oc, void *data)
+{
+    HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_CLASS(oc);
+
+    hiodc->realize = hiod_iommufd_vfio_realize;
+};
+
 static const TypeInfo types[] = {
     {
         .name = TYPE_VFIO_IOMMU_IOMMUFD,
@@ -627,6 +656,7 @@ static const TypeInfo types[] = {
     }, {
         .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO,
         .parent = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
+        .class_init = hiod_iommufd_vfio_class_init,
     }
 };
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 09/17] vfio/container: Implement HostIOMMUDeviceClass::get_cap() handler
  2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (7 preceding siblings ...)
  2024-06-05  8:30 ` [PATCH v7 08/17] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler Zhenzhong Duan
@ 2024-06-05  8:30 ` Zhenzhong Duan
  2024-06-05  8:30 ` [PATCH v7 10/17] backends/iommufd: " Zhenzhong Duan
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Zhenzhong Duan @ 2024-06-05  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
	yi.l.liu, chao.p.peng, Zhenzhong Duan

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 hw/vfio/container.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 2f62c13214..99beeba422 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -1147,11 +1147,26 @@ static bool hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
     return true;
 }
 
+static int hiod_legacy_vfio_get_cap(HostIOMMUDevice *hiod, int cap,
+                                    Error **errp)
+{
+    HostIOMMUDeviceCaps *caps = &hiod->caps;
+
+    switch (cap) {
+    case HOST_IOMMU_DEVICE_CAP_AW_BITS:
+        return caps->aw_bits;
+    default:
+        error_setg(errp, "%s: unsupported capability %x", hiod->name, cap);
+        return -EINVAL;
+    }
+}
+
 static void hiod_legacy_vfio_class_init(ObjectClass *oc, void *data)
 {
     HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
 
     hioc->realize = hiod_legacy_vfio_realize;
+    hioc->get_cap = hiod_legacy_vfio_get_cap;
 };
 
 static const TypeInfo types[] = {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 10/17] backends/iommufd: Implement HostIOMMUDeviceClass::get_cap() handler
  2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (8 preceding siblings ...)
  2024-06-05  8:30 ` [PATCH v7 09/17] vfio/container: Implement HostIOMMUDeviceClass::get_cap() handler Zhenzhong Duan
@ 2024-06-05  8:30 ` Zhenzhong Duan
  2024-06-05  8:30 ` [PATCH v7 11/17] vfio: Create host IOMMU device instance Zhenzhong Duan
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Zhenzhong Duan @ 2024-06-05  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
	yi.l.liu, chao.p.peng, Zhenzhong Duan

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 backends/iommufd.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/backends/iommufd.c b/backends/iommufd.c
index c7e969d6f7..84fefbc9ee 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -230,6 +230,28 @@ bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
     return true;
 }
 
+static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
+{
+    HostIOMMUDeviceCaps *caps = &hiod->caps;
+
+    switch (cap) {
+    case HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE:
+        return caps->type;
+    case HOST_IOMMU_DEVICE_CAP_AW_BITS:
+        return caps->aw_bits;
+    default:
+        error_setg(errp, "%s: unsupported capability %x", hiod->name, cap);
+        return -EINVAL;
+    }
+}
+
+static void hiod_iommufd_class_init(ObjectClass *oc, void *data)
+{
+    HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
+
+    hioc->get_cap = hiod_iommufd_get_cap;
+};
+
 static const TypeInfo types[] = {
     {
         .name = TYPE_IOMMUFD_BACKEND,
@@ -246,6 +268,7 @@ static const TypeInfo types[] = {
     }, {
         .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
         .parent = TYPE_HOST_IOMMU_DEVICE,
+        .class_init = hiod_iommufd_class_init,
         .abstract = true,
     }
 };
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 11/17] vfio: Create host IOMMU device instance
  2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (9 preceding siblings ...)
  2024-06-05  8:30 ` [PATCH v7 10/17] backends/iommufd: " Zhenzhong Duan
@ 2024-06-05  8:30 ` Zhenzhong Duan
  2024-06-05  8:30 ` [PATCH v7 12/17] hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn() Zhenzhong Duan
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Zhenzhong Duan @ 2024-06-05  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
	yi.l.liu, chao.p.peng, Zhenzhong Duan

Create host IOMMU device instance in vfio_attach_device() and call
.realize() to initialize it further.

Introuduce attribute VFIOIOMMUClass::hiod_typename and initialize
it based on VFIO backend type. It will facilitate HostIOMMUDevice
creation in vfio_attach_device().

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/hw/vfio/vfio-common.h         |  1 +
 include/hw/vfio/vfio-container-base.h |  3 +++
 hw/vfio/common.c                      | 16 +++++++++++++++-
 hw/vfio/container.c                   |  2 ++
 hw/vfio/iommufd.c                     |  2 ++
 5 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 105b8b7e80..776de8064f 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -127,6 +127,7 @@ typedef struct VFIODevice {
     OnOffAuto pre_copy_dirty_page_tracking;
     bool dirty_pages_supported;
     bool dirty_tracking;
+    HostIOMMUDevice *hiod;
     int devid;
     IOMMUFDBackend *iommufd;
 } VFIODevice;
diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h
index 2776481fc9..442c0dfc4c 100644
--- a/include/hw/vfio/vfio-container-base.h
+++ b/include/hw/vfio/vfio-container-base.h
@@ -109,6 +109,9 @@ DECLARE_CLASS_CHECKERS(VFIOIOMMUClass, VFIO_IOMMU, TYPE_VFIO_IOMMU)
 struct VFIOIOMMUClass {
     InterfaceClass parent_class;
 
+    /* Properties */
+    const char *hiod_typename;
+
     /* basic feature */
     bool (*setup)(VFIOContainerBase *bcontainer, Error **errp);
     int (*dma_map)(const VFIOContainerBase *bcontainer,
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index f9619a1dfb..f20a7b5bba 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1528,6 +1528,7 @@ bool vfio_attach_device(char *name, VFIODevice *vbasedev,
 {
     const VFIOIOMMUClass *ops =
         VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY));
+    HostIOMMUDevice *hiod;
 
     if (vbasedev->iommufd) {
         ops = VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUFD));
@@ -1535,7 +1536,19 @@ bool vfio_attach_device(char *name, VFIODevice *vbasedev,
 
     assert(ops);
 
-    return ops->attach_device(name, vbasedev, as, errp);
+    if (!ops->attach_device(name, vbasedev, as, errp)) {
+        return false;
+    }
+
+    hiod = HOST_IOMMU_DEVICE(object_new(ops->hiod_typename));
+    if (!HOST_IOMMU_DEVICE_GET_CLASS(hiod)->realize(hiod, vbasedev, errp)) {
+        object_unref(hiod);
+        ops->detach_device(vbasedev);
+        return false;
+    }
+    vbasedev->hiod = hiod;
+
+    return true;
 }
 
 void vfio_detach_device(VFIODevice *vbasedev)
@@ -1543,5 +1556,6 @@ void vfio_detach_device(VFIODevice *vbasedev)
     if (!vbasedev->bcontainer) {
         return;
     }
+    object_unref(vbasedev->hiod);
     vbasedev->bcontainer->ops->detach_device(vbasedev);
 }
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 99beeba422..26e6f7fb4f 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -1126,6 +1126,8 @@ static void vfio_iommu_legacy_class_init(ObjectClass *klass, void *data)
 {
     VFIOIOMMUClass *vioc = VFIO_IOMMU_CLASS(klass);
 
+    vioc->hiod_typename = TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO;
+
     vioc->setup = vfio_legacy_setup;
     vioc->dma_map = vfio_legacy_dma_map;
     vioc->dma_unmap = vfio_legacy_dma_unmap;
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 1674c61227..409ed3dcc9 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -612,6 +612,8 @@ static void vfio_iommu_iommufd_class_init(ObjectClass *klass, void *data)
 {
     VFIOIOMMUClass *vioc = VFIO_IOMMU_CLASS(klass);
 
+    vioc->hiod_typename = TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO;
+
     vioc->dma_map = iommufd_cdev_map;
     vioc->dma_unmap = iommufd_cdev_unmap;
     vioc->attach_device = iommufd_cdev_attach;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 12/17] hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn()
  2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (10 preceding siblings ...)
  2024-06-05  8:30 ` [PATCH v7 11/17] vfio: Create host IOMMU device instance Zhenzhong Duan
@ 2024-06-05  8:30 ` Zhenzhong Duan
  2024-06-05  8:30 ` [PATCH v7 13/17] hw/pci: Introduce pci_device_[set|unset]_iommu_device() Zhenzhong Duan
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Zhenzhong Duan @ 2024-06-05  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
	yi.l.liu, chao.p.peng, Zhenzhong Duan, Yi Sun, Marcel Apfelbaum

Extract out pci_device_get_iommu_bus_devfn() from
pci_device_iommu_address_space() to facilitate
implementation of pci_device_[set|unset]_iommu_device()
in following patch.

No functional change intended.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 hw/pci/pci.c | 48 +++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 45 insertions(+), 3 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 324c1302d2..02a4bb2af6 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2648,11 +2648,27 @@ static void pci_device_class_base_init(ObjectClass *klass, void *data)
     }
 }
 
-AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
+/*
+ * Get IOMMU root bus, aliased bus and devfn of a PCI device
+ *
+ * IOMMU root bus is needed by all call sites to call into iommu_ops.
+ * For call sites which don't need aliased BDF, passing NULL to
+ * aliased_[bus|devfn] is allowed.
+ *
+ * @piommu_bus: return root #PCIBus backed by an IOMMU for the PCI device.
+ *
+ * @aliased_bus: return aliased #PCIBus of the PCI device, optional.
+ *
+ * @aliased_devfn: return aliased devfn of the PCI device, optional.
+ */
+static void pci_device_get_iommu_bus_devfn(PCIDevice *dev,
+                                           PCIBus **piommu_bus,
+                                           PCIBus **aliased_bus,
+                                           int *aliased_devfn)
 {
     PCIBus *bus = pci_get_bus(dev);
     PCIBus *iommu_bus = bus;
-    uint8_t devfn = dev->devfn;
+    int devfn = dev->devfn;
 
     while (iommu_bus && !iommu_bus->iommu_ops && iommu_bus->parent_dev) {
         PCIBus *parent_bus = pci_get_bus(iommu_bus->parent_dev);
@@ -2693,7 +2709,33 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
 
         iommu_bus = parent_bus;
     }
-    if (!pci_bus_bypass_iommu(bus) && iommu_bus->iommu_ops) {
+
+    assert(0 <= devfn && devfn < PCI_DEVFN_MAX);
+    assert(iommu_bus);
+
+    if (pci_bus_bypass_iommu(bus) || !iommu_bus->iommu_ops) {
+        iommu_bus = NULL;
+    }
+
+    *piommu_bus = iommu_bus;
+
+    if (aliased_bus) {
+        *aliased_bus = bus;
+    }
+
+    if (aliased_devfn) {
+        *aliased_devfn = devfn;
+    }
+}
+
+AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
+{
+    PCIBus *bus;
+    PCIBus *iommu_bus;
+    int devfn;
+
+    pci_device_get_iommu_bus_devfn(dev, &iommu_bus, &bus, &devfn);
+    if (iommu_bus) {
         return iommu_bus->iommu_ops->get_address_space(bus,
                                  iommu_bus->iommu_opaque, devfn);
     }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 13/17] hw/pci: Introduce pci_device_[set|unset]_iommu_device()
  2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (11 preceding siblings ...)
  2024-06-05  8:30 ` [PATCH v7 12/17] hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn() Zhenzhong Duan
@ 2024-06-05  8:30 ` Zhenzhong Duan
  2024-06-05  8:30 ` [PATCH v7 14/17] vfio/pci: Pass HostIOMMUDevice to vIOMMU Zhenzhong Duan
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Zhenzhong Duan @ 2024-06-05  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
	yi.l.liu, chao.p.peng, Yi Sun, Zhenzhong Duan, Marcel Apfelbaum

From: Yi Liu <yi.l.liu@intel.com>

pci_device_[set|unset]_iommu_device() call pci_device_get_iommu_bus_devfn()
to get iommu_bus->iommu_ops and call [set|unset]_iommu_device callback to
set/unset HostIOMMUDevice for a given PCI device.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 include/hw/pci/pci.h | 38 +++++++++++++++++++++++++++++++++++++-
 hw/pci/pci.c         | 27 +++++++++++++++++++++++++++
 2 files changed, 64 insertions(+), 1 deletion(-)

diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index eaa3fc99d8..eb26cac810 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -3,6 +3,7 @@
 
 #include "exec/memory.h"
 #include "sysemu/dma.h"
+#include "sysemu/host_iommu_device.h"
 
 /* PCI includes legacy ISA access.  */
 #include "hw/isa/isa.h"
@@ -383,10 +384,45 @@ typedef struct PCIIOMMUOps {
      *
      * @devfn: device and function number
      */
-   AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn);
+    AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn);
+    /**
+     * @set_iommu_device: attach a HostIOMMUDevice to a vIOMMU
+     *
+     * Optional callback, if not implemented in vIOMMU, then vIOMMU can't
+     * retrieve host information from the associated HostIOMMUDevice.
+     *
+     * @bus: the #PCIBus of the PCI device.
+     *
+     * @opaque: the data passed to pci_setup_iommu().
+     *
+     * @devfn: device and function number of the PCI device.
+     *
+     * @dev: the #HostIOMMUDevice to attach.
+     *
+     * @errp: pass an Error out only when return false
+     *
+     * Returns: true if HostIOMMUDevice is attached or else false with errp set.
+     */
+    bool (*set_iommu_device)(PCIBus *bus, void *opaque, int devfn,
+                             HostIOMMUDevice *dev, Error **errp);
+    /**
+     * @unset_iommu_device: detach a HostIOMMUDevice from a vIOMMU
+     *
+     * Optional callback.
+     *
+     * @bus: the #PCIBus of the PCI device.
+     *
+     * @opaque: the data passed to pci_setup_iommu().
+     *
+     * @devfn: device and function number of the PCI device.
+     */
+    void (*unset_iommu_device)(PCIBus *bus, void *opaque, int devfn);
 } PCIIOMMUOps;
 
 AddressSpace *pci_device_iommu_address_space(PCIDevice *dev);
+bool pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice *hiod,
+                                 Error **errp);
+void pci_device_unset_iommu_device(PCIDevice *dev);
 
 /**
  * pci_setup_iommu: Initialize specific IOMMU handlers for a PCIBus
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 02a4bb2af6..c8a8aab306 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2742,6 +2742,33 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
     return &address_space_memory;
 }
 
+bool pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice *hiod,
+                                 Error **errp)
+{
+    PCIBus *iommu_bus;
+
+    /* set_iommu_device requires device's direct BDF instead of aliased BDF */
+    pci_device_get_iommu_bus_devfn(dev, &iommu_bus, NULL, NULL);
+    if (iommu_bus && iommu_bus->iommu_ops->set_iommu_device) {
+        return iommu_bus->iommu_ops->set_iommu_device(pci_get_bus(dev),
+                                                      iommu_bus->iommu_opaque,
+                                                      dev->devfn, hiod, errp);
+    }
+    return true;
+}
+
+void pci_device_unset_iommu_device(PCIDevice *dev)
+{
+    PCIBus *iommu_bus;
+
+    pci_device_get_iommu_bus_devfn(dev, &iommu_bus, NULL, NULL);
+    if (iommu_bus && iommu_bus->iommu_ops->unset_iommu_device) {
+        return iommu_bus->iommu_ops->unset_iommu_device(pci_get_bus(dev),
+                                                        iommu_bus->iommu_opaque,
+                                                        dev->devfn);
+    }
+}
+
 void pci_setup_iommu(PCIBus *bus, const PCIIOMMUOps *ops, void *opaque)
 {
     /*
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 14/17] vfio/pci: Pass HostIOMMUDevice to vIOMMU
  2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (12 preceding siblings ...)
  2024-06-05  8:30 ` [PATCH v7 13/17] hw/pci: Introduce pci_device_[set|unset]_iommu_device() Zhenzhong Duan
@ 2024-06-05  8:30 ` Zhenzhong Duan
  2024-06-05  8:30 ` [PATCH v7 15/17] intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap Zhenzhong Duan
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Zhenzhong Duan @ 2024-06-05  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
	yi.l.liu, chao.p.peng, Zhenzhong Duan, Yi Sun

With HostIOMMUDevice passed, vIOMMU can check compatibility with host
IOMMU, call into IOMMUFD specific methods, etc.

Originally-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 hw/vfio/pci.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 74a79bdf61..d8a76c1ee0 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3121,10 +3121,15 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
 
     vfio_bars_register(vdev);
 
-    if (!vfio_add_capabilities(vdev, errp)) {
+    if (!pci_device_set_iommu_device(pdev, vbasedev->hiod, errp)) {
+        error_prepend(errp, "Failed to set iommu_device: ");
         goto out_teardown;
     }
 
+    if (!vfio_add_capabilities(vdev, errp)) {
+        goto out_unset_idev;
+    }
+
     if (vdev->vga) {
         vfio_vga_quirk_setup(vdev);
     }
@@ -3141,7 +3146,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
             error_setg(errp,
                        "cannot support IGD OpRegion feature on hotplugged "
                        "device");
-            goto out_teardown;
+            goto out_unset_idev;
         }
 
         ret = vfio_get_dev_region_info(vbasedev,
@@ -3150,11 +3155,11 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
         if (ret) {
             error_setg_errno(errp, -ret,
                              "does not support requested IGD OpRegion feature");
-            goto out_teardown;
+            goto out_unset_idev;
         }
 
         if (!vfio_pci_igd_opregion_init(vdev, opregion, errp)) {
-            goto out_teardown;
+            goto out_unset_idev;
         }
     }
 
@@ -3238,6 +3243,8 @@ out_deregister:
     if (vdev->intx.mmap_timer) {
         timer_free(vdev->intx.mmap_timer);
     }
+out_unset_idev:
+    pci_device_unset_iommu_device(pdev);
 out_teardown:
     vfio_teardown_msi(vdev);
     vfio_bars_exit(vdev);
@@ -3266,6 +3273,7 @@ static void vfio_instance_finalize(Object *obj)
 static void vfio_exitfn(PCIDevice *pdev)
 {
     VFIOPCIDevice *vdev = VFIO_PCI(pdev);
+    VFIODevice *vbasedev = &vdev->vbasedev;
 
     vfio_unregister_req_notifier(vdev);
     vfio_unregister_err_notifier(vdev);
@@ -3280,7 +3288,8 @@ static void vfio_exitfn(PCIDevice *pdev)
     vfio_teardown_msi(vdev);
     vfio_pci_disable_rp_atomics(vdev);
     vfio_bars_exit(vdev);
-    vfio_migration_exit(&vdev->vbasedev);
+    vfio_migration_exit(vbasedev);
+    pci_device_unset_iommu_device(pdev);
 }
 
 static void vfio_pci_reset(DeviceState *dev)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 15/17] intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap
  2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (13 preceding siblings ...)
  2024-06-05  8:30 ` [PATCH v7 14/17] vfio/pci: Pass HostIOMMUDevice to vIOMMU Zhenzhong Duan
@ 2024-06-05  8:30 ` Zhenzhong Duan
  2024-06-05  8:30 ` [PATCH v7 16/17] intel_iommu: Implement [set|unset]_iommu_device() callbacks Zhenzhong Duan
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Zhenzhong Duan @ 2024-06-05  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
	yi.l.liu, chao.p.peng, Zhenzhong Duan, Paolo Bonzini,
	Richard Henderson, Eduardo Habkost, Marcel Apfelbaum

Extract cap/ecap initialization in vtd_cap_init() to make code
cleaner.

No functional change intended.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 hw/i386/intel_iommu.c | 93 ++++++++++++++++++++++++-------------------
 1 file changed, 51 insertions(+), 42 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index cc8e59674e..519063c8f8 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3934,30 +3934,10 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n)
     return;
 }
 
-/* Do the initialization. It will also be called when reset, so pay
- * attention when adding new initialization stuff.
- */
-static void vtd_init(IntelIOMMUState *s)
+static void vtd_cap_init(IntelIOMMUState *s)
 {
     X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
 
-    memset(s->csr, 0, DMAR_REG_SIZE);
-    memset(s->wmask, 0, DMAR_REG_SIZE);
-    memset(s->w1cmask, 0, DMAR_REG_SIZE);
-    memset(s->womask, 0, DMAR_REG_SIZE);
-
-    s->root = 0;
-    s->root_scalable = false;
-    s->dmar_enabled = false;
-    s->intr_enabled = false;
-    s->iq_head = 0;
-    s->iq_tail = 0;
-    s->iq = 0;
-    s->iq_size = 0;
-    s->qi_enabled = false;
-    s->iq_last_desc_type = VTD_INV_DESC_NONE;
-    s->iq_dw = false;
-    s->next_frcd_reg = 0;
     s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND |
              VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS |
              VTD_CAP_MGAW(s->aw_bits);
@@ -3974,27 +3954,6 @@ static void vtd_init(IntelIOMMUState *s)
     }
     s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO;
 
-    /*
-     * Rsvd field masks for spte
-     */
-    vtd_spte_rsvd[0] = ~0ULL;
-    vtd_spte_rsvd[1] = VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits,
-                                                  x86_iommu->dt_supported);
-    vtd_spte_rsvd[2] = VTD_SPTE_PAGE_L2_RSVD_MASK(s->aw_bits);
-    vtd_spte_rsvd[3] = VTD_SPTE_PAGE_L3_RSVD_MASK(s->aw_bits);
-    vtd_spte_rsvd[4] = VTD_SPTE_PAGE_L4_RSVD_MASK(s->aw_bits);
-
-    vtd_spte_rsvd_large[2] = VTD_SPTE_LPAGE_L2_RSVD_MASK(s->aw_bits,
-                                                         x86_iommu->dt_supported);
-    vtd_spte_rsvd_large[3] = VTD_SPTE_LPAGE_L3_RSVD_MASK(s->aw_bits,
-                                                         x86_iommu->dt_supported);
-
-    if (s->scalable_mode || s->snoop_control) {
-        vtd_spte_rsvd[1] &= ~VTD_SPTE_SNP;
-        vtd_spte_rsvd_large[2] &= ~VTD_SPTE_SNP;
-        vtd_spte_rsvd_large[3] &= ~VTD_SPTE_SNP;
-    }
-
     if (x86_iommu_ir_supported(x86_iommu)) {
         s->ecap |= VTD_ECAP_IR | VTD_ECAP_MHMV;
         if (s->intr_eim == ON_OFF_AUTO_ON) {
@@ -4027,6 +3986,56 @@ static void vtd_init(IntelIOMMUState *s)
     if (s->pasid) {
         s->ecap |= VTD_ECAP_PASID;
     }
+}
+
+/*
+ * Do the initialization. It will also be called when reset, so pay
+ * attention when adding new initialization stuff.
+ */
+static void vtd_init(IntelIOMMUState *s)
+{
+    X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
+
+    memset(s->csr, 0, DMAR_REG_SIZE);
+    memset(s->wmask, 0, DMAR_REG_SIZE);
+    memset(s->w1cmask, 0, DMAR_REG_SIZE);
+    memset(s->womask, 0, DMAR_REG_SIZE);
+
+    s->root = 0;
+    s->root_scalable = false;
+    s->dmar_enabled = false;
+    s->intr_enabled = false;
+    s->iq_head = 0;
+    s->iq_tail = 0;
+    s->iq = 0;
+    s->iq_size = 0;
+    s->qi_enabled = false;
+    s->iq_last_desc_type = VTD_INV_DESC_NONE;
+    s->iq_dw = false;
+    s->next_frcd_reg = 0;
+
+    vtd_cap_init(s);
+
+    /*
+     * Rsvd field masks for spte
+     */
+    vtd_spte_rsvd[0] = ~0ULL;
+    vtd_spte_rsvd[1] = VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits,
+                                                  x86_iommu->dt_supported);
+    vtd_spte_rsvd[2] = VTD_SPTE_PAGE_L2_RSVD_MASK(s->aw_bits);
+    vtd_spte_rsvd[3] = VTD_SPTE_PAGE_L3_RSVD_MASK(s->aw_bits);
+    vtd_spte_rsvd[4] = VTD_SPTE_PAGE_L4_RSVD_MASK(s->aw_bits);
+
+    vtd_spte_rsvd_large[2] = VTD_SPTE_LPAGE_L2_RSVD_MASK(s->aw_bits,
+                                                    x86_iommu->dt_supported);
+    vtd_spte_rsvd_large[3] = VTD_SPTE_LPAGE_L3_RSVD_MASK(s->aw_bits,
+                                                    x86_iommu->dt_supported);
+
+    if (s->scalable_mode || s->snoop_control) {
+        vtd_spte_rsvd[1] &= ~VTD_SPTE_SNP;
+        vtd_spte_rsvd_large[2] &= ~VTD_SPTE_SNP;
+        vtd_spte_rsvd_large[3] &= ~VTD_SPTE_SNP;
+    }
 
     vtd_reset_caches(s);
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 16/17] intel_iommu: Implement [set|unset]_iommu_device() callbacks
  2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (14 preceding siblings ...)
  2024-06-05  8:30 ` [PATCH v7 15/17] intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap Zhenzhong Duan
@ 2024-06-05  8:30 ` Zhenzhong Duan
  2024-06-05  8:30 ` [PATCH v7 17/17] intel_iommu: Check compatibility with host IOMMU capabilities Zhenzhong Duan
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Zhenzhong Duan @ 2024-06-05  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
	yi.l.liu, chao.p.peng, Yi Sun, Zhenzhong Duan, Marcel Apfelbaum,
	Paolo Bonzini, Richard Henderson, Eduardo Habkost

From: Yi Liu <yi.l.liu@intel.com>

Implement [set|unset]_iommu_device() callbacks in Intel vIOMMU.
In set call, we take a reference of HostIOMMUDevice and store it
in hash table indexed by PCI BDF.

Note this BDF index is device's real BDF not the aliased one which
is different from the index of VTDAddressSpace. There can be multiple
assigned devices under same virtual iommu group and share same
VTDAddressSpace, but each has its own HostIOMMUDevice.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/hw/i386/intel_iommu.h |  2 +
 hw/i386/intel_iommu.c         | 81 +++++++++++++++++++++++++++++++++++
 2 files changed, 83 insertions(+)

diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 7fa0a695c8..1eb05c29fc 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -292,6 +292,8 @@ struct IntelIOMMUState {
     /* list of registered notifiers */
     QLIST_HEAD(, VTDAddressSpace) vtd_as_with_notifiers;
 
+    GHashTable *vtd_host_iommu_dev;             /* HostIOMMUDevice */
+
     /* interrupt remapping */
     bool intr_enabled;              /* Whether guest enabled IR */
     dma_addr_t intr_root;           /* Interrupt remapping table pointer */
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 519063c8f8..07e897ad7a 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -61,6 +61,12 @@ struct vtd_as_key {
     uint32_t pasid;
 };
 
+/* bus/devfn is PCI device's real BDF not the aliased one */
+struct vtd_hiod_key {
+    PCIBus *bus;
+    uint8_t devfn;
+};
+
 struct vtd_iotlb_key {
     uint64_t gfn;
     uint32_t pasid;
@@ -250,6 +256,25 @@ static guint vtd_as_hash(gconstpointer v)
     return (guint)(value << 8 | key->devfn);
 }
 
+/* Same implementation as vtd_as_hash() */
+static guint vtd_hiod_hash(gconstpointer v)
+{
+    return vtd_as_hash(v);
+}
+
+static gboolean vtd_hiod_equal(gconstpointer v1, gconstpointer v2)
+{
+    const struct vtd_hiod_key *key1 = v1;
+    const struct vtd_hiod_key *key2 = v2;
+
+    return (key1->bus == key2->bus) && (key1->devfn == key2->devfn);
+}
+
+static void vtd_hiod_destroy(gpointer v)
+{
+    object_unref(v);
+}
+
 static gboolean vtd_hash_remove_by_domain(gpointer key, gpointer value,
                                           gpointer user_data)
 {
@@ -3812,6 +3837,58 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
     return vtd_dev_as;
 }
 
+static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
+                                     HostIOMMUDevice *hiod, Error **errp)
+{
+    IntelIOMMUState *s = opaque;
+    struct vtd_as_key key = {
+        .bus = bus,
+        .devfn = devfn,
+    };
+    struct vtd_as_key *new_key;
+
+    assert(hiod);
+
+    vtd_iommu_lock(s);
+
+    if (g_hash_table_lookup(s->vtd_host_iommu_dev, &key)) {
+        error_setg(errp, "Host IOMMU device already exist");
+        vtd_iommu_unlock(s);
+        return false;
+    }
+
+    new_key = g_malloc(sizeof(*new_key));
+    new_key->bus = bus;
+    new_key->devfn = devfn;
+
+    object_ref(hiod);
+    g_hash_table_insert(s->vtd_host_iommu_dev, new_key, hiod);
+
+    vtd_iommu_unlock(s);
+
+    return true;
+}
+
+static void vtd_dev_unset_iommu_device(PCIBus *bus, void *opaque, int devfn)
+{
+    IntelIOMMUState *s = opaque;
+    struct vtd_as_key key = {
+        .bus = bus,
+        .devfn = devfn,
+    };
+
+    vtd_iommu_lock(s);
+
+    if (!g_hash_table_lookup(s->vtd_host_iommu_dev, &key)) {
+        vtd_iommu_unlock(s);
+        return;
+    }
+
+    g_hash_table_remove(s->vtd_host_iommu_dev, &key);
+
+    vtd_iommu_unlock(s);
+}
+
 /* Unmap the whole range in the notifier's scope. */
 static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
 {
@@ -4116,6 +4193,8 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
 
 static PCIIOMMUOps vtd_iommu_ops = {
     .get_address_space = vtd_host_dma_iommu,
+    .set_iommu_device = vtd_dev_set_iommu_device,
+    .unset_iommu_device = vtd_dev_unset_iommu_device,
 };
 
 static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
@@ -4235,6 +4314,8 @@ static void vtd_realize(DeviceState *dev, Error **errp)
                                      g_free, g_free);
     s->vtd_address_spaces = g_hash_table_new_full(vtd_as_hash, vtd_as_equal,
                                       g_free, g_free);
+    s->vtd_host_iommu_dev = g_hash_table_new_full(vtd_hiod_hash, vtd_hiod_equal,
+                                                  g_free, vtd_hiod_destroy);
     vtd_init(s);
     pci_setup_iommu(bus, &vtd_iommu_ops, dev);
     /* Pseudo address space under root PCI bus. */
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 17/17] intel_iommu: Check compatibility with host IOMMU capabilities
  2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (15 preceding siblings ...)
  2024-06-05  8:30 ` [PATCH v7 16/17] intel_iommu: Implement [set|unset]_iommu_device() callbacks Zhenzhong Duan
@ 2024-06-05  8:30 ` Zhenzhong Duan
  2024-06-07 15:00 ` [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Eric Auger
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 25+ messages in thread
From: Zhenzhong Duan @ 2024-06-05  8:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
	yi.l.liu, chao.p.peng, Zhenzhong Duan, Paolo Bonzini,
	Richard Henderson, Eduardo Habkost, Marcel Apfelbaum

If check fails, host device (either VFIO or VDPA device) is not
compatible with current vIOMMU config and should not be passed to
guest.

Only aw_bits is checked for now, we don't care about other caps
before scalable modern mode is introduced.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 hw/i386/intel_iommu.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 07e897ad7a..f592082444 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3837,6 +3837,30 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
     return vtd_dev_as;
 }
 
+static bool vtd_check_hiod(IntelIOMMUState *s, HostIOMMUDevice *hiod,
+                           Error **errp)
+{
+    HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_GET_CLASS(hiod);
+    int ret;
+
+    if (!hiodc->get_cap) {
+        error_setg(errp, ".get_cap() not implemented");
+        return false;
+    }
+
+    /* Common checks */
+    ret = hiodc->get_cap(hiod, HOST_IOMMU_DEVICE_CAP_AW_BITS, errp);
+    if (ret < 0) {
+        return false;
+    }
+    if (s->aw_bits > ret) {
+        error_setg(errp, "aw-bits %d > host aw-bits %d", s->aw_bits, ret);
+        return false;
+    }
+
+    return true;
+}
+
 static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
                                      HostIOMMUDevice *hiod, Error **errp)
 {
@@ -3857,6 +3881,11 @@ static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
         return false;
     }
 
+    if (!vtd_check_hiod(s, hiod, errp)) {
+        vtd_iommu_unlock(s);
+        return false;
+    }
+
     new_key = g_malloc(sizeof(*new_key));
     new_key->bus = bus;
     new_key->devfn = devfn;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU
  2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (16 preceding siblings ...)
  2024-06-05  8:30 ` [PATCH v7 17/17] intel_iommu: Check compatibility with host IOMMU capabilities Zhenzhong Duan
@ 2024-06-07 15:00 ` Eric Auger
  2024-06-11  2:32   ` Duan, Zhenzhong
  2024-06-17 17:36 ` Cédric Le Goater
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 25+ messages in thread
From: Eric Auger @ 2024-06-07 15:00 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel
  Cc: alex.williamson, clg, mst, peterx, jasowang, jgg, nicolinc,
	joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
	chao.p.peng

Hi Zhenzhong,

On 6/5/24 10:30, Zhenzhong Duan wrote:
> Hi,
>
> This series introduce a HostIOMMUDevice abstraction and sub-classes.
> Also HostIOMMUDeviceCaps structure in HostIOMMUDevice and a new interface
> between vIOMMU and HostIOMMUDevice.
>
> A HostIOMMUDevice is an abstraction for an assigned device that is protected
> by a physical IOMMU (aka host IOMMU). The userspace interaction with this
> physical IOMMU can be done either through the VFIO IOMMU type 1 legacy
> backend or the new iommufd backend. The assigned device can be a VFIO device
> or a VDPA device. The HostIOMMUDevice is needed to interact with the host
> IOMMU that protects the assigned device. It is especially useful when the
> device is also protected by a virtual IOMMU as this latter use the translation
> services of the physical IOMMU and is constrained by it. In that context the
> HostIOMMUDevice can be passed to the virtual IOMMU to collect physical IOMMU
> capabilities such as the supported address width. In the future, the virtual
> IOMMU will use the HostIOMMUDevice to program the guest page tables in the
> first translation stage of the physical IOMMU.
>
> HostIOMMUDeviceClass::realize() is introduced to initialize
> HostIOMMUDeviceCaps and other fields of HostIOMMUDevice variants.
>
> HostIOMMUDeviceClass::get_cap() is introduced to query host IOMMU
> device capabilities.
>
> The class tree is as below:
>
>                               HostIOMMUDevice
>                                      | .caps
>                                      | .realize()
>                                      | .get_cap()
>                                      |
>             .-----------------------------------------------.
>             |                        |                      |
> HostIOMMUDeviceLegacyVFIO  {HostIOMMUDeviceLegacyVDPA}  HostIOMMUDeviceIOMMUFD
>             |                        |                      | [.iommufd]
>                                                             | [.devid]
>                                                             | [.ioas_id]
>                                                             | [.attach_hwpt()]
>                                                             | [.detach_hwpt()]
>                                                             |
>                                             .----------------------.
>                                             |                      |
>                          HostIOMMUDeviceIOMMUFDVFIO  {HostIOMMUDeviceIOMMUFDVDPA}
>                                           | [.vdev]                | {.vdev}
>
> * The attributes in [] will be implemented in nesting series.
> * The classes in {} will be implemented in future.
> * .vdev in different class points to different agent device,
> * i.e., VFIODevice or VDPADevice.
>
> PATCH1-4: Introduce HostIOMMUDevice and its sub classes
> PATCH5-10: Implement .realize() and .get_cap() handler
> PATCH11-14: Create HostIOMMUDevice instance and pass to vIOMMU
> PATCH15-17: Implement compatibility check between host IOMMU and vIOMMU(intel_iommu)
>
> Test done:
> make check
> vfio device hotplug/unplug with different backend on linux
> reboot, kexec
> build test on linux and windows11
>
> Qemu code can be found at:
> https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_preq_v7
>
> Besides the compatibility check in this series, in nesting series, this
> host IOMMU device is extended for much wider usage. For anyone interested
> on the nesting series, here is the link:
> https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_rfcv2
>
> Thanks
> Zhenzhong
>
> Changelog:
> v7:
> - drop config CONFIG_HOST_IOMMU_DEVICE (Cédric)
> - introduce HOST_IOMMU_DEVICE_CAP_AW_BITS_MAX (Eric)
> - use iova_ranges method in iommufd.realize() (Eric)
> - introduce HostIOMMUDevice::name to facilitate tracing (Eric)
> - implement a custom destroy hash function (Cédric)
> - drop VTDHostIOMMUDevice and save HostIOMMUDevice in hash table (Eric)
> - move patch5 after patch1 (Eric)
> - squash patch3 and 4, squash patch12 and 13 (Eric)
> - refine comments (Eric)
> - collect Eric's R-B

for the whole series:
Reviewed-by: Eric Auger <eric.auger@redhat.com>

I exercised part of it using the virtio-iommu and this series on top
[RFC v2 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for
hotplugged devices

Thanks

Eric
>
> v6:
> - open coded host_iommu_device_get_cap() to avoid #ifdef in intel_iommu.c (Cédric)
>
> v5:
> - pci_device_set_iommu_device return true (Cédric)
> - fix build failure on windows (thanks Cédric found that issue)
>
> v4:
> - move properties vdev, iommufd and devid to nesting series where need it (Cédric)
> - fix 32bit build with clz64 (Cédric)
> - change check_cap naming to get_cap (Cédric)
> - return bool if error is passed through errp (Cédric)
> - drop HostIOMMUDevice[LegacyVFIO|IOMMUFD|IOMMUFDVFIO] declaration (Cédric)
> - drop HOST_IOMMU_DEVICE_CAP_IOMMUFD (Cédric)
> - replace include directive with forward declaration (Cédric)
>
> v3:
> - refine declaration and doc for HostIOMMUDevice (Cédric, Philippe)
> - introduce HostIOMMUDeviceCaps, .realize() and .check_cap() (Cédric)
> - introduce helper range_get_last_bit() for range operation (Cédric)
> - separate pci_device_get_iommu_bus_devfn() in a prereq patch (Cédric)
> - replace HIOD_ abbreviation with HOST_IOMMU_DEVICE_ (Cédric)
> - add header in include/sysemu/iommufd.h (Cédric)
>
> v2:
> - use QOM to abstract host IOMMU device and its sub-classes (Cédric)
> - move host IOMMU device creation in attach_device() (Cédric)
> - refine pci_device_set/unset_iommu_device doc further (Eric)
> - define host IOMMU info format of different backend
> - implement get_host_iommu_info() for different backend (Cédric)
> - drop cap/ecap update logic (MST)
> - check aw-bits from get_host_iommu_info() in legacy mode
>
> v1:
> - use HostIOMMUDevice handle instead of union in VFIODevice (Eric)
> - change host_iommu_device_init to host_iommu_device_create
> - allocate HostIOMMUDevice in host_iommu_device_create callback
>   and set the VFIODevice base_hdev handle (Eric)
> - refine pci_device_set/unset_iommu_device doc (Eric)
> - use HostIOMMUDevice handle instead of union in VTDHostIOMMUDevice (Eric)
> - convert HostIOMMUDevice to sub object pointer in vtd_check_hdev
>
> rfcv2:
> - introduce common abstract HostIOMMUDevice and sub struct for different BEs (Eric, Cédric)
> - remove iommufd_device.[ch] (Cédric)
> - remove duplicate iommufd/devid define from VFIODevice (Eric)
> - drop the p in aliased_pbus and aliased_pdevfn (Eric)
> - assert devfn and iommu_bus in pci_device_get_iommu_bus_devfn (Cédric, Eric)
> - use errp in iommufd_device_get_info (Eric)
> - split and simplify cap/ecap check/sync code in intel_iommu.c (Cédric)
> - move VTDHostIOMMUDevice declaration to intel_iommu_internal.h (Cédric)
> - make '(vtd->cap_reg >> 16) & 0x3fULL' a MACRO and add missed '+1' (Cédric)
> - block migration if vIOMMU cap/ecap updated based on host IOMMU cap/ecap
> - add R-B
>
> Yi Liu (2):
>   hw/pci: Introduce pci_device_[set|unset]_iommu_device()
>   intel_iommu: Implement [set|unset]_iommu_device() callbacks
>
> Zhenzhong Duan (15):
>   backends: Introduce HostIOMMUDevice abstract
>   backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
>   vfio/container: Introduce TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO device
>   backends/iommufd: Introduce TYPE_HOST_IOMMU_DEVICE_IOMMUFD[_VFIO]
>     devices
>   range: Introduce range_get_last_bit()
>   vfio/container: Implement HostIOMMUDeviceClass::realize() handler
>   backends/iommufd: Introduce helper function
>     iommufd_backend_get_device_info()
>   vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
>   vfio/container: Implement HostIOMMUDeviceClass::get_cap() handler
>   backends/iommufd: Implement HostIOMMUDeviceClass::get_cap() handler
>   vfio: Create host IOMMU device instance
>   hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn()
>   vfio/pci: Pass HostIOMMUDevice to vIOMMU
>   intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap
>   intel_iommu: Check compatibility with host IOMMU capabilities
>
>  MAINTAINERS                           |   2 +
>  include/hw/i386/intel_iommu.h         |   2 +
>  include/hw/pci/pci.h                  |  38 ++++-
>  include/hw/vfio/vfio-common.h         |   8 +
>  include/hw/vfio/vfio-container-base.h |   3 +
>  include/qemu/range.h                  |  11 ++
>  include/sysemu/host_iommu_device.h    |  91 ++++++++++++
>  include/sysemu/iommufd.h              |  19 +++
>  backends/host_iommu_device.c          |  33 +++++
>  backends/iommufd.c                    |  76 ++++++++--
>  hw/i386/intel_iommu.c                 | 203 ++++++++++++++++++++------
>  hw/pci/pci.c                          |  75 +++++++++-
>  hw/vfio/common.c                      |  16 +-
>  hw/vfio/container.c                   |  41 +++++-
>  hw/vfio/helpers.c                     |  17 +++
>  hw/vfio/iommufd.c                     |  37 ++++-
>  hw/vfio/pci.c                         |  19 ++-
>  backends/meson.build                  |   1 +
>  18 files changed, 623 insertions(+), 69 deletions(-)
>  create mode 100644 include/sysemu/host_iommu_device.h
>  create mode 100644 backends/host_iommu_device.c
>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU
  2024-06-07 15:00 ` [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Eric Auger
@ 2024-06-11  2:32   ` Duan, Zhenzhong
  0 siblings, 0 replies; 25+ messages in thread
From: Duan, Zhenzhong @ 2024-06-11  2:32 UTC (permalink / raw)
  To: eric.auger@redhat.com, qemu-devel@nongnu.org
  Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
	peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
	nicolinc@nvidia.com, joao.m.martins@oracle.com,
	clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
	Peng, Chao P



>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH v7 00/17] Add a host IOMMU device abstraction to
>check with vIOMMU
>
>Hi Zhenzhong,
>
>On 6/5/24 10:30, Zhenzhong Duan wrote:
>> Hi,
>>
>> This series introduce a HostIOMMUDevice abstraction and sub-classes.
>> Also HostIOMMUDeviceCaps structure in HostIOMMUDevice and a new
>interface
>> between vIOMMU and HostIOMMUDevice.
>>
>> A HostIOMMUDevice is an abstraction for an assigned device that is
>protected
>> by a physical IOMMU (aka host IOMMU). The userspace interaction with
>this
>> physical IOMMU can be done either through the VFIO IOMMU type 1
>legacy
>> backend or the new iommufd backend. The assigned device can be a VFIO
>device
>> or a VDPA device. The HostIOMMUDevice is needed to interact with the
>host
>> IOMMU that protects the assigned device. It is especially useful when the
>> device is also protected by a virtual IOMMU as this latter use the
>translation
>> services of the physical IOMMU and is constrained by it. In that context the
>> HostIOMMUDevice can be passed to the virtual IOMMU to collect physical
>IOMMU
>> capabilities such as the supported address width. In the future, the virtual
>> IOMMU will use the HostIOMMUDevice to program the guest page tables
>in the
>> first translation stage of the physical IOMMU.
>>
>> HostIOMMUDeviceClass::realize() is introduced to initialize
>> HostIOMMUDeviceCaps and other fields of HostIOMMUDevice variants.
>>
>> HostIOMMUDeviceClass::get_cap() is introduced to query host IOMMU
>> device capabilities.
>>
>> The class tree is as below:
>>
>>                               HostIOMMUDevice
>>                                      | .caps
>>                                      | .realize()
>>                                      | .get_cap()
>>                                      |
>>             .-----------------------------------------------.
>>             |                        |                      |	
>> HostIOMMUDeviceLegacyVFIO  {HostIOMMUDeviceLegacyVDPA}
>HostIOMMUDeviceIOMMUFD
>>             |                        |                      | [.iommufd]
>>                                                             | [.devid]
>>                                                             | [.ioas_id]
>>                                                             | [.attach_hwpt()]
>>                                                             | [.detach_hwpt()]
>>                                                             |
>>                                             .----------------------.
>>                                             |                      |
>>                          HostIOMMUDeviceIOMMUFDVFIO
>{HostIOMMUDeviceIOMMUFDVDPA}
>>                                           | [.vdev]                | {.vdev}
>>
>> * The attributes in [] will be implemented in nesting series.
>> * The classes in {} will be implemented in future.
>> * .vdev in different class points to different agent device,
>> * i.e., VFIODevice or VDPADevice.
>>
>> PATCH1-4: Introduce HostIOMMUDevice and its sub classes
>> PATCH5-10: Implement .realize() and .get_cap() handler
>> PATCH11-14: Create HostIOMMUDevice instance and pass to vIOMMU
>> PATCH15-17: Implement compatibility check between host IOMMU and
>vIOMMU(intel_iommu)
>>
>> Test done:
>> make check
>> vfio device hotplug/unplug with different backend on linux
>> reboot, kexec
>> build test on linux and windows11
>>
>> Qemu code can be found at:
>>
>https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_pre
>q_v7
>>
>> Besides the compatibility check in this series, in nesting series, this
>> host IOMMU device is extended for much wider usage. For anyone
>interested
>> on the nesting series, here is the link:
>>
>https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_rfc
>v2
>>
>> Thanks
>> Zhenzhong
>>
>> Changelog:
>> v7:
>> - drop config CONFIG_HOST_IOMMU_DEVICE (Cédric)
>> - introduce HOST_IOMMU_DEVICE_CAP_AW_BITS_MAX (Eric)
>> - use iova_ranges method in iommufd.realize() (Eric)
>> - introduce HostIOMMUDevice::name to facilitate tracing (Eric)
>> - implement a custom destroy hash function (Cédric)
>> - drop VTDHostIOMMUDevice and save HostIOMMUDevice in hash table
>(Eric)
>> - move patch5 after patch1 (Eric)
>> - squash patch3 and 4, squash patch12 and 13 (Eric)
>> - refine comments (Eric)
>> - collect Eric's R-B
>
>for the whole series:
>Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks Eric.

>
>I exercised part of it using the virtio-iommu and this series on top
>[RFC v2 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for
>hotplugged devices

You are super-efficient😊

BRs.
Zhenzhong

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU
  2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (17 preceding siblings ...)
  2024-06-07 15:00 ` [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Eric Auger
@ 2024-06-17 17:36 ` Cédric Le Goater
  2024-06-24 10:26 ` Michael S. Tsirkin
  2024-06-24 21:16 ` Cédric Le Goater
  20 siblings, 0 replies; 25+ messages in thread
From: Cédric Le Goater @ 2024-06-17 17:36 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg, nicolinc,
	joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
	chao.p.peng

Hello Michael,

On 6/5/24 10:30 AM, Zhenzhong Duan wrote:
> Hi,
> 
> This series introduce a HostIOMMUDevice abstraction and sub-classes.
> Also HostIOMMUDeviceCaps structure in HostIOMMUDevice and a new interface
> between vIOMMU and HostIOMMUDevice.
> 
> A HostIOMMUDevice is an abstraction for an assigned device that is protected
> by a physical IOMMU (aka host IOMMU). The userspace interaction with this
> physical IOMMU can be done either through the VFIO IOMMU type 1 legacy
> backend or the new iommufd backend. The assigned device can be a VFIO device
> or a VDPA device. The HostIOMMUDevice is needed to interact with the host
> IOMMU that protects the assigned device. It is especially useful when the
> device is also protected by a virtual IOMMU as this latter use the translation
> services of the physical IOMMU and is constrained by it. In that context the
> HostIOMMUDevice can be passed to the virtual IOMMU to collect physical IOMMU
> capabilities such as the supported address width. In the future, the virtual
> IOMMU will use the HostIOMMUDevice to program the guest page tables in the
> first translation stage of the physical IOMMU.


This series has been the subject of reviews and tests on various
architectures and platforms. It prepares ground for more IOMMU changes
related to the new IOMMUFD backend.

I have queued them in the VFIO 9.1 tree for now, awaiting approval
from the PCI maintainers. Could please take look at the pci part which
introduces new IOMMU callbacks ?

Thanks,

C



> HostIOMMUDeviceClass::realize() is introduced to initialize
> HostIOMMUDeviceCaps and other fields of HostIOMMUDevice variants.
> 
> HostIOMMUDeviceClass::get_cap() is introduced to query host IOMMU
> device capabilities.
> 
> The class tree is as below:
> 
>                                HostIOMMUDevice
>                                       | .caps
>                                       | .realize()
>                                       | .get_cap()
>                                       |
>              .-----------------------------------------------.
>              |                        |                      |
> HostIOMMUDeviceLegacyVFIO  {HostIOMMUDeviceLegacyVDPA}  HostIOMMUDeviceIOMMUFD
>              |                        |                      | [.iommufd]
>                                                              | [.devid]
>                                                              | [.ioas_id]
>                                                              | [.attach_hwpt()]
>                                                              | [.detach_hwpt()]
>                                                              |
>                                              .----------------------.
>                                              |                      |
>                           HostIOMMUDeviceIOMMUFDVFIO  {HostIOMMUDeviceIOMMUFDVDPA}
>                                            | [.vdev]                | {.vdev}
> 
> * The attributes in [] will be implemented in nesting series.
> * The classes in {} will be implemented in future.
> * .vdev in different class points to different agent device,
> * i.e., VFIODevice or VDPADevice.
> 
> PATCH1-4: Introduce HostIOMMUDevice and its sub classes
> PATCH5-10: Implement .realize() and .get_cap() handler
> PATCH11-14: Create HostIOMMUDevice instance and pass to vIOMMU
> PATCH15-17: Implement compatibility check between host IOMMU and vIOMMU(intel_iommu)
> 
> Test done:
> make check
> vfio device hotplug/unplug with different backend on linux
> reboot, kexec
> build test on linux and windows11
> 
> Qemu code can be found at:
> https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_preq_v7
> 
> Besides the compatibility check in this series, in nesting series, this
> host IOMMU device is extended for much wider usage. For anyone interested
> on the nesting series, here is the link:
> https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_rfcv2
> 
> Thanks
> Zhenzhong
> 
> Changelog:
> v7:
> - drop config CONFIG_HOST_IOMMU_DEVICE (Cédric)
> - introduce HOST_IOMMU_DEVICE_CAP_AW_BITS_MAX (Eric)
> - use iova_ranges method in iommufd.realize() (Eric)
> - introduce HostIOMMUDevice::name to facilitate tracing (Eric)
> - implement a custom destroy hash function (Cédric)
> - drop VTDHostIOMMUDevice and save HostIOMMUDevice in hash table (Eric)
> - move patch5 after patch1 (Eric)
> - squash patch3 and 4, squash patch12 and 13 (Eric)
> - refine comments (Eric)
> - collect Eric's R-B
> 
> v6:
> - open coded host_iommu_device_get_cap() to avoid #ifdef in intel_iommu.c (Cédric)
> 
> v5:
> - pci_device_set_iommu_device return true (Cédric)
> - fix build failure on windows (thanks Cédric found that issue)
> 
> v4:
> - move properties vdev, iommufd and devid to nesting series where need it (Cédric)
> - fix 32bit build with clz64 (Cédric)
> - change check_cap naming to get_cap (Cédric)
> - return bool if error is passed through errp (Cédric)
> - drop HostIOMMUDevice[LegacyVFIO|IOMMUFD|IOMMUFDVFIO] declaration (Cédric)
> - drop HOST_IOMMU_DEVICE_CAP_IOMMUFD (Cédric)
> - replace include directive with forward declaration (Cédric)
> 
> v3:
> - refine declaration and doc for HostIOMMUDevice (Cédric, Philippe)
> - introduce HostIOMMUDeviceCaps, .realize() and .check_cap() (Cédric)
> - introduce helper range_get_last_bit() for range operation (Cédric)
> - separate pci_device_get_iommu_bus_devfn() in a prereq patch (Cédric)
> - replace HIOD_ abbreviation with HOST_IOMMU_DEVICE_ (Cédric)
> - add header in include/sysemu/iommufd.h (Cédric)
> 
> v2:
> - use QOM to abstract host IOMMU device and its sub-classes (Cédric)
> - move host IOMMU device creation in attach_device() (Cédric)
> - refine pci_device_set/unset_iommu_device doc further (Eric)
> - define host IOMMU info format of different backend
> - implement get_host_iommu_info() for different backend (Cédric)
> - drop cap/ecap update logic (MST)
> - check aw-bits from get_host_iommu_info() in legacy mode
> 
> v1:
> - use HostIOMMUDevice handle instead of union in VFIODevice (Eric)
> - change host_iommu_device_init to host_iommu_device_create
> - allocate HostIOMMUDevice in host_iommu_device_create callback
>    and set the VFIODevice base_hdev handle (Eric)
> - refine pci_device_set/unset_iommu_device doc (Eric)
> - use HostIOMMUDevice handle instead of union in VTDHostIOMMUDevice (Eric)
> - convert HostIOMMUDevice to sub object pointer in vtd_check_hdev
> 
> rfcv2:
> - introduce common abstract HostIOMMUDevice and sub struct for different BEs (Eric, Cédric)
> - remove iommufd_device.[ch] (Cédric)
> - remove duplicate iommufd/devid define from VFIODevice (Eric)
> - drop the p in aliased_pbus and aliased_pdevfn (Eric)
> - assert devfn and iommu_bus in pci_device_get_iommu_bus_devfn (Cédric, Eric)
> - use errp in iommufd_device_get_info (Eric)
> - split and simplify cap/ecap check/sync code in intel_iommu.c (Cédric)
> - move VTDHostIOMMUDevice declaration to intel_iommu_internal.h (Cédric)
> - make '(vtd->cap_reg >> 16) & 0x3fULL' a MACRO and add missed '+1' (Cédric)
> - block migration if vIOMMU cap/ecap updated based on host IOMMU cap/ecap
> - add R-B
> 
> Yi Liu (2):
>    hw/pci: Introduce pci_device_[set|unset]_iommu_device()
>    intel_iommu: Implement [set|unset]_iommu_device() callbacks
> 
> Zhenzhong Duan (15):
>    backends: Introduce HostIOMMUDevice abstract
>    backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
>    vfio/container: Introduce TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO device
>    backends/iommufd: Introduce TYPE_HOST_IOMMU_DEVICE_IOMMUFD[_VFIO]
>      devices
>    range: Introduce range_get_last_bit()
>    vfio/container: Implement HostIOMMUDeviceClass::realize() handler
>    backends/iommufd: Introduce helper function
>      iommufd_backend_get_device_info()
>    vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
>    vfio/container: Implement HostIOMMUDeviceClass::get_cap() handler
>    backends/iommufd: Implement HostIOMMUDeviceClass::get_cap() handler
>    vfio: Create host IOMMU device instance
>    hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn()
>    vfio/pci: Pass HostIOMMUDevice to vIOMMU
>    intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap
>    intel_iommu: Check compatibility with host IOMMU capabilities
> 
>   MAINTAINERS                           |   2 +
>   include/hw/i386/intel_iommu.h         |   2 +
>   include/hw/pci/pci.h                  |  38 ++++-
>   include/hw/vfio/vfio-common.h         |   8 +
>   include/hw/vfio/vfio-container-base.h |   3 +
>   include/qemu/range.h                  |  11 ++
>   include/sysemu/host_iommu_device.h    |  91 ++++++++++++
>   include/sysemu/iommufd.h              |  19 +++
>   backends/host_iommu_device.c          |  33 +++++
>   backends/iommufd.c                    |  76 ++++++++--
>   hw/i386/intel_iommu.c                 | 203 ++++++++++++++++++++------
>   hw/pci/pci.c                          |  75 +++++++++-
>   hw/vfio/common.c                      |  16 +-
>   hw/vfio/container.c                   |  41 +++++-
>   hw/vfio/helpers.c                     |  17 +++
>   hw/vfio/iommufd.c                     |  37 ++++-
>   hw/vfio/pci.c                         |  19 ++-
>   backends/meson.build                  |   1 +
>   18 files changed, 623 insertions(+), 69 deletions(-)
>   create mode 100644 include/sysemu/host_iommu_device.h
>   create mode 100644 backends/host_iommu_device.c
> 



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU
  2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (18 preceding siblings ...)
  2024-06-17 17:36 ` Cédric Le Goater
@ 2024-06-24 10:26 ` Michael S. Tsirkin
  2024-06-24 15:12   ` Cédric Le Goater
  2024-06-24 21:16 ` Cédric Le Goater
  20 siblings, 1 reply; 25+ messages in thread
From: Michael S. Tsirkin @ 2024-06-24 10:26 UTC (permalink / raw)
  To: Zhenzhong Duan
  Cc: qemu-devel, alex.williamson, clg, eric.auger, peterx, jasowang,
	jgg, nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
	yi.l.liu, chao.p.peng

On Wed, Jun 05, 2024 at 04:30:26PM +0800, Zhenzhong Duan wrote:
> Hi,
> 
> This series introduce a HostIOMMUDevice abstraction and sub-classes.
> Also HostIOMMUDeviceCaps structure in HostIOMMUDevice and a new interface
> between vIOMMU and HostIOMMUDevice.

Reviewed-by: Michael S. Tsirkin <mst@redhat.com>

Who is merging this? Me? Or Alex?



> A HostIOMMUDevice is an abstraction for an assigned device that is protected
> by a physical IOMMU (aka host IOMMU). The userspace interaction with this
> physical IOMMU can be done either through the VFIO IOMMU type 1 legacy
> backend or the new iommufd backend. The assigned device can be a VFIO device
> or a VDPA device. The HostIOMMUDevice is needed to interact with the host
> IOMMU that protects the assigned device. It is especially useful when the
> device is also protected by a virtual IOMMU as this latter use the translation
> services of the physical IOMMU and is constrained by it. In that context the
> HostIOMMUDevice can be passed to the virtual IOMMU to collect physical IOMMU
> capabilities such as the supported address width. In the future, the virtual
> IOMMU will use the HostIOMMUDevice to program the guest page tables in the
> first translation stage of the physical IOMMU.
> 
> HostIOMMUDeviceClass::realize() is introduced to initialize
> HostIOMMUDeviceCaps and other fields of HostIOMMUDevice variants.
> 
> HostIOMMUDeviceClass::get_cap() is introduced to query host IOMMU
> device capabilities.
> 
> The class tree is as below:
> 
>                               HostIOMMUDevice
>                                      | .caps
>                                      | .realize()
>                                      | .get_cap()
>                                      |
>             .-----------------------------------------------.
>             |                        |                      |
> HostIOMMUDeviceLegacyVFIO  {HostIOMMUDeviceLegacyVDPA}  HostIOMMUDeviceIOMMUFD
>             |                        |                      | [.iommufd]
>                                                             | [.devid]
>                                                             | [.ioas_id]
>                                                             | [.attach_hwpt()]
>                                                             | [.detach_hwpt()]
>                                                             |
>                                             .----------------------.
>                                             |                      |
>                          HostIOMMUDeviceIOMMUFDVFIO  {HostIOMMUDeviceIOMMUFDVDPA}
>                                           | [.vdev]                | {.vdev}
> 
> * The attributes in [] will be implemented in nesting series.
> * The classes in {} will be implemented in future.
> * .vdev in different class points to different agent device,
> * i.e., VFIODevice or VDPADevice.
> 
> PATCH1-4: Introduce HostIOMMUDevice and its sub classes
> PATCH5-10: Implement .realize() and .get_cap() handler
> PATCH11-14: Create HostIOMMUDevice instance and pass to vIOMMU
> PATCH15-17: Implement compatibility check between host IOMMU and vIOMMU(intel_iommu)
> 
> Test done:
> make check
> vfio device hotplug/unplug with different backend on linux
> reboot, kexec
> build test on linux and windows11
> 
> Qemu code can be found at:
> https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_preq_v7
> 
> Besides the compatibility check in this series, in nesting series, this
> host IOMMU device is extended for much wider usage. For anyone interested
> on the nesting series, here is the link:
> https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_rfcv2
> 
> Thanks
> Zhenzhong
> 
> Changelog:
> v7:
> - drop config CONFIG_HOST_IOMMU_DEVICE (Cédric)
> - introduce HOST_IOMMU_DEVICE_CAP_AW_BITS_MAX (Eric)
> - use iova_ranges method in iommufd.realize() (Eric)
> - introduce HostIOMMUDevice::name to facilitate tracing (Eric)
> - implement a custom destroy hash function (Cédric)
> - drop VTDHostIOMMUDevice and save HostIOMMUDevice in hash table (Eric)
> - move patch5 after patch1 (Eric)
> - squash patch3 and 4, squash patch12 and 13 (Eric)
> - refine comments (Eric)
> - collect Eric's R-B
> 
> v6:
> - open coded host_iommu_device_get_cap() to avoid #ifdef in intel_iommu.c (Cédric)
> 
> v5:
> - pci_device_set_iommu_device return true (Cédric)
> - fix build failure on windows (thanks Cédric found that issue)
> 
> v4:
> - move properties vdev, iommufd and devid to nesting series where need it (Cédric)
> - fix 32bit build with clz64 (Cédric)
> - change check_cap naming to get_cap (Cédric)
> - return bool if error is passed through errp (Cédric)
> - drop HostIOMMUDevice[LegacyVFIO|IOMMUFD|IOMMUFDVFIO] declaration (Cédric)
> - drop HOST_IOMMU_DEVICE_CAP_IOMMUFD (Cédric)
> - replace include directive with forward declaration (Cédric)
> 
> v3:
> - refine declaration and doc for HostIOMMUDevice (Cédric, Philippe)
> - introduce HostIOMMUDeviceCaps, .realize() and .check_cap() (Cédric)
> - introduce helper range_get_last_bit() for range operation (Cédric)
> - separate pci_device_get_iommu_bus_devfn() in a prereq patch (Cédric)
> - replace HIOD_ abbreviation with HOST_IOMMU_DEVICE_ (Cédric)
> - add header in include/sysemu/iommufd.h (Cédric)
> 
> v2:
> - use QOM to abstract host IOMMU device and its sub-classes (Cédric)
> - move host IOMMU device creation in attach_device() (Cédric)
> - refine pci_device_set/unset_iommu_device doc further (Eric)
> - define host IOMMU info format of different backend
> - implement get_host_iommu_info() for different backend (Cédric)
> - drop cap/ecap update logic (MST)
> - check aw-bits from get_host_iommu_info() in legacy mode
> 
> v1:
> - use HostIOMMUDevice handle instead of union in VFIODevice (Eric)
> - change host_iommu_device_init to host_iommu_device_create
> - allocate HostIOMMUDevice in host_iommu_device_create callback
>   and set the VFIODevice base_hdev handle (Eric)
> - refine pci_device_set/unset_iommu_device doc (Eric)
> - use HostIOMMUDevice handle instead of union in VTDHostIOMMUDevice (Eric)
> - convert HostIOMMUDevice to sub object pointer in vtd_check_hdev
> 
> rfcv2:
> - introduce common abstract HostIOMMUDevice and sub struct for different BEs (Eric, Cédric)
> - remove iommufd_device.[ch] (Cédric)
> - remove duplicate iommufd/devid define from VFIODevice (Eric)
> - drop the p in aliased_pbus and aliased_pdevfn (Eric)
> - assert devfn and iommu_bus in pci_device_get_iommu_bus_devfn (Cédric, Eric)
> - use errp in iommufd_device_get_info (Eric)
> - split and simplify cap/ecap check/sync code in intel_iommu.c (Cédric)
> - move VTDHostIOMMUDevice declaration to intel_iommu_internal.h (Cédric)
> - make '(vtd->cap_reg >> 16) & 0x3fULL' a MACRO and add missed '+1' (Cédric)
> - block migration if vIOMMU cap/ecap updated based on host IOMMU cap/ecap
> - add R-B
> 
> Yi Liu (2):
>   hw/pci: Introduce pci_device_[set|unset]_iommu_device()
>   intel_iommu: Implement [set|unset]_iommu_device() callbacks
> 
> Zhenzhong Duan (15):
>   backends: Introduce HostIOMMUDevice abstract
>   backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
>   vfio/container: Introduce TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO device
>   backends/iommufd: Introduce TYPE_HOST_IOMMU_DEVICE_IOMMUFD[_VFIO]
>     devices
>   range: Introduce range_get_last_bit()
>   vfio/container: Implement HostIOMMUDeviceClass::realize() handler
>   backends/iommufd: Introduce helper function
>     iommufd_backend_get_device_info()
>   vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
>   vfio/container: Implement HostIOMMUDeviceClass::get_cap() handler
>   backends/iommufd: Implement HostIOMMUDeviceClass::get_cap() handler
>   vfio: Create host IOMMU device instance
>   hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn()
>   vfio/pci: Pass HostIOMMUDevice to vIOMMU
>   intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap
>   intel_iommu: Check compatibility with host IOMMU capabilities
> 
>  MAINTAINERS                           |   2 +
>  include/hw/i386/intel_iommu.h         |   2 +
>  include/hw/pci/pci.h                  |  38 ++++-
>  include/hw/vfio/vfio-common.h         |   8 +
>  include/hw/vfio/vfio-container-base.h |   3 +
>  include/qemu/range.h                  |  11 ++
>  include/sysemu/host_iommu_device.h    |  91 ++++++++++++
>  include/sysemu/iommufd.h              |  19 +++
>  backends/host_iommu_device.c          |  33 +++++
>  backends/iommufd.c                    |  76 ++++++++--
>  hw/i386/intel_iommu.c                 | 203 ++++++++++++++++++++------
>  hw/pci/pci.c                          |  75 +++++++++-
>  hw/vfio/common.c                      |  16 +-
>  hw/vfio/container.c                   |  41 +++++-
>  hw/vfio/helpers.c                     |  17 +++
>  hw/vfio/iommufd.c                     |  37 ++++-
>  hw/vfio/pci.c                         |  19 ++-
>  backends/meson.build                  |   1 +
>  18 files changed, 623 insertions(+), 69 deletions(-)
>  create mode 100644 include/sysemu/host_iommu_device.h
>  create mode 100644 backends/host_iommu_device.c
> 
> -- 
> 2.34.1



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU
  2024-06-24 10:26 ` Michael S. Tsirkin
@ 2024-06-24 15:12   ` Cédric Le Goater
  2024-06-24 15:24     ` Michael S. Tsirkin
  0 siblings, 1 reply; 25+ messages in thread
From: Cédric Le Goater @ 2024-06-24 15:12 UTC (permalink / raw)
  To: Michael S. Tsirkin, Zhenzhong Duan
  Cc: qemu-devel, alex.williamson, eric.auger, peterx, jasowang, jgg,
	nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
	yi.l.liu, chao.p.peng

On 6/24/24 12:26 PM, Michael S. Tsirkin wrote:
> On Wed, Jun 05, 2024 at 04:30:26PM +0800, Zhenzhong Duan wrote:
>> Hi,
>>
>> This series introduce a HostIOMMUDevice abstraction and sub-classes.
>> Also HostIOMMUDeviceCaps structure in HostIOMMUDevice and a new interface
>> between vIOMMU and HostIOMMUDevice.
> 
> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> 
> Who is merging this? Me? Or Alex?

I will and I will include this series also :

   [v4] VIRTIO-IOMMU/VFIO: Fix host iommu geometry
   https://lore.kernel.org/all/20240614095402.904691-1-eric.auger@redhat.com

Thanks,

C.


> 
> 
> 
>> A HostIOMMUDevice is an abstraction for an assigned device that is protected
>> by a physical IOMMU (aka host IOMMU). The userspace interaction with this
>> physical IOMMU can be done either through the VFIO IOMMU type 1 legacy
>> backend or the new iommufd backend. The assigned device can be a VFIO device
>> or a VDPA device. The HostIOMMUDevice is needed to interact with the host
>> IOMMU that protects the assigned device. It is especially useful when the
>> device is also protected by a virtual IOMMU as this latter use the translation
>> services of the physical IOMMU and is constrained by it. In that context the
>> HostIOMMUDevice can be passed to the virtual IOMMU to collect physical IOMMU
>> capabilities such as the supported address width. In the future, the virtual
>> IOMMU will use the HostIOMMUDevice to program the guest page tables in the
>> first translation stage of the physical IOMMU.
>>
>> HostIOMMUDeviceClass::realize() is introduced to initialize
>> HostIOMMUDeviceCaps and other fields of HostIOMMUDevice variants.
>>
>> HostIOMMUDeviceClass::get_cap() is introduced to query host IOMMU
>> device capabilities.
>>
>> The class tree is as below:
>>
>>                                HostIOMMUDevice
>>                                       | .caps
>>                                       | .realize()
>>                                       | .get_cap()
>>                                       |
>>              .-----------------------------------------------.
>>              |                        |                      |
>> HostIOMMUDeviceLegacyVFIO  {HostIOMMUDeviceLegacyVDPA}  HostIOMMUDeviceIOMMUFD
>>              |                        |                      | [.iommufd]
>>                                                              | [.devid]
>>                                                              | [.ioas_id]
>>                                                              | [.attach_hwpt()]
>>                                                              | [.detach_hwpt()]
>>                                                              |
>>                                              .----------------------.
>>                                              |                      |
>>                           HostIOMMUDeviceIOMMUFDVFIO  {HostIOMMUDeviceIOMMUFDVDPA}
>>                                            | [.vdev]                | {.vdev}
>>
>> * The attributes in [] will be implemented in nesting series.
>> * The classes in {} will be implemented in future.
>> * .vdev in different class points to different agent device,
>> * i.e., VFIODevice or VDPADevice.
>>
>> PATCH1-4: Introduce HostIOMMUDevice and its sub classes
>> PATCH5-10: Implement .realize() and .get_cap() handler
>> PATCH11-14: Create HostIOMMUDevice instance and pass to vIOMMU
>> PATCH15-17: Implement compatibility check between host IOMMU and vIOMMU(intel_iommu)
>>
>> Test done:
>> make check
>> vfio device hotplug/unplug with different backend on linux
>> reboot, kexec
>> build test on linux and windows11
>>
>> Qemu code can be found at:
>> https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_preq_v7
>>
>> Besides the compatibility check in this series, in nesting series, this
>> host IOMMU device is extended for much wider usage. For anyone interested
>> on the nesting series, here is the link:
>> https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_rfcv2
>>
>> Thanks
>> Zhenzhong
>>
>> Changelog:
>> v7:
>> - drop config CONFIG_HOST_IOMMU_DEVICE (Cédric)
>> - introduce HOST_IOMMU_DEVICE_CAP_AW_BITS_MAX (Eric)
>> - use iova_ranges method in iommufd.realize() (Eric)
>> - introduce HostIOMMUDevice::name to facilitate tracing (Eric)
>> - implement a custom destroy hash function (Cédric)
>> - drop VTDHostIOMMUDevice and save HostIOMMUDevice in hash table (Eric)
>> - move patch5 after patch1 (Eric)
>> - squash patch3 and 4, squash patch12 and 13 (Eric)
>> - refine comments (Eric)
>> - collect Eric's R-B
>>
>> v6:
>> - open coded host_iommu_device_get_cap() to avoid #ifdef in intel_iommu.c (Cédric)
>>
>> v5:
>> - pci_device_set_iommu_device return true (Cédric)
>> - fix build failure on windows (thanks Cédric found that issue)
>>
>> v4:
>> - move properties vdev, iommufd and devid to nesting series where need it (Cédric)
>> - fix 32bit build with clz64 (Cédric)
>> - change check_cap naming to get_cap (Cédric)
>> - return bool if error is passed through errp (Cédric)
>> - drop HostIOMMUDevice[LegacyVFIO|IOMMUFD|IOMMUFDVFIO] declaration (Cédric)
>> - drop HOST_IOMMU_DEVICE_CAP_IOMMUFD (Cédric)
>> - replace include directive with forward declaration (Cédric)
>>
>> v3:
>> - refine declaration and doc for HostIOMMUDevice (Cédric, Philippe)
>> - introduce HostIOMMUDeviceCaps, .realize() and .check_cap() (Cédric)
>> - introduce helper range_get_last_bit() for range operation (Cédric)
>> - separate pci_device_get_iommu_bus_devfn() in a prereq patch (Cédric)
>> - replace HIOD_ abbreviation with HOST_IOMMU_DEVICE_ (Cédric)
>> - add header in include/sysemu/iommufd.h (Cédric)
>>
>> v2:
>> - use QOM to abstract host IOMMU device and its sub-classes (Cédric)
>> - move host IOMMU device creation in attach_device() (Cédric)
>> - refine pci_device_set/unset_iommu_device doc further (Eric)
>> - define host IOMMU info format of different backend
>> - implement get_host_iommu_info() for different backend (Cédric)
>> - drop cap/ecap update logic (MST)
>> - check aw-bits from get_host_iommu_info() in legacy mode
>>
>> v1:
>> - use HostIOMMUDevice handle instead of union in VFIODevice (Eric)
>> - change host_iommu_device_init to host_iommu_device_create
>> - allocate HostIOMMUDevice in host_iommu_device_create callback
>>    and set the VFIODevice base_hdev handle (Eric)
>> - refine pci_device_set/unset_iommu_device doc (Eric)
>> - use HostIOMMUDevice handle instead of union in VTDHostIOMMUDevice (Eric)
>> - convert HostIOMMUDevice to sub object pointer in vtd_check_hdev
>>
>> rfcv2:
>> - introduce common abstract HostIOMMUDevice and sub struct for different BEs (Eric, Cédric)
>> - remove iommufd_device.[ch] (Cédric)
>> - remove duplicate iommufd/devid define from VFIODevice (Eric)
>> - drop the p in aliased_pbus and aliased_pdevfn (Eric)
>> - assert devfn and iommu_bus in pci_device_get_iommu_bus_devfn (Cédric, Eric)
>> - use errp in iommufd_device_get_info (Eric)
>> - split and simplify cap/ecap check/sync code in intel_iommu.c (Cédric)
>> - move VTDHostIOMMUDevice declaration to intel_iommu_internal.h (Cédric)
>> - make '(vtd->cap_reg >> 16) & 0x3fULL' a MACRO and add missed '+1' (Cédric)
>> - block migration if vIOMMU cap/ecap updated based on host IOMMU cap/ecap
>> - add R-B
>>
>> Yi Liu (2):
>>    hw/pci: Introduce pci_device_[set|unset]_iommu_device()
>>    intel_iommu: Implement [set|unset]_iommu_device() callbacks
>>
>> Zhenzhong Duan (15):
>>    backends: Introduce HostIOMMUDevice abstract
>>    backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
>>    vfio/container: Introduce TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO device
>>    backends/iommufd: Introduce TYPE_HOST_IOMMU_DEVICE_IOMMUFD[_VFIO]
>>      devices
>>    range: Introduce range_get_last_bit()
>>    vfio/container: Implement HostIOMMUDeviceClass::realize() handler
>>    backends/iommufd: Introduce helper function
>>      iommufd_backend_get_device_info()
>>    vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
>>    vfio/container: Implement HostIOMMUDeviceClass::get_cap() handler
>>    backends/iommufd: Implement HostIOMMUDeviceClass::get_cap() handler
>>    vfio: Create host IOMMU device instance
>>    hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn()
>>    vfio/pci: Pass HostIOMMUDevice to vIOMMU
>>    intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap
>>    intel_iommu: Check compatibility with host IOMMU capabilities
>>
>>   MAINTAINERS                           |   2 +
>>   include/hw/i386/intel_iommu.h         |   2 +
>>   include/hw/pci/pci.h                  |  38 ++++-
>>   include/hw/vfio/vfio-common.h         |   8 +
>>   include/hw/vfio/vfio-container-base.h |   3 +
>>   include/qemu/range.h                  |  11 ++
>>   include/sysemu/host_iommu_device.h    |  91 ++++++++++++
>>   include/sysemu/iommufd.h              |  19 +++
>>   backends/host_iommu_device.c          |  33 +++++
>>   backends/iommufd.c                    |  76 ++++++++--
>>   hw/i386/intel_iommu.c                 | 203 ++++++++++++++++++++------
>>   hw/pci/pci.c                          |  75 +++++++++-
>>   hw/vfio/common.c                      |  16 +-
>>   hw/vfio/container.c                   |  41 +++++-
>>   hw/vfio/helpers.c                     |  17 +++
>>   hw/vfio/iommufd.c                     |  37 ++++-
>>   hw/vfio/pci.c                         |  19 ++-
>>   backends/meson.build                  |   1 +
>>   18 files changed, 623 insertions(+), 69 deletions(-)
>>   create mode 100644 include/sysemu/host_iommu_device.h
>>   create mode 100644 backends/host_iommu_device.c
>>
>> -- 
>> 2.34.1
> 



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU
  2024-06-24 15:12   ` Cédric Le Goater
@ 2024-06-24 15:24     ` Michael S. Tsirkin
  0 siblings, 0 replies; 25+ messages in thread
From: Michael S. Tsirkin @ 2024-06-24 15:24 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: Zhenzhong Duan, qemu-devel, alex.williamson, eric.auger, peterx,
	jasowang, jgg, nicolinc, joao.m.martins, clement.mathieu--drif,
	kevin.tian, yi.l.liu, chao.p.peng

On Mon, Jun 24, 2024 at 05:12:13PM +0200, Cédric Le Goater wrote:
> On 6/24/24 12:26 PM, Michael S. Tsirkin wrote:
> > On Wed, Jun 05, 2024 at 04:30:26PM +0800, Zhenzhong Duan wrote:
> > > Hi,
> > > 
> > > This series introduce a HostIOMMUDevice abstraction and sub-classes.
> > > Also HostIOMMUDeviceCaps structure in HostIOMMUDevice and a new interface
> > > between vIOMMU and HostIOMMUDevice.
> > 
> > Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> > 
> > Who is merging this? Me? Or Alex?
> 
> I will and I will include this series also :
> 
>   [v4] VIRTIO-IOMMU/VFIO: Fix host iommu geometry
>   https://lore.kernel.org/all/20240614095402.904691-1-eric.auger@redhat.com
> 
> Thanks,
> 
> C.


Sounds good. I sent ack for merging, both reviewed:

Reviewed-by: Michael S. Tsirkin <mst@redhat.com>

> 
> > 
> > 
> > 
> > > A HostIOMMUDevice is an abstraction for an assigned device that is protected
> > > by a physical IOMMU (aka host IOMMU). The userspace interaction with this
> > > physical IOMMU can be done either through the VFIO IOMMU type 1 legacy
> > > backend or the new iommufd backend. The assigned device can be a VFIO device
> > > or a VDPA device. The HostIOMMUDevice is needed to interact with the host
> > > IOMMU that protects the assigned device. It is especially useful when the
> > > device is also protected by a virtual IOMMU as this latter use the translation
> > > services of the physical IOMMU and is constrained by it. In that context the
> > > HostIOMMUDevice can be passed to the virtual IOMMU to collect physical IOMMU
> > > capabilities such as the supported address width. In the future, the virtual
> > > IOMMU will use the HostIOMMUDevice to program the guest page tables in the
> > > first translation stage of the physical IOMMU.
> > > 
> > > HostIOMMUDeviceClass::realize() is introduced to initialize
> > > HostIOMMUDeviceCaps and other fields of HostIOMMUDevice variants.
> > > 
> > > HostIOMMUDeviceClass::get_cap() is introduced to query host IOMMU
> > > device capabilities.
> > > 
> > > The class tree is as below:
> > > 
> > >                                HostIOMMUDevice
> > >                                       | .caps
> > >                                       | .realize()
> > >                                       | .get_cap()
> > >                                       |
> > >              .-----------------------------------------------.
> > >              |                        |                      |
> > > HostIOMMUDeviceLegacyVFIO  {HostIOMMUDeviceLegacyVDPA}  HostIOMMUDeviceIOMMUFD
> > >              |                        |                      | [.iommufd]
> > >                                                              | [.devid]
> > >                                                              | [.ioas_id]
> > >                                                              | [.attach_hwpt()]
> > >                                                              | [.detach_hwpt()]
> > >                                                              |
> > >                                              .----------------------.
> > >                                              |                      |
> > >                           HostIOMMUDeviceIOMMUFDVFIO  {HostIOMMUDeviceIOMMUFDVDPA}
> > >                                            | [.vdev]                | {.vdev}
> > > 
> > > * The attributes in [] will be implemented in nesting series.
> > > * The classes in {} will be implemented in future.
> > > * .vdev in different class points to different agent device,
> > > * i.e., VFIODevice or VDPADevice.
> > > 
> > > PATCH1-4: Introduce HostIOMMUDevice and its sub classes
> > > PATCH5-10: Implement .realize() and .get_cap() handler
> > > PATCH11-14: Create HostIOMMUDevice instance and pass to vIOMMU
> > > PATCH15-17: Implement compatibility check between host IOMMU and vIOMMU(intel_iommu)
> > > 
> > > Test done:
> > > make check
> > > vfio device hotplug/unplug with different backend on linux
> > > reboot, kexec
> > > build test on linux and windows11
> > > 
> > > Qemu code can be found at:
> > > https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_preq_v7
> > > 
> > > Besides the compatibility check in this series, in nesting series, this
> > > host IOMMU device is extended for much wider usage. For anyone interested
> > > on the nesting series, here is the link:
> > > https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_rfcv2
> > > 
> > > Thanks
> > > Zhenzhong
> > > 
> > > Changelog:
> > > v7:
> > > - drop config CONFIG_HOST_IOMMU_DEVICE (Cédric)
> > > - introduce HOST_IOMMU_DEVICE_CAP_AW_BITS_MAX (Eric)
> > > - use iova_ranges method in iommufd.realize() (Eric)
> > > - introduce HostIOMMUDevice::name to facilitate tracing (Eric)
> > > - implement a custom destroy hash function (Cédric)
> > > - drop VTDHostIOMMUDevice and save HostIOMMUDevice in hash table (Eric)
> > > - move patch5 after patch1 (Eric)
> > > - squash patch3 and 4, squash patch12 and 13 (Eric)
> > > - refine comments (Eric)
> > > - collect Eric's R-B
> > > 
> > > v6:
> > > - open coded host_iommu_device_get_cap() to avoid #ifdef in intel_iommu.c (Cédric)
> > > 
> > > v5:
> > > - pci_device_set_iommu_device return true (Cédric)
> > > - fix build failure on windows (thanks Cédric found that issue)
> > > 
> > > v4:
> > > - move properties vdev, iommufd and devid to nesting series where need it (Cédric)
> > > - fix 32bit build with clz64 (Cédric)
> > > - change check_cap naming to get_cap (Cédric)
> > > - return bool if error is passed through errp (Cédric)
> > > - drop HostIOMMUDevice[LegacyVFIO|IOMMUFD|IOMMUFDVFIO] declaration (Cédric)
> > > - drop HOST_IOMMU_DEVICE_CAP_IOMMUFD (Cédric)
> > > - replace include directive with forward declaration (Cédric)
> > > 
> > > v3:
> > > - refine declaration and doc for HostIOMMUDevice (Cédric, Philippe)
> > > - introduce HostIOMMUDeviceCaps, .realize() and .check_cap() (Cédric)
> > > - introduce helper range_get_last_bit() for range operation (Cédric)
> > > - separate pci_device_get_iommu_bus_devfn() in a prereq patch (Cédric)
> > > - replace HIOD_ abbreviation with HOST_IOMMU_DEVICE_ (Cédric)
> > > - add header in include/sysemu/iommufd.h (Cédric)
> > > 
> > > v2:
> > > - use QOM to abstract host IOMMU device and its sub-classes (Cédric)
> > > - move host IOMMU device creation in attach_device() (Cédric)
> > > - refine pci_device_set/unset_iommu_device doc further (Eric)
> > > - define host IOMMU info format of different backend
> > > - implement get_host_iommu_info() for different backend (Cédric)
> > > - drop cap/ecap update logic (MST)
> > > - check aw-bits from get_host_iommu_info() in legacy mode
> > > 
> > > v1:
> > > - use HostIOMMUDevice handle instead of union in VFIODevice (Eric)
> > > - change host_iommu_device_init to host_iommu_device_create
> > > - allocate HostIOMMUDevice in host_iommu_device_create callback
> > >    and set the VFIODevice base_hdev handle (Eric)
> > > - refine pci_device_set/unset_iommu_device doc (Eric)
> > > - use HostIOMMUDevice handle instead of union in VTDHostIOMMUDevice (Eric)
> > > - convert HostIOMMUDevice to sub object pointer in vtd_check_hdev
> > > 
> > > rfcv2:
> > > - introduce common abstract HostIOMMUDevice and sub struct for different BEs (Eric, Cédric)
> > > - remove iommufd_device.[ch] (Cédric)
> > > - remove duplicate iommufd/devid define from VFIODevice (Eric)
> > > - drop the p in aliased_pbus and aliased_pdevfn (Eric)
> > > - assert devfn and iommu_bus in pci_device_get_iommu_bus_devfn (Cédric, Eric)
> > > - use errp in iommufd_device_get_info (Eric)
> > > - split and simplify cap/ecap check/sync code in intel_iommu.c (Cédric)
> > > - move VTDHostIOMMUDevice declaration to intel_iommu_internal.h (Cédric)
> > > - make '(vtd->cap_reg >> 16) & 0x3fULL' a MACRO and add missed '+1' (Cédric)
> > > - block migration if vIOMMU cap/ecap updated based on host IOMMU cap/ecap
> > > - add R-B
> > > 
> > > Yi Liu (2):
> > >    hw/pci: Introduce pci_device_[set|unset]_iommu_device()
> > >    intel_iommu: Implement [set|unset]_iommu_device() callbacks
> > > 
> > > Zhenzhong Duan (15):
> > >    backends: Introduce HostIOMMUDevice abstract
> > >    backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
> > >    vfio/container: Introduce TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO device
> > >    backends/iommufd: Introduce TYPE_HOST_IOMMU_DEVICE_IOMMUFD[_VFIO]
> > >      devices
> > >    range: Introduce range_get_last_bit()
> > >    vfio/container: Implement HostIOMMUDeviceClass::realize() handler
> > >    backends/iommufd: Introduce helper function
> > >      iommufd_backend_get_device_info()
> > >    vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
> > >    vfio/container: Implement HostIOMMUDeviceClass::get_cap() handler
> > >    backends/iommufd: Implement HostIOMMUDeviceClass::get_cap() handler
> > >    vfio: Create host IOMMU device instance
> > >    hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn()
> > >    vfio/pci: Pass HostIOMMUDevice to vIOMMU
> > >    intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap
> > >    intel_iommu: Check compatibility with host IOMMU capabilities
> > > 
> > >   MAINTAINERS                           |   2 +
> > >   include/hw/i386/intel_iommu.h         |   2 +
> > >   include/hw/pci/pci.h                  |  38 ++++-
> > >   include/hw/vfio/vfio-common.h         |   8 +
> > >   include/hw/vfio/vfio-container-base.h |   3 +
> > >   include/qemu/range.h                  |  11 ++
> > >   include/sysemu/host_iommu_device.h    |  91 ++++++++++++
> > >   include/sysemu/iommufd.h              |  19 +++
> > >   backends/host_iommu_device.c          |  33 +++++
> > >   backends/iommufd.c                    |  76 ++++++++--
> > >   hw/i386/intel_iommu.c                 | 203 ++++++++++++++++++++------
> > >   hw/pci/pci.c                          |  75 +++++++++-
> > >   hw/vfio/common.c                      |  16 +-
> > >   hw/vfio/container.c                   |  41 +++++-
> > >   hw/vfio/helpers.c                     |  17 +++
> > >   hw/vfio/iommufd.c                     |  37 ++++-
> > >   hw/vfio/pci.c                         |  19 ++-
> > >   backends/meson.build                  |   1 +
> > >   18 files changed, 623 insertions(+), 69 deletions(-)
> > >   create mode 100644 include/sysemu/host_iommu_device.h
> > >   create mode 100644 backends/host_iommu_device.c
> > > 
> > > -- 
> > > 2.34.1
> > 



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU
  2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
                   ` (19 preceding siblings ...)
  2024-06-24 10:26 ` Michael S. Tsirkin
@ 2024-06-24 21:16 ` Cédric Le Goater
  20 siblings, 0 replies; 25+ messages in thread
From: Cédric Le Goater @ 2024-06-24 21:16 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel
  Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg, nicolinc,
	joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
	chao.p.peng

On 6/5/24 10:30 AM, Zhenzhong Duan wrote:
> Hi,
> 
> This series introduce a HostIOMMUDevice abstraction and sub-classes.
> Also HostIOMMUDeviceCaps structure in HostIOMMUDevice and a new interface
> between vIOMMU and HostIOMMUDevice.
> 
> A HostIOMMUDevice is an abstraction for an assigned device that is protected
> by a physical IOMMU (aka host IOMMU). The userspace interaction with this
> physical IOMMU can be done either through the VFIO IOMMU type 1 legacy
> backend or the new iommufd backend. The assigned device can be a VFIO device
> or a VDPA device. The HostIOMMUDevice is needed to interact with the host
> IOMMU that protects the assigned device. It is especially useful when the
> device is also protected by a virtual IOMMU as this latter use the translation
> services of the physical IOMMU and is constrained by it. In that context the
> HostIOMMUDevice can be passed to the virtual IOMMU to collect physical IOMMU
> capabilities such as the supported address width. In the future, the virtual
> IOMMU will use the HostIOMMUDevice to program the guest page tables in the
> first translation stage of the physical IOMMU.
> 
> HostIOMMUDeviceClass::realize() is introduced to initialize
> HostIOMMUDeviceCaps and other fields of HostIOMMUDevice variants.
> 
> HostIOMMUDeviceClass::get_cap() is introduced to query host IOMMU
> device capabilities.
> 
> The class tree is as below:
> 
>                                HostIOMMUDevice
>                                       | .caps
>                                       | .realize()
>                                       | .get_cap()
>                                       |
>              .-----------------------------------------------.
>              |                        |                      |
> HostIOMMUDeviceLegacyVFIO  {HostIOMMUDeviceLegacyVDPA}  HostIOMMUDeviceIOMMUFD
>              |                        |                      | [.iommufd]
>                                                              | [.devid]
>                                                              | [.ioas_id]
>                                                              | [.attach_hwpt()]
>                                                              | [.detach_hwpt()]
>                                                              |
>                                              .----------------------.
>                                              |                      |
>                           HostIOMMUDeviceIOMMUFDVFIO  {HostIOMMUDeviceIOMMUFDVDPA}
>                                            | [.vdev]                | {.vdev}
> 
> * The attributes in [] will be implemented in nesting series.
> * The classes in {} will be implemented in future.
> * .vdev in different class points to different agent device,
> * i.e., VFIODevice or VDPADevice.
> 
> PATCH1-4: Introduce HostIOMMUDevice and its sub classes
> PATCH5-10: Implement .realize() and .get_cap() handler
> PATCH11-14: Create HostIOMMUDevice instance and pass to vIOMMU
> PATCH15-17: Implement compatibility check between host IOMMU and vIOMMU(intel_iommu)
> 
> Test done:
> make check
> vfio device hotplug/unplug with different backend on linux
> reboot, kexec
> build test on linux and windows11
> 
> Qemu code can be found at:
> https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_preq_v7
> 
> Besides the compatibility check in this series, in nesting series, this
> host IOMMU device is extended for much wider usage. For anyone interested
> on the nesting series, here is the link:
> https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_rfcv2
> 
> Thanks
> Zhenzhong
> 
> Changelog:
> v7:
> - drop config CONFIG_HOST_IOMMU_DEVICE (Cédric)
> - introduce HOST_IOMMU_DEVICE_CAP_AW_BITS_MAX (Eric)
> - use iova_ranges method in iommufd.realize() (Eric)
> - introduce HostIOMMUDevice::name to facilitate tracing (Eric)
> - implement a custom destroy hash function (Cédric)
> - drop VTDHostIOMMUDevice and save HostIOMMUDevice in hash table (Eric)
> - move patch5 after patch1 (Eric)
> - squash patch3 and 4, squash patch12 and 13 (Eric)
> - refine comments (Eric)
> - collect Eric's R-B
> 
> v6:
> - open coded host_iommu_device_get_cap() to avoid #ifdef in intel_iommu.c (Cédric)
> 
> v5:
> - pci_device_set_iommu_device return true (Cédric)
> - fix build failure on windows (thanks Cédric found that issue)
> 
> v4:
> - move properties vdev, iommufd and devid to nesting series where need it (Cédric)
> - fix 32bit build with clz64 (Cédric)
> - change check_cap naming to get_cap (Cédric)
> - return bool if error is passed through errp (Cédric)
> - drop HostIOMMUDevice[LegacyVFIO|IOMMUFD|IOMMUFDVFIO] declaration (Cédric)
> - drop HOST_IOMMU_DEVICE_CAP_IOMMUFD (Cédric)
> - replace include directive with forward declaration (Cédric)
> 
> v3:
> - refine declaration and doc for HostIOMMUDevice (Cédric, Philippe)
> - introduce HostIOMMUDeviceCaps, .realize() and .check_cap() (Cédric)
> - introduce helper range_get_last_bit() for range operation (Cédric)
> - separate pci_device_get_iommu_bus_devfn() in a prereq patch (Cédric)
> - replace HIOD_ abbreviation with HOST_IOMMU_DEVICE_ (Cédric)
> - add header in include/sysemu/iommufd.h (Cédric)
> 
> v2:
> - use QOM to abstract host IOMMU device and its sub-classes (Cédric)
> - move host IOMMU device creation in attach_device() (Cédric)
> - refine pci_device_set/unset_iommu_device doc further (Eric)
> - define host IOMMU info format of different backend
> - implement get_host_iommu_info() for different backend (Cédric)
> - drop cap/ecap update logic (MST)
> - check aw-bits from get_host_iommu_info() in legacy mode
> 
> v1:
> - use HostIOMMUDevice handle instead of union in VFIODevice (Eric)
> - change host_iommu_device_init to host_iommu_device_create
> - allocate HostIOMMUDevice in host_iommu_device_create callback
>    and set the VFIODevice base_hdev handle (Eric)
> - refine pci_device_set/unset_iommu_device doc (Eric)
> - use HostIOMMUDevice handle instead of union in VTDHostIOMMUDevice (Eric)
> - convert HostIOMMUDevice to sub object pointer in vtd_check_hdev
> 
> rfcv2:
> - introduce common abstract HostIOMMUDevice and sub struct for different BEs (Eric, Cédric)
> - remove iommufd_device.[ch] (Cédric)
> - remove duplicate iommufd/devid define from VFIODevice (Eric)
> - drop the p in aliased_pbus and aliased_pdevfn (Eric)
> - assert devfn and iommu_bus in pci_device_get_iommu_bus_devfn (Cédric, Eric)
> - use errp in iommufd_device_get_info (Eric)
> - split and simplify cap/ecap check/sync code in intel_iommu.c (Cédric)
> - move VTDHostIOMMUDevice declaration to intel_iommu_internal.h (Cédric)
> - make '(vtd->cap_reg >> 16) & 0x3fULL' a MACRO and add missed '+1' (Cédric)
> - block migration if vIOMMU cap/ecap updated based on host IOMMU cap/ecap
> - add R-B
> 
> Yi Liu (2):
>    hw/pci: Introduce pci_device_[set|unset]_iommu_device()
>    intel_iommu: Implement [set|unset]_iommu_device() callbacks
> 
> Zhenzhong Duan (15):
>    backends: Introduce HostIOMMUDevice abstract
>    backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
>    vfio/container: Introduce TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO device
>    backends/iommufd: Introduce TYPE_HOST_IOMMU_DEVICE_IOMMUFD[_VFIO]
>      devices
>    range: Introduce range_get_last_bit()
>    vfio/container: Implement HostIOMMUDeviceClass::realize() handler
>    backends/iommufd: Introduce helper function
>      iommufd_backend_get_device_info()
>    vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
>    vfio/container: Implement HostIOMMUDeviceClass::get_cap() handler
>    backends/iommufd: Implement HostIOMMUDeviceClass::get_cap() handler
>    vfio: Create host IOMMU device instance
>    hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn()
>    vfio/pci: Pass HostIOMMUDevice to vIOMMU
>    intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap
>    intel_iommu: Check compatibility with host IOMMU capabilities
> 
>   MAINTAINERS                           |   2 +
>   include/hw/i386/intel_iommu.h         |   2 +
>   include/hw/pci/pci.h                  |  38 ++++-
>   include/hw/vfio/vfio-common.h         |   8 +
>   include/hw/vfio/vfio-container-base.h |   3 +
>   include/qemu/range.h                  |  11 ++
>   include/sysemu/host_iommu_device.h    |  91 ++++++++++++
>   include/sysemu/iommufd.h              |  19 +++
>   backends/host_iommu_device.c          |  33 +++++
>   backends/iommufd.c                    |  76 ++++++++--
>   hw/i386/intel_iommu.c                 | 203 ++++++++++++++++++++------
>   hw/pci/pci.c                          |  75 +++++++++-
>   hw/vfio/common.c                      |  16 +-
>   hw/vfio/container.c                   |  41 +++++-
>   hw/vfio/helpers.c                     |  17 +++
>   hw/vfio/iommufd.c                     |  37 ++++-
>   hw/vfio/pci.c                         |  19 ++-
>   backends/meson.build                  |   1 +
>   18 files changed, 623 insertions(+), 69 deletions(-)
>   create mode 100644 include/sysemu/host_iommu_device.h
>   create mode 100644 backends/host_iommu_device.c
> 


Applied to vfio-next.

Thanks,

C.





^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2024-06-24 21:17 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-05  8:30 [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
2024-06-05  8:30 ` [PATCH v7 01/17] backends: Introduce HostIOMMUDevice abstract Zhenzhong Duan
2024-06-05  8:30 ` [PATCH v7 02/17] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps Zhenzhong Duan
2024-06-05  8:30 ` [PATCH v7 03/17] vfio/container: Introduce TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO device Zhenzhong Duan
2024-06-05  8:30 ` [PATCH v7 04/17] backends/iommufd: Introduce TYPE_HOST_IOMMU_DEVICE_IOMMUFD[_VFIO] devices Zhenzhong Duan
2024-06-05  8:30 ` [PATCH v7 05/17] range: Introduce range_get_last_bit() Zhenzhong Duan
2024-06-05  8:30 ` [PATCH v7 06/17] vfio/container: Implement HostIOMMUDeviceClass::realize() handler Zhenzhong Duan
2024-06-05  8:30 ` [PATCH v7 07/17] backends/iommufd: Introduce helper function iommufd_backend_get_device_info() Zhenzhong Duan
2024-06-05  8:30 ` [PATCH v7 08/17] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler Zhenzhong Duan
2024-06-05  8:30 ` [PATCH v7 09/17] vfio/container: Implement HostIOMMUDeviceClass::get_cap() handler Zhenzhong Duan
2024-06-05  8:30 ` [PATCH v7 10/17] backends/iommufd: " Zhenzhong Duan
2024-06-05  8:30 ` [PATCH v7 11/17] vfio: Create host IOMMU device instance Zhenzhong Duan
2024-06-05  8:30 ` [PATCH v7 12/17] hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn() Zhenzhong Duan
2024-06-05  8:30 ` [PATCH v7 13/17] hw/pci: Introduce pci_device_[set|unset]_iommu_device() Zhenzhong Duan
2024-06-05  8:30 ` [PATCH v7 14/17] vfio/pci: Pass HostIOMMUDevice to vIOMMU Zhenzhong Duan
2024-06-05  8:30 ` [PATCH v7 15/17] intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap Zhenzhong Duan
2024-06-05  8:30 ` [PATCH v7 16/17] intel_iommu: Implement [set|unset]_iommu_device() callbacks Zhenzhong Duan
2024-06-05  8:30 ` [PATCH v7 17/17] intel_iommu: Check compatibility with host IOMMU capabilities Zhenzhong Duan
2024-06-07 15:00 ` [PATCH v7 00/17] Add a host IOMMU device abstraction to check with vIOMMU Eric Auger
2024-06-11  2:32   ` Duan, Zhenzhong
2024-06-17 17:36 ` Cédric Le Goater
2024-06-24 10:26 ` Michael S. Tsirkin
2024-06-24 15:12   ` Cédric Le Goater
2024-06-24 15:24     ` Michael S. Tsirkin
2024-06-24 21:16 ` Cédric Le Goater

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).