qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PULL 00/16] vfio queue
@ 2023-06-30  5:22 Cédric Le Goater
  2023-06-30  9:55 ` Richard Henderson
  0 siblings, 1 reply; 19+ messages in thread
From: Cédric Le Goater @ 2023-06-30  5:22 UTC (permalink / raw)
  To: qemu-devel; +Cc: Richard Henderson, Alex Williamson, Cédric Le Goater

The following changes since commit 4d541f63e90c81112c298cbb35ed53e9c79deb00:

  Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging (2023-06-29 13:16:06 +0200)

are available in the Git repository at:

  https://github.com/legoater/qemu/ tags/pull-vfio-20230630

for you to fetch changes up to 0cc889c8826cefa5b80110d31a62273b56aa1832:

  vfio/pci: Free leaked timer in vfio_realize error path (2023-06-30 06:02:51 +0200)

----------------------------------------------------------------
vfio queue:

* migration: New switchover ack to reduce downtime
* VFIO migration pre-copy support
* Removal of the VFIO migration experimental flag
* Alternate offset for GPUDirect Cliques
* Misc fixes

----------------------------------------------------------------
Alex Williamson (3):
      vfio: Implement a common device info helper
      hw/vfio/pci-quirks: Support alternate offset for GPUDirect Cliques
      MAINTAINERS: Promote Cédric to VFIO co-maintainer

Avihai Horon (10):
      migration: Add switchover ack capability
      migration: Implement switchover ack logic
      migration: Enable switchover ack capability
      tests: Add migration switchover ack capability test
      vfio/migration: Refactor vfio_save_block() to return saved data size
      vfio/migration: Store VFIO migration flags in VFIOMigration
      vfio/migration: Add VFIO migration pre-copy support
      vfio/migration: Add support for switchover ack capability
      vfio/migration: Reset bytes_transferred properly
      vfio/migration: Make VFIO migration non-experimental

Shameer Kolothum (1):
      vfio/pci: Call vfio_prepare_kvm_msi_virq_batch() in MSI retry path

Zhenzhong Duan (2):
      vfio/pci: Fix a segfault in vfio_realize
      vfio/pci: Free leaked timer in vfio_realize error path

 MAINTAINERS                   |   2 +-
 docs/devel/vfio-migration.rst |  45 +++++--
 qapi/migration.json           |  12 +-
 include/hw/vfio/vfio-common.h |  12 +-
 include/migration/register.h  |   2 +
 migration/migration.h         |  15 +++
 migration/options.h           |   1 +
 migration/savevm.h            |   1 +
 hw/s390x/s390-pci-vfio.c      |  37 +----
 hw/vfio/common.c              |  68 +++++++---
 hw/vfio/migration.c           | 305 ++++++++++++++++++++++++++++++++++++------
 hw/vfio/pci-quirks.c          |  41 +++++-
 hw/vfio/pci.c                 |  15 ++-
 migration/migration.c         |  33 ++++-
 migration/options.c           |  17 +++
 migration/savevm.c            |  55 ++++++++
 migration/target.c            |  17 ++-
 tests/qtest/migration-test.c  |  31 +++++
 hw/vfio/trace-events          |   6 +-
 migration/trace-events        |   3 +
 20 files changed, 600 insertions(+), 118 deletions(-)


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PULL 00/16] vfio queue
  2023-06-30  5:22 Cédric Le Goater
@ 2023-06-30  9:55 ` Richard Henderson
  0 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2023-06-30  9:55 UTC (permalink / raw)
  To: Cédric Le Goater, qemu-devel; +Cc: Alex Williamson

On 6/30/23 07:22, Cédric Le Goater wrote:
> The following changes since commit 4d541f63e90c81112c298cbb35ed53e9c79deb00:
> 
>    Merge tag 'for-upstream' ofhttps://gitlab.com/bonzini/qemu  into staging (2023-06-29 13:16:06 +0200)
> 
> are available in the Git repository at:
> 
>    https://github.com/legoater/qemu/  tags/pull-vfio-20230630
> 
> for you to fetch changes up to 0cc889c8826cefa5b80110d31a62273b56aa1832:
> 
>    vfio/pci: Free leaked timer in vfio_realize error path (2023-06-30 06:02:51 +0200)
> 
> ----------------------------------------------------------------
> vfio queue:
> 
> * migration: New switchover ack to reduce downtime
> * VFIO migration pre-copy support
> * Removal of the VFIO migration experimental flag
> * Alternate offset for GPUDirect Cliques
> * Misc fixes

Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/8.1 as appropriate.


r~



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PULL 00/16] vfio queue
@ 2024-07-23 14:00 Cédric Le Goater
  2024-07-23 14:00 ` [PULL 01/16] hw/vfio/container: Fix SIGSEV on vfio_container_instance_finalize() Cédric Le Goater
                   ` (13 more replies)
  0 siblings, 14 replies; 19+ messages in thread
From: Cédric Le Goater @ 2024-07-23 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alex Williamson, Cédric Le Goater

The following changes since commit 6af69d02706c821797802cfd56acdac13a7c9422:

  Merge tag 'nvme-next-pull-request' of https://gitlab.com/birkelund/qemu into staging (2024-07-23 13:55:45 +1000)

are available in the Git repository at:

  https://github.com/legoater/qemu/ tags/pull-vfio-20240723

for you to fetch changes up to 6ac9efe6805af60de14481fdde7d340080d38324:

  vfio/common: Allow disabling device dirty page tracking (2024-07-23 11:10:10 +0200)

----------------------------------------------------------------
vfio queue:

* IOMMUFD Dirty Tracking support
* Fix for a possible SEGV in IOMMU type1 container
* Dropped initialization of host IOMMU device with mdev devices

----------------------------------------------------------------
Eric Auger (1):
      hw/vfio/container: Fix SIGSEV on vfio_container_instance_finalize()

Joao Martins (13):
      vfio/pci: Extract mdev check into an helper
      vfio/iommufd: Don't initialize nor set a HOST_IOMMU_DEVICE with mdev
      backends/iommufd: Extend iommufd_backend_get_device_info() to fetch HW capabilities
      vfio/iommufd: Return errno in iommufd_cdev_attach_ioas_hwpt()
      vfio/iommufd: Introduce auto domain creation
      vfio/{iommufd,container}: Remove caps::aw_bits
      vfio/iommufd: Add hw_caps field to HostIOMMUDeviceCaps
      vfio/{iommufd, container}: Invoke HostIOMMUDevice::realize() during attach_device()
      vfio/iommufd: Probe and request hwpt dirty tracking capability
      vfio/iommufd: Implement VFIOIOMMUClass::set_dirty_tracking support
      vfio/iommufd: Implement VFIOIOMMUClass::query_dirty_bitmap support
      vfio/migration: Don't block migration device dirty tracking is unsupported
      vfio/common: Allow disabling device dirty page tracking

Zhenzhong Duan (2):
      vfio/ap: Don't initialize HOST_IOMMU_DEVICE with mdev
      vfio/ccw: Don't initialize HOST_IOMMU_DEVICE with mdev

 include/hw/vfio/vfio-common.h      |  15 +++
 include/sysemu/host_iommu_device.h |   5 +-
 include/sysemu/iommufd.h           |  13 ++-
 backends/iommufd.c                 |  89 ++++++++++++++++-
 hw/vfio/ap.c                       |   3 +
 hw/vfio/ccw.c                      |   3 +
 hw/vfio/common.c                   |  17 ++--
 hw/vfio/container.c                |  10 +-
 hw/vfio/helpers.c                  |  25 +++++
 hw/vfio/iommufd.c                  | 196 +++++++++++++++++++++++++++++++++++--
 hw/vfio/migration.c                |  12 ++-
 hw/vfio/pci.c                      |  26 ++---
 backends/trace-events              |   3 +
 13 files changed, 377 insertions(+), 40 deletions(-)



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PULL 01/16] hw/vfio/container: Fix SIGSEV on vfio_container_instance_finalize()
  2024-07-23 14:00 [PULL 00/16] vfio queue Cédric Le Goater
@ 2024-07-23 14:00 ` Cédric Le Goater
  2024-07-23 14:00 ` [PULL 02/16] vfio/pci: Extract mdev check into an helper Cédric Le Goater
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Cédric Le Goater @ 2024-07-23 14:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Williamson, Eric Auger, Cédric Le Goater,
	Zhenzhong Duan

From: Eric Auger <eric.auger@redhat.com>

In vfio_connect_container's error path, the base container is
removed twice form the VFIOAddressSpace QLIST: first on the
listener_release_exit label and second, on free_container_exit
label, through object_unref(container), which calls
vfio_container_instance_finalize().

Let's remove the first instance.

Fixes: 938026053f4 ("vfio/container: Switch to QOM")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 hw/vfio/container.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 38a9df34964a4e5a4d349c14d54f66585728d5ca..ce9a858e56218a9e9c803b4f5cf4c9f7cfc4edda 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -656,7 +656,6 @@ static bool vfio_connect_container(VFIOGroup *group, AddressSpace *as,
     return true;
 listener_release_exit:
     QLIST_REMOVE(group, container_next);
-    QLIST_REMOVE(bcontainer, next);
     vfio_kvm_device_del_group(group);
     memory_listener_unregister(&bcontainer->listener);
     if (vioc->release) {
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PULL 02/16] vfio/pci: Extract mdev check into an helper
  2024-07-23 14:00 [PULL 00/16] vfio queue Cédric Le Goater
  2024-07-23 14:00 ` [PULL 01/16] hw/vfio/container: Fix SIGSEV on vfio_container_instance_finalize() Cédric Le Goater
@ 2024-07-23 14:00 ` Cédric Le Goater
  2024-07-23 14:00 ` [PULL 03/16] vfio/iommufd: Don't initialize nor set a HOST_IOMMU_DEVICE with mdev Cédric Le Goater
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Cédric Le Goater @ 2024-07-23 14:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Williamson, Joao Martins, Cédric Le Goater,
	Zhenzhong Duan, Eric Auger

From: Joao Martins <joao.m.martins@oracle.com>

In preparation to skip initialization of the HostIOMMUDevice for mdev,
extract the checks that validate if a device is an mdev into helpers.

A vfio_device_is_mdev() is created, and subsystems consult VFIODevice::mdev
to check if it's mdev or not.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 include/hw/vfio/vfio-common.h |  2 ++
 hw/vfio/helpers.c             | 14 ++++++++++++++
 hw/vfio/pci.c                 | 12 +++---------
 3 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index e8ddf92bb18547f0d3b811b3d757cbae7fec8b8d..98acae8c1c975390c6cd0fdc02a1282f64ea2987 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -116,6 +116,7 @@ typedef struct VFIODevice {
     DeviceState *dev;
     int fd;
     int type;
+    bool mdev;
     bool reset_works;
     bool needs_reset;
     bool no_mmap;
@@ -231,6 +232,7 @@ void vfio_region_exit(VFIORegion *region);
 void vfio_region_finalize(VFIORegion *region);
 void vfio_reset_handler(void *opaque);
 struct vfio_device_info *vfio_get_device_info(int fd);
+bool vfio_device_is_mdev(VFIODevice *vbasedev);
 bool vfio_attach_device(char *name, VFIODevice *vbasedev,
                         AddressSpace *as, Error **errp);
 void vfio_detach_device(VFIODevice *vbasedev);
diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c
index b14edd46edc9069bb148359a1b419253ff4e5ef0..7e23e9080c9d2860dea51ca5ef5fbc840d42a32d 100644
--- a/hw/vfio/helpers.c
+++ b/hw/vfio/helpers.c
@@ -675,3 +675,17 @@ int vfio_device_get_aw_bits(VFIODevice *vdev)
 
     return HOST_IOMMU_DEVICE_CAP_AW_BITS_MAX;
 }
+
+bool vfio_device_is_mdev(VFIODevice *vbasedev)
+{
+    g_autofree char *subsys = NULL;
+    g_autofree char *tmp = NULL;
+
+    if (!vbasedev->sysfsdev) {
+        return false;
+    }
+
+    tmp = g_strdup_printf("%s/subsystem", vbasedev->sysfsdev);
+    subsys = realpath(tmp, NULL);
+    return subsys && (strcmp(subsys, "/sys/bus/mdev") == 0);
+}
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index e03d9f3ba5461f55f6351d937aba5d522a9128ec..b34e91468a533ab4d550bf2392e940b867f7b34c 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2963,12 +2963,9 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
     ERRP_GUARD();
     VFIOPCIDevice *vdev = VFIO_PCI(pdev);
     VFIODevice *vbasedev = &vdev->vbasedev;
-    char *subsys;
     int i, ret;
-    bool is_mdev;
     char uuid[UUID_STR_LEN];
     g_autofree char *name = NULL;
-    g_autofree char *tmp = NULL;
 
     if (vbasedev->fd < 0 && !vbasedev->sysfsdev) {
         if (!(~vdev->host.domain || ~vdev->host.bus ||
@@ -2997,14 +2994,11 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
      * stays in sync with the active working set of the guest driver.  Prevent
      * the x-balloon-allowed option unless this is minimally an mdev device.
      */
-    tmp = g_strdup_printf("%s/subsystem", vbasedev->sysfsdev);
-    subsys = realpath(tmp, NULL);
-    is_mdev = subsys && (strcmp(subsys, "/sys/bus/mdev") == 0);
-    free(subsys);
+    vbasedev->mdev = vfio_device_is_mdev(vbasedev);
 
-    trace_vfio_mdev(vbasedev->name, is_mdev);
+    trace_vfio_mdev(vbasedev->name, vbasedev->mdev);
 
-    if (vbasedev->ram_block_discard_allowed && !is_mdev) {
+    if (vbasedev->ram_block_discard_allowed && !vbasedev->mdev) {
         error_setg(errp, "x-balloon-allowed only potentially compatible "
                    "with mdev devices");
         goto error;
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PULL 03/16] vfio/iommufd: Don't initialize nor set a HOST_IOMMU_DEVICE with mdev
  2024-07-23 14:00 [PULL 00/16] vfio queue Cédric Le Goater
  2024-07-23 14:00 ` [PULL 01/16] hw/vfio/container: Fix SIGSEV on vfio_container_instance_finalize() Cédric Le Goater
  2024-07-23 14:00 ` [PULL 02/16] vfio/pci: Extract mdev check into an helper Cédric Le Goater
@ 2024-07-23 14:00 ` Cédric Le Goater
  2024-07-23 14:00 ` [PULL 04/16] backends/iommufd: Extend iommufd_backend_get_device_info() to fetch HW capabilities Cédric Le Goater
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Cédric Le Goater @ 2024-07-23 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alex Williamson, Joao Martins, Zhenzhong Duan, Eric Auger

From: Joao Martins <joao.m.martins@oracle.com>

mdevs aren't "physical" devices and when asking for backing IOMMU info, it
fails the entire provisioning of the guest. Fix that by skipping
HostIOMMUDevice initialization in the presence of mdevs, and skip setting
an iommu device when it is known to be an mdev.

Cc: Zhenzhong Duan <zhenzhong.duan@intel.com>
Fixes: 930589520128 ("vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler")
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 hw/vfio/common.c |  4 ++++
 hw/vfio/pci.c    | 11 ++++++++---
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 6d15b36e0bbbdaeb9437725167e61fdf5502555a..d7f02be595b5e71558d7e2d75d21d28f05968252 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1548,6 +1548,10 @@ bool vfio_attach_device(char *name, VFIODevice *vbasedev,
         return false;
     }
 
+    if (vbasedev->mdev) {
+        return true;
+    }
+
     hiod = HOST_IOMMU_DEVICE(object_new(ops->hiod_typename));
     if (!HOST_IOMMU_DEVICE_GET_CLASS(hiod)->realize(hiod, vbasedev, errp)) {
         object_unref(hiod);
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index b34e91468a533ab4d550bf2392e940b867f7b34c..265d3cb82ffc2a6ada02547c0d8e306318442ef7 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3115,7 +3115,8 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
 
     vfio_bars_register(vdev);
 
-    if (!pci_device_set_iommu_device(pdev, vbasedev->hiod, errp)) {
+    if (!vbasedev->mdev &&
+        !pci_device_set_iommu_device(pdev, vbasedev->hiod, errp)) {
         error_prepend(errp, "Failed to set iommu_device: ");
         goto out_teardown;
     }
@@ -3238,7 +3239,9 @@ out_deregister:
         timer_free(vdev->intx.mmap_timer);
     }
 out_unset_idev:
-    pci_device_unset_iommu_device(pdev);
+    if (!vbasedev->mdev) {
+        pci_device_unset_iommu_device(pdev);
+    }
 out_teardown:
     vfio_teardown_msi(vdev);
     vfio_bars_exit(vdev);
@@ -3283,7 +3286,9 @@ static void vfio_exitfn(PCIDevice *pdev)
     vfio_pci_disable_rp_atomics(vdev);
     vfio_bars_exit(vdev);
     vfio_migration_exit(vbasedev);
-    pci_device_unset_iommu_device(pdev);
+    if (!vbasedev->mdev) {
+        pci_device_unset_iommu_device(pdev);
+    }
 }
 
 static void vfio_pci_reset(DeviceState *dev)
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PULL 04/16] backends/iommufd: Extend iommufd_backend_get_device_info() to fetch HW capabilities
  2024-07-23 14:00 [PULL 00/16] vfio queue Cédric Le Goater
                   ` (2 preceding siblings ...)
  2024-07-23 14:00 ` [PULL 03/16] vfio/iommufd: Don't initialize nor set a HOST_IOMMU_DEVICE with mdev Cédric Le Goater
@ 2024-07-23 14:00 ` Cédric Le Goater
  2024-07-23 14:00 ` [PULL 05/16] vfio/iommufd: Return errno in iommufd_cdev_attach_ioas_hwpt() Cédric Le Goater
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Cédric Le Goater @ 2024-07-23 14:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Williamson, Joao Martins, Zhenzhong Duan,
	Cédric Le Goater, Eric Auger

From: Joao Martins <joao.m.martins@oracle.com>

The helper will be able to fetch vendor agnostic IOMMU capabilities
supported both by hardware and software. Right now it is only iommu dirty
tracking.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 include/sysemu/iommufd.h | 2 +-
 backends/iommufd.c       | 4 +++-
 hw/vfio/iommufd.c        | 4 +++-
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
index 9edfec604595c7ed0e4032472bb73c9b4d2ea559..57d502a1c79a65e0447989f398e4e54c37839531 100644
--- a/include/sysemu/iommufd.h
+++ b/include/sysemu/iommufd.h
@@ -49,7 +49,7 @@ int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
                               hwaddr iova, ram_addr_t size);
 bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
                                      uint32_t *type, void *data, uint32_t len,
-                                     Error **errp);
+                                     uint64_t *caps, Error **errp);
 
 #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
 #endif
diff --git a/backends/iommufd.c b/backends/iommufd.c
index cabd1b50025d6910d072fd61d8702765c7ffb7ef..48dfd39624740e05217fb55be98ff5e054a32670 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -209,7 +209,7 @@ int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
 
 bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
                                      uint32_t *type, void *data, uint32_t len,
-                                     Error **errp)
+                                     uint64_t *caps, Error **errp)
 {
     struct iommu_hw_info info = {
         .size = sizeof(info),
@@ -225,6 +225,8 @@ bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
 
     g_assert(type);
     *type = info.out_data_type;
+    g_assert(caps);
+    *caps = info.out_capabilities;
 
     return true;
 }
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 7b5f87a1488111f7b88ce7588db4f5e5bd976978..7c1b9e0284a3e84f68d13031cd517bffc47376d8 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -628,11 +628,13 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
     union {
         struct iommu_hw_info_vtd vtd;
     } data;
+    uint64_t hw_caps;
 
     hiod->agent = opaque;
 
     if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid,
-                                         &type, &data, sizeof(data), errp)) {
+                                         &type, &data, sizeof(data),
+                                         &hw_caps, errp)) {
         return false;
     }
 
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PULL 05/16] vfio/iommufd: Return errno in iommufd_cdev_attach_ioas_hwpt()
  2024-07-23 14:00 [PULL 00/16] vfio queue Cédric Le Goater
                   ` (3 preceding siblings ...)
  2024-07-23 14:00 ` [PULL 04/16] backends/iommufd: Extend iommufd_backend_get_device_info() to fetch HW capabilities Cédric Le Goater
@ 2024-07-23 14:00 ` Cédric Le Goater
  2024-07-23 14:00 ` [PULL 06/16] vfio/ap: Don't initialize HOST_IOMMU_DEVICE with mdev Cédric Le Goater
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Cédric Le Goater @ 2024-07-23 14:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Williamson, Joao Martins, Cédric Le Goater, Eric Auger,
	Zhenzhong Duan

From: Joao Martins <joao.m.martins@oracle.com>

In preparation to implement auto domains have the attach function
return the errno it got during domain attach instead of a bool.

-EINVAL is tracked to track domain incompatibilities, and decide whether
to create a new IOMMU domain.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 hw/vfio/iommufd.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 7c1b9e0284a3e84f68d13031cd517bffc47376d8..7390621ee92762c5d752c0fae907e71380b6e980 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -172,7 +172,7 @@ out:
     return ret;
 }
 
-static bool iommufd_cdev_attach_ioas_hwpt(VFIODevice *vbasedev, uint32_t id,
+static int iommufd_cdev_attach_ioas_hwpt(VFIODevice *vbasedev, uint32_t id,
                                          Error **errp)
 {
     int iommufd = vbasedev->iommufd->fd;
@@ -187,12 +187,12 @@ static bool iommufd_cdev_attach_ioas_hwpt(VFIODevice *vbasedev, uint32_t id,
         error_setg_errno(errp, errno,
                          "[iommufd=%d] error attach %s (%d) to id=%d",
                          iommufd, vbasedev->name, vbasedev->fd, id);
-        return false;
+        return -errno;
     }
 
     trace_iommufd_cdev_attach_ioas_hwpt(iommufd, vbasedev->name,
                                         vbasedev->fd, id);
-    return true;
+    return 0;
 }
 
 static bool iommufd_cdev_detach_ioas_hwpt(VFIODevice *vbasedev, Error **errp)
@@ -216,7 +216,7 @@ static bool iommufd_cdev_attach_container(VFIODevice *vbasedev,
                                           VFIOIOMMUFDContainer *container,
                                           Error **errp)
 {
-    return iommufd_cdev_attach_ioas_hwpt(vbasedev, container->ioas_id, errp);
+    return !iommufd_cdev_attach_ioas_hwpt(vbasedev, container->ioas_id, errp);
 }
 
 static void iommufd_cdev_detach_container(VFIODevice *vbasedev,
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PULL 06/16] vfio/ap: Don't initialize HOST_IOMMU_DEVICE with mdev
  2024-07-23 14:00 [PULL 00/16] vfio queue Cédric Le Goater
                   ` (4 preceding siblings ...)
  2024-07-23 14:00 ` [PULL 05/16] vfio/iommufd: Return errno in iommufd_cdev_attach_ioas_hwpt() Cédric Le Goater
@ 2024-07-23 14:00 ` Cédric Le Goater
  2024-07-23 14:00 ` [PULL 07/16] vfio/ccw: " Cédric Le Goater
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Cédric Le Goater @ 2024-07-23 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alex Williamson, Zhenzhong Duan, Joao Martins, Eric Auger

From: Zhenzhong Duan <zhenzhong.duan@intel.com>

mdevs aren't "physical" devices and when asking for backing IOMMU info,
it fails the entire provisioning of the guest. Fix that by setting
vbasedev->mdev true so skipping HostIOMMUDevice initialization in the
presence of mdevs.

Fixes: 930589520128 ("vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 hw/vfio/ap.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
index 0c4354e3e70169ec072e16da0919936647d1d351..71bf32b83c50b9892b018db10e2f2ae0cf312e97 100644
--- a/hw/vfio/ap.c
+++ b/hw/vfio/ap.c
@@ -230,6 +230,9 @@ static void vfio_ap_instance_init(Object *obj)
      */
     vfio_device_init(vbasedev, VFIO_DEVICE_TYPE_AP, &vfio_ap_ops,
                      DEVICE(vapdev), true);
+
+    /* AP device is mdev type device */
+    vbasedev->mdev = true;
 }
 
 #ifdef CONFIG_IOMMUFD
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PULL 07/16] vfio/ccw: Don't initialize HOST_IOMMU_DEVICE with mdev
  2024-07-23 14:00 [PULL 00/16] vfio queue Cédric Le Goater
                   ` (5 preceding siblings ...)
  2024-07-23 14:00 ` [PULL 06/16] vfio/ap: Don't initialize HOST_IOMMU_DEVICE with mdev Cédric Le Goater
@ 2024-07-23 14:00 ` Cédric Le Goater
  2024-07-23 14:00 ` [PULL 08/16] vfio/iommufd: Introduce auto domain creation Cédric Le Goater
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Cédric Le Goater @ 2024-07-23 14:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Williamson, Zhenzhong Duan, Joao Martins, Eric Farman,
	Eric Auger

From: Zhenzhong Duan <zhenzhong.duan@intel.com>

mdevs aren't "physical" devices and when asking for backing IOMMU info,
it fails the entire provisioning of the guest. Fix that by setting
vbasedev->mdev true so skipping HostIOMMUDevice initialization in the
presence of mdevs.

Fixes: 930589520128 ("vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Acked-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 hw/vfio/ccw.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index 1f8e1272c7555cd0a770481d1ae92988f6e2e62e..115862f43036442d99c33f0328e5bc599ba1f2b9 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -675,6 +675,9 @@ static void vfio_ccw_instance_init(Object *obj)
     VFIOCCWDevice *vcdev = VFIO_CCW(obj);
     VFIODevice *vbasedev = &vcdev->vdev;
 
+    /* CCW device is mdev type device */
+    vbasedev->mdev = true;
+
     /*
      * All vfio-ccw devices are believed to operate in a way compatible with
      * discarding of memory in RAM blocks, ie. pages pinned in the host are
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PULL 08/16] vfio/iommufd: Introduce auto domain creation
  2024-07-23 14:00 [PULL 00/16] vfio queue Cédric Le Goater
                   ` (6 preceding siblings ...)
  2024-07-23 14:00 ` [PULL 07/16] vfio/ccw: " Cédric Le Goater
@ 2024-07-23 14:00 ` Cédric Le Goater
  2024-07-23 14:00 ` [PULL 09/16] vfio/{iommufd,container}: Remove caps::aw_bits Cédric Le Goater
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Cédric Le Goater @ 2024-07-23 14:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Williamson, Joao Martins, Zhenzhong Duan,
	Cédric Le Goater, Eric Auger

From: Joao Martins <joao.m.martins@oracle.com>

There's generally two modes of operation for IOMMUFD:

1) The simple user API which intends to perform relatively simple things
with IOMMUs e.g. DPDK. The process generally creates an IOAS and attaches
to VFIO and mainly performs IOAS_MAP and UNMAP.

2) The native IOMMUFD API where you have fine grained control of the
IOMMU domain and model it accordingly. This is where most new feature
are being steered to.

For dirty tracking 2) is required, as it needs to ensure that
the stage-2/parent IOMMU domain will only attach devices
that support dirty tracking (so far it is all homogeneous in x86, likely
not the case for smmuv3). Such invariant on dirty tracking provides a
useful guarantee to VMMs that will refuse incompatible device
attachments for IOMMU domains.

Dirty tracking insurance is enforced via HWPT_ALLOC, which is
responsible for creating an IOMMU domain. This is contrast to the
'simple API' where the IOMMU domain is created by IOMMUFD automatically
when it attaches to VFIO (usually referred as autodomains) but it has
the needed handling for mdevs.

To support dirty tracking with the advanced IOMMUFD API, it needs
similar logic, where IOMMU domains are created and devices attached to
compatible domains. Essentially mimicking kernel
iommufd_device_auto_get_domain(). With mdevs given there's no IOMMU domain
it falls back to IOAS attach.

The auto domain logic allows different IOMMU domains to be created when
DMA dirty tracking is not desired (and VF can provide it), and others where
it is. Here it is not used in this way given how VFIODevice migration
state is initialized after the device attachment. But such mixed mode of
IOMMU dirty tracking + device dirty tracking is an improvement that can
be added on. Keep the 'all of nothing' of type1 approach that we have
been using so far between container vs device dirty tracking.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
[ clg: Added ERRP_GUARD() in iommufd_cdev_autodomains_get() ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 include/hw/vfio/vfio-common.h |  9 ++++
 include/sysemu/iommufd.h      |  5 +++
 backends/iommufd.c            | 30 +++++++++++++
 hw/vfio/iommufd.c             | 85 +++++++++++++++++++++++++++++++++++
 backends/trace-events         |  1 +
 5 files changed, 130 insertions(+)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 98acae8c1c975390c6cd0fdc02a1282f64ea2987..1a96678f8c384e7ff4a1db1e0ba90a5f9624bcff 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -95,10 +95,17 @@ typedef struct VFIOHostDMAWindow {
 
 typedef struct IOMMUFDBackend IOMMUFDBackend;
 
+typedef struct VFIOIOASHwpt {
+    uint32_t hwpt_id;
+    QLIST_HEAD(, VFIODevice) device_list;
+    QLIST_ENTRY(VFIOIOASHwpt) next;
+} VFIOIOASHwpt;
+
 typedef struct VFIOIOMMUFDContainer {
     VFIOContainerBase bcontainer;
     IOMMUFDBackend *be;
     uint32_t ioas_id;
+    QLIST_HEAD(, VFIOIOASHwpt) hwpt_list;
 } VFIOIOMMUFDContainer;
 
 OBJECT_DECLARE_SIMPLE_TYPE(VFIOIOMMUFDContainer, VFIO_IOMMU_IOMMUFD);
@@ -135,6 +142,8 @@ typedef struct VFIODevice {
     HostIOMMUDevice *hiod;
     int devid;
     IOMMUFDBackend *iommufd;
+    VFIOIOASHwpt *hwpt;
+    QLIST_ENTRY(VFIODevice) hwpt_next;
 } VFIODevice;
 
 struct VFIODeviceOps {
diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
index 57d502a1c79a65e0447989f398e4e54c37839531..e917e7591d050bd02945f6feb8d268e6d51d49aa 100644
--- a/include/sysemu/iommufd.h
+++ b/include/sysemu/iommufd.h
@@ -50,6 +50,11 @@ int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
 bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
                                      uint32_t *type, void *data, uint32_t len,
                                      uint64_t *caps, Error **errp);
+bool iommufd_backend_alloc_hwpt(IOMMUFDBackend *be, uint32_t dev_id,
+                                uint32_t pt_id, uint32_t flags,
+                                uint32_t data_type, uint32_t data_len,
+                                void *data_ptr, uint32_t *out_hwpt,
+                                Error **errp);
 
 #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
 #endif
diff --git a/backends/iommufd.c b/backends/iommufd.c
index 48dfd39624740e05217fb55be98ff5e054a32670..60a3d14bfab4b96186509886d3e8665b249b3415 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -207,6 +207,36 @@ int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
     return ret;
 }
 
+bool iommufd_backend_alloc_hwpt(IOMMUFDBackend *be, uint32_t dev_id,
+                                uint32_t pt_id, uint32_t flags,
+                                uint32_t data_type, uint32_t data_len,
+                                void *data_ptr, uint32_t *out_hwpt,
+                                Error **errp)
+{
+    int ret, fd = be->fd;
+    struct iommu_hwpt_alloc alloc_hwpt = {
+        .size = sizeof(struct iommu_hwpt_alloc),
+        .flags = flags,
+        .dev_id = dev_id,
+        .pt_id = pt_id,
+        .data_type = data_type,
+        .data_len = data_len,
+        .data_uptr = (uintptr_t)data_ptr,
+    };
+
+    ret = ioctl(fd, IOMMU_HWPT_ALLOC, &alloc_hwpt);
+    trace_iommufd_backend_alloc_hwpt(fd, dev_id, pt_id, flags, data_type,
+                                     data_len, (uintptr_t)data_ptr,
+                                     alloc_hwpt.out_hwpt_id, ret);
+    if (ret) {
+        error_setg_errno(errp, errno, "Failed to allocate hwpt");
+        return false;
+    }
+
+    *out_hwpt = alloc_hwpt.out_hwpt_id;
+    return true;
+}
+
 bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
                                      uint32_t *type, void *data, uint32_t len,
                                      uint64_t *caps, Error **errp)
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 7390621ee92762c5d752c0fae907e71380b6e980..58c11c93086e0c2aba20a80b147f3b980015c7bb 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -212,10 +212,89 @@ static bool iommufd_cdev_detach_ioas_hwpt(VFIODevice *vbasedev, Error **errp)
     return true;
 }
 
+static bool iommufd_cdev_autodomains_get(VFIODevice *vbasedev,
+                                         VFIOIOMMUFDContainer *container,
+                                         Error **errp)
+{
+    ERRP_GUARD();
+    IOMMUFDBackend *iommufd = vbasedev->iommufd;
+    uint32_t flags = 0;
+    VFIOIOASHwpt *hwpt;
+    uint32_t hwpt_id;
+    int ret;
+
+    /* Try to find a domain */
+    QLIST_FOREACH(hwpt, &container->hwpt_list, next) {
+        ret = iommufd_cdev_attach_ioas_hwpt(vbasedev, hwpt->hwpt_id, errp);
+        if (ret) {
+            /* -EINVAL means the domain is incompatible with the device. */
+            if (ret == -EINVAL) {
+                /*
+                 * It is an expected failure and it just means we will try
+                 * another domain, or create one if no existing compatible
+                 * domain is found. Hence why the error is discarded below.
+                 */
+                error_free(*errp);
+                *errp = NULL;
+                continue;
+            }
+
+            return false;
+        } else {
+            vbasedev->hwpt = hwpt;
+            QLIST_INSERT_HEAD(&hwpt->device_list, vbasedev, hwpt_next);
+            return true;
+        }
+    }
+
+    if (!iommufd_backend_alloc_hwpt(iommufd, vbasedev->devid,
+                                    container->ioas_id, flags,
+                                    IOMMU_HWPT_DATA_NONE, 0, NULL,
+                                    &hwpt_id, errp)) {
+        return false;
+    }
+
+    hwpt = g_malloc0(sizeof(*hwpt));
+    hwpt->hwpt_id = hwpt_id;
+    QLIST_INIT(&hwpt->device_list);
+
+    ret = iommufd_cdev_attach_ioas_hwpt(vbasedev, hwpt->hwpt_id, errp);
+    if (ret) {
+        iommufd_backend_free_id(container->be, hwpt->hwpt_id);
+        g_free(hwpt);
+        return false;
+    }
+
+    vbasedev->hwpt = hwpt;
+    QLIST_INSERT_HEAD(&hwpt->device_list, vbasedev, hwpt_next);
+    QLIST_INSERT_HEAD(&container->hwpt_list, hwpt, next);
+    return true;
+}
+
+static void iommufd_cdev_autodomains_put(VFIODevice *vbasedev,
+                                         VFIOIOMMUFDContainer *container)
+{
+    VFIOIOASHwpt *hwpt = vbasedev->hwpt;
+
+    QLIST_REMOVE(vbasedev, hwpt_next);
+    vbasedev->hwpt = NULL;
+
+    if (QLIST_EMPTY(&hwpt->device_list)) {
+        QLIST_REMOVE(hwpt, next);
+        iommufd_backend_free_id(container->be, hwpt->hwpt_id);
+        g_free(hwpt);
+    }
+}
+
 static bool iommufd_cdev_attach_container(VFIODevice *vbasedev,
                                           VFIOIOMMUFDContainer *container,
                                           Error **errp)
 {
+    /* mdevs aren't physical devices and will fail with auto domains */
+    if (!vbasedev->mdev) {
+        return iommufd_cdev_autodomains_get(vbasedev, container, errp);
+    }
+
     return !iommufd_cdev_attach_ioas_hwpt(vbasedev, container->ioas_id, errp);
 }
 
@@ -227,6 +306,11 @@ static void iommufd_cdev_detach_container(VFIODevice *vbasedev,
     if (!iommufd_cdev_detach_ioas_hwpt(vbasedev, &err)) {
         error_report_err(err);
     }
+
+    if (vbasedev->hwpt) {
+        iommufd_cdev_autodomains_put(vbasedev, container);
+    }
+
 }
 
 static void iommufd_cdev_container_destroy(VFIOIOMMUFDContainer *container)
@@ -354,6 +438,7 @@ static bool iommufd_cdev_attach(const char *name, VFIODevice *vbasedev,
     container = VFIO_IOMMU_IOMMUFD(object_new(TYPE_VFIO_IOMMU_IOMMUFD));
     container->be = vbasedev->iommufd;
     container->ioas_id = ioas_id;
+    QLIST_INIT(&container->hwpt_list);
 
     bcontainer = &container->bcontainer;
     vfio_address_space_insert(space, bcontainer);
diff --git a/backends/trace-events b/backends/trace-events
index 211e6f374adcef25be0409ce3e42cbed6f31b744..4d8ac02fe7d6c6d3780dfef48406872ee46fd4df 100644
--- a/backends/trace-events
+++ b/backends/trace-events
@@ -14,4 +14,5 @@ iommufd_backend_map_dma(int iommufd, uint32_t ioas, uint64_t iova, uint64_t size
 iommufd_backend_unmap_dma_non_exist(int iommufd, uint32_t ioas, uint64_t iova, uint64_t size, int ret) " Unmap nonexistent mapping: iommufd=%d ioas=%d iova=0x%"PRIx64" size=0x%"PRIx64" (%d)"
 iommufd_backend_unmap_dma(int iommufd, uint32_t ioas, uint64_t iova, uint64_t size, int ret) " iommufd=%d ioas=%d iova=0x%"PRIx64" size=0x%"PRIx64" (%d)"
 iommufd_backend_alloc_ioas(int iommufd, uint32_t ioas) " iommufd=%d ioas=%d"
+iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, uint32_t pt_id, uint32_t flags, uint32_t hwpt_type, uint32_t len, uint64_t data_ptr, uint32_t out_hwpt_id, int ret) " iommufd=%d dev_id=%u pt_id=%u flags=0x%x hwpt_type=%u len=%u data_ptr=0x%"PRIx64" out_hwpt=%u (%d)"
 iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d id=%d (%d)"
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PULL 09/16] vfio/{iommufd,container}: Remove caps::aw_bits
  2024-07-23 14:00 [PULL 00/16] vfio queue Cédric Le Goater
                   ` (7 preceding siblings ...)
  2024-07-23 14:00 ` [PULL 08/16] vfio/iommufd: Introduce auto domain creation Cédric Le Goater
@ 2024-07-23 14:00 ` Cédric Le Goater
  2024-07-23 14:00 ` [PULL 10/16] vfio/iommufd: Add hw_caps field to HostIOMMUDeviceCaps Cédric Le Goater
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Cédric Le Goater @ 2024-07-23 14:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Williamson, Joao Martins, Zhenzhong Duan,
	Cédric Le Goater, Eric Auger

From: Joao Martins <joao.m.martins@oracle.com>

Remove caps::aw_bits which requires the bcontainer::iova_ranges being
initialized after device is actually attached. Instead defer that to
.get_cap() and call vfio_device_get_aw_bits() directly.

This is in preparation for HostIOMMUDevice::realize() being called early
during attach_device().

Suggested-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 include/sysemu/host_iommu_device.h | 3 ---
 backends/iommufd.c                 | 3 ++-
 hw/vfio/container.c                | 5 +----
 hw/vfio/iommufd.c                  | 1 -
 4 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
index c1bf74ae2c7a729b22d6512f3ca37ce65fa6bcec..d1c10ff7c239d9a0ae31894abe929e1e96b63ef2 100644
--- a/include/sysemu/host_iommu_device.h
+++ b/include/sysemu/host_iommu_device.h
@@ -19,12 +19,9 @@
  * struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities.
  *
  * @type: host platform IOMMU type.
- *
- * @aw_bits: host IOMMU address width. 0xff if no limitation.
  */
 typedef struct HostIOMMUDeviceCaps {
     uint32_t type;
-    uint8_t aw_bits;
 } HostIOMMUDeviceCaps;
 
 #define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
diff --git a/backends/iommufd.c b/backends/iommufd.c
index 60a3d14bfab4b96186509886d3e8665b249b3415..06b135111f303ca95d55dc0f71ad3bbb76211337 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -18,6 +18,7 @@
 #include "qemu/error-report.h"
 #include "monitor/monitor.h"
 #include "trace.h"
+#include "hw/vfio/vfio-common.h"
 #include <sys/ioctl.h>
 #include <linux/iommufd.h>
 
@@ -269,7 +270,7 @@ static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
     case HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE:
         return caps->type;
     case HOST_IOMMU_DEVICE_CAP_AW_BITS:
-        return caps->aw_bits;
+        return vfio_device_get_aw_bits(hiod->agent);
     default:
         error_setg(errp, "%s: unsupported capability %x", hiod->name, cap);
         return -EINVAL;
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index ce9a858e56218a9e9c803b4f5cf4c9f7cfc4edda..10cb4b4320ac3d6b3a1da3625e964af5f2f2f9a7 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -1141,7 +1141,6 @@ static bool hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
     VFIODevice *vdev = opaque;
 
     hiod->name = g_strdup(vdev->name);
-    hiod->caps.aw_bits = vfio_device_get_aw_bits(vdev);
     hiod->agent = opaque;
 
     return true;
@@ -1150,11 +1149,9 @@ static bool hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
 static int hiod_legacy_vfio_get_cap(HostIOMMUDevice *hiod, int cap,
                                     Error **errp)
 {
-    HostIOMMUDeviceCaps *caps = &hiod->caps;
-
     switch (cap) {
     case HOST_IOMMU_DEVICE_CAP_AW_BITS:
-        return caps->aw_bits;
+        return vfio_device_get_aw_bits(hiod->agent);
     default:
         error_setg(errp, "%s: unsupported capability %x", hiod->name, cap);
         return -EINVAL;
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 58c11c93086e0c2aba20a80b147f3b980015c7bb..f1e7cf3e9cafde08a0353876da973f3713006df3 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -725,7 +725,6 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
 
     hiod->name = g_strdup(vdev->name);
     caps->type = type;
-    caps->aw_bits = vfio_device_get_aw_bits(vdev);
 
     return true;
 }
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PULL 10/16] vfio/iommufd: Add hw_caps field to HostIOMMUDeviceCaps
  2024-07-23 14:00 [PULL 00/16] vfio queue Cédric Le Goater
                   ` (8 preceding siblings ...)
  2024-07-23 14:00 ` [PULL 09/16] vfio/{iommufd,container}: Remove caps::aw_bits Cédric Le Goater
@ 2024-07-23 14:00 ` Cédric Le Goater
  2024-07-23 14:00 ` [PULL 11/16] vfio/{iommufd, container}: Invoke HostIOMMUDevice::realize() during attach_device() Cédric Le Goater
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Cédric Le Goater @ 2024-07-23 14:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Williamson, Joao Martins, Cédric Le Goater,
	Zhenzhong Duan, Eric Auger

From: Joao Martins <joao.m.martins@oracle.com>

Store the value of @caps returned by iommufd_backend_get_device_info()
in a new field HostIOMMUDeviceCaps::hw_caps. Right now the only value is
whether device IOMMU supports dirty tracking (IOMMU_HW_CAP_DIRTY_TRACKING).

This is in preparation for HostIOMMUDevice::realize() being called early
during attach_device().

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 include/sysemu/host_iommu_device.h | 4 ++++
 hw/vfio/iommufd.c                  | 1 +
 2 files changed, 5 insertions(+)

diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
index d1c10ff7c239d9a0ae31894abe929e1e96b63ef2..809cced4ba5c56263132b474a382e4bd0ffdd3cd 100644
--- a/include/sysemu/host_iommu_device.h
+++ b/include/sysemu/host_iommu_device.h
@@ -19,9 +19,13 @@
  * struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities.
  *
  * @type: host platform IOMMU type.
+ *
+ * @hw_caps: host platform IOMMU capabilities (e.g. on IOMMUFD this represents
+ *           the @out_capabilities value returned from IOMMU_GET_HW_INFO ioctl)
  */
 typedef struct HostIOMMUDeviceCaps {
     uint32_t type;
+    uint64_t hw_caps;
 } HostIOMMUDeviceCaps;
 
 #define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index f1e7cf3e9cafde08a0353876da973f3713006df3..fb87e64e443035bc239f4f4272ae1c28fa8ab8c9 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -725,6 +725,7 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
 
     hiod->name = g_strdup(vdev->name);
     caps->type = type;
+    caps->hw_caps = hw_caps;
 
     return true;
 }
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PULL 11/16] vfio/{iommufd, container}: Invoke HostIOMMUDevice::realize() during attach_device()
  2024-07-23 14:00 [PULL 00/16] vfio queue Cédric Le Goater
                   ` (9 preceding siblings ...)
  2024-07-23 14:00 ` [PULL 10/16] vfio/iommufd: Add hw_caps field to HostIOMMUDeviceCaps Cédric Le Goater
@ 2024-07-23 14:00 ` Cédric Le Goater
  2024-07-23 14:00 ` [PULL 12/16] vfio/iommufd: Probe and request hwpt dirty tracking capability Cédric Le Goater
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 19+ messages in thread
From: Cédric Le Goater @ 2024-07-23 14:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Williamson, Joao Martins, Cédric Le Goater,
	Zhenzhong Duan, Cédric Le Goater, Eric Auger

From: Joao Martins <joao.m.martins@oracle.com>

Move the HostIOMMUDevice::realize() to be invoked during the attach of the device
before we allocate IOMMUFD hardware pagetable objects (HWPT). This allows the use
of the hw_caps obtained by IOMMU_GET_HW_INFO that essentially tell if the IOMMU
behind the device supports dirty tracking.

Note: The HostIOMMUDevice data from legacy backend is static and doesn't
need any information from the (type1-iommu) backend to be initialized.
In contrast however, the IOMMUFD HostIOMMUDevice data requires the
iommufd FD to be connected and having a devid to be able to successfully
GET_HW_INFO. This means vfio_device_hiod_realize() is called in
different places within the backend .attach_device() implementation.

Suggested-by: Cédric Le Goater <clg@redhat.cm>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
[ clg: Fixed error handling in iommufd_cdev_attach() ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 include/hw/vfio/vfio-common.h |  1 +
 hw/vfio/common.c              | 16 ++++++----------
 hw/vfio/container.c           |  4 ++++
 hw/vfio/helpers.c             | 11 +++++++++++
 hw/vfio/iommufd.c             | 11 +++++++++++
 5 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 1a96678f8c384e7ff4a1db1e0ba90a5f9624bcff..4e44b26d3c453b5b47a819df371a21a4ca3b39c3 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -242,6 +242,7 @@ void vfio_region_finalize(VFIORegion *region);
 void vfio_reset_handler(void *opaque);
 struct vfio_device_info *vfio_get_device_info(int fd);
 bool vfio_device_is_mdev(VFIODevice *vbasedev);
+bool vfio_device_hiod_realize(VFIODevice *vbasedev, Error **errp);
 bool vfio_attach_device(char *name, VFIODevice *vbasedev,
                         AddressSpace *as, Error **errp);
 void vfio_detach_device(VFIODevice *vbasedev);
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index d7f02be595b5e71558d7e2d75d21d28f05968252..26e74fa430db4c7618698ded5d514d524f33d273 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1536,7 +1536,7 @@ bool vfio_attach_device(char *name, VFIODevice *vbasedev,
 {
     const VFIOIOMMUClass *ops =
         VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY));
-    HostIOMMUDevice *hiod;
+    HostIOMMUDevice *hiod = NULL;
 
     if (vbasedev->iommufd) {
         ops = VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUFD));
@@ -1544,21 +1544,17 @@ bool vfio_attach_device(char *name, VFIODevice *vbasedev,
 
     assert(ops);
 
-    if (!ops->attach_device(name, vbasedev, as, errp)) {
-        return false;
-    }
 
-    if (vbasedev->mdev) {
-        return true;
+    if (!vbasedev->mdev) {
+        hiod = HOST_IOMMU_DEVICE(object_new(ops->hiod_typename));
+        vbasedev->hiod = hiod;
     }
 
-    hiod = HOST_IOMMU_DEVICE(object_new(ops->hiod_typename));
-    if (!HOST_IOMMU_DEVICE_GET_CLASS(hiod)->realize(hiod, vbasedev, errp)) {
+    if (!ops->attach_device(name, vbasedev, as, errp)) {
         object_unref(hiod);
-        ops->detach_device(vbasedev);
+        vbasedev->hiod = NULL;
         return false;
     }
-    vbasedev->hiod = hiod;
 
     return true;
 }
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 10cb4b4320ac3d6b3a1da3625e964af5f2f2f9a7..9ccdb639ac84f885da40eace8a0059f397295619 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -914,6 +914,10 @@ static bool vfio_legacy_attach_device(const char *name, VFIODevice *vbasedev,
 
     trace_vfio_attach_device(vbasedev->name, groupid);
 
+    if (!vfio_device_hiod_realize(vbasedev, errp)) {
+        return false;
+    }
+
     group = vfio_get_group(groupid, as, errp);
     if (!group) {
         return false;
diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c
index 7e23e9080c9d2860dea51ca5ef5fbc840d42a32d..ea15c79db0a3643f260fc1ce3abfeaa7001ab306 100644
--- a/hw/vfio/helpers.c
+++ b/hw/vfio/helpers.c
@@ -689,3 +689,14 @@ bool vfio_device_is_mdev(VFIODevice *vbasedev)
     subsys = realpath(tmp, NULL);
     return subsys && (strcmp(subsys, "/sys/bus/mdev") == 0);
 }
+
+bool vfio_device_hiod_realize(VFIODevice *vbasedev, Error **errp)
+{
+    HostIOMMUDevice *hiod = vbasedev->hiod;
+
+    if (!hiod) {
+        return true;
+    }
+
+    return HOST_IOMMU_DEVICE_GET_CLASS(hiod)->realize(hiod, vbasedev, errp);
+}
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index fb87e64e443035bc239f4f4272ae1c28fa8ab8c9..798c4798a55e0c839c5128b3cd9571356157dce9 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -404,6 +404,17 @@ static bool iommufd_cdev_attach(const char *name, VFIODevice *vbasedev,
 
     space = vfio_get_address_space(as);
 
+    /*
+     * The HostIOMMUDevice data from legacy backend is static and doesn't need
+     * any information from the (type1-iommu) backend to be initialized. In
+     * contrast however, the IOMMUFD HostIOMMUDevice data requires the iommufd
+     * FD to be connected and having a devid to be able to successfully call
+     * iommufd_backend_get_device_info().
+     */
+    if (!vfio_device_hiod_realize(vbasedev, errp)) {
+        goto err_alloc_ioas;
+    }
+
     /* try to attach to an existing container in this space */
     QLIST_FOREACH(bcontainer, &space->containers, next) {
         container = container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer);
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PULL 12/16] vfio/iommufd: Probe and request hwpt dirty tracking capability
  2024-07-23 14:00 [PULL 00/16] vfio queue Cédric Le Goater
                   ` (10 preceding siblings ...)
  2024-07-23 14:00 ` [PULL 11/16] vfio/{iommufd, container}: Invoke HostIOMMUDevice::realize() during attach_device() Cédric Le Goater
@ 2024-07-23 14:00 ` Cédric Le Goater
  2024-07-23 14:00 ` [PULL 13/16] vfio/iommufd: Implement VFIOIOMMUClass::set_dirty_tracking support Cédric Le Goater
  2024-07-23 15:13 ` [PULL 00/16] vfio queue Cédric Le Goater
  13 siblings, 0 replies; 19+ messages in thread
From: Cédric Le Goater @ 2024-07-23 14:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Williamson, Joao Martins, Cédric Le Goater,
	Zhenzhong Duan

From: Joao Martins <joao.m.martins@oracle.com>

In preparation to using the dirty tracking UAPI, probe whether the IOMMU
supports dirty tracking. This is done via the data stored in
hiod::caps::hw_caps initialized from GET_HW_INFO.

Qemu doesn't know if VF dirty tracking is supported when allocating
hardware pagetable in iommufd_cdev_autodomains_get(). This is because
VFIODevice migration state hasn't been initialized *yet* hence it can't pick
between VF dirty tracking vs IOMMU dirty tracking. So, if IOMMU supports
dirty tracking it always creates HWPTs with IOMMU_HWPT_ALLOC_DIRTY_TRACKING
even if later on VFIOMigration decides to use VF dirty tracking instead.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[ clg: - Fixed vbasedev->iommu_dirty_tracking assignment in
         iommufd_cdev_autodomains_get()
       - Added warning for heterogeneous dirty page tracking support
	 in iommufd_cdev_autodomains_get() ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/hw/vfio/vfio-common.h |  2 ++
 hw/vfio/iommufd.c             | 26 ++++++++++++++++++++++++++
 2 files changed, 28 insertions(+)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 4e44b26d3c453b5b47a819df371a21a4ca3b39c3..1e02c98b09babb6878fed1130cd1d6a00ac3641e 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -97,6 +97,7 @@ typedef struct IOMMUFDBackend IOMMUFDBackend;
 
 typedef struct VFIOIOASHwpt {
     uint32_t hwpt_id;
+    uint32_t hwpt_flags;
     QLIST_HEAD(, VFIODevice) device_list;
     QLIST_ENTRY(VFIOIOASHwpt) next;
 } VFIOIOASHwpt;
@@ -139,6 +140,7 @@ typedef struct VFIODevice {
     OnOffAuto pre_copy_dirty_page_tracking;
     bool dirty_pages_supported;
     bool dirty_tracking;
+    bool iommu_dirty_tracking;
     HostIOMMUDevice *hiod;
     int devid;
     IOMMUFDBackend *iommufd;
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 798c4798a55e0c839c5128b3cd9571356157dce9..240c476eaf5b9b79090b6767c18316ade3b3b794 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -110,6 +110,11 @@ static void iommufd_cdev_unbind_and_disconnect(VFIODevice *vbasedev)
     iommufd_backend_disconnect(vbasedev->iommufd);
 }
 
+static bool iommufd_hwpt_dirty_tracking(VFIOIOASHwpt *hwpt)
+{
+    return hwpt && hwpt->hwpt_flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING;
+}
+
 static int iommufd_cdev_getfd(const char *sysfs_path, Error **errp)
 {
     ERRP_GUARD();
@@ -243,10 +248,22 @@ static bool iommufd_cdev_autodomains_get(VFIODevice *vbasedev,
         } else {
             vbasedev->hwpt = hwpt;
             QLIST_INSERT_HEAD(&hwpt->device_list, vbasedev, hwpt_next);
+            vbasedev->iommu_dirty_tracking = iommufd_hwpt_dirty_tracking(hwpt);
             return true;
         }
     }
 
+    /*
+     * This is quite early and VFIO Migration state isn't yet fully
+     * initialized, thus rely only on IOMMU hardware capabilities as to
+     * whether IOMMU dirty tracking is going to be requested. Later
+     * vfio_migration_realize() may decide to use VF dirty tracking
+     * instead.
+     */
+    if (vbasedev->hiod->caps.hw_caps & IOMMU_HW_CAP_DIRTY_TRACKING) {
+        flags = IOMMU_HWPT_ALLOC_DIRTY_TRACKING;
+    }
+
     if (!iommufd_backend_alloc_hwpt(iommufd, vbasedev->devid,
                                     container->ioas_id, flags,
                                     IOMMU_HWPT_DATA_NONE, 0, NULL,
@@ -256,6 +273,7 @@ static bool iommufd_cdev_autodomains_get(VFIODevice *vbasedev,
 
     hwpt = g_malloc0(sizeof(*hwpt));
     hwpt->hwpt_id = hwpt_id;
+    hwpt->hwpt_flags = flags;
     QLIST_INIT(&hwpt->device_list);
 
     ret = iommufd_cdev_attach_ioas_hwpt(vbasedev, hwpt->hwpt_id, errp);
@@ -266,8 +284,16 @@ static bool iommufd_cdev_autodomains_get(VFIODevice *vbasedev,
     }
 
     vbasedev->hwpt = hwpt;
+    vbasedev->iommu_dirty_tracking = iommufd_hwpt_dirty_tracking(hwpt);
     QLIST_INSERT_HEAD(&hwpt->device_list, vbasedev, hwpt_next);
     QLIST_INSERT_HEAD(&container->hwpt_list, hwpt, next);
+    container->bcontainer.dirty_pages_supported |=
+                                vbasedev->iommu_dirty_tracking;
+    if (container->bcontainer.dirty_pages_supported &&
+        !vbasedev->iommu_dirty_tracking) {
+        warn_report("IOMMU instance for device %s doesn't support dirty tracking",
+                    vbasedev->name);
+    }
     return true;
 }
 
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PULL 13/16] vfio/iommufd: Implement VFIOIOMMUClass::set_dirty_tracking support
  2024-07-23 14:00 [PULL 00/16] vfio queue Cédric Le Goater
                   ` (11 preceding siblings ...)
  2024-07-23 14:00 ` [PULL 12/16] vfio/iommufd: Probe and request hwpt dirty tracking capability Cédric Le Goater
@ 2024-07-23 14:00 ` Cédric Le Goater
  2024-07-23 15:13 ` [PULL 00/16] vfio queue Cédric Le Goater
  13 siblings, 0 replies; 19+ messages in thread
From: Cédric Le Goater @ 2024-07-23 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alex Williamson, Joao Martins, Zhenzhong Duan, Eric Auger

From: Joao Martins <joao.m.martins@oracle.com>

ioctl(iommufd, IOMMU_HWPT_SET_DIRTY_TRACKING, arg) is the UAPI that
enables or disables dirty page tracking. The ioctl is used if the hwpt
has been created with dirty tracking supported domain (stored in
hwpt::flags) and it is called on the whole list of iommu domains.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 include/sysemu/iommufd.h |  2 ++
 backends/iommufd.c       | 23 +++++++++++++++++++++++
 hw/vfio/iommufd.c        | 32 ++++++++++++++++++++++++++++++++
 backends/trace-events    |  1 +
 4 files changed, 58 insertions(+)

diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
index e917e7591d050bd02945f6feb8d268e6d51d49aa..6fb412f61144e91ae02710291368ec9d577832f8 100644
--- a/include/sysemu/iommufd.h
+++ b/include/sysemu/iommufd.h
@@ -55,6 +55,8 @@ bool iommufd_backend_alloc_hwpt(IOMMUFDBackend *be, uint32_t dev_id,
                                 uint32_t data_type, uint32_t data_len,
                                 void *data_ptr, uint32_t *out_hwpt,
                                 Error **errp);
+bool iommufd_backend_set_dirty_tracking(IOMMUFDBackend *be, uint32_t hwpt_id,
+                                        bool start, Error **errp);
 
 #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
 #endif
diff --git a/backends/iommufd.c b/backends/iommufd.c
index 06b135111f303ca95d55dc0f71ad3bbb76211337..b9788350388481f449b1969366efa8d7766fc080 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -238,6 +238,29 @@ bool iommufd_backend_alloc_hwpt(IOMMUFDBackend *be, uint32_t dev_id,
     return true;
 }
 
+bool iommufd_backend_set_dirty_tracking(IOMMUFDBackend *be,
+                                        uint32_t hwpt_id, bool start,
+                                        Error **errp)
+{
+    int ret;
+    struct iommu_hwpt_set_dirty_tracking set_dirty = {
+            .size = sizeof(set_dirty),
+            .hwpt_id = hwpt_id,
+            .flags = start ? IOMMU_HWPT_DIRTY_TRACKING_ENABLE : 0,
+    };
+
+    ret = ioctl(be->fd, IOMMU_HWPT_SET_DIRTY_TRACKING, &set_dirty);
+    trace_iommufd_backend_set_dirty(be->fd, hwpt_id, start, ret ? errno : 0);
+    if (ret) {
+        error_setg_errno(errp, errno,
+                         "IOMMU_HWPT_SET_DIRTY_TRACKING(hwpt_id %u) failed",
+                         hwpt_id);
+        return false;
+    }
+
+    return true;
+}
+
 bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
                                      uint32_t *type, void *data, uint32_t len,
                                      uint64_t *caps, Error **errp)
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 240c476eaf5b9b79090b6767c18316ade3b3b794..e39fbf4dd60a62321def4349fd727da335d82d8d 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -115,6 +115,37 @@ static bool iommufd_hwpt_dirty_tracking(VFIOIOASHwpt *hwpt)
     return hwpt && hwpt->hwpt_flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING;
 }
 
+static int iommufd_set_dirty_page_tracking(const VFIOContainerBase *bcontainer,
+                                           bool start, Error **errp)
+{
+    const VFIOIOMMUFDContainer *container =
+        container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer);
+    VFIOIOASHwpt *hwpt;
+
+    QLIST_FOREACH(hwpt, &container->hwpt_list, next) {
+        if (!iommufd_hwpt_dirty_tracking(hwpt)) {
+            continue;
+        }
+
+        if (!iommufd_backend_set_dirty_tracking(container->be,
+                                                hwpt->hwpt_id, start, errp)) {
+            goto err;
+        }
+    }
+
+    return 0;
+
+err:
+    QLIST_FOREACH(hwpt, &container->hwpt_list, next) {
+        if (!iommufd_hwpt_dirty_tracking(hwpt)) {
+            continue;
+        }
+        iommufd_backend_set_dirty_tracking(container->be,
+                                           hwpt->hwpt_id, !start, NULL);
+    }
+    return -EINVAL;
+}
+
 static int iommufd_cdev_getfd(const char *sysfs_path, Error **errp)
 {
     ERRP_GUARD();
@@ -739,6 +770,7 @@ static void vfio_iommu_iommufd_class_init(ObjectClass *klass, void *data)
     vioc->attach_device = iommufd_cdev_attach;
     vioc->detach_device = iommufd_cdev_detach;
     vioc->pci_hot_reset = iommufd_cdev_pci_hot_reset;
+    vioc->set_dirty_page_tracking = iommufd_set_dirty_page_tracking;
 };
 
 static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
diff --git a/backends/trace-events b/backends/trace-events
index 4d8ac02fe7d6c6d3780dfef48406872ee46fd4df..28aca3b859d43f35511a95d6eba196046bfda835 100644
--- a/backends/trace-events
+++ b/backends/trace-events
@@ -16,3 +16,4 @@ iommufd_backend_unmap_dma(int iommufd, uint32_t ioas, uint64_t iova, uint64_t si
 iommufd_backend_alloc_ioas(int iommufd, uint32_t ioas) " iommufd=%d ioas=%d"
 iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, uint32_t pt_id, uint32_t flags, uint32_t hwpt_type, uint32_t len, uint64_t data_ptr, uint32_t out_hwpt_id, int ret) " iommufd=%d dev_id=%u pt_id=%u flags=0x%x hwpt_type=%u len=%u data_ptr=0x%"PRIx64" out_hwpt=%u (%d)"
 iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d id=%d (%d)"
+iommufd_backend_set_dirty(int iommufd, uint32_t hwpt_id, bool start, int ret) " iommufd=%d hwpt=%u enable=%d (%d)"
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PULL 00/16] vfio queue
  2024-07-23 14:00 [PULL 00/16] vfio queue Cédric Le Goater
                   ` (12 preceding siblings ...)
  2024-07-23 14:00 ` [PULL 13/16] vfio/iommufd: Implement VFIOIOMMUClass::set_dirty_tracking support Cédric Le Goater
@ 2024-07-23 15:13 ` Cédric Le Goater
  13 siblings, 0 replies; 19+ messages in thread
From: Cédric Le Goater @ 2024-07-23 15:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alex Williamson

On 7/23/24 16:00, Cédric Le Goater wrote:
> The following changes since commit 6af69d02706c821797802cfd56acdac13a7c9422:
> 
>    Merge tag 'nvme-next-pull-request' of https://gitlab.com/birkelund/qemu into staging (2024-07-23 13:55:45 +1000)
> 
> are available in the Git repository at:
> 
>    https://github.com/legoater/qemu/ tags/pull-vfio-20240723
> 
> for you to fetch changes up to 6ac9efe6805af60de14481fdde7d340080d38324:
> 
>    vfio/common: Allow disabling device dirty page tracking (2024-07-23 11:10:10 +0200)
> 
> ----------------------------------------------------------------
> vfio queue:
> 
> * IOMMUFD Dirty Tracking support
> * Fix for a possible SEGV in IOMMU type1 container
> * Dropped initialization of host IOMMU device with mdev devices

There is a problem with an email address in patch 13 :

    Reviewed-by: Cédric Le Goater <clg@redhat.co>

I will repush an resend.


Thanks,

C.




^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PULL 00/16] vfio queue
@ 2025-06-05  8:42 Cédric Le Goater
  2025-06-05 19:00 ` Stefan Hajnoczi
  0 siblings, 1 reply; 19+ messages in thread
From: Cédric Le Goater @ 2025-06-05  8:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alex Williamson, Cédric Le Goater

The following changes since commit 09be8a511a2e278b45729d7b065d30c68dd699d0:

  Merge tag 'pull-qapi-2025-06-03' of https://repo.or.cz/qemu/armbru into staging (2025-06-03 09:19:26 -0400)

are available in the Git repository at:

  https://github.com/legoater/qemu/ tags/pull-vfio-20250605

for you to fetch changes up to 3ed34463a2d8ab8aabfa1d612f12b56600c87983:

  vfio: move vfio-cpr.h (2025-06-05 10:40:38 +0200)

----------------------------------------------------------------
vfio queue:

* Fixed OpRegion detection in IGD
* Added prerequisite rework for IOMMU nesting support
* Added prerequisite rework for vfio-user
* Added prerequisite rework for VFIO live update
* Modified memory_get_xlat_addr() to return a MemoryRegion

----------------------------------------------------------------
Edmund Raile (1):
      vfio/igd: OpRegion not found fix error typo

John Levon (5):
      vfio: add more VFIOIOMMUClass docs
      vfio: move more cleanup into vfio_pci_put_device()
      vfio: move config space read into vfio_pci_config_setup()
      vfio: refactor out IRQ signalling setup
      vfio/container: pass MemoryRegion to DMA operations

Steve Sistare (4):
      vfio: return mr from vfio_get_xlat_addr
      MAINTAINERS: Add reviewer for CPR
      vfio: vfio_find_ram_discard_listener
      vfio: move vfio-cpr.h

Tomita Moeko (1):
      vfio/igd: Fix incorrect error propagation in vfio_pci_igd_opregion_detect()

Zhenzhong Duan (5):
      vfio/iommufd: Add comment emphasizing no movement of hiod->realize() call
      backends/iommufd: Add a helper to invalidate user-managed HWPT
      vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD
      vfio/iommufd: Implement [at|de]tach_hwpt handlers
      vfio/iommufd: Save vendor specific device info

 MAINTAINERS                           | 10 ++++
 hw/vfio/vfio-cpr.h                    | 15 ------
 include/hw/vfio/vfio-container-base.h | 83 ++++++++++++++++++++++++++++++--
 include/hw/vfio/vfio-cpr.h            | 18 +++++++
 include/system/host_iommu_device.h    | 15 ++++++
 include/system/iommufd.h              | 54 +++++++++++++++++++++
 include/system/memory.h               | 19 ++++----
 backends/iommufd.c                    | 58 +++++++++++++++++++++++
 hw/vfio/container-base.c              |  4 +-
 hw/vfio/container.c                   |  5 +-
 hw/vfio/cpr.c                         |  2 +-
 hw/vfio/igd.c                         | 22 ++++-----
 hw/vfio/iommufd.c                     | 45 +++++++++++++++---
 hw/vfio/listener.c                    | 74 ++++++++++++++++++-----------
 hw/vfio/pci.c                         | 89 +++++++++++++++++++----------------
 hw/virtio/vhost-vdpa.c                |  9 +++-
 system/memory.c                       | 32 +++----------
 backends/trace-events                 |  1 +
 18 files changed, 406 insertions(+), 149 deletions(-)
 delete mode 100644 hw/vfio/vfio-cpr.h
 create mode 100644 include/hw/vfio/vfio-cpr.h



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PULL 00/16] vfio queue
  2025-06-05  8:42 Cédric Le Goater
@ 2025-06-05 19:00 ` Stefan Hajnoczi
  0 siblings, 0 replies; 19+ messages in thread
From: Stefan Hajnoczi @ 2025-06-05 19:00 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: qemu-devel, Alex Williamson, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 116 bytes --]

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/10.1 for any user-visible changes.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2025-06-05 19:01 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-23 14:00 [PULL 00/16] vfio queue Cédric Le Goater
2024-07-23 14:00 ` [PULL 01/16] hw/vfio/container: Fix SIGSEV on vfio_container_instance_finalize() Cédric Le Goater
2024-07-23 14:00 ` [PULL 02/16] vfio/pci: Extract mdev check into an helper Cédric Le Goater
2024-07-23 14:00 ` [PULL 03/16] vfio/iommufd: Don't initialize nor set a HOST_IOMMU_DEVICE with mdev Cédric Le Goater
2024-07-23 14:00 ` [PULL 04/16] backends/iommufd: Extend iommufd_backend_get_device_info() to fetch HW capabilities Cédric Le Goater
2024-07-23 14:00 ` [PULL 05/16] vfio/iommufd: Return errno in iommufd_cdev_attach_ioas_hwpt() Cédric Le Goater
2024-07-23 14:00 ` [PULL 06/16] vfio/ap: Don't initialize HOST_IOMMU_DEVICE with mdev Cédric Le Goater
2024-07-23 14:00 ` [PULL 07/16] vfio/ccw: " Cédric Le Goater
2024-07-23 14:00 ` [PULL 08/16] vfio/iommufd: Introduce auto domain creation Cédric Le Goater
2024-07-23 14:00 ` [PULL 09/16] vfio/{iommufd,container}: Remove caps::aw_bits Cédric Le Goater
2024-07-23 14:00 ` [PULL 10/16] vfio/iommufd: Add hw_caps field to HostIOMMUDeviceCaps Cédric Le Goater
2024-07-23 14:00 ` [PULL 11/16] vfio/{iommufd, container}: Invoke HostIOMMUDevice::realize() during attach_device() Cédric Le Goater
2024-07-23 14:00 ` [PULL 12/16] vfio/iommufd: Probe and request hwpt dirty tracking capability Cédric Le Goater
2024-07-23 14:00 ` [PULL 13/16] vfio/iommufd: Implement VFIOIOMMUClass::set_dirty_tracking support Cédric Le Goater
2024-07-23 15:13 ` [PULL 00/16] vfio queue Cédric Le Goater
  -- strict thread matches above, loose matches on Subject: below --
2025-06-05  8:42 Cédric Le Goater
2025-06-05 19:00 ` Stefan Hajnoczi
2023-06-30  5:22 Cédric Le Goater
2023-06-30  9:55 ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).